Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
dbus and ldap == hangs at boot
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
cantao
Apprentice
Apprentice


Joined: 07 Jan 2004
Posts: 166

PostPosted: Tue Aug 29, 2006 1:52 pm    Post subject: dbus and ldap == hangs at boot Reply with quote

Hi Friends!

I have a laptop that is most of the time used as a desktop, connected to a lan. Users authenticate through LDAP and their /homes are mounted using NFS.

Everything is working very fine, but... If I try to use the laptop disconnected from the lan, the boot hangs when starting DBUS. According to a few sparse references I have found, the problem seems that DBUS is trying to figure something out through LDAP (via nss_ldap). As the laptop is not connected, no LDAP server, so hanging forever.

Setting bind_policy to soft in /etc/ldap.conf was of no avail either.

Any hints?

Thanks a lot in a advance, Cantão!
Back to top
View user's profile Send private message
ellingsw
n00b
n00b


Joined: 31 May 2004
Posts: 40
Location: Kansas, USA

PostPosted: Sun Sep 03, 2006 3:56 am    Post subject: Reply with quote

I have a desktop machine that is configured to authenticate users via LDAP and /home is NFS mounted. My machine is connected to the network all the time, however, no ethernet devices are up by the time D-BUS attempts to start.

During boot my machine hangs for a bit after it displays "Cleaning /tmp directory" then it hangs indefinitely when it gets to "Starting D-BUS system messagebus". And no my problem with cleaning /tmp is not due to the tpm devices line in /etc/udev/rules.d/50-udev.rules because I already commented it out.

As of right now, I've been waiting for my desktop to boot for about 5 hours. I can't tell you which version of D-BUS is installed because I cannot get my machine booted far enough to look.

Also, one other thing that really bothers me is if the LDAP server is unavailable I cannot even login as root on my desktop.
Back to top
View user's profile Send private message
ellingsw
n00b
n00b


Joined: 31 May 2004
Posts: 40
Location: Kansas, USA

PostPosted: Sun Sep 03, 2006 5:22 am    Post subject: Reply with quote

I was finally able to boot my system to a point where I could disable dbus and reboot. Some services failed to start because they were unable to find the dbus service but at least I can login now.

I currently have dbus-0.60-r4 installed, which was installed on March 19. I performed an emerge sync, updated to dbus-0.61-r1 then rebooted, however, this did not fix the issue.
Back to top
View user's profile Send private message
cantao
Apprentice
Apprentice


Joined: 07 Jan 2004
Posts: 166

PostPosted: Sun Sep 03, 2006 9:46 pm    Post subject: Reply with quote

Hi ellingsw!

Sorry for the delay in response! It' Sunday *and* I was in the middle of a power shortage.

I'm currently phisically far from the mentioned laptop, but what I can tell you now is that the "Cleaning /tmp" issue simply vanished with -- I guess -- the last baselayout update. But the D-BUS issue is really annoying.

I have created a local (non LDAP) user to be used when disconnected from the net, and took the machine for a presentation within a client's office. My surprise? The laptop simply stuck at the D-BUS thing. Very unconfortable to have 5 pairs of eyes staring blankly at you :-D

Anyway, I have to fix this @*#($@%&$ for a new presentation wednesday. I'll try the last ~x86 dbus to see what happens. I'll keep in touch, in case of success.

Worth telling you that my /etc/nsswitch.conf is set as "files ldap". To my understanding, the system should look *first* on local files, then on LDAP, if the information (users, groups, whatever) is not locally available. Am I correct about this?

Best regards and good luck, Cantão!
Back to top
View user's profile Send private message
.:chrome:.
Advocate
Advocate


Joined: 19 Feb 2005
Posts: 4588
Location: Brescia, Italy

PostPosted: Sun Sep 03, 2006 10:58 pm    Post subject: Re: dbus and ldap == hangs at boot Reply with quote

set in /etc/ldap.conf
Code:
bind_policy soft
Back to top
View user's profile Send private message
cantao
Apprentice
Apprentice


Joined: 07 Jan 2004
Posts: 166

PostPosted: Mon Sep 04, 2006 10:58 am    Post subject: Reply with quote

Hi k.gothmog!

Yes, I tried that (it's on the original post). It didn't work and had a side effect, disallowing me to ssh into the machine. Had to revert it.

Thanks a lot, Cantão!
Back to top
View user's profile Send private message
UberLord
Retired Dev
Retired Dev


Joined: 18 Sep 2003
Posts: 6835
Location: Blighty

PostPosted: Mon Sep 04, 2006 11:35 am    Post subject: Reply with quote

ellingsw wrote:
During boot my machine hangs for a bit after it displays "Cleaning /tmp directory"


It's not hanging, it's working out the order to start services. Due to doing a complex topological sort in bash it's a bit slow. We're trying to recitify this for baselayout-1.13 (tsort is not an option for the curious btw)
_________________
Use dhcpcd for all your automated network configuration needs
Use dhcpcd-ui (GTK+/Qt) as your System Tray Network tool
Back to top
View user's profile Send private message
cantao
Apprentice
Apprentice


Joined: 07 Jan 2004
Posts: 166

PostPosted: Mon Sep 04, 2006 11:47 am    Post subject: Reply with quote

Hi UberLord!

Quote:
It's not hanging, it's working out the order to start services. Due to doing a complex topological sort in bash it's a bit slow. We're trying to recitify this for baselayout-1.13 (tsort is not an option for the curious btw)


Yep, the problem here seems the D-BUS thing. The question is, if I have "files ldap" in nsswitch.conf, and the messagebus user is local (/etc/passwd, /etc/groups), why should the service look into LDAP?

Worse, why can't ellingsw log in as root without LDAP (presuming root is local too)?

Cheers, Cantão!
Back to top
View user's profile Send private message
UberLord
Retired Dev
Retired Dev


Joined: 18 Sep 2003
Posts: 6835
Location: Blighty

PostPosted: Mon Sep 04, 2006 1:10 pm    Post subject: Reply with quote

cantao wrote:
Yep, the problem here seems the D-BUS thing. The question is, if I have "files ldap" in nsswitch.conf, and the messagebus user is local (/etc/passwd, /etc/groups), why should the service look into LDAP?


Depends on how dbus works. If it's set to enumerate users/groups then it will look in LDAP regardless of order in nsswitch.conf

Quote:
Worse, why can't ellingsw log in as root without LDAP (presuming root is local too)?


I would assume that the rc process hasn't completed, which disables login until it has.
_________________
Use dhcpcd for all your automated network configuration needs
Use dhcpcd-ui (GTK+/Qt) as your System Tray Network tool
Back to top
View user's profile Send private message
cantao
Apprentice
Apprentice


Joined: 07 Jan 2004
Posts: 166

PostPosted: Mon Sep 04, 2006 1:52 pm    Post subject: Reply with quote

Hi UberLord!

Quote:
Depends on how dbus works. If it's set to enumerate users/groups then it will look in LDAP regardless of order in nsswitch.conf


Hum... Very good guess. Upstream problem, perhaps? I'm going to do some research :)

Cheers, Cantão!
Back to top
View user's profile Send private message
UberLord
Retired Dev
Retired Dev


Joined: 18 Sep 2003
Posts: 6835
Location: Blighty

PostPosted: Mon Sep 04, 2006 1:55 pm    Post subject: Reply with quote

It will also do a ldap lookup if it searchs for a user/uid not in /etc/passwd
_________________
Use dhcpcd for all your automated network configuration needs
Use dhcpcd-ui (GTK+/Qt) as your System Tray Network tool
Back to top
View user's profile Send private message
cantao
Apprentice
Apprentice


Joined: 07 Jan 2004
Posts: 166

PostPosted: Mon Sep 04, 2006 2:06 pm    Post subject: Reply with quote

UberLord wrote:
It will also do a ldap lookup if it searchs for a user/uid not in /etc/passwd


Exactly. But, AFAIK, messagebus is the user related to D-BUS, and it's a local user.

Cheers, Cantão!
Back to top
View user's profile Send private message
ellingsw
n00b
n00b


Joined: 31 May 2004
Posts: 40
Location: Kansas, USA

PostPosted: Thu Sep 07, 2006 6:26 am    Post subject: Reply with quote

I found the command "chown 0:0 /tmp/.{ICE,X11}-unix" was re-introduced into /etc/init.d/bootmisc after I had already changed it to "chown root:root /tmp/.{ICE,X11}-unix"---I can't remember the bug number. I have since updated to baselayout-1.12.4-r7 and the command is now commented out by default in the init script. I have not rebooted yet to see if my problem with "Cleaning /tmp" is fixed or at least does not delay as long.

According to the documentation on nsswitch.conf you are correct cantao but it does not appear to work that way. Of course, this does apply to calls to getpwent functions. I don't know if there are other functions that implement password `db' lookups and what programs would use them, however, I doubt /bin/login would use them.

I have the following in /etc/nsswitch.conf:
Code:
passwd:      files ldap
shadow:      files ldap
group:       files ldap


I do not have the bind_policy option in /etc/ldap.conf and according to nss_ldap(5) the default policy is hard_open. I can try the "bind_policy soft" option but I doubt it will help. For one, it doesn't work for cantao. And two; IIRC, login reports invalid user id and password for root instead of waiting for the LDAP timeout. root is a local account in /etc/password.

UberLord, I am presented with a login prompt so I know the rc process has completed. I can sit in front of the screen and watch my system go through the boot process but still not be able to login as root at the login prompt if the ethernet interface fails to come up.
Back to top
View user's profile Send private message
UberLord
Retired Dev
Retired Dev


Joined: 18 Sep 2003
Posts: 6835
Location: Blighty

PostPosted: Thu Sep 07, 2006 7:15 am    Post subject: Reply with quote

ellingsw wrote:
UberLord, I am presented with a login prompt so I know the rc process has completed. I can sit in front of the screen and watch my system go through the boot process but still not be able to login as root at the login prompt if the ethernet interface fails to come up.


Is that because the root password on LDAP is different from how it is locally or something?
_________________
Use dhcpcd for all your automated network configuration needs
Use dhcpcd-ui (GTK+/Qt) as your System Tray Network tool
Back to top
View user's profile Send private message
ellingsw
n00b
n00b


Joined: 31 May 2004
Posts: 40
Location: Kansas, USA

PostPosted: Sun Sep 10, 2006 11:17 pm    Post subject: Reply with quote

UberLord wrote:
Is that because the root password on LDAP is different from how it is locally or something?


What?! What difference would root's password in LDAP make if LDAP is unavailable because the network interface is down. I AM typing in the correct password for the local root account.


As for "Cleaning /tmp". After updating to baselayout-1.12.4-r7 and rebooting, I do not see a noticeable delay during boot when "Cleaning /tmp".

The "bind_policy soft" option has an unfortunate consequence that makes it unusable. If bind_policy is set to soft, all users existing in LDAP cannot ssh into the box. When a user tries to ssh in, sshd authenticates the user---and displays /etc/issue in my case---then is immediately disconnected by sshd.

Here is proof for reference:

~ $> ssh hostname

This is a private system. If you do not have
an account on this system please disconnect
now. Connections are logged and any attempt
to hack into this system will be reported to
the appropriate authorities.

Connection to hostname closed by remote host.
Connection to hostname closed.


==> /var/log/auth <==
Sep 10 18:03:28 hostname sshd[24836]: Accepted publickey for username from 192.168.2.101 port 3489 ssh2
Sep 10 18:03:28 hostname sshd(pam_unix)[24838]: session opened for user username by (uid=0)
Sep 10 18:03:28 hostname sshd[24836]: nss_ldap: could not search LDAP server - Server is unavailable
Sep 10 18:03:28 hostname sshd[24836]: fatal: login_get_lastlog: Cannot find account for uid 1001
Sep 10 18:03:28 hostname sshd[24836]: syslogin_perform_logout: logout() returned an error
Sep 10 18:03:28 hostname sshd(pam_unix)[24838]: session closed for user username

==> /var/log/error <==
Sep 10 18:03:28 hostname sshd[24836]: nss_ldap: could not search LDAP server - Server is unavailable
Sep 10 18:03:28 hostname sshd[24836]: fatal: login_get_lastlog: Cannot find account for uid 1001

==> /var/log/messages <==
Sep 10 18:03:28 hostname sshd[24836]: Accepted publickey for username from 192.168.2.101 port 3489 ssh2
Sep 10 18:03:28 hostname sshd(pam_unix)[24838]: session opened for user username by (uid=0)
Sep 10 18:03:28 hostname sshd[24836]: nss_ldap: could not search LDAP server - Server is unavailable
Sep 10 18:03:28 hostname sshd[24836]: fatal: login_get_lastlog: Cannot find account for uid 1001
Sep 10 18:03:28 hostname sshd[24836]: syslogin_perform_logout: logout() returned an error
Sep 10 18:03:28 hostname sshd(pam_unix)[24838]: session closed for user username
Back to top
View user's profile Send private message
cantao
Apprentice
Apprentice


Joined: 07 Jan 2004
Posts: 166

PostPosted: Sun Sep 17, 2006 10:53 pm    Post subject: Reply with quote

Well, as ellingsw said, bind_policy soft is a shoot on the feet, at least for remote admin :)

I have just given up. I'll duplicate accounts locally on the laptop and free it from LDAP.

Thanks to everybody, Cantão!
Back to top
View user's profile Send private message
mattsk
n00b
n00b


Joined: 11 Apr 2003
Posts: 46
Location: Newcastle, Australia

PostPosted: Mon Sep 25, 2006 2:17 pm    Post subject: Reply with quote

I'm having similar problems. Have you tried using the timelimit or bind_timelimit in /etc/ldap.conf? Or have you tried downgrading versions of nss_ldap?

In my case, if the ldap service isn't started yet (or has been stopped for some reason) then the system is *really* slow. I've worked out that it's the system trying to connect to the ldap server for a user or group search, and I also have
Code:
files ldap

set for the appropriate services in /etc/nsswitch.conf. When this happens, I can't even log in as root at the console - it times out after 60 seconds.

I am reasonably certain that the problem lies with nss_ldap. I recently updated it, and it's been since then that the problems have started. It's been a while since I've been forced to, but I'm fairly sure I was previously able to log in as root at the console, if the ldap server (which resides on the same computer, as it happens) was down for whatever reason. In addition to this, commands run at the command line sometimes took a long time to complete (even ls) - but these symptoms would go away if I removed the 'ldap' keyword on the line entries in /etc/nsswitch.conf. In fact I have to do this to be able to restart the ldap server (since it seems to do a ldap search for the ldap user). In one case so far I even had to reboot the machine into single user mode (done by editing the grub boot paramaters on the fly) to edit the nsswitch.conf file just so I could boot. I haven't tried rebooting since, but I suspect I"ll have to do it again.

During my struggles so far, I've tried setting the soft bind option - and it stopped ssh logins from happenign for me to (I'd like to know why that is - the debug slapd logs I poured through didn't offer up any clue that I could see). And setting:
Code:
timelimit 30
bind_timelimit 30

in /etc/ldap.conf doesn't seem to have made a difference. Part of the problem seems to stem from the fact that the default behaviour of nss_ldap is to never stop trying to connect to the ldap server.

The only real solution I have when

So with this in mind, you may want to try the above settings (maybe with smaller timeouts) , or downgrade your nss_ldap and see if either fixes the problem. I can't remember which version I upgraded from - but the emerge.log seems to indicate that it was 226. I'm currently using 249. I also just realised that I'm running version 239-r1 on one of my other servers, so I'll do some tests and see how that computer behaves when the ldap service is down.

I'd like to know *why* it insists on still checking the ldap service. Fortunately, in my case, it's quite rare for the ldap server to be offline when that computer is online.
_________________
-- Matt Sk (etc)
Back to top
View user's profile Send private message
mattsk
n00b
n00b


Joined: 11 Apr 2003
Posts: 46
Location: Newcastle, Australia

PostPosted: Mon Sep 25, 2006 2:22 pm    Post subject: Reply with quote

Update: my other server running version 239-r1 doesn't skip a beat when the ldap server is down. If I call groups <user> when the ldap server is up I get hte groups for that user, and if I call it when the server is down, I get an "unknown user" error.


The servers have identical /etc/nsswitch entries for "passwd:, groups:, and shadow:"

I just downgraded to that version on the main and all seems to be well. No pauses, and no need to edit nsswitch.conf to restart the ldap server.

I've temporarily solved the problem by putting
Code:
>=sys-auth/nss_ldap-249

in /etc/portage/package.mask

I know this is all a month and a half after you gave up, but I hope this is still helpful.

This seems to be an nss_ldap bug - but I can't find anything else about it on the net so far. Does anybody know if it goes away in later versions (I notice that there are versions 250, 250-r1, 252, and 253 all masked with the ~x86 keyword)
_________________
-- Matt Sk (etc)
Back to top
View user's profile Send private message
cantao
Apprentice
Apprentice


Joined: 07 Jan 2004
Posts: 166

PostPosted: Mon Sep 25, 2006 3:01 pm    Post subject: Reply with quote

Hi Mattsk!

Thanks a lot for your reply.

Tomorrow I'll have the laptop in hands, so I can give the several nss_ldap versions a try. I'll see what happens with the unstable versions and I'll post the results here.

mattsk wrote:
I can't remember which version I upgraded from - but the emerge.log seems to indicate that it was 226. I'm currently using 249


There goes a great package:
Code:
emerge -v genlop

and then:
Code:
genlop nss_ldap


It should tell you the history behind your versions os nss_ldap (or any other package, indeed).

Thanks a lot, Cantão!
Back to top
View user's profile Send private message
cantao
Apprentice
Apprentice


Joined: 07 Jan 2004
Posts: 166

PostPosted: Fri Sep 29, 2006 2:06 pm    Post subject: Reply with quote

It only gets worse...

I could not check the laptop (is was travelling), but I performed several updates on some machines (monolitic Xorg -> non-monolitic Xorg, openssl and gnutls), all by the book, with several revdep-rebuilds to make sure everything was ok.

And then, these updated machines are hanging at DBUS also, even connected to the LAN and with the LDAP server working fine.

I'm trying to boot them to re-emerge dbus and nss_ldap. Let's see what happens.

Cheers, Cantão!
Back to top
View user's profile Send private message
cantao
Apprentice
Apprentice


Joined: 07 Jan 2004
Posts: 166

PostPosted: Fri Sep 29, 2006 2:45 pm    Post subject: Reply with quote

Quote:
I'm trying to boot them to re-emerge dbus and nss_ldap. Let's see what happens.

Nope, that's not the problem. Chechink /var/log/message and the init scripts I discovered that DBUS is trying to connect to LDAP before net.eth0. After one zillon attempts, it gives up and continues.

Updating baselayout as an attempt...

Cheers, Cantão!
Back to top
View user's profile Send private message
blubbi
Guru
Guru


Joined: 27 Apr 2003
Posts: 564
Location: Halle (Saale), Germany

PostPosted: Fri Nov 10, 2006 1:29 pm    Post subject: Reply with quote

Now I ran into the same problem.

System is stable and up-to-date.

Only thing that is ~x86 is udev to solve the nss_ldap and udev problem (a user called tss causes a ldap lookup) described here
https://bugs.gentoo.org/show_bug.cgi?id=99564

I worked around this problem in the following way.

I don't have any local users so I added all system users to /etc/ldap.conf:

Code:
echo "nss_initgroups_ignoreusers $(cat /etc/passwd | cut -d : -f1 | xargs |sed -e 's/ /,/g')" >> nss_initgroups_ignoreusers


My nsswitch.conf loks like this:

passwd: files compat ldap
shadow: files compat ldap
group: files compat ldap

Now the system boots and I can login as root if there is no access to the LDAP.

But still, there is famd, nscd and kdm_config wich try to connect to ldap. I have no clue what they are trying to lookup on the LDAP. Okay, nscd tries to cache the users from LDAP but what about kdm_config and famd?

regards
blubbi
Back to top
View user's profile Send private message
cantao
Apprentice
Apprentice


Joined: 07 Jan 2004
Posts: 166

PostPosted: Fri Nov 10, 2006 4:59 pm    Post subject: Reply with quote

Hi Friends!

In fact, I gave up on this... I recreated all users locally on the laptop, without any mention to LDAP on pam or nsswitch.conf. This way I can use the NFS mounted /home if the lan is up, or a local user if the laptop is unconnected.

Not the most elegant solution, but it worked anyway.

Thanks to all, Cantão!
Back to top
View user's profile Send private message
bunder
Bodhisattva
Bodhisattva


Joined: 10 Apr 2004
Posts: 5947

PostPosted: Fri Nov 10, 2006 10:18 pm    Post subject: Reply with quote

i currently have this problem as well. i believe i'm just going to hardmask the new openldap/nss_ldap/udev versions until they fix these bootup/login bugs. tbh, this has been a problem for quite some time, and i can't believe that it hasn't been fixed properly. bind_policy soft and nss_reconnect only make this problem worse.

cheers
_________________
Neddyseagoon wrote:
The problem with leaving is that you can only do it once and it reduces your influence.

banned from #gentoo since sept 2017
Back to top
View user's profile Send private message
ellingsw
n00b
n00b


Joined: 31 May 2004
Posts: 40
Location: Kansas, USA

PostPosted: Mon Jan 08, 2007 4:00 am    Post subject: Reply with quote

Heads Up!

I had to reboot today because my system was not responding to user input from the console even though I could login remotely. Upon boot, the problem with my machine hanging indefinitely at boot when it gets to "Starting D-BUS system messagebus" is back.

I was able to get the system booted by restarting and performing an interactive boot... skipping dbus of course when it asked if I wanted to start it. After the system and NIC were up, I was able to start dbus without a problem.

I have not determined the cause of it this time but I'll see if I can figure it out.

The latest version of nss_ldap on my machine is 249 and was installed on Oct 14, 2006. My last reboot was Nov 12, 2006, which I did not have a problem with.

I have rsync 4 times since then and performed at least 2 updates with an emerge world in there somewhere after upgrading to gcc 4.1.1-r1.

Since Nov 12, the following packages, which are used during boot, have been updated:
baselayout (1.12.6)
udev (103)
dbus (0.62-r2)

There are other packages that have been updated but I doubt they are related to this issue.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum