View previous topic :: View next topic |
Author |
Message |
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54744 Location: 56N 3W
|
Posted: Thu Sep 28, 2017 10:12 am Post subject: Chroot broken on arm64 |
|
|
Team,
I got may arm64 install into a state where it either won't boot because it hangs when it should be making static dev nodes, or it won't start mate or xfre4 due to a missing symbol.
So I've tried to chroot in from an older install as follows.
Code: | Pi3 64bit ~ # mount /dev/sdb1 /mnt/sdroot
Pi3 64bit ~ # mount --rbind /run /mnt/sdroot/run
Pi3 64bit ~ # mount --rbind /sys /mnt/sdroot/sys
Pi3 64bit ~ # mount --rbind /dev /mnt/sdroot/dev
Pi3 64bit ~ # mount -t proc proc /mnt/sdroot/proc
Pi3 64bit ~ # chroot /mnt/sdroot /bin/bash
"If you'll excuse me a minute, I'm going to have a cup of coffee."
- broadcast from Apollo 11's LEM, "Eagle", to Johnson Space Center, Houston
July 20, 1969, 7:27 P.M.
Pi3 64bit / # env-update
[1]+ Stopped chroot /mnt/sdroot /bin/bash
Pi3 64bit ~ # | That all looks good. /etc/bash/bashrc runs fortune as the very last thing.
Then in the chroot, the env-update command stops chroot.
Its like there is no job control somewhere.
Code: | chroot /mnt/sdroot /bin/busybox sh | works as busybox is built statically but lots of things don't work with busybox as the shell.
Any pointers? _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
LIsLinuxIsSogood Veteran
Joined: 13 Feb 2016 Posts: 1186
|
Posted: Fri Sep 29, 2017 9:46 am Post subject: |
|
|
Maybe boot from sysrescueCD (assuming that you haven't already tried that?) |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54744 Location: 56N 3W
|
Posted: Fri Sep 29, 2017 10:06 am Post subject: |
|
|
LIsLinuxIsSogood,
Its arm64, not amd64. It will be a year or two before System Rescue CD supports that arch.
The system is a Raspberry Pi 3 in 64 bit mode. Its all a bit like the wild west :)
I've tried booting from an image of the broken install as it was 9 months ago. That boots but won't chroot. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
nokilli Apprentice
Joined: 25 Feb 2004 Posts: 235
|
Posted: Fri Sep 29, 2017 5:50 pm Post subject: |
|
|
You've helped so many people here including myself and so there's a natural desire to return the favor but all I got is maybe it's a permissions problem? Something you no doubt dismissed in the first few seconds of considering the problem.
Usually with chroots that where things go awry in my experience. _________________ We are the block device. The kernel is our client. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54744 Location: 56N 3W
|
Posted: Fri Sep 29, 2017 8:16 pm Post subject: |
|
|
nokilli,
All help gratefully received. I think its probably something silly that I'm doing.
This may not be related.
The system I'm trying to chroot into won't boot on its own. It stalls when it should be making static device nodes for the kernel.
If I downgrade glibc (thats a silly thing to do - don't do that at home) while portage isn't looking. It boots.
I've not tested the chroot with an older glibc.
Downgrading glibc prevents Xfce4 and/or Mate from working, so that's not an option but it does have the advantage that it boots :)
Rule One is assume nothing.
Feel free to ask anything you consider relevant. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20581
|
Posted: Sat Sep 30, 2017 3:45 am Post subject: |
|
|
OK, so a couple of questions mainly for my own benefit, but who knows, they may trigger an idea.
Keep in mind that I have little more than an ethereal concept that glibc provides "core libraries."
Quote: | It stalls when it should be making static device nodes for the kernel. | Is there a significant "transition" going on at the point where it should be creating the nodes? And more specifically, what code is it using at that point when it crashes? A new or previously rarely encountered bug in either code base might be the issue.
Since downgrading glibc results in solving the boot problem, I'm wondering if a different kernel would also solve the problem. If so, then perhaps finding a kernel that doesn't have the problem could narrow down the code change which led to the crash. Or maybe the same if the difference is in glibc code. If creating device nodes uses a smallish section of code, that would help narrow down the focus of the code search.
At least it sounds good from where I'm sitting (aka, no clue) _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54744 Location: 56N 3W
|
Posted: Sat Sep 30, 2017 9:16 am Post subject: |
|
|
pjp,
My first though, was how can I determine that?
After a little pondering, I can fill the init scripts with echo commands. That should tell me what was executed last.
I did try the Interactive boot mode but I have to say "No" to services so early, that its no better than init=/bin/bash.
init=/bin/bash works as well as you would expect it to.
-- edit --
After poking about with lots of echo statements, it appears that
/etc/inittab: | # System initialization, mount local filesystems, etc.
si::sysinit:/sbin/openrc sysinit
# Further system initialization, brings up the boot runlevel.
rc::bootwait:/sbin/openrc boot
...
l3:3:wait:/sbin/openrc default
... |
The bootwait and wait statements both wait forever. Everything in the sysinit runlevel appears to run.
Nothing in the boot runlevel ever starts.
-- edit some more --
The devfs service completes but is in theoutput 10 seconds later, so it appears that the parent never finds out. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20581
|
Posted: Sun Oct 01, 2017 2:21 am Post subject: |
|
|
Well, my perhaps overly unrealistic thought was doing something similar in either the kernel or glibc code. But I may have misremembered whether or not you did development. My thinking was that when the "create_device_node()" function was run, something in that was messed up.
NeddySeagoon wrote: | The bootwait and wait statements both wait forever. Everything in the sysinit runlevel appears to run.
Nothing in the boot runlevel ever starts. | So is it openrc that's creating the device nodes? I literally have no idea how or where that happens. Is the init system creating the devices which are then visible to the kernel?
NeddySeagoon wrote: | The devfs service completes but is in the
output 10 seconds later, so it appears that the parent never finds out. | It is completing but not creating the devices?
And while going back to try ensuring I wasn't asking something completely unhelpful, I noticed your opening statement:
NeddySeagoon wrote: | I got may arm64 install into a state where it either won't boot | Do you recall what you did to get it in this state? This made it sound like it was just fine until you got an idea to experiment :)
The only other thing I could think of is whether or not the combination of the kernel, glibc and openrc are working on a different architecture (or the relevant pieces if they aren't all relevant). _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
vaxbrat l33t
Joined: 05 Oct 2005 Posts: 731 Location: DC Burbs
|
Posted: Sun Oct 01, 2017 2:40 am Post subject: Missing crucial node(s) in the static /dev tree maybe? |
|
|
Could you be missing something in the static /dev tree that's there in the initial filesystem? That would either be the initramfs or the root one itself if you don't bother to use one and then pivot. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54744 Location: 56N 3W
|
Posted: Sun Oct 01, 2017 9:25 am Post subject: |
|
|
pjp, vaxbrat,
Responses intermingled.
I got into this mess by doing an update. The update included glibc. After that build, I noticed sandbox errors about couldn't bind some signals.
From memory, signals 3, 10 and 15. I can do the glibc downgrade, boot and glibc upgrade if exact numbers are important.
It may even be in some of the build logs.
Code: | >>> sys-libs/glibc-2.25-r4 merged.
>>> Regenerating /etc/ld.so.cache...
sandbox:main unable to bind signal 3: Bad file descriptor
sandbox:main signal 15 already had a handler ...
sandbox:main unable to bind signal 10: Bad file descriptor |
The entire build log is at http://bpaste.net/show/96eed28e20b0
Seeing that glibc had just been built and there were apparent errors, I rebooted, which didn't work.
I tried the emergency manual override to downgrade glibc. Don't do this at home!: | tar --xattrs -xpvf packages/sys-libs/glibc-2.24-r3.tbz2 -C /mnt/floppy |
Which allowed the system to boot, for some versions of glibc.
I've reverted openrc the same way Code: | tar --xattrs -xpf openrc-0.22.4.tbz2 -C /mnt/floppy/ | thinking that booting might be an openrc issue.
I don't think I'm missing static /dev nodes. devtmpfs is mounted on /dev so everything should be there.
The only user is root, so it should not be a permissions issue in /dev either.
-- edit --
Bug sys-libs/glibc-2.25-r5 fails with binutils-2.29: segfault running simple test during pkg_preinst on arm64
might be relevant. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20581
|
Posted: Mon Oct 02, 2017 4:32 am Post subject: |
|
|
Unfortunately it was well beyond me at the beginning. What seems strange is that xfce stopped working with the downgrade. And I'm guessing the downgrade was to versions on which xfce4 previously worked?
Hopefully the bug is promising. Have you tried the suggestion of disabling stack protection? _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54744 Location: 56N 3W
|
Posted: Mon Oct 02, 2017 9:14 am Post subject: |
|
|
Team,
Its actually several problems. I'm unravelling it a bit at a time.
Downgrading glibc to 2.24-r3 allows booting to work.
The updated broken glibc-2.25.x provides a new symbol that libbsd-0.8.6 depends on.
Downgrading glibc to 2.24-r3 and libbsd to 0.8.5 allows both Xfce4 and Mate to work.
Strictly speaking the libbsd-0.8.6.ebuild should have an RDEPEND on >=glibc-2.25
but we don't tend to do that as glibc is in the system set and just works.
Its only ~arch users and beyond that will get caught out, then only when they downgrade glibc,
which is, according the the error message, "a sure way to break your system".
I've built glibc with disabling stack protection, as per the bug but it didn't install due
to file collisions. I'll install the tarball and see what happens. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
mDup Apprentice
Joined: 14 Apr 2006 Posts: 212
|
Posted: Tue Oct 03, 2017 2:29 pm Post subject: Re: Chroot broken on arm64 |
|
|
NeddySeagoon wrote: |
Code: | chroot /mnt/sdroot /bin/busybox sh | works as busybox is built statically but lots of things don't work with busybox as the shell.
Any pointers? |
If I run
or
on a working system everything seems to work.
So which things don't work?
Note that
and are usually statically linked too.
Maybe it helps to run ldconfig?
Or to fix some links manually with sln?
Sorry to not be of any help.
Good luck! |
|
Back to top |
|
|
|