View previous topic :: View next topic |
Author |
Message |
Tzuriel Apprentice
![Apprentice Apprentice](/images/ranks/rank_rect_2.gif)
Joined: 01 Jun 2004 Posts: 260
|
Posted: Tue Oct 31, 2006 10:15 pm Post subject: network unreachable mounting root via nfs |
|
|
I've got a diskless node configuration and I think I've got something wrong with NFS when the node starts to boot the kernel. I've been following the gentoo diskless howto.
The error 101 below I think means network is unreachable, but I don't see how. My node gets a static ip by pxeboot via dhcp, gets the bzImage via tftp, and starts to book the kernel at the point of the error below. I think eth0 is shutting down somehow so as not to be able to make the network reachable.
Does anyone have an idea what's going on here?
Node boot output ...
Code: |
...
Using IPI Shorcut mode
IP-Config: No network devices available.
Looking up port of RPC 100003/2 on 192.168.2.0
portmap: RPC call returned error 101
Root-NFS: Unable to get mountd port number from server, using default
Looking up port of RPC 100005/1 on 192.168.2.10
portmap: RPC call returned error 101
Root-NFS: Unable to get mountd port number from server, using default
mount: RPC call returned error 101
Root-NFS: Server returned error 101 while mounting /diskless/192.168.2.101
VFS: Unable to mount root fs via NFS, trying floppy.
VFS: cannot open root device "nfs" or unknown-block(2,0)
Please apend a correct "root=" boot option
Kernal panic - not syncing: VFS: Unable to mount root fs on unknown-block(2,0)
|
My /diskless/pxelinux.cfg/default file
Code: |
DEFAULT /bzImage
APPEND ip=dhcp root=/dev/nfs nfsroot=192.168.2.10:/diskless/192.168.2.101
|
My /etc/exports file
Code: |
# /etc/exports: NFS file systems being exported. See exports(5).
# for each node ...
/diskless/192.168.2.101 192.168.2.101(sync,rw,no_root_squash,no_all_squash)
# This is common to all nodes ...
/opt 192.168.2.0/24(sync,ro,no_root_squash,no_all_squash)
/usr 192.168.2.0/24(sync,ro,no_root_squash,no_all_squash)
/home 192.168.2.0/24(sync,rw,no_root_squash,no_all_squash)
# This is the shared log ...
/var/log 192.168.2.101(sync,rw,no_root_squash,no_all_squash)
|
My slave nodes fstab
Code: |
192.168.2.10:/diskless/192.168.2.101 / nfs sync,hard,intr,rw,nolock,rsize=8192,wsize=8192 0 0
192.168.2.10:/opt /opt nfs sync,hard,intr,ro,nolock,rsize=8192,wsize=8192 0 0
192.168.2.10:/usr /usr nfs sync,hard,intr,ro,nolock,rsize=8192,wsize=8192 0 0
192.168.2.10:/home /home nfs sync,hard,intr,rw,nolock,rsize=8192,wsize=8192 0 0
none /proc proc defaults 0 0
192.168.2.10:/var/log /var/log nfs hard,intr,rw 0 0
|
|
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
anonybosh Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
Joined: 20 Nov 2005 Posts: 324
|
Posted: Tue Oct 31, 2006 11:57 pm Post subject: |
|
|
Did you get any log output on your server regarding the NFS mounting? |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Tzuriel Apprentice
![Apprentice Apprentice](/images/ranks/rank_rect_2.gif)
Joined: 01 Jun 2004 Posts: 260
|
Posted: Thu Nov 02, 2006 1:27 am Post subject: |
|
|
Yes, in /var/log/messages on the master, I get the following output. I still haven't found what's going wrong here, other than I think the NIC is shutting down while booting.
Code: |
Nov 1 13:09:07 master dhcpd: DHCPDISCOVER from 00:13:72:fb:0f:fc via eth0
Nov 1 13:09:07 master dhcpd: DHCPOFFER on 192.168.2.101 to 00:13:72:fb:0f:fc via eth0
Nov 1 13:09:11 master dhcpd: DHCPREQUEST for 192.168.2.101 (192.168.2.10) from 00:13:72:fb:0f:fc via eth0
Nov 1 13:09:11 master dhcpd: DHCPACK on 192.168.2.101 to 00:13:72:fb:0f:fc via eth0
Nov 1 13:09:11 master in.tftpd[10457]: RRQ from 192.168.2.101 filename pxelinux.0
Nov 1 13:09:11 master in.tftpd[10457]: tftp: client does not accept optionsNov 1 13:09:11 master in.tftpd[10458]: RRQ from 192.168.2.101 filename pxelinux.0
Nov 1 13:09:11 master in.tftpd[10459]: RRQ from 192.168.2.101 filename pxelinux.cfg/01-00-13-72-fb-0f-fc
Nov 1 13:09:11 master in.tftpd[10460]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0A80265
Nov 1 13:09:11 master in.tftpd[10461]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0A8026
Nov 1 13:09:11 master in.tftpd[10462]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0A802
Nov 1 13:09:11 master in.tftpd[10463]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0A80
Nov 1 13:09:11 master in.tftpd[10464]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0A8
Nov 1 13:09:11 master in.tftpd[10465]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0A
Nov 1 13:09:11 master in.tftpd[10466]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0
Nov 1 13:09:11 master in.tftpd[10467]: RRQ from 192.168.2.101 filename pxelinux.cfg/C
Nov 1 13:09:11 master in.tftpd[10468]: RRQ from 192.168.2.101 filename pxelinux.cfg/default
Nov 1 13:09:11 master in.tftpd[10469]: RRQ from 192.168.2.101 filename /bzImage
|
|
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
anonybosh Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
Joined: 20 Nov 2005 Posts: 324
|
Posted: Thu Nov 02, 2006 6:05 am Post subject: |
|
|
Ok, so your diskless system doesn't even seem to be contacting the master...
Your problem appears to lie right here: Code: | IP-Config: No network devices available. | This seems to tell me that you don't have the necessary network driver compiled into your kernel!
One other thing that looks wrong is the: Code: | VFS: cannot open root device "nfs" or unknown-block(2,0)
Please apend a correct "root=" boot option | ...which seems to be caused by your declaration of "root=/dev/nfs". Try removing that string altogether.
Also, have you tried mounting your NFS shares from another system to be certain all is well with them? |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Tzuriel Apprentice
![Apprentice Apprentice](/images/ranks/rank_rect_2.gif)
Joined: 01 Jun 2004 Posts: 260
|
Posted: Thu Nov 02, 2006 7:44 am Post subject: |
|
|
liber8ate wrote: | Ok, so your diskless system doesn't even seem to be contacting the master...
Your problem appears to lie right here: Code: | IP-Config: No network devices available. | This seems to tell me that you don't have the necessary network driver compiled into your kernel!
|
How can that be if I'm already on the network and grabbing the bzImage from the master? If I wasn't on the network talking to the master, then I wouldn't have been able to grab the kernel and load it. Or, maybe I'm not understanding something here.
liber8ate wrote: |
One other thing that looks wrong is the: Code: | VFS: cannot open root device "nfs" or unknown-block(2,0)
Please apend a correct "root=" boot option | ...which seems to be caused by your declaration of "root=/dev/nfs". Try removing that string altogether.
Also, have you tried mounting your NFS shares from another system to be certain all is well with them? |
No, haven't been able to find another linux box try.
Ok, I'll give that a try in the morning by taking off the root=/dev/nfs. Every tutorial I saw basically used that, as well as the gentoo tutorials. And I've tried many combinations of boot params. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
wynn Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
![](images/avatars/46695354144c509f41a088.png)
Joined: 01 Apr 2005 Posts: 2421 Location: UK
|
Posted: Thu Nov 02, 2006 12:56 pm Post subject: |
|
|
liber8ate wrote: | One other thing that looks wrong is the: Code: | VFS: cannot open root device "nfs" or unknown-block(2,0)
Please append a correct "root=" boot option | ...which seems to be caused by your declaration of "root=/dev/nfs". Try removing that string altogether. | The comment in /usr/src/linux/init/do_mounts.c to name_to_dev_t says Code: | /*
* Convert a name into device number. We accept the following variants:
*
* 1) device number in hexadecimal represents itself
* 2) /dev/nfs represents Root_NFS (0xff)
* 3) /dev/<disk_name> represents the device number of disk
* 4) /dev/<disk_name><decimal> represents the device number
* of partition - device number of disk plus the partition number
* 5) /dev/<disk_name>p<decimal> - same as the above, that form is
* used when disk name of partitioned disk ends on a digit. | so root=/dev/nfs seems to be correct.
There is another section of code (which will be selected as Tzuriel's kernel config contains CONFIG_ROOT_NFS=y) Code: | #ifdef CONFIG_ROOT_NFS
if (MAJOR(ROOT_DEV) == UNNAMED_MAJOR) {
if (mount_nfs_root())
return;
printk(KERN_ERR "VFS: Unable to mount root fs via NFS, trying floppy.\n");
ROOT_DEV = Root_FD0;
}
#endif | which shows why the error message says "unknown-block(2,0)" which is /dev/fd0.
Sorry I can't help with the main problem. _________________ The avatar is jorma, a "duck" from "Elephants Dream": the film and all the production materials have been made available under a Creative Commons Attribution 2.5 License, see orange.blender.org for details. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Tzuriel Apprentice
![Apprentice Apprentice](/images/ranks/rank_rect_2.gif)
Joined: 01 Jun 2004 Posts: 260
|
Posted: Thu Nov 02, 2006 5:48 pm Post subject: |
|
|
liber8ate wrote: | Ok, so your diskless system doesn't even seem to be contacting the master...
Your problem appears to lie right here: Code: | IP-Config: No network devices available. | This seems to tell me that you don't have the necessary network driver compiled into your kernel!
|
Great! Ok, this was the case for the driver. I don't understand it, but I changed drivers and it got past that point in the boot process.
Now the boot hangs on my diskless node at this point while booting the new bzImage I compiled with the new ethernet driver.
Code: |
...
* Mounting devpts at /dev/pts ... [ok]
* Remounting root filesystem read/write [ok]
* Updating modules.dep ... [ok]
FATAL: could not open '/System.map': No such file or directory [!!]
...
*Failed to set user font [!!] (I'm ignoring these errors for now)
*Starting eth0
* Bringing up eth0
* 192.168.2.10 [ok]
|
And then is just hangs. Though I'm confused as to why that ip of 192.168.2.10 is my master. Shouldn't that be the ip of the node? Also, should there be a system.map file for a diskless node? Where should it go? The gentoo diskless howto doesn't mention anything about this. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Tzuriel Apprentice
![Apprentice Apprentice](/images/ranks/rank_rect_2.gif)
Joined: 01 Jun 2004 Posts: 260
|
Posted: Thu Nov 02, 2006 5:58 pm Post subject: |
|
|
Apologies, thanks guys. I just had a left over /diskless/xxx/etc/conf.d/net reference that was wrong. But my diskless node now boots even though it has a boatload of failed errors during the startup. The biggest concern I have is the failed System.map file. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
anonybosh Guru
![Guru Guru](/images/ranks/rank_rect_3.gif)
Joined: 20 Nov 2005 Posts: 324
|
Posted: Thu Nov 02, 2006 6:04 pm Post subject: |
|
|
Thanks wynn for the clarification.
Quote: | How can that be if I'm already on the network and grabbing the bzImage from the master? If I wasn't on the network talking to the master, then I wouldn't have been able to grab the kernel and load it. | As far as I understand it, the process is:
1. Computer is turned on
2. BIOS runs its tests
3. BIOS looks for boot device (ethernet)
4. Following the PXE protocol, the BIOS uses the NIC to acquire the next booting instructions.
5. The BIOS is told to download a kernel image, and load that into memory.
--End direct BIOS control--
6. Kernel image is completely loaded (all drivers, etc).
7. Kernel looks for a root device (nfsroot)
8. The rest of the system is brought up via init scripts.
So basically, you are getting stuck at step #7 because the necessary driver is not loaded during step #6.
---
Just saw your post while editing this: Quote: | *Starting eth0
* Bringing up eth0
* 192.168.2.10 [ok] | You do not want init to start this! It screws the boot process up entirely. You will need to remove the net.eth0 from the 'boot'/'default' runlevel(s) Code: | # rc-update del net.eth0 | There was one other thing I had to edit, I am thinking that it was in the conf.d/rc file, but I cannot remember. I'll do some more looking and get back to you.
Edit:
Ok I found it; it was in the conf.d/rc file: Code: | RC_PLUG_SERVICES="!net.eth0" | . Hope this helps. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
PeterF n00b
![n00b n00b](/images/ranks/rank_rect_0.gif)
Joined: 08 Feb 2004 Posts: 8 Location: GMT-6
|
Posted: Fri Nov 03, 2006 4:32 am Post subject: |
|
|
I started having a similar problem after upgrading a MythTV frontend diskless system from kernel 2.6.13 to 2.6.17. Use of udev was part of the change. liber8ate's suggestion to update the conf.d/rc file did the trick! System booting. Thanks! I hadn't considered events being triggered outside the "rc-update" mechanism.
Later, also found this suggestion at http://gentoo-wiki.com/HOWTO_Gentoo_Diskless_Install to modify the conf.d/net file to something like this:
Code: | config_eth0=( "noop" "192.168.1.2 netmask 255.255.255.248" ) |
Either mechanism doesn't trigger the dns update that was occuring. Static entries will be created if I cannot find another way.
Thanks again,
- Pete |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|