View previous topic :: View next topic |
Author |
Message |
garythompson n00b
Joined: 28 Nov 2006 Posts: 52 Location: Brisbane, Australia
|
Posted: Mon Jun 07, 2010 9:45 am Post subject: Memtest type program for NIC |
|
|
Hi,
I suspect I have faulty hardware somewhere. My primary culprit is the network hardware (on board r8139) which is still under warranty.
Symptoms:
- Gnome, VirtualBox, FireFox and all desktop apps (metacity on xinerama) run without any crashes, hangs or freezes (except Virtualbox when system is showing network instability - Virtualbox images are stored on network)
- Taken from another thread of mine where I discovered this:
- SSH drop out during network or system loading with message noted above
- Garbled Web browsing (sporadic)
- Thunderbird failed to download mail with error ssl_error_bad_mac_read
- NTP failed to download (emerge) with Failed on RMD160 verification
- Vitualbox fails to start (loading saved state from network device)
I performed a reboot and everything is back to normal.
What tests can I run to confirm the health of my network hardware? I'm having difficulty repeating the issue between reboots and these symptoms have started only in the last two weeks (system build two months ago from new parts). Also, I have not updated any packages on my system since build two months ago. _________________ Life is for Living |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54808 Location: 56N 3W
|
Posted: Mon Jun 07, 2010 10:15 am Post subject: |
|
|
garythompson,
The services you mention use TCP/IP which does error checking and retransmissions if errors are found in packets.
That you get garbled data that passes the checks is possible but very remote. You would notice the slowdown due to the retries.
Look at your ifconfig output for the error count.
Code: | $ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:24:8c:2a:63:85
inet addr:192.168.100.20 Bcast:192.168.100.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:15696 errors:0 dropped:0 overruns:0 frame:0
TX packets:15148 errors:0 dropped:0 overruns:0 carrier:2
collisions:0 txqueuelen:1000
RX bytes:10798634 (10.2 MiB) TX bytes:3452637 (3.2 MiB)
Interrupt:28 |
Try a new cable if you have errors there.
If there are few or no errors there, its further up the network stack ... maybe even your RAM, CPU or North Bridge chip.
memtest is a good check here. You must boot into it instead of the kernel, so it has full control of your hardware.
If it finds problems, it does not always mean its a RAM issue. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
garythompson n00b
Joined: 28 Nov 2006 Posts: 52 Location: Brisbane, Australia
|
Posted: Tue Jun 08, 2010 8:34 am Post subject: |
|
|
NeddySeagoon,
Thanks for the reply. I checked ifconfig and found no errors reported on that side.
I ran a Memtest and no errors reported by hardware.
The only recent change made to my system is the addition of a second monitor and enabling of Xinerama and composite extensions (as well as ADDARGBVisuals enabled on the screens for an nVidia card using proprietary drivers). I would also expect these to have a remote, if any potential, affect on the symptoms I described above.
This is quite frustrating as I'm seeing instability with network, SSL and elsewhere but I can't repeat the issue reliably (rebooting sometimes fixes).
I can appreciate a faulty Mobo as being the problem, I would like to claim it under warranty if I can demonstrate the issue somehow.
Thanks again, any advice to help find this bug is greatly appreciated.
Downloading large packages continue to give a verification fail when the system is behaving unstably, I just tried emerging updated gentoo sources and received the following error:
Code: | (u'Failed on RMD160 verification', '12ec36864871c224531121a5072895c60f996777', u'b93742cbaf8174f2200d2dbef0d47a26c618039c')
!!! Fetched file: linux-2.6.32.tar.bz2 VERIFY FAILED!
!!! Reason: Failed on RMD160 verification
!!! Got: 12ec36864871c224531121a5072895c60f996777
!!! Expected: b93742cbaf8174f2200d2dbef0d47a26c618039c
|
Edit: I rebooted and the package (from the same source) downloaded successfully. There's a bug in here somewhere! _________________ Life is for Living |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54808 Location: 56N 3W
|
Posted: Tue Jun 08, 2010 7:04 pm Post subject: |
|
|
garythompson,
Plug a NIC into a spare slot on the motherboard and demonstrate that using the plug-in NIC makes the problem go away.
Its only indirect evidence and does not rule out the kernel driver for the on board NIC.
You could also try a different kernel version. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
garythompson n00b
Joined: 28 Nov 2006 Posts: 52 Location: Brisbane, Australia
|
Posted: Wed Jun 09, 2010 3:10 am Post subject: |
|
|
Hi,
If only I had a PCI NIC! I've taken onboard network for granted.
I am updating my system first to see if the problem "goes away". Certainly, I can recreate the symptoms after a few hours of general usage, hopefully this will help me find out.
If it doesn't work then I'll start switching out hardware. I'll report back when/if things change.
Thanks for the help. _________________ Life is for Living |
|
Back to top |
|
|
dE_logics Advocate
Joined: 02 Jan 2009 Posts: 2289 Location: $TERM
|
Posted: Wed Jun 09, 2010 3:38 am Post subject: |
|
|
Try -
ping -t www.yahoo.com as root.
If the number of dots produced are very high, then you have faulty hardware or bad ISP.
Somehow get a spare computer and connect the 2 using a LAN card.
Set static IP to both of them, then from your PC -
ping -t <the IP of the other PC> _________________ My blog |
|
Back to top |
|
|
garythompson n00b
Joined: 28 Nov 2006 Posts: 52 Location: Brisbane, Australia
|
Posted: Thu Jun 17, 2010 7:41 am Post subject: |
|
|
Pings are all fine, within 100ish ms.
On occasion some web pages loaded garbled (I realise TCP checksums, which means issue must be above the TCP layer on the client). Anyway, the problem seems to have been bypassed:
Summary
- This all started at https://forums.gentoo.org/viewtopic-p-6307498.html#6307498
- My system was stable, no changes at all made to config and it mysteriously went unstable, coinciding with making changes to a Samba server (so I thought initially that I broke the samba server)
- Problems only seem to occur in X Windows clients (terminal, nautilus, firefox)
- Killing X I was able to move large amounts of data from my machine (samba client) to a samba server. MD5 everything and it all copied without a problem.
To Resolve:
- Wiped my computer
- Installed Fedora (wanted to try something new), broke fedora following guides to set up nVidia proprietary drivers so I then installed Ubuntu (where Linux really started for me), I'm sad to say good bye to Gentoo on my desktop, I've been using it for four years. I just don't have time to recompile my system
- System is working fine, operations which were regularly failing (save VM state in virtualbox to samba server) now work again
I'm stumped as to what happened. There were no configuration changes or system updates at all. Hardware is running stable now on a rebuild _________________ Life is for Living |
|
Back to top |
|
|
mbar Veteran
Joined: 19 Jan 2005 Posts: 1991 Location: Poland
|
Posted: Fri Jun 18, 2010 8:05 am Post subject: |
|
|
For me your problems are consistent with RAM failure in a subtle way (maybe it overheated for some time) or a problem with SATA cable (been there done that ). |
|
Back to top |
|
|
|