Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Memtest type program for NIC
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
garythompson
n00b
n00b


Joined: 28 Nov 2006
Posts: 52
Location: Brisbane, Australia

PostPosted: Mon Jun 07, 2010 9:45 am    Post subject: Memtest type program for NIC Reply with quote

Hi,

I suspect I have faulty hardware somewhere. My primary culprit is the network hardware (on board r8139) which is still under warranty.

Symptoms:
- Gnome, VirtualBox, FireFox and all desktop apps (metacity on xinerama) run without any crashes, hangs or freezes (except Virtualbox when system is showing network instability - Virtualbox images are stored on network)
- Taken from another thread of mine where I discovered this:
- SSH drop out during network or system loading with message noted above
- Garbled Web browsing (sporadic)
- Thunderbird failed to download mail with error ssl_error_bad_mac_read
- NTP failed to download (emerge) with Failed on RMD160 verification
- Vitualbox fails to start (loading saved state from network device)

I performed a reboot and everything is back to normal.

What tests can I run to confirm the health of my network hardware? I'm having difficulty repeating the issue between reboots and these symptoms have started only in the last two weeks (system build two months ago from new parts). Also, I have not updated any packages on my system since build two months ago.
_________________
Life is for Living
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54808
Location: 56N 3W

PostPosted: Mon Jun 07, 2010 10:15 am    Post subject: Reply with quote

garythompson,

The services you mention use TCP/IP which does error checking and retransmissions if errors are found in packets.
That you get garbled data that passes the checks is possible but very remote. You would notice the slowdown due to the retries.

Look at your ifconfig output for the error count.
Code:
$ /sbin/ifconfig
eth0      Link encap:Ethernet  HWaddr 00:24:8c:2a:63:85 
          inet addr:192.168.100.20  Bcast:192.168.100.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:15696 errors:0 dropped:0 overruns:0 frame:0
          TX packets:15148 errors:0 dropped:0 overruns:0 carrier:2
          collisions:0 txqueuelen:1000
          RX bytes:10798634 (10.2 MiB)  TX bytes:3452637 (3.2 MiB)
          Interrupt:28


Try a new cable if you have errors there.

If there are few or no errors there, its further up the network stack ... maybe even your RAM, CPU or North Bridge chip.
memtest is a good check here. You must boot into it instead of the kernel, so it has full control of your hardware.
If it finds problems, it does not always mean its a RAM issue.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
garythompson
n00b
n00b


Joined: 28 Nov 2006
Posts: 52
Location: Brisbane, Australia

PostPosted: Tue Jun 08, 2010 8:34 am    Post subject: Reply with quote

NeddySeagoon,

Thanks for the reply. I checked ifconfig and found no errors reported on that side.

I ran a Memtest and no errors reported by hardware.

The only recent change made to my system is the addition of a second monitor and enabling of Xinerama and composite extensions (as well as ADDARGBVisuals enabled on the screens for an nVidia card using proprietary drivers). I would also expect these to have a remote, if any potential, affect on the symptoms I described above.

This is quite frustrating as I'm seeing instability with network, SSL and elsewhere but I can't repeat the issue reliably (rebooting sometimes fixes).

I can appreciate a faulty Mobo as being the problem, I would like to claim it under warranty if I can demonstrate the issue somehow.

Thanks again, any advice to help find this bug is greatly appreciated.

Downloading large packages continue to give a verification fail when the system is behaving unstably, I just tried emerging updated gentoo sources and received the following error:

Code:
(u'Failed on RMD160 verification', '12ec36864871c224531121a5072895c60f996777', u'b93742cbaf8174f2200d2dbef0d47a26c618039c')
!!! Fetched file: linux-2.6.32.tar.bz2 VERIFY FAILED!
!!! Reason: Failed on RMD160 verification
!!! Got:      12ec36864871c224531121a5072895c60f996777
!!! Expected: b93742cbaf8174f2200d2dbef0d47a26c618039c


Edit: I rebooted and the package (from the same source) downloaded successfully. There's a bug in here somewhere!
_________________
Life is for Living
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54808
Location: 56N 3W

PostPosted: Tue Jun 08, 2010 7:04 pm    Post subject: Reply with quote

garythompson,

Plug a NIC into a spare slot on the motherboard and demonstrate that using the plug-in NIC makes the problem go away.
Its only indirect evidence and does not rule out the kernel driver for the on board NIC.

You could also try a different kernel version.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
garythompson
n00b
n00b


Joined: 28 Nov 2006
Posts: 52
Location: Brisbane, Australia

PostPosted: Wed Jun 09, 2010 3:10 am    Post subject: Reply with quote

Hi,

If only I had a PCI NIC! I've taken onboard network for granted.

I am updating my system first to see if the problem "goes away". Certainly, I can recreate the symptoms after a few hours of general usage, hopefully this will help me find out.

If it doesn't work then I'll start switching out hardware. I'll report back when/if things change.

Thanks for the help.
_________________
Life is for Living
Back to top
View user's profile Send private message
dE_logics
Advocate
Advocate


Joined: 02 Jan 2009
Posts: 2289
Location: $TERM

PostPosted: Wed Jun 09, 2010 3:38 am    Post subject: Reply with quote

Try -

ping -t www.yahoo.com as root.

If the number of dots produced are very high, then you have faulty hardware or bad ISP.

Somehow get a spare computer and connect the 2 using a LAN card.

Set static IP to both of them, then from your PC -

ping -t <the IP of the other PC>
_________________
My blog
Back to top
View user's profile Send private message
garythompson
n00b
n00b


Joined: 28 Nov 2006
Posts: 52
Location: Brisbane, Australia

PostPosted: Thu Jun 17, 2010 7:41 am    Post subject: Reply with quote

Pings are all fine, within 100ish ms.

On occasion some web pages loaded garbled (I realise TCP checksums, which means issue must be above the TCP layer on the client). Anyway, the problem seems to have been bypassed:

Summary
- This all started at https://forums.gentoo.org/viewtopic-p-6307498.html#6307498
- My system was stable, no changes at all made to config and it mysteriously went unstable, coinciding with making changes to a Samba server (so I thought initially that I broke the samba server)
- Problems only seem to occur in X Windows clients (terminal, nautilus, firefox)
- Killing X I was able to move large amounts of data from my machine (samba client) to a samba server. MD5 everything and it all copied without a problem.

To Resolve:
- Wiped my computer
- Installed Fedora (wanted to try something new), broke fedora following guides to set up nVidia proprietary drivers so I then installed Ubuntu (where Linux really started for me), I'm sad to say good bye to Gentoo on my desktop, I've been using it for four years. I just don't have time to recompile my system
- System is working fine, operations which were regularly failing (save VM state in virtualbox to samba server) now work again

I'm stumped as to what happened. There were no configuration changes or system updates at all. Hardware is running stable now on a rebuild
_________________
Life is for Living
Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1991
Location: Poland

PostPosted: Fri Jun 18, 2010 8:05 am    Post subject: Reply with quote

For me your problems are consistent with RAM failure in a subtle way (maybe it overheated for some time) or a problem with SATA cable (been there done that :)).
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum