View previous topic :: View next topic |
Author |
Message |
Perfect Gentleman Veteran
Joined: 18 May 2014 Posts: 1256
|
Posted: Sat Dec 19, 2015 6:50 pm Post subject: Computer freezes randomly. |
|
|
i7-4770K.
Computer freezes randomly, usually when idle or no load at all. It can re-emerge world and freeze after that. No kernel panic, just freeze, no input work at all.
I have no clue, but only one that kernels 4.3-4.4-git can cause that as it starts to happen at time installing 4.3. But with gcc-5.3 I can't boot to kernels older than 4.3.
wtf? what should i to investigate?
Please, help. |
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Sun Dec 20, 2015 6:57 am Post subject: |
|
|
Can you some information about your hardware? |
|
Back to top |
|
|
Perfect Gentleman Veteran
Joined: 18 May 2014 Posts: 1256
|
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Sun Dec 20, 2015 11:43 am Post subject: |
|
|
And the power supply? |
|
Back to top |
|
|
Perfect Gentleman Veteran
Joined: 18 May 2014 Posts: 1256
|
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Sun Dec 20, 2015 2:07 pm Post subject: |
|
|
You can calculate the power consumption of your computer here:
http://www.bequiet.com/en/psucalculator
I think you can change the language easily. The power supply is oversized.
Can you post the hardware specifcations a second time in a chart, since the output of lshw and hwinfo a hard to read.
Last edited by Keruskerfuerst on Wed Dec 23, 2015 7:22 am; edited 1 time in total |
|
Back to top |
|
|
Perfect Gentleman Veteran
Joined: 18 May 2014 Posts: 1256
|
Posted: Sun Dec 20, 2015 2:35 pm Post subject: |
|
|
shortly, power supply is more than enough as there is no discrete GPU, audiocard, just cpu and ram and 4 hdd.
RAM - Corsair CMD16GX3M2A2400C10 x2
CPU - i7-4770K
SSD1 - root - M.2- PCIe - PLEXTOR_PX-G256M6e - F2FS
SSD2 - SAMSUNG_MZ7WD240HAFV - F2FS
HDD1 - Hitachi_HUS724040ALE640 - XFS
HDD2 - Hitachi_HTS541010A9E680 - XFS
MB - ASUSTek Z97I - Plus
PS - Aerocool Strike-X 1100W |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54799 Location: 56N 3W
|
Posted: Sun Dec 20, 2015 3:27 pm Post subject: |
|
|
Perfect Gentleman,
Boot into memtest for a few cycles.
If it passes clean all the time we have learned nothing.
If it returns random errors, its probably not the RAM.
If you get the same errors at the same addresses, that's useful info but it may not be RAM either.
memtest uses a lot more that just your RAM.
Its not safe to assume everything else is OK because the tool is called memtest.
If you are overclocking - don't.
What does dmidecode tell us? _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Perfect Gentleman Veteran
Joined: 18 May 2014 Posts: 1256
|
Posted: Sun Dec 20, 2015 3:33 pm Post subject: |
|
|
dmedecode - https://bpaste.net/show/64acefe36f53
NeddySeagoon, I ran memtest for 1 cycle - no errors, linpack (hpl) for an hour - no errors.
Googling gives me info, that it can be X. |
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Sun Dec 20, 2015 3:34 pm Post subject: |
|
|
Is there any relevant information in the log?
dmesg or any other log?
Run Memtest: check output
You can replace the parts in your computer: one part after another.
I would replace the parts in the following order:
1. CPU
2. Mainboard |
|
Back to top |
|
|
Perfect Gentleman Veteran
Joined: 18 May 2014 Posts: 1256
|
Posted: Sun Dec 20, 2015 4:00 pm Post subject: |
|
|
Quote: | Is there any relevant information in the log? |
no
Quote: | dmesg or any other log? |
no, errors or fails
Quote: | Run Memtest: check output |
run it tomorrow
Quote: | You can replace the parts in your computer: one part after another. |
it's impossible, no spare parts for those
Last edited by Perfect Gentleman on Sun Dec 20, 2015 4:03 pm; edited 1 time in total |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54799 Location: 56N 3W
|
Posted: Sun Dec 20, 2015 4:03 pm Post subject: |
|
|
Perfect Gentleman.
The only vaguely interesting part oy dmidecode is
Code: | Memory Device
Array Handle: 0x003C
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 8192 MB
Form Factor: DIMM
Set: None
Locator: DIMM_A1
Bank Locator: BANK 0
Type: DDR3
Type Detail: Synchronous
Speed: 2400 MHz
Manufacturer: 0215
Serial Number: 00000000
Asset Tag: 9876543210
Part Number: CMD16GX3M2A2400C10
Rank: 2
Configured Clock Speed: 2400 MHz
Minimum Voltage: 1.5 V
Maximum Voltage: 1.5 V
Configured Voltage: 1.65 V |
Your RAM is rated for 1.5v operation but you have it set to 1.65v.
I suspect its both overclocked and overvolted. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Perfect Gentleman Veteran
Joined: 18 May 2014 Posts: 1256
|
Posted: Sun Dec 20, 2015 4:05 pm Post subject: |
|
|
NeddySeagoon, I've used XMP-profile for a long time, there were no problems with that. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54799 Location: 56N 3W
|
Posted: Sun Dec 20, 2015 4:21 pm Post subject: |
|
|
Perfect Gentleman,
That,s rather like the man who jumped off a tall building ...
As he passed the 13th floor he was heard to say ... so far, so good.
You have had no problems ... yet.
10% over volt is a lot. That's getting to the point that permanent damage can be expected.
Its not just the RAM either, its the CPU interface to the RAM that is over volted too.
10% on voltage is 21% on power at the same clock speed. The on board PSU (and the metal box PSU) has to provide the extra power and still keep the ripple within limits.
21% is a lot. Power is proportional to clock speed, so if you increase the clock speed that 21% over power goes up too.
If you want to deliberately operate your equipment outside its specified limits, that's fine.
You get to keep the pieces when you let the magic smoke out.
For debugging, do not over clock, not even accidentally. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
szatox Advocate
Joined: 27 Aug 2013 Posts: 3489
|
Posted: Sun Dec 20, 2015 5:28 pm Post subject: |
|
|
I had a similar problem and I believe it to be gone now, though with randomness one can never be sure.
The thing I have changed was removing some unused hardware drivers from kernel. They used to be built as modules, but i have eventually completely removed them. In theory that should not matter at all, however my system didn't even freeze once afterwards, and it's been a few weeks already. So, I am not sure it helped, but it's already promising enough you may want to try it. (Drop modules for hardware you don't have)
Don't go after all hints at once though. You'd lose track of them and you'd never figure out what was that. Try resetting performance adjustments to factory defaults first, overclocking is known to be dangerous. |
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Sun Dec 20, 2015 5:44 pm Post subject: |
|
|
Can you test another OS?
Systemrescuecd or Ubuntu live mode?
Win or Solaris or Openinidiana? |
|
Back to top |
|
|
Perfect Gentleman Veteran
Joined: 18 May 2014 Posts: 1256
|
Posted: Mon Dec 21, 2015 7:18 am Post subject: |
|
|
@NeddySeagoon, I would agree that memory can cause it, but there would more hangs during merging which is not, only when idle.
This memory was sold as overclocked.
@szatox, I removed unnecessary modules long time ago, only modules for available hardware.
@Keruskerfuerst, I got only Gentoo installed. LiveCD is option, but I wouldn't like to use it cause there is needed software.
P.S. I rebuilt modules and x11-modules yesterday, and no freezing so far. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54799 Location: 56N 3W
|
Posted: Mon Dec 21, 2015 9:32 am Post subject: |
|
|
Perfect Gentleman,
Perfect Gentleman wrote: | This memory was sold as overclocked. |
I know that. Nobody makes 2400 memory.
It was tested in a test rig and passed in the test rig. Its rather like running benchmarks says very little about real world performance.
The memory was not tested with your PSU, your motherboard nor your CPU.
It can still be the memory subsystem, which includes all the bits above, depending on how RAM is dynamically allocated.
The CPU switching in and out of idle stresses the motherboard PSU more than a constant load. That induces large transients in all the CPU related voltages.
I 'm quite confident, that as you say, its not the RAM, its the system level tolerance build up due to the way the system is being asked to perform.
If rebuilding software appears to fix it, that points to an error in the original build.
The discussion above still holds good as you have no idea if all these emerges produced correct output. Its not even ECC RAM, so errors (at any time) always go undetected.
-- edit --
I have a bridge for sale, in the middle of London :) _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Perfect Gentleman Veteran
Joined: 18 May 2014 Posts: 1256
|
Posted: Mon Dec 21, 2015 10:52 am Post subject: |
|
|
NeddySeagoon, there is a point that overclocked memry can produces builds with errors, but then all my applications were segfaulted, hang without reason. And there is no any of those sympthoms. |
|
Back to top |
|
|
steveL Watchman
Joined: 13 Sep 2006 Posts: 5153 Location: The Peanut Gallery
|
Posted: Mon Dec 21, 2015 1:47 pm Post subject: |
|
|
Is it really so hard to clock the voltage back down for a week or two, to see how it goes?
Troubleshooting: are you using systemdbust? |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54799 Location: 56N 3W
|
Posted: Mon Dec 21, 2015 1:53 pm Post subject: |
|
|
Perfect Gentleman,
That's a vast over simplification.
What about all the errors that don't lead to applications or the system stopping?
Applications still operate but produce incorrect results. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Perfect Gentleman Veteran
Joined: 18 May 2014 Posts: 1256
|
Posted: Mon Dec 21, 2015 2:18 pm Post subject: |
|
|
@steveL, of cause, not. I've already minimized overclocking, only XMP. No systemd.
@NeddySeagoon, that can be. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54799 Location: 56N 3W
|
Posted: Mon Dec 21, 2015 2:22 pm Post subject: |
|
|
Perfect Gentleman,
XMP is still overclocking. Turn it off. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Logicien Veteran
Joined: 16 Sep 2005 Posts: 1555 Location: Montréal
|
Posted: Mon Dec 21, 2015 2:23 pm Post subject: |
|
|
The man who jumped off a tall building was not superstitious. He haven't sense any bad omen even if he was passing the 13th floor.
_________________ Paul |
|
Back to top |
|
|
steveL Watchman
Joined: 13 Sep 2006 Posts: 5153 Location: The Peanut Gallery
|
Posted: Mon Dec 21, 2015 4:08 pm Post subject: |
|
|
No "of course not" about it; that was the third post in a row, where you were arguing the toss with Neddy.
I realise this is meta-discussion, and you just want to get on and make your machine work again. Do so, by all means.
It just comes up a lot, on IRC especially, where people would rather puzzle out an argument in their heads, than attempt to do what the people they've come to for advice, suggest.
This is frustrating because the only satisfaction helpers get is from knowing that you made progress (and usually they'd like follow-up to know how, since they've already given over headspace to the discussion.)
The common example on IRC is people arguing about the fact that they don't need to quote "$1" or "$foo", because in this call they know the value is "safe".
Arguing with them is a travail that leads to support-burnout, which is why bots are so handy.
If there isn't a term for that stubbornness in the face of one's chosen helpers, there needs to be.
Sorry if I'm over-reacting to you specifically; it's not really about you. We all understand the need to know what went wrong.
The trouble is it's not very interesting for others to explore borked thinking, which far too often the ramblers end up going over afterwards in conversation, rather than on reflection.
That's sometimes appropriate for real-world conversation, but very rarely for textual communication, ime.
Again, please don't take this personally; I'm just chatting, more about IRC than the forums.
Good luck with it. :-) |
|
Back to top |
|
|
|