Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Computer freezes randomly.
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1256

PostPosted: Sat Dec 19, 2015 6:50 pm    Post subject: Computer freezes randomly. Reply with quote

i7-4770K.
Computer freezes randomly, usually when idle or no load at all. It can re-emerge world and freeze after that. No kernel panic, just freeze, no input work at all.
I have no clue, but only one that kernels 4.3-4.4-git can cause that as it starts to happen at time installing 4.3. But with gcc-5.3 I can't boot to kernels older than 4.3.
wtf? what should i to investigate?
Please, help.
Back to top
View user's profile Send private message
Keruskerfuerst
Advocate
Advocate


Joined: 01 Feb 2006
Posts: 2289
Location: near Augsburg, Germany

PostPosted: Sun Dec 20, 2015 6:57 am    Post subject: Reply with quote

Can you some information about your hardware?
Back to top
View user's profile Send private message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1256

PostPosted: Sun Dec 20, 2015 8:37 am    Post subject: Reply with quote

lshw - https://bpaste.net/show/da2a79211ac2
hwinfo - https://bpaste.net/show/38cbd10554e5
Back to top
View user's profile Send private message
Keruskerfuerst
Advocate
Advocate


Joined: 01 Feb 2006
Posts: 2289
Location: near Augsburg, Germany

PostPosted: Sun Dec 20, 2015 11:43 am    Post subject: Reply with quote

And the power supply?
Back to top
View user's profile Send private message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1256

PostPosted: Sun Dec 20, 2015 1:02 pm    Post subject: Reply with quote

http://www.pc-specs.com/psu/Aerocool/Aerocool_Strike-X_1100W/1695
Back to top
View user's profile Send private message
Keruskerfuerst
Advocate
Advocate


Joined: 01 Feb 2006
Posts: 2289
Location: near Augsburg, Germany

PostPosted: Sun Dec 20, 2015 2:07 pm    Post subject: Reply with quote

You can calculate the power consumption of your computer here:

http://www.bequiet.com/en/psucalculator

I think you can change the language easily. The power supply is oversized.

Can you post the hardware specifcations a second time in a chart, since the output of lshw and hwinfo a hard to read.


Last edited by Keruskerfuerst on Wed Dec 23, 2015 7:22 am; edited 1 time in total
Back to top
View user's profile Send private message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1256

PostPosted: Sun Dec 20, 2015 2:35 pm    Post subject: Reply with quote

shortly, power supply is more than enough as there is no discrete GPU, audiocard, just cpu and ram and 4 hdd.
RAM - Corsair CMD16GX3M2A2400C10 x2
CPU - i7-4770K
SSD1 - root - M.2- PCIe - PLEXTOR_PX-G256M6e - F2FS
SSD2 - SAMSUNG_MZ7WD240HAFV - F2FS
HDD1 - Hitachi_HUS724040ALE640 - XFS
HDD2 - Hitachi_HTS541010A9E680 - XFS
MB - ASUSTek Z97I - Plus
PS - Aerocool Strike-X 1100W
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54799
Location: 56N 3W

PostPosted: Sun Dec 20, 2015 3:27 pm    Post subject: Reply with quote

Perfect Gentleman,

Boot into memtest for a few cycles.

If it passes clean all the time we have learned nothing.
If it returns random errors, its probably not the RAM.
If you get the same errors at the same addresses, that's useful info but it may not be RAM either.

memtest uses a lot more that just your RAM.
Its not safe to assume everything else is OK because the tool is called memtest.

If you are overclocking - don't.

What does dmidecode tell us?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1256

PostPosted: Sun Dec 20, 2015 3:33 pm    Post subject: Reply with quote

dmedecode - https://bpaste.net/show/64acefe36f53
NeddySeagoon, I ran memtest for 1 cycle - no errors, linpack (hpl) for an hour - no errors.
Googling gives me info, that it can be X.
Back to top
View user's profile Send private message
Keruskerfuerst
Advocate
Advocate


Joined: 01 Feb 2006
Posts: 2289
Location: near Augsburg, Germany

PostPosted: Sun Dec 20, 2015 3:34 pm    Post subject: Reply with quote

Is there any relevant information in the log?

dmesg or any other log?

Run Memtest: check output

You can replace the parts in your computer: one part after another.

I would replace the parts in the following order:
1. CPU
2. Mainboard
Back to top
View user's profile Send private message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1256

PostPosted: Sun Dec 20, 2015 4:00 pm    Post subject: Reply with quote

Quote:
Is there any relevant information in the log?

no
Quote:
dmesg or any other log?

no, errors or fails
Quote:
Run Memtest: check output

run it tomorrow
Quote:
You can replace the parts in your computer: one part after another.

it's impossible, no spare parts for those


Last edited by Perfect Gentleman on Sun Dec 20, 2015 4:03 pm; edited 1 time in total
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54799
Location: 56N 3W

PostPosted: Sun Dec 20, 2015 4:03 pm    Post subject: Reply with quote

Perfect Gentleman.

The only vaguely interesting part oy dmidecode is
Code:
Memory Device
   Array Handle: 0x003C
   Error Information Handle: Not Provided
   Total Width: 64 bits
   Data Width: 64 bits
   Size: 8192 MB
   Form Factor: DIMM
   Set: None
   Locator: DIMM_A1
   Bank Locator: BANK 0
   Type: DDR3
   Type Detail: Synchronous
   Speed: 2400 MHz
   Manufacturer: 0215
   Serial Number: 00000000
   Asset Tag: 9876543210
   Part Number: CMD16GX3M2A2400C10
   Rank: 2
   Configured Clock Speed: 2400 MHz
   Minimum Voltage: 1.5 V
   Maximum Voltage: 1.5 V
   Configured Voltage: 1.65 V


Your RAM is rated for 1.5v operation but you have it set to 1.65v.
I suspect its both overclocked and overvolted.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1256

PostPosted: Sun Dec 20, 2015 4:05 pm    Post subject: Reply with quote

NeddySeagoon, I've used XMP-profile for a long time, there were no problems with that.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54799
Location: 56N 3W

PostPosted: Sun Dec 20, 2015 4:21 pm    Post subject: Reply with quote

Perfect Gentleman,

That,s rather like the man who jumped off a tall building ...
As he passed the 13th floor he was heard to say ... so far, so good.

You have had no problems ... yet.
10% over volt is a lot. That's getting to the point that permanent damage can be expected.
Its not just the RAM either, its the CPU interface to the RAM that is over volted too.
10% on voltage is 21% on power at the same clock speed. The on board PSU (and the metal box PSU) has to provide the extra power and still keep the ripple within limits.
21% is a lot. Power is proportional to clock speed, so if you increase the clock speed that 21% over power goes up too.

If you want to deliberately operate your equipment outside its specified limits, that's fine.
You get to keep the pieces when you let the magic smoke out.

For debugging, do not over clock, not even accidentally.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3489

PostPosted: Sun Dec 20, 2015 5:28 pm    Post subject: Reply with quote

I had a similar problem and I believe it to be gone now, though with randomness one can never be sure.
The thing I have changed was removing some unused hardware drivers from kernel. They used to be built as modules, but i have eventually completely removed them. In theory that should not matter at all, however my system didn't even freeze once afterwards, and it's been a few weeks already. So, I am not sure it helped, but it's already promising enough you may want to try it. (Drop modules for hardware you don't have)

Don't go after all hints at once though. You'd lose track of them and you'd never figure out what was that. Try resetting performance adjustments to factory defaults first, overclocking is known to be dangerous.
Back to top
View user's profile Send private message
Keruskerfuerst
Advocate
Advocate


Joined: 01 Feb 2006
Posts: 2289
Location: near Augsburg, Germany

PostPosted: Sun Dec 20, 2015 5:44 pm    Post subject: Reply with quote

Can you test another OS?
Systemrescuecd or Ubuntu live mode?
Win or Solaris or Openinidiana?
Back to top
View user's profile Send private message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1256

PostPosted: Mon Dec 21, 2015 7:18 am    Post subject: Reply with quote

@NeddySeagoon, I would agree that memory can cause it, but there would more hangs during merging which is not, only when idle.
This memory was sold as overclocked.

@szatox, I removed unnecessary modules long time ago, only modules for available hardware.

@Keruskerfuerst, I got only Gentoo installed. LiveCD is option, but I wouldn't like to use it cause there is needed software.

P.S. I rebuilt modules and x11-modules yesterday, and no freezing so far.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54799
Location: 56N 3W

PostPosted: Mon Dec 21, 2015 9:32 am    Post subject: Reply with quote

Perfect Gentleman,

Perfect Gentleman wrote:
This memory was sold as overclocked.

I know that. Nobody makes 2400 memory.

It was tested in a test rig and passed in the test rig. Its rather like running benchmarks says very little about real world performance.
The memory was not tested with your PSU, your motherboard nor your CPU.

It can still be the memory subsystem, which includes all the bits above, depending on how RAM is dynamically allocated.
The CPU switching in and out of idle stresses the motherboard PSU more than a constant load. That induces large transients in all the CPU related voltages.

I 'm quite confident, that as you say, its not the RAM, its the system level tolerance build up due to the way the system is being asked to perform.

If rebuilding software appears to fix it, that points to an error in the original build.
The discussion above still holds good as you have no idea if all these emerges produced correct output. Its not even ECC RAM, so errors (at any time) always go undetected.

-- edit --

I have a bridge for sale, in the middle of London :)
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1256

PostPosted: Mon Dec 21, 2015 10:52 am    Post subject: Reply with quote

NeddySeagoon, there is a point that overclocked memry can produces builds with errors, but then all my applications were segfaulted, hang without reason. And there is no any of those sympthoms.
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Mon Dec 21, 2015 1:47 pm    Post subject: Reply with quote

Is it really so hard to clock the voltage back down for a week or two, to see how it goes?

Troubleshooting: are you using systemdbust?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54799
Location: 56N 3W

PostPosted: Mon Dec 21, 2015 1:53 pm    Post subject: Reply with quote

Perfect Gentleman,

That's a vast over simplification.

What about all the errors that don't lead to applications or the system stopping?
Applications still operate but produce incorrect results.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1256

PostPosted: Mon Dec 21, 2015 2:18 pm    Post subject: Reply with quote

@steveL, of cause, not. I've already minimized overclocking, only XMP. No systemd.

@NeddySeagoon, that can be.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54799
Location: 56N 3W

PostPosted: Mon Dec 21, 2015 2:22 pm    Post subject: Reply with quote

Perfect Gentleman,

XMP is still overclocking. Turn it off.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Logicien
Veteran
Veteran


Joined: 16 Sep 2005
Posts: 1555
Location: Montréal

PostPosted: Mon Dec 21, 2015 2:23 pm    Post subject: Reply with quote

The man who jumped off a tall building was not superstitious. He haven't sense any bad omen even if he was passing the 13th floor.

:D
_________________
Paul
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Mon Dec 21, 2015 4:08 pm    Post subject: Reply with quote

No "of course not" about it; that was the third post in a row, where you were arguing the toss with Neddy.

I realise this is meta-discussion, and you just want to get on and make your machine work again. Do so, by all means.

It just comes up a lot, on IRC especially, where people would rather puzzle out an argument in their heads, than attempt to do what the people they've come to for advice, suggest.
This is frustrating because the only satisfaction helpers get is from knowing that you made progress (and usually they'd like follow-up to know how, since they've already given over headspace to the discussion.)

The common example on IRC is people arguing about the fact that they don't need to quote "$1" or "$foo", because in this call they know the value is "safe".
Arguing with them is a travail that leads to support-burnout, which is why bots are so handy.

If there isn't a term for that stubbornness in the face of one's chosen helpers, there needs to be.

Sorry if I'm over-reacting to you specifically; it's not really about you. We all understand the need to know what went wrong.

The trouble is it's not very interesting for others to explore borked thinking, which far too often the ramblers end up going over afterwards in conversation, rather than on reflection.
That's sometimes appropriate for real-world conversation, but very rarely for textual communication, ime.

Again, please don't take this personally; I'm just chatting, more about IRC than the forums.

Good luck with it. :-)
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum