View previous topic :: View next topic |
Author |
Message |
jesnow l33t
Joined: 26 Apr 2006 Posts: 888
|
Posted: Tue Apr 30, 2024 5:37 pm Post subject: wireguard stopped working (again) [never solved] |
|
|
Yet again, wireguard fails silently. At least the debug chain is less frustrating because I know it's probably wireguard's fault.
I sit down at my computer, and my remote mounts aren't available. OK. So I ssh through the wg tunnel, nothing, wg is broken. So when did this happen? Four days ago at midnight. Makes sense, I haven't used the machine since that (WFH). My previous experiences (two debug sessions) with this problem are here:
https://forums.gentoo.org/viewtopic-t-1167259-highlight-.html
https://forums.gentoo.org/viewtopic-t-1166203-highlight-.html
Only this time I haven't rebuilt the kernel. It just stopped working four days ago. I can see this from the output of wg (last handshake) and this:
Code: |
vanaert jesnow # grep nfs: /var/log/messages
Apr 26 00:55:20 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 00:55:49 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 00:57:48 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 00:58:19 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 00:59:46 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 00:59:46 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 01:01:59 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 01:01:59 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 01:04:12 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 01:06:26 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 03:57:16 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 03:59:29 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 03:59:29 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 12:51:58 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 12:51:58 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
Apr 26 12:54:11 vanaert kernel: nfs: server 10.0.17.1 not responding, still trying
|
NFS is thankfully doing me the favor of telling me that it had a connection that suddenly stopped responding.
I restart nfs and wireguard. Nothing. But also no error messages, just nothing. I really wish that either of them when they are told there should be an interface there or a remote mount that they should be seeing, that it's a problem and they should write to the syslog when there is nothing there. I get *endless* chatter in /var/log/messages from dbus and networkmanager, and other super chatty daemons, to the point where it's hard to scan the logs for problems because there's so much junk filling it up.
But I can't connect to my remotes, because the wireguard interface is dead and wireguard is "OK this is fine". And I have no idea that it's not working and no way of knowing why it's not working.
So as in the previous threads I re-installed net-vpn/wireguard-tools, this rebuilds the key signatures from the wg0.conf. And rebooted. Nothing.
Anybody with real insight into this problem please let me know. I'm finding it frustrating how often wireguard fails silently on me. Silent fail is the worst. I have other work to do that isn't this, and I'm glad I have a backup machine.
Update:
With debugging turned on, it's the same symptoms as last time.
Code: |
Apr 30 19:39:34 vanaert kernel: wireguard: wg0: Handshake for peer 1 (104.176.81.55:51820) did not complete after 5 seconds, retrying (try 6)
Apr 30 19:39:34 vanaert kernel: wireguard: wg0: Sending handshake initiation to peer 1 (104.176.81.55:51820)
Apr 30 19:39:39 vanaert kernel: wireguard: wg0: Handshake for peer 1 (104.176.81.55:51820) did not complete after 5 seconds, retrying (try 7)
Apr 30 19:39:39 vanaert kernel: wireguard: wg0: Sending handshake initiation to peer 1 (104.176.81.55:51820)
Apr 30 19:39:44 vanaert kernel: wireguard: wg0: Handshake for peer 1 (104.176.81.55:51820) did not complete after 5 seconds, retrying (try 8)
Apr 30 19:39:44 vanaert kernel: wireguard: wg0: Sending handshake initiation to peer 1 (104.176.81.55:51820)
Apr 30 19:39:50 vanaert kernel: wireguard: wg0: Handshake for peer 1 (104.176.81.55:51820) did not complete after 5 seconds, retrying (try 9)
Apr 30 19:39:50 vanaert kernel: wireguard: wg0: Sending handshake initiation to peer 1 (104.176.81.55:51820)
Apr 30 19:39:55 vanaert kernel: wireguard: wg0: Handshake for peer 1 (104.176.81.55:51820) did not complete after 5 seconds, retrying (try 2)
Apr 30 19:39:55 vanaert kernel: wireguard: wg0: Sending handshake initiation to peer 1 (104.176.81.55:51820)
Apr 30 19:40:00 vanaert kernel: wireguard: wg0: Handshake for peer 1 (104.176.81.55:51820) did not complete after 5 seconds, retrying (try 3)
Apr 30 19:40:00 vanaert kernel: wireguard: wg0: Sending handshake initiation to peer 1 (104.176.81.55:51820)
Apr 30 19:40:05 vanaert kernel: wireguard: wg0: Sending handshake initiation to peer 1 (104.176.81.55:51820)
...
Apr 30 19:40:52 vanaert mount[5447]: mount to NFS server '10.0.17.1' failed: timed out, retrying
|
Ping also obviously does not work, nor ssh through the non-existent tunnel connection.
But regular ssh login to that server works fine:
Code: |
jesnow@vanaert ~ $ ssh merckx
Last login: Tue Apr 30 19:45:46 2024 from 130.39.188.91
jesnow@merckx ~ $
|
Showing that the connection is just fine end to end. Just wireguard has yet again failed silently. Last time it was a problem with the keys needing to be ported between the old and new kernel, there are multiple ways to do this (see the chain of posts), and I tried them all. The system is up to date. My backup system (nearly identical setup just much slower) is working fine: wg, nfs, the works.
Kind of at a loss.
[another update] I have a workaround that doesn't use wireguard -- I can ssh into my server and start a forward tunnel and connect to samba through it. It's about as fast, but I'm still at a loss why wireguard would simply and without any error message would refuse to work on this machine.
Cheers,
Jon.
Last edited by jesnow on Fri May 31, 2024 8:00 pm; edited 1 time in total |
|
Back to top |
|
|
jesnow l33t
Joined: 26 Apr 2006 Posts: 888
|
Posted: Fri May 31, 2024 7:56 pm Post subject: |
|
|
So this is worrisome. This issue was never resolved. That is, wireguard continued to neither work nor give any reason why it didn't work for a couple of months. during that time I fell back to mounting my remote volumes via samba over an ssh tunnel. NFS would have worked too.
Today I did a system update that required a kernel rebuild. After that, wireguard resumed working. OK. But there was no way to figure out what went wrong. That's a weakness in wireguard. It is impossible to debug. It either works (and doesn't tell you what it's doing) or it doesn't. I suspect that the kernel module that contains the encrypted keys (see the links in the previous posts about this) got updated when I rebuilt the kernel. But I had done this several times. It's possible that the wg user space stuff became incompatible via emerge with the kernel I was using. But there was no debug info I could use to find this out.
This is just a problem with wireguard.
Cheers,
Jon. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|