Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Postfix conversation with .. timed out while sending message
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
lostinspace2011
Apprentice
Apprentice


Joined: 09 Sep 2005
Posts: 230

PostPosted: Wed Oct 20, 2010 2:54 pm    Post subject: Postfix conversation with .. timed out while sending message Reply with quote

I managed to run my mail server for some time without any issues. In my setup I am using Postfix, Amavisd, Spamassassing and Clamd as described in http://www.gentoo.org/doc/en/mailfilter-guide.xml. However for the past two days I have been struggling with message getting stuck in the outgoing queue.

sendmail -bp shows the following messages for some (most) of my outgoing messages.

Quote:

conversation with mx1.bt.mail.yahoo.com[212.82.111.207] timed out while sending message body
...
lost connection with mx292.emailfiltering.com[194.116.199.31] while sending end of data -- message may be sent more than once
...
(host nk11b01-smtp-mx004.mac.com[17.148.17.44] said: 421 4.4.2 Timeout while waiting for command. (in reply to end of DATA command))
...
-- 407 Kbytes in 6 Requests


Most of the messages are very small (<10K) and one is larger (>350K), so I don't think this problem is linked to the size of the message.

I have tried to disable amavisd by comenting out the following in /etc/postfix/main.cf

Code:
#content_filter = smtp-amavis:[localhost]:10024


However this did not resolve the problem either. I can see outbound traffic and some of the message seem to still get received at their destination. However the message is still shown in the outgoing messages.

I have already tried sending messages via telnet directly to the same destination address and this works, so I am lead to rule out a network problem. I also tried using telnet to send via my own server using port 25, 10024 (amavisd) and 100025 (smtp-delivery), however the message ended up getting stuck every time. I also tried this with firewall on and off with the same results and since telnet to the destination address works, I don't think this is a network issue.

Any other suggestions on what I can do to debug and analyse the cause of this problem.

Thanks in advance
Alex
Back to top
View user's profile Send private message
Anarcho
Advocate
Advocate


Joined: 06 Jun 2004
Posts: 2970
Location: Germany

PostPosted: Wed Oct 20, 2010 3:21 pm    Post subject: Reply with quote

There are suggestions on the net to lower the MTU to 1400 or even 1000 and try again.

If this helps, maybe your server doesn't get the ICMP messages for path MTU discovery?

See also: http://www.postfix.org/faq.html#timeouts
_________________
...it's only Rock'n'Roll, but I like it!
Back to top
View user's profile Send private message
lostinspace2011
Apprentice
Apprentice


Joined: 09 Sep 2005
Posts: 230

PostPosted: Wed Oct 20, 2010 4:35 pm    Post subject: Fixed.. but not sure why Reply with quote

I decided to leave this issue for now and do something else for an hour or two. And when I got back all messages were sent. I then re-enabled the amavis integration and tried sending new messages. All went without any issue. Not sure what causes this and I really don't understand why. Maybe it was just some strange network congestion which only affected certain email recipients. Oh well. Still I will keen an eye on this one for a couple of days.

Thanks for your help and the link
Alex
Back to top
View user's profile Send private message
lostinspace2011
Apprentice
Apprentice


Joined: 09 Sep 2005
Posts: 230

PostPosted: Sat Oct 23, 2010 4:25 pm    Post subject: Problem persists Reply with quote

It would appear after some time the problem came back. I haven't been able to figure out exactly what is causing yet. So I am still looking for any other suggestion on what can be done about this.

Could it be that one of the timeout settings in postfix is too small:
postconf |grep timeout
Code:

smtp_connect_timeout = 30s
smtp_data_done_timeout = 600s
smtp_data_init_timeout = 120s
smtp_data_xfer_timeout = 180s
smtp_helo_timeout = 300s
smtp_mail_timeout = 300s
smtp_quit_timeout = 300s
smtp_rcpt_timeout = 300s
smtp_rset_timeout = 20s
smtp_starttls_timeout = 300s
smtp_tls_session_cache_timeout = 3600s
smtp_xforward_timeout = 300s

smtpd_policy_service_timeout = 100s
smtpd_proxy_timeout = 100s
smtpd_starttls_timeout = 300s
smtpd_timeout = 300s
smtpd_tls_session_cache_timeout = 3600s


Thanks in advance
Alex
Back to top
View user's profile Send private message
lostinspace2011
Apprentice
Apprentice


Joined: 09 Sep 2005
Posts: 230

PostPosted: Sun Oct 24, 2010 11:29 am    Post subject: Small progress Reply with quote

It would appear that the message are timing out after 5 minutes (300 seconds).

Quote:

messages:Oct 24 11:03:53 bumblebee amavis[5533]: (05533-05) sending SMTP response: "250 2.0.0 Ok, id=05533-05, from MTA([127.0.0.1]:10025): 250 2.0.0 Ok: queued as E1EFB6283CC"
messages:Oct 24 11:03:53 bumblebee postfix/smtp[8654]: 169A66283C0: to=<REMOVED@iinet.net.au>, relay=localhost[127.0.0.1]:10024, delay=2.9, delays=0.07/0.01/0.02/2.8, dsn=2.0.0, status=sent (250 2.0.0 Ok, id=05533-05, from MTA([127.0.0.1]:10025): 250 2.0.0 Ok: queued as E1EFB6283CC)
messages:Oct 24 11:08:55 bumblebee postfix/smtp[8660]: E1EFB6283CC: to=<REMOVED@iinet.net.au>, relay=as-av.iinet.net.au[203.0.178.180]:25, delay=302, delays=0.03/0.02/0.58/301, dsn=4.4.2, status=deferred (lost connection with as-av.iinet.net.au[203.0.178.180] while sending end of data -- message may be sent more than once)


However this message is rather small
Quote:

E1EFB6283CC 10631 Sun Oct 24 11:03:53 REMOVED @ sender.xyz (lost connection with as-av.iinet.net.au[203.0.178.180] while sending end of data -- message may be sent more than once)
REMOVED @ iinet.net.au


I'd say it must be some sort of network issue, but it is affected most of my messages but not all. I managed to send a 18Mb email to a test account as well as received it back, so I don't think it is network related. I did some ping measurements to this particular server with the following results
Quote:
bumblebee ~ # ping 203.0.178.180
PING 203.0.178.180 (203.0.178.180) 56(84) bytes of data.
64 bytes from 203.0.178.180: icmp_req=1 ttl=60 time=20.4 ms
64 bytes from 203.0.178.180: icmp_req=2 ttl=60 time=19.3 ms
64 bytes from 203.0.178.180: icmp_req=3 ttl=60 time=19.0 ms
64 bytes from 203.0.178.180: icmp_req=4 ttl=60 time=18.2 ms
64 bytes from 203.0.178.180: icmp_req=5 ttl=60 time=21.1 ms
64 bytes from 203.0.178.180: icmp_req=6 ttl=60 time=18.9 ms
64 bytes from 203.0.178.180: icmp_req=7 ttl=60 time=19.9 ms
^C
--- 203.0.178.180 ping statistics ---
7 packets transmitted, 7 received, 0% packet loss, time 6026ms
rtt min/avg/max/mdev = 18.278/19.606/21.192/0.935 ms


I eventually tried to change the MTU on my ADSL router to 1400 and issues sendmail -q to retry message delivery. Shortly afterwards all message had been delivered.

I then disabled MTU path discovery using the following command and set the MTU back to 1492.
Code:
sysctl -w net.ipv4.ip_no_pmtu_disc=1

Then I resent some test messages, all of which were delivered. Upon further testing other messages got stuck again and were only delivered after changing the MTU to 1400.

Thanks in advance
Alex
Back to top
View user's profile Send private message
lostinspace2011
Apprentice
Apprentice


Joined: 09 Sep 2005
Posts: 230

PostPosted: Mon Oct 25, 2010 1:01 pm    Post subject: More progress Reply with quote

Using MTU of 1492 on my router I get:
Code:

bumblebee log # tracepath -n web47.justhost.com
 1:  192.168.0.3                                           0.223ms pmtu 1500
 1:  192.168.0.1                                           0.976ms
 1:  192.168.0.1                                           0.883ms
 2:  192.168.0.1                                           0.966ms pmtu 1492
 2:  203.215.5.244                                        36.195ms
 3:  203.215.4.18                                         80.277ms
 4:  203.215.20.68                                        91.846ms asymm  5
 5:  114.31.193.237                                       93.607ms asymm  6
 6:  114.31.193.237                                       96.809ms
 7:  208.178.246.85                                      296.737ms asymm  8
 8:  208.178.246.85                                      368.962ms
 9:  69.31.111.94                                        313.895ms asymm 13
10:  69.31.111.94                                        315.777ms asymm 13
11:  99.198.126.118                                      317.966ms asymm 14
12:  no reply
13:  no reply


And using an MTU of 1400 on my router I get:
Code:

bumblebee log # tracepath -n web47.justhost.com
 1:  192.168.0.3                                           0.233ms pmtu 1500
 1:  192.168.0.1                                           0.997ms
 1:  192.168.0.1                                           0.902ms
 2:  192.168.0.1                                           0.983ms pmtu 1400
 2:  203.215.5.244                                        34.550ms
 3:  203.215.4.36                                         35.384ms
 4:  203.215.20.6                                         88.938ms
 5:  203.215.20.148                                       92.369ms asymm  4
 6:  114.31.199.58                                       293.905ms
 7:  114.31.199.58                                       296.088ms asymm  6
 8:  69.31.110.229                                       314.788ms asymm 12
 9:  69.31.110.229                                       316.066ms asymm 12
10:  99.198.126.118                                      314.154ms asymm 14
11:  99.198.126.118                                      315.451ms asymm 14
12:  no reply
13:  no reply


Not quite sure what this means though. Will be consulting the man pages.


Last edited by lostinspace2011 on Mon Oct 25, 2010 3:34 pm; edited 1 time in total
Back to top
View user's profile Send private message
lostinspace2011
Apprentice
Apprentice


Joined: 09 Sep 2005
Posts: 230

PostPosted: Mon Oct 25, 2010 3:28 pm    Post subject: Confusion prevails Reply with quote

After playing with various MTU values on both the router trying 1400 on both, then various permutation of PPPoA and PPPoE messages were still getting stuck. At this point setting my servers MTU back to 1500, the routers to 1400 and issuing sendmail -q did not delivery the messages. In this case a reboot and some patience resolved the problem.

Still is very confusing and there seem to be several factors contributing.

Alex
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum