Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: Incoming fine, outgoing time out

  1. #1
    Join Date
    Feb 2010
    Location
    South Africa
    Posts
    107
    Rep Power
    5

    Default Incoming fine, outgoing time out

    Incoming messages come in no problem, but with outgoing, I get a lot of messages like below.

    Over 1000 messages in defereed queue, I put all on hold, to see if that would help but even with just a few in queue, I still get this.
    Split DNS fine, DNS lookups on in global & server settings. Ran zmfixperms, upgraded to 7.1.0 from 7.0.0, etc.

    Apart from these errors, I don't see anything else in logs.

    ANY advice would be greatly appreciated. 3:30 am here and as I'm an ISP, people need to get their messages out before business start...

    :
    May 13 03:27:40 mail postfix/smtp[27106]: A69D8255809E: to=<lemontree@intekom.co.za>, relay=mail.intekom.com[196.25.211.70]:25, delay=184614, delays=184572/3/5.8/33, dsn=4.4.2, status=deferred (lost connection with mail.intekom.com[196.25.211.70] while sending MAIL FROM)
    status=deferred (lost connection with mail.intekom.com[196.25.211.70] while sending MAIL FROM)
    status=deferred (lost connection with mail.intekom.com[196.25.211.70] while performing the EHLO handshake)
    May 13 03:28:08 mail postfix/smtp[26889]: 9909329580C1: lost connection with j.mx.mail.yahoo.com[66.94.237.64] while sending message body
    May 13 03:28:20 mail postfix/smtp[27070]: 548215CC030C: lost connection with mx2.telkomsa.net[196.25.211.172] while sending message body
    May 13 03:28:20 mail postfix/smtp[27049]: 20A3F21D8034: lost connection with mail.telkomsa.net[196.25.211.70] while sending DATA command
    status=deferred (lost connection with g.mx.mail.yahoo.com[98.137.54.238] while sending MAIL FROM)

  2. #2
    phoenix is offline Zimbra Consultant & Moderator
    Join Date
    Sep 2005
    Location
    Vannes, France
    Posts
    23,587
    Rep Power
    58

    Default

    Quote Originally Posted by ekkas View Post
    Incoming messages come in no problem, but with outgoing, I get a lot of messages like below.

    Over 1000 messages in defereed queue, I put all on hold, to see if that would help but even with just a few in queue, I still get this.
    Split DNS fine, DNS lookups on in global & server settings. Ran zmfixperms, upgraded to 7.1.0 from 7.0.0, etc.

    Apart from these errors, I don't see anything else in logs.

    ANY advice would be greatly appreciated. 3:30 am here and as I'm an ISP, people need to get their messages out before business start...

    :
    May 13 03:27:40 mail postfix/smtp[27106]: A69D8255809E: to=<lemontree@intekom.co.za>, relay=mail.intekom.com[196.25.211.70]:25, delay=184614, delays=184572/3/5.8/33, dsn=4.4.2, status=deferred (lost connection with mail.intekom.com[196.25.211.70] while sending MAIL FROM)
    status=deferred (lost connection with mail.intekom.com[196.25.211.70] while sending MAIL FROM)
    status=deferred (lost connection with mail.intekom.com[196.25.211.70] while performing the EHLO handshake)
    May 13 03:28:08 mail postfix/smtp[26889]: 9909329580C1: lost connection with j.mx.mail.yahoo.com[66.94.237.64] while sending message body
    May 13 03:28:20 mail postfix/smtp[27070]: 548215CC030C: lost connection with mx2.telkomsa.net[196.25.211.172] while sending message body
    May 13 03:28:20 mail postfix/smtp[27049]: 20A3F21D8034: lost connection with mail.telkomsa.net[196.25.211.70] while sending DATA command
    status=deferred (lost connection with g.mx.mail.yahoo.com[98.137.54.238] while sending MAIL FROM)
    You need to look at what's causing the highlighted problem (yes, I know it's stating the obvious ). Is this a new problem? What's the output of the 'Verrify...' commands in the Split DNS article? Is there a performance problem on this server (what's the specification)? Does a Zimbra restart or reboot of the srver improve anything? Is it a VM or on real hardware? How much RAM on the server? Does 'top' show any performance problem or excessive i/o? Is it on a RAID system and if so what RAID level? Is there any firewall or SElinux enabled? When did the problem start and have any updates been done to the server?
    Regards


    Bill


    Acompli: A new adventure for Co-Founder KevinH.

  3. #3
    Join Date
    Nov 2009
    Location
    Ljubljana, Slovenia
    Posts
    268
    Rep Power
    6

    Default

    It might not be related, but I've had the simillar problem. My issue was datacenter's provider DNS, which I was using for my Zimbra box - some DNS queries simply did not get back.
    Changed to public available (reliable) DNS server and queue on server got empty in next hour.

    And another idea - jumbo frames? You might have netowrking issue, NIC adapter degrading in time. Mybe try setting it to 100 Mbps full-duplex speed. Worth trying.

  4. #4
    Join Date
    Feb 2010
    Location
    South Africa
    Posts
    107
    Rep Power
    5

    Default

    Triple checked my DNS settings.
    It seems my local (split) DNS is 100%, my ISP DNS is 100%, but the other ISPs whose majority of mail is failing does not see my DNS records.

    dig mydomain.com mx - 100%
    dig @myisp.dns mydomain.com mx - 100%
    dig @failingdomain.dns mydomain.com mx - no records in answer.

    But they are the largest ISP in South Africa, so that is strange.
    To answer the other questions, yes it is a VM running om XenServer whose storage is on a SAN with RAID5. Running more than a year with no problems in this environment. Upped the RAM from 2GB to 3GB and upped CPUs from 2 to 4, but same issues...

    What was strange is that mail to Google (and we send a lot) goes without (much) trouble, but mail to national ISP, which should be fine, is timing out.
    I hope it is a DNS issue on their end and will see if it clears up.

  5. #5
    phoenix is offline Zimbra Consultant & Moderator
    Join Date
    Sep 2005
    Location
    Vannes, France
    Posts
    23,587
    Rep Power
    58

    Default

    Quote Originally Posted by ekkas View Post
    Triple checked my DNS settings.
    It seems my local (split) DNS is 100%, my ISP DNS is 100%, but the other ISPs whose majority of mail is failing does not see my DNS records.

    dig mydomain.com mx - 100%
    dig @myisp.dns mydomain.com mx - 100%
    dig @failingdomain.dns mydomain.com mx - no records in answer.
    Without exact details of the sites. obviously, I couldn't comment but the lack of response would indicate a DNS problem. I assume those commands were run on the Zimbra server?

    Quote Originally Posted by ekkas View Post
    To answer the other questions, yes it is a VM running om XenServer whose storage is on a SAN with RAID5. Running more than a year with no problems in this environment. Upped the RAM from 2GB to 3GB and upped CPUs from 2 to 4, but same issues...
    A RAID5 is not recommended for a production server with more than 100 users (and prefer it wasn't used at all but I understand it's attraction), two processors should be sufficient and I'd also suggest more RAM for a reasonable size installation.

    Quote Originally Posted by ekkas View Post
    I hope it is a DNS issue on their end and will see if it clears up.
    I hope it clears-up, let us know the outcome.
    Regards


    Bill


    Acompli: A new adventure for Co-Founder KevinH.

  6. #6
    Join Date
    Feb 2010
    Location
    South Africa
    Posts
    107
    Rep Power
    5

    Default

    I assume those commands were run on the Zimbra server?
    Yes, run on Zimbra server.

    I'd also suggest more RAM for a reasonable size installation.
    I'll see if I can get it up to 4GB. (500 users, but not using ZWC, mostly POP, handfull using IMAP)

    A RAID5 is not recommended for a production server with more than 100 users
    I can't see why not. The RAID is running on a SAN with large amounts of cache. Increased storage requirements nowadays make mirroring (Raid1) impractical. Besides, the SAN perform at well over 100MBps (bytes) sustained write speed, saturating a 1Gbps ethernet link. But I do not want to start (another!) RAID x vs RAID y, NFS vs FC vs iSCSI vs FoE debate.
    Our next SAN project is going to have 40Gbps Infiniband and SSDs acting as large non-volatile cache, apart from 32GB volatile cache (using Nexenta SAN software), making any RAID5/RAID6 performance, almost a non-issue.

    My word, I suspect I've dwelt slightly off-topic.

  7. #7
    Join Date
    Feb 2010
    Location
    South Africa
    Posts
    107
    Rep Power
    5

    Default

    Nope, same problem, it seemed that the other ISP just rejected my DNS request, but after logging a call, all seems to be correct.

    So I'm back to square 1.

    Strange that only some domains give problems.
    Any ideas where I should look?
    Even if I use telnet, I get relatively quick timeout. Have to type really fast, otherwise I can't send:

    [root@mail ~]# telnet 196.25.211.70 25
    Trying 196.25.211.70...
    Connected to mail.telkomsa.net (196.25.211.70).
    Escape character is '^]'.
    220 as5.telkomsa.net ESMTP
    helo mail.mydomain.co.za
    250 as5.telkomsa.net
    mail from:support@mydomain.co.za
    250 sender <support@mydomain.co.za> ok
    Connection closed by foreign host.

    Sometimes I get till after "mail to:" command, but kicks me off quite quick.
    They say they do not know of any issues, and I say I can send to most other domains, so I'm stuck with no idea where to look.

    If it was Postfix issues, then telnet should at least be working?
    Now it seems even Telnet times out after a few seconds.
    Maybe some other CentOS setting? This started happening out of the blue, I did yum & Zimbra updates since, but it didn't cure the problem.

    Thanks for the replies so far.

    Ekkas

  8. #8
    Join Date
    Feb 2010
    Location
    South Africa
    Posts
    107
    Rep Power
    5

    Default

    Also tried to change MTU to lower setting, checked timeout settings, don't know if it will help...

    [zimbra@mail root]$ postconf | grep timeout
    connection_cache_protocol_timeout = 5s
    daemon_timeout = 18000s
    ipc_timeout = 3600s
    lmtp_connect_timeout = 0s
    lmtp_data_done_timeout = 600s
    lmtp_data_init_timeout = 120s
    lmtp_data_xfer_timeout = 180s
    lmtp_lhlo_timeout = 300s
    lmtp_mail_timeout = 300s
    lmtp_quit_timeout = 300s
    lmtp_rcpt_timeout = 300s
    lmtp_rset_timeout = 20s
    lmtp_starttls_timeout = 300s
    lmtp_tls_session_cache_timeout = 3600s
    lmtp_xforward_timeout = 300s
    milter_command_timeout = 30s
    milter_connect_timeout = 30s
    milter_content_timeout = 300s
    qmqpd_timeout = 300s
    smtp_connect_timeout = 300s
    smtp_data_done_timeout = 600s
    smtp_data_init_timeout = 120s
    smtp_data_xfer_timeout = 180s
    smtp_helo_timeout = 300s
    smtp_mail_timeout = 300s
    smtp_quit_timeout = 300s
    smtp_rcpt_timeout = 300s
    smtp_rset_timeout = 20s
    smtp_starttls_timeout = 300s
    smtp_tls_session_cache_timeout = 3600s
    smtp_xforward_timeout = 300s
    smtpd_policy_service_timeout = 100s
    smtpd_proxy_timeout = 100s
    smtpd_starttls_timeout = 300s
    smtpd_timeout = ${stress?10}${stress:300}s
    smtpd_tls_session_cache_timeout = 3600s
    trigger_timeout = 10s

  9. #9
    phoenix is offline Zimbra Consultant & Moderator
    Join Date
    Sep 2005
    Location
    Vannes, France
    Posts
    23,587
    Rep Power
    58

    Default

    Quote Originally Posted by ekkas View Post
    Nope, same problem, it seemed that the other ISP just rejected my DNS request, but after logging a call, all seems to be correct.

    So I'm back to square 1.

    Strange that only some domains give problems.
    Any ideas where I should look?
    Even if I use telnet, I get relatively quick timeout. Have to type really fast, otherwise I can't send:

    [root@mail ~]# telnet 196.25.211.70 25
    Trying 196.25.211.70...
    Connected to mail.telkomsa.net (196.25.211.70).
    Escape character is '^]'.
    220 as5.telkomsa.net ESMTP
    helo mail.mydomain.co.za
    250 as5.telkomsa.net
    mail from:support@mydomain.co.za
    250 sender <support@mydomain.co.za> ok
    Connection closed by foreign host.

    Sometimes I get till after "mail to:" command, but kicks me off quite quick.
    They say they do not know of any issues, and I say I can send to most other domains, so I'm stuck with no idea where to look.

    If it was Postfix issues, then telnet should at least be working?
    Now it seems even Telnet times out after a few seconds.
    Maybe some other CentOS setting? This started happening out of the blue, I did yum & Zimbra updates since, but it didn't cure the problem.

    Thanks for the replies so far.

    Ekkas
    I have no problem connecting to their mail servers, it doesn't kick me off if I try and send an email. The MTU should be set at 1500 for your network, I assume you are also on a fairly recent version of XEN? Do you have any firewall hardware (or any CISCO PIX devices) between you and the outside world?

    I do remember there was a problem with XEN a while back with the 'checksum offload' function (I haven't used it for years so it may have been fixed) - feel free to ignore the following if it doesn't apply anymore:

    This probably is a the NIC causing the problem, you can check the by doing 'tcpdump -nvvi eth0' in your Dom0 and then initiating some traffic, you can run a 'traceroute microsoft.com' and see what output tcpdump gives, if there's any error about 'bad chksum' then you need to modify your NIC driver. The problem is caused by checksum offloading in the NIC driver and you can check it with the following commands:

    $ethtool -k eth0 -- display the setting for your driver, you should see something like this:

    tx-checksumming: on

    If that's the case, disable it with:

    $ethtool -K eth0 tx off

    I assume that SElinux is disabled on the server? Apart from either a DNS issue or a router/firewall issue between you and the receiving sit I can't really imagine what else it could be.
    Regards


    Bill


    Acompli: A new adventure for Co-Founder KevinH.

  10. #10
    Join Date
    Feb 2010
    Location
    South Africa
    Posts
    107
    Rep Power
    5

    Default

    Thanks a lot, I'll try it and see what happens.
    I have rx on and tried to turn it off, or should I not do that?
    Otherwise I can swap Interfaces later and see if it's maybe the physical NIC that's giving problems.

    [root@Xen2 ~]# ethtool -k eth1
    Offload parameters for eth1:
    Cannot get device flags: Operation not supported
    Cannot get device GRO settings: Operation not supported
    rx-checksumming: on
    tx-checksumming: off
    scatter-gather: off
    tcp-segmentation-offload: off
    udp-fragmentation-offload: off
    generic-segmentation-offload: off
    generic-receive-offload: off
    large-receive-offload: off

Similar Threads

  1. No incoming or outgoing email... Please help...
    By ldomingues in forum Administrators
    Replies: 4
    Last Post: 04-03-2011, 08:43 AM
  2. Replies: 5
    Last Post: 05-20-2007, 09:31 AM
  3. Still Looking to "Capture" all Incoming & Outgoing
    By irvkatz in forum Administrators
    Replies: 2
    Last Post: 04-10-2007, 05:46 AM
  4. Keep a copy of all incoming and outgoing messages
    By bjquinn in forum Administrators
    Replies: 2
    Last Post: 12-18-2006, 12:37 PM
  5. choose incoming or outgoing
    By naturalblue in forum Installation
    Replies: 1
    Last Post: 04-16-2006, 02:52 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •