Page 2 of 3

Zimbra stops working properly after some hours

Posted: Sat May 16, 2015 7:15 pm
by ninjavz
/etc/hosts

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4

::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.0.5 mail.clinicasantaisabel.com



/etc/resolv.conf

nameserver 127.0.0.1

# Generated by NetworkManager

search clinicasantaisabel.com

nameserver 192.168.0.1

nameserver 192.168.0.6

#nameserver 8.8.8.8



Note: I took out 8.8.8.8 because dnscache may use it and then resolves to a public IP instead of internal and then the SMTP server rejects email

http://community.zimbra.com/collaboration/f/1886/t/1138065



[tag:dig] mx clinicasantaisabel.com

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.el6_6.2 <<>> mx clinicasantaisabel.com

;; global options: +cmd

;; Got answer:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5161

;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0



;; QUESTION SECTION:

;clinicasantaisabel.com. IN MX



;; ANSWER SECTION:

clinicasantaisabel.com. 3600 IN MX 10 mail.clinicasantaisabel.com.



;; Query time: 0 msec

;; SERVER: 127.0.0.1#53(127.0.0.1)

;; WHEN: Sat May 16 19:12:52 2015

;; MSG SIZE rcvd: 61



[tag:dig] a clinicasantaisabel.com

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.el6_6.2 <<>> a clinicasantaisabel.com

;; global options: +cmd

;; Got answer:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 19890

;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 0



;; QUESTION SECTION:

;clinicasantaisabel.com. IN A



;; ANSWER SECTION:

clinicasantaisabel.com. 600 IN A 192.168.0.1

clinicasantaisabel.com. 600 IN A 192.168.0.3

clinicasantaisabel.com. 600 IN A 192.168.0.6

clinicasantaisabel.com. 600 IN A 192.168.0.2

clinicasantaisabel.com. 600 IN A 192.168.0.83



;; Query time: 0 msec

;; SERVER: 127.0.0.1#53(127.0.0.1)

;; WHEN: Sat May 16 19:13:29 2015

;; MSG SIZE rcvd: 120



mmmm, this last one seems strange, shall it be the server IP? It is 192.168.0.5



-Miguel

Zimbra stops working properly after some hours

Posted: Sat May 16, 2015 9:38 pm
by ninjavz
Did some changes to the DNS, new info:

#dig mx clinicasantaisabel.com

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.el6_6.2 <<>> mx clinicasantaisabel.com

;; global options: +cmd

;; Got answer:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27284

;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1



;; QUESTION SECTION:

;clinicasantaisabel.com. IN MX



;; ANSWER SECTION:

clinicasantaisabel.com. 2503 IN MX 10 mail.clinicasantaisabel.com.



;; ADDITIONAL SECTION:

mail.clinicasantaisabel.com. 2438 IN A 192.168.0.5



;; Query time: 0 msec

;; SERVER: 127.0.0.1#53(127.0.0.1)

;; WHEN: Sat May 16 21:35:25 2015

;; MSG SIZE rcvd: 77



#dig a clinicasantaisabel.com

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.el6_6.2 <<>> a clinicasantaisabel.com

;; global options: +cmd

;; Got answer:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13958

;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0



;; QUESTION SECTION:

;clinicasantaisabel.com. IN A



;; AUTHORITY SECTION:

clinicasantaisabel.com. 3093 IN SOA server1.clinicasantaisabel.com. server2.clinicasantaisabel.com. 2007124551 36000 600 1296000 3600



;; Query time: 0 msec

;; SERVER: 127.0.0.1#53(127.0.0.1)

;; WHEN: Sat May 16 21:36:48 2015

;; MSG SIZE rcvd: 94



Currently testing...

Zimbra stops working properly after some hours

Posted: Sun May 17, 2015 1:49 am
by phoenix

The entry for your server in the hosts file is incorrect, take a look at the details in the Split DNS wiki article for the correct format.


Zimbra stops working properly after some hours

Posted: Sun May 17, 2015 12:23 pm
by ninjavz

Link -> "Unfortunately, the page you've requested no longer exists. "


Still can this hang the whole server? Looking for articles on split dns...


Now trying to connect through the webclient:


HTTP ERROR 504


Problem accessing ZCS upstream server. Reason: Cannot connect to the ZCS upstream server. Connection timeout.
Possible reasons:



  • upstream server is blocked by a firewall

  • upstream server is failing to send back the response in time

  • upstream server is down


Please contact your ZCS administrator to fix the problem.



Powered by Nginx-Zimbra://


 


[zimbra@mail ~]$ zmcontrol status
Host mail.clinicasantaisabel.com
amavis Running
antispam Running
antivirus Running
dnscache Running
ldap Running
logger Running
mailbox Running
memcached Running
mta Running
opendkim Running
proxy Running
service webapp Running
snmp Running
spell Running
stats Stopped
zimbra webapp Running
zimbraAdmin webapp Running
zimlet webapp Running
zmconfigd Running



 


Zimbra stops working properly after some hours

Posted: Sun May 17, 2015 1:50 pm
by phoenix

[quote user="ninjavz"]Still can this hang the whole server? Looking for articles on split dns...[/quote]I've already given you the link to the Split DNS article.


Zimbra stops working properly after some hours

Posted: Sun May 17, 2015 2:29 pm
by ninjavz
phoenix, the link is broken -> http://community.zimbra.com/utility/error-notfound.aspx

Zimbra stops working properly after some hours

Posted: Sun May 17, 2015 2:48 pm
by ninjavz
Any did some changes to DNS and hosts, but when restarting I get this for $zmcontrol status:

Host mail.clinicasantaisabel.com

amavis Running

antispam Running

antivirus Running

dnscache Running

ldap Running

logger Running

mailbox Running

memcached Running

mta Running

opendkim Running

proxy Running

service webapp Running

snmp Running

spell Running

stats Stopped

zimbra webapp Running

zimbraAdmin webapp Running

zimlet webapp Running

zmconfigd Running



stats stopped?



after $zmcontrol start all are running... but still strange, although I had to do force the shutdown (hard reset)

Zimbra stops working properly after some hours

Posted: Sun May 17, 2015 2:58 pm
by ninjavz
Now back to dig

[tag:dig] mx clinicasantaisabel.com

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.el6_6.2 <<>> mx clinicasantaisabel.com

;; global options: +cmd

;; Got answer:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27986

;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1



;; QUESTION SECTION:

;clinicasantaisabel.com. IN MX



;; ANSWER SECTION:

clinicasantaisabel.com. 3600 IN MX 10 mail.clinicasantaisabel.com.



;; ADDITIONAL SECTION:

mail.clinicasantaisabel.com. 3405 IN A 192.168.0.5



;; Query time: 0 msec

;; SERVER: 127.0.0.1#53(127.0.0.1)

;; WHEN: Sun May 17 14:49:19 2015

;; MSG SIZE rcvd: 77



This looks good.



[tag:dig] any clinicasantaisabel.com

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.30.rc1.el6_6.2 <<>> any clinicasantaisabel.com

;; global options: +cmd

;; Got answer:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25513

;; flags: qr rd ra; QUERY: 1, ANSWER: 8, AUTHORITY: 0, ADDITIONAL: 3



;; QUESTION SECTION:

;clinicasantaisabel.com. IN ANY



;; ANSWER SECTION:

clinicasantaisabel.com. 3600 IN A 192.168.0.5

clinicasantaisabel.com. 600 IN A 192.168.0.1

clinicasantaisabel.com. 600 IN A 192.168.0.6

clinicasantaisabel.com. 3600 IN NS kingu.clinicasantaisabel.com.

clinicasantaisabel.com. 3600 IN NS marduk.clinicasantaisabel.com.

clinicasantaisabel.com. 3600 IN SOA marduk.clinicasantaisabel.com. postmaster.clinicasantaisabel.com. 2007124590 36000 600 1296000 3600

clinicasantaisabel.com. 3560 IN MX 10 mail.clinicasantaisabel.com.

clinicasantaisabel.com. 3600 IN TXT "v=spf1 mx -all"



;; ADDITIONAL SECTION:

kingu.clinicasantaisabel.com. 3600 IN A 192.168.0.1

marduk.clinicasantaisabel.com. 3600 IN A 192.168.0.6

mail.clinicasantaisabel.com. 3365 IN A 192.168.0.5



;; Query time: 0 msec

;; SERVER: 127.0.0.1#53(127.0.0.1)

;; WHEN: Sun May 17 14:49:59 2015

;; MSG SIZE rcvd: 272



I get 2 more servers here, kingu and marduk, I believe this is because they are windows AD controllers running DNS Server (main and backup). I erase those .1 and .6 but they come back so this must be a windows issue.



So now this is the /etc/hosts file:

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4

::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.0.5 mail.clinicasantaisabel.com mail clinicasantaisabel.com



Pings to mail, mail.clinicasantaisabel.com and clinicasantaisabel.com all resolve to 192.168.0.5, which I assume is correct.



Server is again under test to see if it hangs again...

Zimbra stops working properly after some hours

Posted: Sun May 17, 2015 5:00 pm
by ninjavz

zimbra service hanging again, everything running extremely slow, even the su - zimbra and exit.

Well this is about it, 2 weeks trying to fix this server and stil hanging... the server worked for 4 years, now this new one (as the old) keeps getting into this odd problem.


Zimbra admin kind of works, but when you try to open a user (email@domain.com) it just hangs there.


$zmcontrol status (after a long while)


Host mail.clinicasantaisabel.com
amavis Running
antispam Running
antivirus Running
dnscache Running
ldap Running
logger Running
mailbox Running
memcached Running
mta Running
opendkim Running
proxy Running
service webapp Running
snmp Running
spell Running
stats Running
Timeout after 180 seconds


$ zmstatctl start
zmstat-ldap already running, skipping.
zmstat-fd already running, skipping.
zmstat-allprocs already running, skipping.
zmstat-proc already running, skipping.
zmstat-io already running, skipping.
zmstat-nginx already running, skipping.
zmstat-mysql already running, skipping.
zmstat-mtaqueue already running, skipping.
zmstat-io-x already running, skipping.
zmstat-cpu already running, skipping.
zmstat-vm already running, skipping.
zmstat-df already running, skipping.


Also found this:


2015-05-17 15:23:19,283 ERROR [{RemoteManager: mail.clinicasantaisabel.com->zimbra@mail.clinicasantaisabel.com:22}-zmqstat incoming] [] rmgmt - error scanning com.zimbra.cs.rmgmt.RemoteMailQueue$QueueHandler@19a73b2f: Can't use an undefined value as an ARRAY reference at /opt/zimbra/libexec/zmqstat line 159.


But I cannot spot exactly what service at what time goes wrong...



Zimbra stops working properly after some hours

Posted: Mon May 18, 2015 4:02 am
by jorgedlcruz
Hi,

This line on the hosts file is wrong -- 192.168.0.5 mail.clinicasantaisabel.com mail clinicasantaisabel.com



Needs to be -- 192.168.0.5 mail.clinicasantaisabel.com mail



Can you please change it, restart, and let us know?



Best regards