Continous problems with fresh multi-server install

Posted: Wed Aug 07, 2019 11:00 am
by pup_seba

In a new environment I'm unable to install zimbra 8.8.12 and having working properly.

The environment has 2 sites, with 1 store + 1 mta/proxy each (4 serves in total). Site 1, has the ldap master along the store. Site 2 has a ldap replica along its store server. Logger is deployed on store 1.

- I can login to the adminUI of the store in site one. But I can't login to the adminUI of the store in site 2 as it gets stuck on "loading screen". On logs, I see this:

2019-08-07 12:39:24,974 INFO [main] [] misc - Thread pool was configured to max=250
2019-08-07 12:39:25,996 INFO [ Activity Thread] [] extensions - Zimbra docs read an empty configuration
2019-08-07 12:39:29,681 INFO [qtp1258084361-19:https:] [] AuthProvider - Adding auth provider: zimbra com.zimbra.cs.service.ZimbraAuthProvider
2019-08-07 12:39:29,682 INFO [qtp1258084361-19:https:] [] AuthProvider - Adding auth provider: sampleoauth com.zimbra.cs.service.ZimbraAuthProviderForOAuth
2019-08-07 12:39:29,724 INFO [qtp1258084361-19:https:] [;ip=;port=50025;ua=ZimbraWebClient - GC75 (Win);soapId=3086576a;] extensions - Using two-factor auth factory ZimbraTwoFactorAuth
2019-08-07 12:39:29,947 INFO [qtp1258084361-19:https:] [;ip=;port=50025;ua=ZimbraWebClient - GC75 (Win);soapId=3086576a;] soap - AuthRequest elapsed=245
2019-08-07 12:39:30,007 INFO [qtp1258084361-21:https:] [;ip=;port=50025;ua=ZimbraWebClient - GC75 (Win);soapId=3086576b;] soap - Proxying request: requestedAccountId=9a3f0944-4c47-4dda-9138-8c27100f1b2b authAcct id=9a3f0944-4c47-4dda-9138-8c27100f1b2b reason: onLocalSvr=false isLocal=false
2019-08-07 12:39:30,069 INFO [qtp1258084361-21:https:] [;ip=;port=50025;ua=ZimbraWebClient - GC75 (Win);soapId=3086576b;] HttpMethodDirector - I/O exception ( caught when processing request: Connection refused (Connection refused)
2019-08-07 12:39:30,069 INFO [qtp1258084361-21:https:] [;ip=;port=50025;ua=ZimbraWebClient - GC75 (Win);soapId=3086576b;] HttpMethodDirector - Retrying request
2019-08-07 12:39:30,094 WARN [qtp1258084361-21:https:] [;ip=;port=50025;ua=ZimbraWebClient - GC75 (Win);soapId=3086576b;] SoapEngine - handler exception
com.zimbra.common.service.ServiceException: operation sent to wrong host (you want '')

- If I try to access to store in site 2, from the admin console on the store in site one (configuration, servers, mbox02), I get another error on the screen:
pop up with this message and no further details: "JavaScript error encountered in method ZaOverviewPanelController.prototype._overviewTreeListener"

- I can't run "zmmailbox" to any email from any of the 2 stores. I always get this error (no matter to which account, no matter from which store):
[zimbra@correo_mbox01 log]$ zmmailbox -z -m
[] INFO: I/O exception ( caught when processing request: Connection refused (Connection refused)
[] INFO: Retrying request
ERROR: remote.CONNECT_FAILURE ( (cause: Connection refused (Connection refused))

- Even more, even with these errors, I've imported information from a backup_ng from an older server (the ones we are migrating from) and my logs are full with messages like these:
2019-08-07 12:49:18,760 WARN [Index-6] [;mid=85;] ParsedMessage - Unable to parse part=2 filename=Esquema CIP La Polesa.pdf content-type=application/pdf message-id=<>
com.zimbra.cs.mime.MimeHandlerException: extraction failed
at com.zimbra.cs.mime.handler.ConverterHandler.getContentImpl(
at com.zimbra.cs.mime.MimeHandler.getContent(
at com.zimbra.cs.mime.ParsedMessage.analyzePart(
at com.zimbra.cs.mime.ParsedMessage.analyzeNonBodyParts(
at com.zimbra.cs.mime.ParsedMessage.analyzeFully(
at com.zimbra.cs.mailbox.Message.generateIndexData(
at com.zimbra.cs.mailbox.MailboxIndex.indexItemList(
at com.zimbra.cs.mailbox.MailboxIndex.indexDeferredItems(
at com.zimbra.cs.mailbox.MailboxIndex.access$600(
at com.zimbra.cs.mailbox.MailboxIndex$BatchIndexTask.exec(
at com.zimbra.cs.mailbox.MailboxIndex$
at java.base/java.util.concurrent.Executors$
at java.base/
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(
at java.base/java.util.concurrent.ThreadPoolExecutor$
at java.base/
Caused by: com.zimbra.cs.convert.ConversionException: connect failed
... 16 more
Caused by: Connection refused (Connection refused)
at java.base/ Method)
at java.base/
at java.base/
at java.base/
at java.base/
at java.base/
at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(
at org.apache.commons.httpclient.HttpClient.executeMethod(
at org.apache.commons.httpclient.HttpClient.executeMethod(
at com.zimbra.cs.convert.AbstractConverterClient.putOrPost(
at com.zimbra.cs.convert.LegacyConverterClient.extract(
at com.zimbra.cs.convert.PooledConverterClient.extract(
at com.zimbra.cs.mime.handler.ConverterHandler.getContentImpl(
... 15 more

- That which seems "just an convertd" indexing problem, wouldn't scare me but the fact that i got 3M items backed up on origin, 3M items restored in these new servers, yet when I access one of the accounts (via webclient as I can't use zmmailbox), I can see folders missing, and 6 thousand mails in inbox after restore in new servers when they see 24 thousand mail in inbox on origin server (one particular case).

I've configured and reconfigured the DoSfilters making sure that config is ok (delays, throttleips, maxrequests, trustedips). I even disabled all tls and interprocess security just in case connections were being refuded because of that.

Firewalld is disabled, selinux is disabled and no other firewall or "weird thing" (not even nagios agents,etc) on these machines, which btw, they are deployed from the same CentOS vSphere template I use for all my deployments, and I configured using the same ansible playbooks I use for all my deployments.

I can telnet in any direction and from any of the store servers to ports 7071, 8443, 7047, 8080.

Any hint/sugestion/experience is more than welcome, I'm quite stucked here :/

Posted: Thu Aug 08, 2019 11:53 am
by pup_seba
Bumping this one. Please help.

Posted: Mon Aug 12, 2019 7:43 am
by pup_seba

Everything seems to be fixed, thanks to Zimbra support (kudos to Gopal in this case). I overlooked the name my customer gave to the servers (lame way to try to excuse me and say it was not my fault :D...but it is), which included a "_". So slap me and call me noob, as apache has a bug with such symbol in the names ( ... bug=851357), everything was working just awfully wrong given the prevously described sympthoms.

Anyways, hope this helps someone someday.
PS: No ego, no shame...always learning :)

Posted: Mon Aug 12, 2019 10:13 am
by Klug
IIRC, it's not just about apache, the "_" is not DNS compliant. ... host-names