Page 1 of 2

LDAP Multi-Master Replication issues

Posted: Tue Mar 13, 2018 11:01 am
by TLS
Hi

Zimbra FOSS 8.7.11 (upgraded from 8.6), Ubuntu 16.04.

I have 3 MMR nodes. One of them have higher cpu usage. I have checked journalctl and there is many logs like this

Code: Select all

mar 13 11:27:11 host2 slapd[27393]: log csn 20180313094706.903171Z#000000#001#000000
mar 13 11:27:11 host2 slapd[27393]: cmp -1, too old
mar 13 11:27:11 host2 slapd[27393]: log csn 20180313094711.502080Z#000000#001#000000
mar 13 11:27:11 host2 slapd[27393]: cmp -1, too old
mar 13 11:27:11 host2 slapd[27393]: log csn 20180313094711.955960Z#000000#001#000000
mar 13 11:27:11 host2 slapd[27393]: cmp -1, too old
mar 13 11:27:11 host2 slapd[27393]: log csn 20180313094712.520711Z#000000#001#000000
mar 13 11:27:11 host2 slapd[27393]: cmp -1, too old
mar 13 11:27:11 host2 slapd[27393]: log csn 20180313094715.972866Z#000000#001#000000
mar 13 11:27:11 host2 slapd[27393]: cmp -1, too old
mar 13 11:27:11 host2 slapd[27393]: log csn 20180313094717.419157Z#000000#001#000000
mar 13 11:27:11 host2 slapd[27393]: cmp -1, too old


On the other two nodes cpu usage is normal and logs as above does not occur.

I have also checked database size and problematic node have smaller database in relation to the other two nodes. What can be a problem?

Re: LDAP Multi-Master Replication issues

Posted: Tue Mar 13, 2018 12:55 pm
by fs.schmidt
Hello,

Since you said that you have updated from Zimbra 8.6 it seems that you are facing this bug: https://bugzilla.zimbra.com/show_bug.cgi?id=101054

"Due to bug 101054 it is strongly advised that after installation completes on new MMR nodes, that its primary database reloaded from the primary master via slapcat/slapadd: LDAP_data_import_export. The accesslog DB on the secondary node should be erased (rm -f /opt/zimbra/data/ldap/accesslog/db/*) while slapd is offline.

For 8.6, there are updated OpenLDAP builds available upon request that include the fix for this issue. It is recommended to deploy the updated builds on any 8.6 LDAP deployment."

Re: LDAP Multi-Master Replication issues

Posted: Tue Mar 13, 2018 2:27 pm
by TLS
Thank you for your answer.

In that case, I should do the LDAP data export, and then import according to this article? https://wiki.zimbra.com/wiki/LDAP_data_import_export

LDAP data from which node should be exported?

Re: LDAP Multi-Master Replication issues

Posted: Tue Mar 13, 2018 3:05 pm
by fs.schmidt
TLS wrote:Thank you for your answer.

In that case, I should do the LDAP data export, and then import according to this article? https://wiki.zimbra.com/wiki/LDAP_data_import_export

LDAP data from which node should be exported?


Hi,

You should export the base from the first master server, since the base on the second server is not OK due to the bug.

You can follow the steps in the “ldap import export” article. You shouldn’t export and import the config database and the accesslog can just be cleared.

Re: LDAP Multi-Master Replication issues

Posted: Tue Mar 13, 2018 3:09 pm
by fs.schmidt
You should run these steps as well:

Note for when the accesslog DB is cleared
After slapd is back online, it is advised to do a no-op (no real update) change on the master where the accesslog database was reset:

ldapmodify -x -H ldapi:/// -D cn=config -w `zmlocalconfig -s -m nokey ldap_root_password`
dn: cn=admins,cn=zimbra
changetype: modify
replace: description
description: admin accounts
Then press Control-D to submit the change. This will write a single entry to the accesslog, allowing replication between the two nodes to resume.

Source: https://wiki.zimbra.com/wiki/LDAP_Multi ... eplication

Re: LDAP Multi-Master Replication issues

Posted: Tue Mar 13, 2018 6:12 pm
by TLS
I hope that the LDAP data import procedure will go smoothly.

Thank you for your help in this matter.

Re: LDAP Multi-Master Replication issues

Posted: Wed Mar 14, 2018 11:14 am
by TLS
Unfortunately it did not help

What I did

1. Main database export on ldap master
2. Import on problematic node
3. Clear accesslog DB on problematic node
4. Do a no-op (no real update) change on problematic node

CPU usage is still high and "cmp -1, too old" in journalctl still occur.

Re: LDAP Multi-Master Replication issues

Posted: Wed Mar 14, 2018 2:08 pm
by fs.schmidt
Hi,

Please try to export and import the LDAP database on the first master as well. I had this issue and this step solved it.

After running the export/export on the first master, do it again on the second master.

Kind regards.
Fabio

Re: LDAP Multi-Master Replication issues

Posted: Thu Mar 15, 2018 8:05 am
by TLS
I resolved problem with high cpu usage: java process was hang on.

Code: Select all

zimbra   29556 99.4  2.5 2624844 104468 ?      Sl   mar11 4403:07 /opt/zimbra/common/lib/jvm/java/bin/java -XX:ErrorFile=/opt/zimbra/log -client -Xmx256m -Djava.net.preferIPv4Stack=true -Dzimbra.home=/opt/zimbra -Djava.library.path=/opt/zimbra/lib -Djava.ext.dirs=/opt/zimbra/common/lib/jvm/java/jre/lib/ext:/opt/zimbra/lib/jars:/opt/zimbra/lib/ext-common:/opt/zimbra/lib/ext/com_zimbra_ssdb_ephemeral_store com.zimbra.cs.account.ProvUtil -l gs hostname.domain zimbraServiceEnabled


I restarted all zimbra services on problematic node and killed java process. Now cpu load is fine.

Please try to export and import the LDAP database on the first master as well. I had this issue and this step solved it.

After running the export/export on the first master, do it again on the second master.


I do not quite understand. Should I export ldap data from master ldap, import same data again and clean accesslog? And next import same data from master ldap to last node (on the one where there was no import yet?) I have 3 MMR nodes (import has already been made on one node).

Re: LDAP Multi-Master Replication issues

Posted: Thu Mar 15, 2018 6:41 pm
by fs.schmidt
TLS wrote:I resolved problem with high cpu usage: java process was hang on.

Code: Select all

zimbra   29556 99.4  2.5 2624844 104468 ?      Sl   mar11 4403:07 /opt/zimbra/common/lib/jvm/java/bin/java -XX:ErrorFile=/opt/zimbra/log -client -Xmx256m -Djava.net.preferIPv4Stack=true -Dzimbra.home=/opt/zimbra -Djava.library.path=/opt/zimbra/lib -Djava.ext.dirs=/opt/zimbra/common/lib/jvm/java/jre/lib/ext:/opt/zimbra/lib/jars:/opt/zimbra/lib/ext-common:/opt/zimbra/lib/ext/com_zimbra_ssdb_ephemeral_store com.zimbra.cs.account.ProvUtil -l gs hostname.domain zimbraServiceEnabled


I restarted all zimbra services on problematic node and killed java process. Now cpu load is fine.

Please try to export and import the LDAP database on the first master as well. I had this issue and this step solved it.

After running the export/export on the first master, do it again on the second master.


I do not quite understand. Should I export ldap data from master ldap, import same data again and clean accesslog? And next import same data from master ldap to last node (on the one where there was no import yet?) I have 3 MMR nodes (import has already been made on one node).


Hello,

Since you have 3 MMR nodes, you should export the database of the first LDAP and import it on both servers. All replicas or multi-masters are affected by the mentioned bug.

Kind regards.
Fabio S. Schmidt