[SOLVED] LDAP / slapd - Database environment corrupt (Issue & Solution)

Discuss your pilot or production implementation with other Zimbra admins or our engineers.
greenrenault
Advanced member
Advanced member
Posts: 180
Joined: Fri Sep 12, 2014 10:13 pm

[SOLVED] LDAP / slapd - Database environment corrupt (Issue & Solution)

Postby greenrenault » Mon Jul 16, 2007 7:14 pm

Detailed below is both the issue and solution when the LDAP database becomes corrupt. Posting here to help others as I had to raise this with Zimbra support for an answer, which worked!
-------------------- Our Issue --------------------
Our Zimbra server lost power recently and as a result performed an unexpected halt. When Zimbra restarted the slapd database was corrupt and LDAP server would not start at all. Investigation revealed that the LDAP log file had become corrupt. So the log file was moved to a temporary directory and Zimbra restarted. Zimbra started up OK however now the following errors were consistently displayed in the Zimbra log.
I ran the recover on the database but this did not solve the problem.
Errors in Zimbra log.
Jun 30 13:47:42 black clamd[6319]: Reading databases from /opt/zimbra/clamav/db

Jun 30 13:47:58 black slapd[2699]: is_entry_objectclass("", "2.5.6.1") no objectClass attribute

Jun 30 13:47:58 black slapd[2699]: bdb(): DB_ENV->log_flush: LSN of 1/1503308 past current end-of-log of 1/516427

Jun 30 13:47:58 black slapd[2699]: bdb(): Database environment corrupt; the wrong log files may have been removed or incompatible database

files imported from another environment

Jun 30 13:47:58 black slapd[2699]: bdb(): DB_ENV->log_flush: LSN of 1/3245384 past current end-of-log of 1/516427

Jun 30 13:47:58 black slapd[2699]: bdb(): Database environment corrupt; the wrong log files may have been removed or incompatible database files imported from another environment

Jun 30 13:47:58 black slapd[2699]: bdb(): DB_ENV->log_flush: LSN of 1/1067196 past current end-of-log of 1/516427

Jun 30 13:47:58 black slapd[2699]: bdb(): Database environment corrupt; the wrong log files may have been removed or incompatible database

files imported from another environment

Jun 30 13:47:59 black slapd[2699]: is_entry_objectclass("", "2.5.6.1") no objectClass attribute

Jun 30 13:48:06 black zmtomcatmgr[6520]: status requested

Jun 30 13:51:09 black slapd[2699]: bdb(): DB_ENV->log_flush: LSN of 1/3279665 past current end-of-log of 1/516803

Jun 30 13:51:09 black slapd[2699]: bdb(): Database environment corrupt; the wrong log files may have been removed or incompatible database

files imported from another environment

Jun 30 13:51:09 black slapd[2699]: bdb(): sn.bdb: unable to flush page: 0

Jun 30 13:51:09 black slapd[2699]: bdb(): txn_checkpoint: failed to flush the buffer cache Invalid argument

Jun 30 13:51:13 black slapd[2699]: is_entry_objectclass("", "2.5.6.1") no objectClass attribute
And errors when trying to recover the LDAP database.
[zimbra@black ~]$ /opt/zimbra/sleepycat/bin/db_recover -h /opt/zimbra/openldap-data/

db_recover: Log sequence error: page LSN 1 512617; previous LSN 2 6235099

db_recover: Recovery function for LSN 1 878 failed on forward pass

db_recover: PANIC: Invalid argument

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: PANIC: fatal region error detected; run recovery

db_recover: DB_ENV->open: DB_RUNRECOVERY: Fatal error, run database recovery


-------------------- Solutions --------------------

The commands below should get ldap back to a stable state. The lines begin with # or $ to signify whether run by root or zimbra.
If you have a significant sized ldap database, it will speed things up to tune ldap performance a little as specified in this wiki guide (specifically adding set_cachesize to DB_CONFIG): Performance Tuning Guidelines for Large Deployments - ZimbraWiki
Solution 1 - Try this first

The following is based on this post, http://www.zimbra.com/forums/administrators/10169-ldap-slapd-database-environment-corrupt-issue-solution.html#post91501



# su - zimbra

$ ldap stop

$ cd /opt/zimbra/openldap-data

$ /opt/zimbra/sleepycat/bin/db_recover



This will *recover* the database and checkpoint out the logs. If at that point, they still remain corrupt, then yes you would have to take the steps in your post, but that should only be done as a last resort.
Also, if it is a master, the accesslog database will also need to be recovered:


$ cd /opt/zimbra/openldap-data/accesslog/db

$ /opt/zimbra/sleepycat/bin/db_recover



If you are trying to restore from a backup on a master, you'll need to make sure the accesslog directory structure exists first (see the zmldapenablereplica script), and you'll need to make sure you select the correct database when doing slapadd (The -b '' flag).
Just moving aside the log files and starting slapd will forever destroy the database when it may otherwise have been recoverable without resorting to backups.
Solution 2 - Last resort (as provided by Zimbra support)

Look for the latest ldap backup. On my system it's from this morning; you may want to use the one from yesterday if the system was already down by backup time this morning. For the example I'm using my ldap backup filename: /opt/zimbra/backup/ldap/incr-20070704.080005.554/ldap.bak.


# su - zimbra

$ ldap stop

$ exit

# mv /opt/zimbra/openldap-data /opt/zimbra/openldap-data-0704-crash

# mkdir /opt/zimbra/openldap-data

# cp /opt/zimbra/openldap-data-0704-crash/DB_CONFIG /opt/zimbra/openldap-data/DB_CONFIG

# chown -R zimbra:zimbra /opt/zimbra/openldap-data

# su - zimbra

$ ~/openldap/sbin/slapadd -w -q -f ~/conf/slapd.conf -l /opt/zimbra/backup/ldap/incr-20070704.080005.554/ldap.bak

$ ~/openldap/sbin/slapindex -f ~/conf/slapd.conf

$ ldap start


Thanks Zimbra support!


4211PeterH
Posts: 48
Joined: Fri Sep 12, 2014 9:58 pm

[SOLVED] LDAP / slapd - Database environment corrupt (Issue & Solution)

Postby 4211PeterH » Sun Jan 20, 2008 2:05 pm

Yesterday i upgraded from 4.5.10 to 5.0.1_GA_NE.

Upgrade went smoothly..

Today I tried to import a huge mailbox from Domino (2.5Gig) with the newest importwizard (1900).

Lateron the day the server threw soap-errors on the client and then suddenly didn't want to start anymore...:eek:
I noticed my diskspace dropped from >20G free to ZERO..

That crashed slapd thus server wouldn't start.
Using the above mentioned guidelines i managed to get it up and running again. (had to change location of logfile but could read that from console) Seems ok...
So: Thnx for these instructions!!! :)
Still remains to be answered what's eating my diskspace and how to stop that...

Could it be some temp-files from the importwizard, if so, where are they located?

Any other suggestions for where to look welcome..

I'll post my findings looking further into this.. Just hope my server 'll be running ok 2morrow when my users come in..

regards,

Peter
4211PeterH
Posts: 48
Joined: Fri Sep 12, 2014 9:58 pm

[SOLVED] LDAP / slapd - Database environment corrupt (Issue & Solution)

Postby 4211PeterH » Sun Jan 20, 2008 3:02 pm

ok, so it was the full-backup I ran right after the upgrade as per the instructions... could have figured that out before..:o

Now moved backup location to other mount via global settings=>backup/restore=>backup location.
Maybe this helps someone else in the future :cool:
bubarooni
Advanced member
Advanced member
Posts: 184
Joined: Fri Sep 12, 2014 10:29 pm

[SOLVED] LDAP / slapd - Database environment corrupt (Issue & Solution)

Postby bubarooni » Mon May 12, 2008 5:56 pm

ok, i had a power failure over the weekend. i was trying to replicate the solution offered here in order to fix the same problem.
root@mail ~]# su zimbra

[zimbra@mail root]$ ldap stop

slapd not running

[zimbra@mail root]$ mv /opt/zimbra/openldap-data /opt/zimbra/openldap-data-0511-crash

mv: cannot move `/opt/zimbra/openldap-data' to `/opt/zimbra/openldap-data-0511-crash': Permission denied

[zimbra@mail root]$ exit

exit

[root@mail ~]# mv /opt/zimbra/openldap-data /opt/zimbra/openldap-data-0511-crash

[root@mail ~]# mkdir /opt/zimbra/openldap-data

[root@mail ~]# cp /opt/zimbra/openldap-data-0511-crash/DB_CONFIG /opt/zimbra/openldap-data/DB_CONFIG

[root@mail ~]# chown -R zimbra:zimbra /opt/zimbra/openldap-data

[root@mail ~]# su zimbra

[zimbra@mail root]$ ~/openldap/sbin/slapadd -w -q -f ~/conf/slapd.conf -l /opt/zimbra/backup/sessions/incr-20080511.060003.318/ldap/ldap.bak

The first database does not allow slapadd; using the first available one (2)

[zimbra@mail root]$ ~/openldap/sbin/slapindex -f ~conf/slapd.conf

could not stat config file "~conf/slapd.conf": Permission denied (13)

slapindex: bad configuration file!

[zimbra@mail root]$ ldap start

Failed to start slapd. Attempting debug start to determine error.

bdb(): PANIC: DB_RUNRECOVERY: Fatal error, run database recovery

bdb_db_close: txn_checkpoint failed: Invalid argument (22)

backend_startup_one: bi_db_open failed! (-30978)

bdb_db_close: alock_close failed
any ideas would be greatly appreciated!!!!
User avatar
quanah
Zimbra Alumni
Zimbra Alumni
Posts: 1667
Joined: Fri Sep 12, 2014 10:33 pm
Contact:

[SOLVED] LDAP / slapd - Database environment corrupt (Issue & Solution)

Postby quanah » Mon May 12, 2008 7:15 pm

[quote user="greenrenault"]Detailed below is both the issue and solution when the LDAP database becomes corrupt. Posting here to help others as I had to raise this with Zimbra support for an answer, which worked!
-------------------- Our Issue --------------------
Our Zimbra server lost power recently and as a result performed an unexpected halt. When Zimbra restarted the slapd database was corrupt and LDAP server would not start at all. Investigation revealed that the LDAP log file had become corrupt. So the log file was moved to a temporary directory and Zimbra restarted. Zimbra started up OK however now the following errors were consistently displayed in the Zimbra log.[/QUOTE]
The instructions here are wrong, and what you did was incorrect. As a result, you forced yourself into a situation requiring you to restore from backup.
What you should have done was made sure slapd wasn't running (which of course it likely wasn't), and then



cd /opt/zimbra/openldap-data

/opt/zimbra/sleepycat/bin/db_recover


This will *recover* the database and checkpoint out the logs. If at that point, they still remain corrupt, then yes you would have to take the steps in your post, but that should only be done as a last resort.
Also, if it is a master, the accesslog database will also need to be recovered:


cd /opt/zimbra/openldap-data/accesslog/db

/opt/zimbra/sleepycat/bin/db_recover


in that case.
If you are trying to restore from a backup on a master, you'll need to make sure the accesslog directory structure exists first (see the zmldapenablereplica script), and you'll need to make sure you select the correct database when doing slapadd (The -b '' flag).
Just moving aside the log files and starting slapd will forever destroy the database when it may otherwise have been recoverable without resorting to backups.
--Quanah
--
Quanah Gibson-Mount
Product Architect, Symas http://www.symas.com/
OpenLDAP Core team http://www.openldap.org/project/
bubarooni
Advanced member
Advanced member
Posts: 184
Joined: Fri Sep 12, 2014 10:29 pm

[SOLVED] LDAP / slapd - Database environment corrupt (Issue & Solution)

Postby bubarooni » Thu May 15, 2008 8:09 am

soooo....
can i move everything back and take the steps you outline?
bubarooni
Advanced member
Advanced member
Posts: 184
Joined: Fri Sep 12, 2014 10:29 pm

[SOLVED] LDAP / slapd - Database environment corrupt (Issue & Solution)

Postby bubarooni » Thu May 15, 2008 8:44 am

apparently not. this is crazy. i'm gonna take this thing live in two weeks and i can't get the darn thing working.
i'm just gonna wipe it and start from scratch.
greenrenault
Advanced member
Advanced member
Posts: 180
Joined: Fri Sep 12, 2014 10:13 pm

[SOLVED] LDAP / slapd - Database environment corrupt (Issue & Solution)

Postby greenrenault » Thu May 15, 2008 2:46 pm

[quote user="quanah"]The instructions here are wrong, and what you did was incorrect. --Quanah[/QUOTE]
Instructions updated based on your comments Dude. If these are wrong please update :)
User avatar
quanah
Zimbra Alumni
Zimbra Alumni
Posts: 1667
Joined: Fri Sep 12, 2014 10:33 pm
Contact:

[SOLVED] LDAP / slapd - Database environment corrupt (Issue & Solution)

Postby quanah » Thu May 15, 2008 11:34 pm

[quote user="bubarooni"]apparently not. this is crazy. i'm gonna take this thing live in two weeks and i can't get the darn thing working.
i'm just gonna wipe it and start from scratch.[/QUOTE]
What release are you running? What platform? What type of disks? Is /opt/zimbra/openldap-data in NFS or on a SAN rather than local disk? Are you running Xen?
--
Quanah Gibson-Mount
Product Architect, Symas http://www.symas.com/
OpenLDAP Core team http://www.openldap.org/project/
jsilence
Posts: 16
Joined: Fri Sep 12, 2014 11:18 pm

[SOLVED] LDAP / slapd - Database environment corrupt (Issue & Solution)

Postby jsilence » Tue Sep 16, 2008 3:30 pm

I am having a corrupt LDAP Database due to a Server freeze last week.

Previously I enabled Replication using the zmldapenablereplica script, but I did not finish the replica server before the problems occured.
Right now I am in the situation where I can restore from an almost three week old backup which unfortunately becomes corrupt again after a while. This might be due to the fact that I upgraded from 5.0.6 to 5.0.9 after that backup. After a while I get errors like this and the admin interface can not change anything in the LDAP any more.


Sep 8 09:06:02 zimbra slapd[4207]: bdb(): Ignoring log file: /opt/zimbra/openldap-data/logs/log.0000000191: magic number 0, not 40988

Sep 8 09:06:02 zimbra slapd[4207]: bdb(): Invalid log file: log.0000000191: Invalid argument

Sep 8 09:06:02 zimbra slapd[4207]: bdb(): First log record not found

Sep 8 09:06:02 zimbra slapd[4207]: bdb(): PANIC: Invalid argument

Sep 8 09:06:02 zimbra slapd[4207]: bdb_db_open: Database cannot be recovered, err -30978. Restore from backup!

Sep 8 09:06:02 zimbra slapd[4207]: bdb(): DB_ENV->lock_id_free interface requires an environment configured for the locking subsystem

Sep 8 09:06:02 zimbra slapd[4207]: bdb(): txn_checkpoint interface requires an environment configured for the transaction subsystem

Sep 8 09:06:02 zimbra slapd[4207]: bdb_db_close: txn_checkpoint failed: Invalid argument (22)

Sep 8 09:06:02 zimbra slapd[4207]: backend_startup_one: bi_db_open failed! (-30978)

Sep 8 09:06:02 zimbra slapd[4207]: bdb_db_close: alock_close failed

Sep 8 09:06:02 zimbra slapd[4207]: slapd stopped.


When I trie to recover following recipe #2 from the original poster, I get the following error:


zimbra@zimbra:~$ ~/openldap/sbin/slapadd -w -q -f ~/conf/slapd.conf -l /opt/zimbra/backup/ldap.bak

The first database does not allow slapadd; using the first available one (2)

slapadd: empty dn="" (line=5)


greenrenault writes something about

[QUOTE]and you'll need to make sure you select the correct database when doing slapadd (The -b '' flag).[/QUOTE]

But I don't know whether that is related and if so, how to select the correct database.
Any help would be appreciated.
-jsl

Return to “Administrators”

Who is online

Users browsing this forum: No registered users and 6 guests