Results 1 to 6 of 6

Thread: DRBD & Heartbeat not quite working as expected

  1. #1
    Join Date
    Dec 2007
    Location
    Raleigh, NC
    Posts
    91
    Rep Power
    8

    Default DRBD & Heartbeat not quite working as expected

    After several days (heartbeat and DRBD are new to me) I've gotten Zimbra working with heartbeat, mostly.

    If Zimbra is working off Server-B and Server-B goes down, Zimbra transfers over to Server-A. The problem is that the servers reboot so quickly during a test (less than a minute) that Zimbra is about 90% started on Server-A when it receives a heartbeat command to transfer back to Server-B. Server-A takes a while to unmount /opt and both server's DRBD ends up going to Secondary/Secondary, the shared IP is never assigned again. I end up rebooting both servers and everything comes back up.

    auto_failback off is set to off on both servers, and heartbeat is set to prefer Server-A to start with.

    I've been pulling my hair out on this one, and these are new servers.
    2.66G 64bit Pentium Ds
    1G of RAM
    1 mailbox (I was still testing heartbeat and haven't setup the mailboxes yet)

    Does anyone know what I need to tweak?

    Doug

  2. #2
    Join Date
    May 2006
    Location
    USA
    Posts
    6,242
    Rep Power
    21

    Default

    Quote Originally Posted by DougWare View Post
    The problem is that the servers reboot so quickly during a test (less than a minute) that Zimbra is about 90% started on Server-A when it receives a heartbeat command to transfer back to Server-B.
    Did you remove zimbra from your runlevels on Server-A? (/etc/rc#.d/S99zimbra)

  3. #3
    Join Date
    Dec 2007
    Location
    Raleigh, NC
    Posts
    91
    Rep Power
    8

    Default

    I did, but then I reinstalled Zimbra on Server-B.

    I guess I forgot to remove them again. I've removed them and I am restarting now to see if that corrects the problem.

    Thank you for pointing that out!

    Doug

  4. #4
    Join Date
    Dec 2007
    Location
    Raleigh, NC
    Posts
    91
    Rep Power
    8

    Default

    Same outcome....

    Dec 17 20:23:48 mailserver1B heartbeat: [2506]: info: mailserver1a wants to go standby [foreign]
    Dec 17 20:23:49 mailserver1B heartbeat: [2506]: info: standby: acquire [foreign] resources from mailserver1a
    Dec 17 20:23:49 mailserver1B heartbeat: [2842]: info: acquire local HA resources (standby).
    Dec 17 20:23:49 mailserver1B heartbeat: [2842]: info: local HA resource acquisition completed (standby).
    Dec 17 20:23:49 mailserver1B heartbeat: [2506]: info: Standby resource acquisition done [foreign].
    Dec 17 20:23:49 mailserver1B heartbeat: [2506]: info: remote resource transition completed.

    Doug

  5. #5
    Join Date
    Dec 2007
    Location
    Raleigh, NC
    Posts
    91
    Rep Power
    8

    Default

    Here's the same output from Server-A....

    Dec 17 20:23:11 mailserver1A heartbeat: [2498]: WARN: T_STARTING received during takeover.
    Dec 17 20:23:11 mailserver1A heartbeat: [2498]: info: remote resource transition completed.
    Dec 17 20:23:13 mailserver1A ResourceManager[18922]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.2.20/24/bond0 stop
    Dec 17 20:23:13 mailserver1A IPaddr[24657]: INFO: ifconfig bond0:0 down
    Dec 17 20:23:13 mailserver1A IPaddr[24628]: INFO: Success
    Dec 17 20:23:13 mailserver1A ResourceManager[18922]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt reiserfs stop
    Dec 17 20:23:13 mailserver1A Filesystem[24719]: INFO: Running stop for /dev/drbd0 on /opt
    Dec 17 20:23:13 mailserver1A Filesystem[24719]: INFO: Trying to unmount /opt
    Dec 17 20:23:13 mailserver1A Filesystem[24719]: ERROR: Couldn't unmount /opt; trying cleanup with SIGTERM
    Dec 17 20:23:14 mailserver1A Filesystem[24719]: INFO: Some processes on /opt were signalled
    Dec 17 20:23:15 mailserver1A Filesystem[24719]: INFO: unmounted /opt successfully
    Dec 17 20:23:15 mailserver1A Filesystem[24708]: INFO: Success
    Dec 17 20:23:15 mailserver1A ResourceManager[18922]: info: Running /etc/ha.d/resource.d/drbddisk r0 stop
    Dec 17 20:23:15 mailserver1A kernel: drbd0: role( Primary -> Secondary )
    Dec 17 20:23:15 mailserver1A kernel: drbd0: Writing meta data super block now.
    Dec 17 20:23:15 mailserver1A heartbeat: [18896]: info: local HA resource acquisition completed (standby).
    Dec 17 20:23:15 mailserver1A heartbeat: [2498]: info: Standby resource acquisition done [all].
    Dec 17 20:23:15 mailserver1A harc[24828]: info: Running /etc/ha.d/rc.d/status status
    Dec 17 20:23:15 mailserver1A mach_down[24844]: info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
    Dec 17 20:23:15 mailserver1A mach_down[24844]: info: mach_down takeover complete for node mailserver1b.
    Dec 17 20:23:15 mailserver1A heartbeat: [2498]: info: mach_down takeover complete.
    Dec 17 20:23:15 mailserver1A harc[24878]: info: Running /etc/ha.d/rc.d/status status
    Dec 17 20:23:15 mailserver1A harc[24894]: info: Running /etc/ha.d/rc.d/status status
    Dec 17 20:23:15 mailserver1A harc[24910]: info: Running /etc/ha.d/rc.d/status status
    Dec 17 20:23:45 mailserver1A hb_standby[24946]: Going standby [foreign].
    Dec 17 20:23:45 mailserver1A heartbeat: [2498]: info: mailserver1a wants to go standby [foreign]
    Dec 17 20:23:45 mailserver1A heartbeat: [2498]: info: standby: mailserver1b can take our foreign resources
    Dec 17 20:23:45 mailserver1A heartbeat: [24960]: info: give up foreign HA resources (standby).
    Dec 17 20:23:45 mailserver1A heartbeat: [24960]: info: foreign HA resource release completed (standby).
    Dec 17 20:23:45 mailserver1A heartbeat: [2498]: info: Local standby process completed [foreign].
    Dec 17 20:23:46 mailserver1A heartbeat: [2498]: WARN: 1 lost packet(s) for [mailserver1b] [46:48]
    Dec 17 20:23:46 mailserver1A heartbeat: [2498]: info: remote resource transition completed.
    Dec 17 20:23:46 mailserver1A heartbeat: [2498]: info: No pkts missing from mailserver1b!
    Dec 17 20:23:46 mailserver1A heartbeat: [2498]: info: Other node completed standby takeover of foreign resources.

  6. #6
    Join Date
    May 2010
    Location
    Budapest
    Posts
    56
    Rep Power
    5

    Default

    can you please tell me what's in your /etc/heartbeat/haresources file?
    I can't get zimbra to start and get it mounted from drbd with heartbeat

    Thanks,
    Tibby

Similar Threads

  1. [SOLVED] Zimbra on DRBD
    By prash in forum Administrators
    Replies: 60
    Last Post: 08-26-2012, 10:07 AM
  2. Zimlets all not working?
    By jadestorm in forum Administrators
    Replies: 16
    Last Post: 10-28-2007, 08:25 PM
  3. Catchall not working as expected?
    By jbwiv in forum Administrators
    Replies: 4
    Last Post: 02-24-2007, 09:45 PM
  4. Replies: 2
    Last Post: 08-24-2006, 03:12 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •