Page 2 of 2

Re: Stopping zimlet webapp... takes 10-15 minutes

Posted: Fri Sep 14, 2018 5:28 pm
by L. Mark Stone
PaperAdvocate wrote:Both files are the same except for 2 things, that the order of the line items are different and the value of innodb_buffer_pool_size is different; 4163895296 for the server with the delay and 2511535718 for the server that is fine.

The 4163895296 value was given to me by Zimbra support when I sent them my logs, so this is why it's different, but the issue was present prior to changing this value.

On both the innodb_max_dirty_pages_pct = 30.

Dunno, but the InnoDB Buffer Pool size should ideally be 1.25x or larger the size of the InnoDB databases. If the databases are bigger than the buffer pool, some portions of the databases get paged out to disk. If that's what's happening in your case, then likely MariaDB needs to pull that data in from swap and write it to disk before allowing itself to shut down, and that can add to the shutdown time (users will also notice periodic "stalls" of UI responsiveness).

You can use a tool like to get the InnoDB database size, and you can also use the M parameter to set pool size in MB, rather than bytes. Much easier to read and less likely to cause a typo.

So here you can see on one system where, given the fullness of time, more and larger mailboxes, I needed to increase the size of the buffer pool eventually to 7GB. A number of clients have buffer pools 20GB or greater in size, so it's a good thing to check periodically. Be sure to add more RAM to your server if needed too!

Code: Select all

zimbra@zimbra:~$ cat conf/my.cnf | grep -i innodb_buffer
# innodb_buffer_pool_size        = 5047721164
# innodb_buffer_pool_size        = 6144M
innodb_buffer_pool_size        = 7168M

Hope that helps,

Re: Stopping zimlet webapp... takes 10-15 minutes

Posted: Fri Sep 14, 2018 6:55 pm
by PaperAdvocate
Thank you. I will check those values.

I haven't timed it perfectly, but it seems it's timing out at 10 minutes which is a common timeout variable. And due to the "timeout" listed in the shutdown process:

Code: Select all

Apr  3 23:37:49 mail postfix/amavisd/smtpd[11477]: timeout after END-OF-MESSAGE from localhost[]
Apr  3 23:37:49 mail postfix/amavisd/smtpd[11477]: disconnect from localhost[] ehlo=1 mail=1 rcpt=1 data=1 commands=4

On researching "timeout after END-OF-MESSAGE from localhost" I found several posts linking the behavior to possibly spamassasin or amavis-new. So I looked through my conf folder for differences in those configs and found only that the server with the delay was missing the IPV6 address entries in @mynetworks. I will change this just in case and try.

On the server with the timeout, there are additional files related to spamassasin that aren't present on the server without issues. These are: a directory named ~/conf/sa with a file inside name, and in the root of the conf folder and I'm going to try and clean those up and see if they do anything.

Re: Stopping zimlet webapp... takes 10-15 minutes

Posted: Sat Sep 15, 2018 12:24 am
by L. Mark Stone
FWIW I always remove every trace of IPv6 before deploying Zimbra on an operating system. When IPv6 is ubiquitous, I'll remove every trace of IPv4 before deploying Zimbra on an operating system.

In my experience, Zimbra works well on either an IPv4 OR an IPv6 system, but I have always found "something" (as Rosanne Rosanadana would say...) when both are present on a Zimbra server.

Hope that helps,

Re: Stopping zimlet webapp... takes 10-15 minutes

Posted: Mon Sep 17, 2018 7:25 am
by andrey.ivanov
I've seen sometimes the shutdowns that take a lot of time, usually it's because some connections are established from proxy to backend and they are active or in 'TIME_WAIT' state. What i usually do to avoid it:
* stop proxy and memcached services
* on mailbox server, i monitor the connections coming from the proxy using iptstate -t. I've found that the connections in the state 'TIME_WAIT' cause the shutdown delays (at least for me). So i wait until they disappear (usually about 2 minutes).
* then i stop all the other zimbra services.

Using this sequence of stopping zimbra mailbox i have no problems with delays that i had in the past.

Re: Stopping zimlet webapp... takes 10-15 minutes

Posted: Fri Sep 21, 2018 9:52 pm
by PaperAdvocate
@ L. Mark Stone

I cleaned up the extra spamassasin conf files and other misc items to make it match the server without issue, all without result. Set the vm.swappiness=1 (as 0 now means to disable the swap altogether), without result. Ran the and got these results (which are the same as the results on the server without the shutdown delay:

Code: Select all

[OK] InnoDB File per table is activated
[OK] InnoDB buffer pool / data size: 3.9G/661.9M
[OK] Ratio InnoDB log file size / InnoDB Buffer pool size: 500.0M * 2/3.9G should be equal 25%
[!!] InnoDB buffer pool instances: 8
[--] InnoDB Buffer Pool Chunk Size not used or defined in your version
[OK] InnoDB Read buffer efficiency: 100.00% (754767404 hits/ 754803238 total)
[!!] InnoDB Write Log efficiency: 85.19% (206209 hits/ 242065 total)
[OK] InnoDB log waits: 0.00% (0 waits / 35856 writes)

I'm still curious about the amavis interaction in the delay... do you know if there is a way to bypass/disable amavis for troubleshooting?

Thank you for the info, I'll look into it. My concern is automating this for a graceful shutdown in the event of a power outage (which means I won't be logged on to do the steps manually).

Re: Stopping zimlet webapp... takes 10-15 minutes

Posted: Fri Jul 03, 2020 5:02 am
by PaperAdvocate
Posting it here as well... I wanted to post what ended up fixing it for me (I've posted about this issue in another thread). Zimbra support had an open ticket with me for a year and never resolved it (I still love Zimbra anyway), for an unrelated reason I had to do a database repair and ran the steps found here:

After that, no more huge delay on shutdown/restart. Hopefully this helps others...