Would need a bit more information to help with a guess?
Have you grabbed any other stats? Do you have an educated guess of which subsystem? ... disk, cpu, interrupts, memory, network, etc.
Dedicated hardware or Virtualized (KVM, Xen, vmware, etc)? What type of raid configuration are you running with those SATA disks?
What is your network connected speed by your host and have you done anything to tune it if connections are dropping or being denied... ie. https://access.redhat.com/sites/default/files/attachments/20150325_network_performance_tuning.pdf
Code: Select all
ps axo pid,ppid,rss,vsz,nlwp,cmd
grep -i error /opt/zimbra/log/*.log
tail -f /opt/zimbra/log/myslow.log
What does freeze mean? The users are feeling lockups/delays or you are rebooting the server because it has locked up and unresponsive from the network and your out of band console access hangs.
What is the mix of web users to pop/imap/etc. How many established connections when it happens? dmesg?, Anything in /var/log/messages? , etc, etc.
I tend to run vmstat and drill deeper from there but you need to exhaust looking through your logs for errors or warnings. Could be as easy as disk errors but the logs would have that information. Depending on how the hardware fails it can lockup the bus until a kernel watch dog fires and "unfreezes" everything. It's fairly obvious with dmesg if that is happening and easy to verify from there.
I tend to focus on kernel subsystems and work back from there. Lots of ways to isolate the problem.