Handling SYN Flood and SYN Queue

Discuss your pilot or production implementation with other Zimbra admins or our engineers.
User avatar
JDunphy
Outstanding Member
Outstanding Member
Posts: 464
Joined: Fri Sep 12, 2014 11:18 pm
Location: Victoria, BC
ZCS/ZD Version: 8.7.11_P12 RHEL6 Network Edition
Contact:

Handling SYN Flood and SYN Queue

Postby JDunphy » Fri Aug 23, 2019 2:27 pm

There has been a low bandwidth SYN flood using spoofed ip's attacking most ip4 address space. Across 5 data centers with 5 different providers we are seeing this. It attacks every service that is open so if you are seeing syn_cookies deployed you could be experiencing and participating in this low bandwidth amplification attack. Quickest way to verify if you are participating in this attack is do do this:

Code: Select all

% netstat -na |grep SYN
or
% ss -n state syn-recv
or
% netstat -s |grep SYN

Observe if these don't tend to disappear or if you have a high number of resets and SYN Cookies. We don't normally have anything listed but prior to mitigation, we were seeing between 50-70 SYN or SYN-RECV states.

The issue is you will receive 1 SYN packet with a spoofed ip address of a victim. Your server will most likely then retry every 5 seconds with a SYN+ACK packet which takes approx 75 seconds for the default value of: net.ipv4.tcp_synack_retries=5 ... That takes a slot in your SYN connection table which can quickly fill up preventing real users access unless you have syn cookies enabled. My proposed solution is not ideal but it can work fairly well if you have DNS running on your zimbra box with services that are being hit by this attack. The idea is to observe the TCP port for port 53 (even better if that port is unique to your servers participating in this that other don't normally have enabled - don't pick common ports like 443, etc) and count the number of attempted connections and use 2 firewall rules to automatically block these ip's so you don't participate in this attack. We will use an ipset with a timeout value that will remove the ip's automatically sometime in the future. I'll show an example of 24 hrs but it could be something very low like 5 mins while you figure this out.

Code: Select all

# su -
# ipset create blacklist24hr hash:ip hashsize 4096 timeout 86400

Then in your firewall rules, I am showing iptables with centos 6 and 7 which is in /etc/sysconfig/iptables but you can do this from the command line also if you use a different interface to netfilter.

Code: Select all

-A Block -m set --match-set blacklist24hr src -j DROP
-A Block -p tcp --dport 53 -m state --state NEW -m recent --set --name DNS
-A Block -p tcp --dport 53 -m state --state NEW -m recent --update --seconds 700 --hitcount 10 --rttl --name DNS -j SET --add-set blacklist24hr src
-A Block -m state --state NEW -m tcp -p tcp --dport 53 -j ACCEPT
-A Block -m state --state NEW -m udp -p udp --dport 53 -j ACCEPT

Where Block is INPUT/FORWARD so make this INPUT in your configuration or from the command line.

This says that if we receive 10 attempted connections to the DNS TCP service within 700 seconds, to add that ip address to our blacklist24hr.... You can then observe the ip's by doing something like this at any time:

Code: Select all

# su -
# ipset list blacklist24hr
Name: blacklist24hr
Type: hash:ip
Header: family inet hashsize 4096 maxelem 65536 timeout 86400
Size in memory: 68856
References: 2
Members:
34.249.72.29 timeout 75370
63.33.107.88 timeout 75311
54.77.86.28 timeout 75234
134.209.197.140 timeout 7788
104.248.93.27 timeout 7556
134.209.198.237 timeout 7855
134.209.204.225 timeout 9483
159.69.42.212 timeout 4326
52.51.174.182 timeout 75453
104.248.85.96 timeout 4315
54.229.20.26 timeout 75048
104.248.93.69 timeout 7554
134.209.206.130 timeout 8032
134.209.193.107 timeout 7616
54.229.41.139 timeout 75040
134.209.206.170 timeout 8026
54.72.16.57 timeout 75397
134.209.197.21 timeout 7785
167.71.15.163 timeout 4313
134.209.207.173 timeout 8732
134.209.196.85 timeout 7797
134.209.81.178 timeout 8145
52.19.32.79 timeout 75036
104.248.93.253 timeout 7489
52.50.56.61 timeout 75469
52.213.23.238 timeout 75241
165.22.203.65 timeout 9476
34.246.156.178 timeout 75072
128.199.55.152 timeout 7615
134.209.93.78 timeout 8894
104.248.89.196 timeout 7436
134.209.202.54 timeout 7910
104.248.93.29 timeout 7551
134.209.203.232 timeout 7963
134.209.194.206 timeout 7673
134.209.195.132 timeout 7730
52.48.34.24 timeout 75510
134.209.198.83 timeout 7860

Hint: Instead of

Code: Select all

-A Block -m set --match-set blacklist24hr src -j DROP

You could change that to ACCEPT until you were sure this is appropriate for your environment. The net effect of this attack is that those same ip's are attacking all TCP sockets in a listen state so your zimbra external ports of 443, 25, 80, etc would have been participating in this attack. It has been ongoing for a while but the low frequency of the attack has created a mess for the victims ip addresses that have been spoofed. Sometimes its been a country and other times it's been specific ISP's, etc. Every night, it been a different set of victims. Some on NANOG are speculating it is a research project but it could just as easily be the start of a state sponsored attack. Until BGP38 is required completely, there is very little to stop this type of attack. If the frequency increases, there could be a lot of pain for everyone. These type of attacks are pretty common but this one has been sustained for a while so I thought I would share one method of dealing with it.

Warnings: If someone knows you are doing this and has a grudge with you, they could spoof an ip address you need. That is a lower probability if you target a TCP service that is unique to you instead of port 53. I saw them on openvpn ports so they have done a port scan which is approx only 45min now for the entire per 2**32 ipv4 addresses these days via a 10Gb/s link or instant via a service like shodan, etc. TCP port 53 is generally not used for most queries (generally zone transfers, etc) because we query over UDP 53 but any accidental blocking of normal ip addresses you deal with is the concern here and/or make the timeout window much shorter than 24 hours to further mitigate that problem. We haven't seen any false positives but that is something one needs to be aware because choosing 10 attempts over 700 seconds needs to be tuned to your environment. I think this is safe because it's TCP port 53 but this would be a really bad idea for port 25 or port 443 for obvious reasons.

A few other things you can also do to help your servers ride through these SYN flooding storms that happen from time to time.

Code: Select all

# sysctl -w net.ipv4.tcp_synack_retries=1
or add to /etc/sysctl.conf file to do so and have sysctl reload this file.
# TCP SYN Flood Protection
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 2048   # or much larger depending on size of host
net.ipv4.tcp_synack_retries = 3

This is a pretty nice description of the problem if you want to know more.
ref: https://blog.cloudflare.com/syn-packet- ... -the-wild/


Return to “Administrators”

Who is online

Users browsing this forum: Google [Bot] and 15 guests