Yesterday I noticed that one of my servers was seeing more traffic than usual. The hourly traffic limit I had set was reached and my mailbox filled with traffic warning messages. I wanted to look into the matter but wondered: how can I quickly see which IP connections exist? Fortunate to be on a Debian Linux, I decided to install the package iptraf. This IP LAN monitor generates various network statistics and showed me that most of the new connections came from one particular IP range: 18.104.22.168/16. WHOIS DB shows, that this range belongs to BAIDU, the famous search enterprise in China. Generating several hundred Megabytes of traffic per hour was too much for me. You can see the traffic line here:
First of all I thought it might be the indexing work of many Baidu spiders, or of spiders claiming to be Baidu. In this regard, I found a very helpful post by the folks at Perishablepress, which explains how to block the Baidu spidering process through the .htaccess file.
This did not really solve the problem, so I took more serious measures and blocked the whole address space of Baidu via iptables Firewall. This is the command you need (please be root before using it):
iptables -A INPUT -s 22.214.171.124/16 -j DROP
Just remember, that this statement really blocks all the traffic from the Baidu network. So if your servers are meant to reach Chinese visitors, you might want to think twice before you block it all!