How to retaliate against the Yandex bot (Updated)

ooga booga

A few days ago I noticed that my site’s bandwidth usage was suddenly up. And I mean way up. Bandwidth is expensive, so I dug into the server logs and found that one particular computer was repeatedly accessing every page on my domain, several times a day. Further research revealed that the culprit is a bot that indexes web pages for a Russian search engine called Yandex.
My attempts to rebuff the Yandex bot using the familiar robots.txt method failed utterly. Yandex bots ignore that file, which causes no small amount of stomach acid online among people like me who don’t have money to burn.
I decided to retaliate.


I added the following lines to my .htaccess file, so that every time a bot whose name begins with Yandex tries to access my site it gets a 403 error instead of downloading the page it’s trying to see.

# BAD BOT EXCLUSION
# block known trouble makers dumb enough to
# announce who they are
SetEnvIfNoCase User-Agent “^Yandex” bad_bot
<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>

The bandwidth dropped back down to where it used to be, but I noticed one stupid Yandex bot kept coming back from IP address 77.88.26.27 even when I fed it a never-ending stream of 403 errors. Since every static page on my site ends in .htm and only my 403 error page ends in .shtml, I got nasty by adding these lines to my .htaccess file to target all visitors from 77.88.26.27 who try to access a page ending in .shtml:

# permanently redirect specific IP request for entire site
Options +FollowSymlinks
RewriteEngine on
RewriteCond %{REMOTE_HOST} 77\.88\.26\.27
RewriteRule \.shtml$ http://www.youtube.com/watch?v=oHg5SJYRHA0 [R=301,L]

That Yandex bot now gets rickrolled every time it tries to index my site. Problem solved.

6/30 Update: Looks like the Yandex bots have gone away. My server logs show zero hits from that domain. Now witness the firepower of this fully ARMED and OPERATIONAL battle station!
firepower -- screw subtlelty
#^@&ing spammers.

8/13 Update: Even better!

5 comments

  1. Jon Gifford

    You are now officially one of my heroes. Seriously.
    I hate this thing. Starting with the fact that it’s from an essentially lawless place for trade practices. It might as well be in North Korea.
    While I don’t really care about bandwidth, it’s so obviously up to no good.
    /for those deficient in sarcasm: Actually, I care. Just remembered how many other sites I have on that one blade server πŸ˜‰

  2. Alo Konsen

    Thanks, Jon. It’s satisfying to stick it to a spammer.
    Can’t wait ’til the bots start nibbling at the other nasty bits of bait I’ve scattered around.

  3. Wayne

    Hopefully this works. I’ve been trying to block these yandex bots for around a month and a half. Will let you know the out come
    Cheers

  4. anonymousradioshow

    Why this site is fucking stoopid:
    1. You use lame-ass language, the word “darn” (as in KNITTING?) instead of proper ENGLISH; F-U-C-K
    2. Obama name calling – whaddaya? 12 years old?
    3. That ridiculous ad for our “heroes” DOGS – What the fuck are animals doing in an amerikin invasion?
    4. Your slamming of Universal HEALTH care – you racist fuck! Your buddies in the us army get free healthcare already, no niggers?
    5. I’m so fucking smart; i “defeated” the Yanex bot – are you fucking kidding me!! – any asshole can write code, moron.
    6. the caliber of people you attract: “…it’s from an essentially lawless place for trade practices. It might as well be in North Korea”
    7. and (of course) YOUR IRRELEVANT TAKE on the “arms” race. GET A FUCKING LIFE – AND STOP TAKING UP SPACE !!