The Bot Hunter: An Event Processing Challenge


 
Thread Tools Search this Thread
Special Forums UNIX and Linux Applications Virtualization and Cloud Computing The Bot Hunter: An Event Processing Challenge
# 1  
Old 08-15-2008
The Bot Hunter: An Event Processing Challenge

Tim Bass
08-15-2008 02:35 AM
Recently we penned The Attack of the Spiders from the Clouds where we mentioned how cloud computing infrastructures can be used to stage malicous or accidential network attacks.

Today I challenge our CEP/ESP/EP vendors (or SIs) to create the following solution to detect and block rogue bots on Apache web sites.** I will install and test each submitted solution on The UNIX Forums and post the results here.

Here are some basic requirements:

  1. Your solution must run on Linux and be installable and configurable remotely with SSH or HTTP.* There will be no physical access to the server. No exceptions.
  2. Preferrably, the configuration can be done with a Web-Based Interface (WBI) - a browser.
  3. Your solution will listen to continuous updates to the Apache2 access log, exact location configurable in your solution, and identify robots ( bots), also known as spiders, from the log.
  4. Your solution will provide a confidence metric, key indicator (KI), for each bot detected, from 0 to 10, where 10 indicates “absolutely a bot,” 0 is “absolutely not a bot.”
  5. Your solution will update the IP address of each bot and KI you identify in a file/table called, for example, ./bot_scorecard.txt where each line is an IP address of a bot, followed by a semicolon (or other delimiter of your choice) and the confidence factor, for example,* 10.0.0.1;10 means that 10.0.0.1 is a bot, 100% sure.
  6. Your solution must compare bots detected to a file/table called, for example, ./bots_allowed.txt and ./bots_denied.txt that are in the format IP address/mask, for example 10.0.0.1/24, or 10.0.0.1/32.
  7. If the KI “confidence factor” of the IP address of your detected bot is higher than the tunable “is a bot” KI, then your solution should update the tables/files and then call iptables and block the bot.
  8. It should send an email to one or more email addresses with a message, for example:* “New Bot Detected - Confidence 8″ with IP address, etc. in the message.* Another example would be an email, “Bot Blocked” - with details, etc.
  9. You cannot automatically block any traffic that is not a bot.* Blocking one “non-bot” results in failure, no exceptions.
These are some basic requirements; I don’t want to restrict your thinking or solution, so be creative!* Feel free to ask any questions in the comment section of this thread.

Remember, sometimes you may have to manage the state of IP addresses for days, or hours, before you can accurately deterimine if it is a bot based on behavior alone.** So, you will need to work with both long and short time windows.* Latency is not important. Detection accurate is importance.

Anyone care to submit a solution for testing?



Source...
Login or Register to Ask a Question

Previous Thread | Next Thread
Login or Register to Ask a Question