The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Special Forums > News, Links, Events and Announcements > Complex Event Processing RSS News
Google UNIX.COM
Home Forums Register Rules & FAQ Members List Arcade Search Today's Posts Mark Forums Read


Complex Event Processing RSS News Aggregated RSS news on CEP, ESP and EP.


Other UNIX.COM Threads You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Understanding this Makefile the_learner High Level Programming 5 06-13-2007 10:55 PM
A little help understanding FIFOs? deckard Linux 0 11-01-2005 09:46 AM
Understanding traceroutes Bobby IP Networking 2 03-14-2005 01:35 PM
Understanding google Chat with iBot - Our RSS Robot Girl 45 03-05-2005 04:25 AM
need further understanding of init.d jigarlakhani UNIX for Advanced & Expert Users 1 09-20-2002 12:11 PM

Reply
 
Submit Tools LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 04-13-2008
iBot's Avatar
RSS Robot Girl
 

Join Date: Sep 2000
Posts: 14,108
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiReddit! Stumble this Post!Spurl this Post!
Spam Filtering: Understanding SEP and CEP

Greg Reemler
Mon, 14 Apr 2008 04:56:52 +0000

In order to*help folks*further understand the differences between CEP and SEP, prompted by*Marc’s reply in the blogosphere, More Cloudy Thoughts, here is the scoop.
In the early days of spam filtering, let’s go back around 10 years, detecting spam was performed with rule-based systems.* In fact, here is a link to one of the first papers that documented rule-based approaches in spam filtering, E-Mail Bombs and Countermeasures: Cyber Attacks on Availability and Brand Integrity published in IEEE Network Magazine, Volume 12, Issue 2, p.10-17 (1998).** At the time, rule-based approaches were common (the state-of-the-art)*in antispam filtering.
Over time, however, the spammers get more clever and they find many ways to poke holes in rule-based detection approaches.* They learn to write with spaces between the letters in the words, they change the subject and message text frequently, they randomize their originating IP addresses, they use IP addresses of your best friends, they changed the timing and frequency of the spam, etc. ad infinitium.
Not to sound like an elitist for speaking the truth,* but the more operational experience you have with detection-oriented solutions, the more you will understand that rule-based approaches (alone)*are not scalable nor efficient.**If you followed a rules-based approach (only),*against*heavy, complex spam (the type of spam we see in cyberspace today), you would spend much of your time writing rules and still not stop very much of the spam!
The same is true for the security situation-detection example in Marc’s example.
Like Google’s Gmail spam filter, and Microsoft’s old Mr Clippy (the goofy help algorithm of the past), you need detection techiques that use advanced statistical methods to detect complex situations as they emerge.* With rules, you can only detect simple situations unless you have a tremendous amount of resources to build a maintain very complex rule bases (and even then rules have limitations for real-time analytics).
We did not make this up at Techrotech, BTW.** Neither did our favorite search engine and leading free email provider, Google!***
This is precisely why Gmail has a great spam filter.***Google detects spam with a Bayesian Classifer, not a rule-based system.*** If they used (only) a rule-based approach, your Gmail inbox would be full of spam!!!*
The same is true for search and retrieval algorithms, but that is a topic for another day.* However, you can bet your annual paycheck that Google uses a Bayesian type of classifer in their highly confidential search and retreival (and - hint - classification) algorithms.
In closing, don’t let the folks selling software and analysts promoting three-letter-acronyms (TLAs)*cloud your thinking.*
What we are seeing*in the market place, the so-called CEP market place, are simple event processing engines.* CEP is already happening in the operations of Google, a company that*needs real-time CEP for spam filtering and also for search-and-retrieval.* We also see real-time CEP*in top quality security products that use advanced neural networks, and Bayesian networks,*to detect problems (fraud, abuse,*denial-of-service attacks, phishing, identity theft)*in cyberspace.



Source...
Reply With Quote
Google UNIX.COM
Forum Sponsor
Reply



Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -7. The time now is 06:32 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger

Search Engine Optimization by vBSEO 3.1.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102