Greg Reemler
Sun, 20 Apr 2008 11:16:38 +0000
Ironically, our favorite software vendors have decided, in a nutshell, to redefine
Dr. David Luckham’s definition*of “event cloud” to match the lack-of-capabilities*in their products.**
This is really funny, if you think about it.*
The
definition of “event cloud” was coordinated over a*long (over two year)*period*with the leading vendors in the event processing community and*is based on the same concepts in David’s book,
The Power of Events.*
But, since the stream-processing oriented vendors do not yet have the analytical capability to*discover unknown causal relationship in contextually complex data sets, they have chosen to reduce and redefine the term “event cloud” to match their product’s lack-of-capability.**Why not*simply admit they can only process a subdomain of the CEP space as defined by both Dr. Luckham and the CEP community-at-large?*
What’s the big deal?** Stream processing is a perfectly respectable professional!
David, along with the*”event processing community”*
defined the term “event cloud” as follows:
Event cloud: a partially ordered set of events (poset), either bounded or unbounded, where the partial orderings are imposed by the causal, timing and other relationships between the events.
Notes: Typically an event cloud is created by the events produced by one or more distributed systems. An event cloud may contain many event types, event streams and event channels. The difference between a cloud and a stream is that there is no event relationship that totally orders the events in a cloud. A stream is a cloud, but the converse is not necessarily true.
Note: CEP usually refers to event processing that assumes an event cloud as input, and thereby can make no assumptions about the arrival order of events.
Oddly enough, quite a few event processing vendors*seem to*have succeeded at confusing*their customers, as evident in this post,
Abstracting the CEP Engine, where*a customer*has seemingly*been*convinced by*the disinformational*marketing pitches* - “there are no clouds of events, only ordered streams.”
I think the problem is that folks are not comfortable with uncertainty and hidden causal relationships, so they give the standard “let’s run a calculation over a stream” example and state “that is all their is…” confusing the customers who know there is more to solving complex event processing problems.
So, let’s make this simple (we hope). referencing the invited keynote at DEBS 2007,
Mythbusters: Event Stream Processing Versus Complex Event Processing.
In a nutshell…. (these examples are in the PDF above, BTW)
The set of market data from Citigroup (C) is an example of multiple “event streams.”
The set of all events that influence the NASDAQ is an “event cloud”.
Why?
Because a stream**of market data is a linear ordered set of data related by the timestamp of each transaction linked (relative speaking)*in context because it it Citigroup*market data.*** So, event processing software can*process a stream of market data,*perform a VWAP if they chose, and estimate a good time to enter and exit the market.* This is “good”.
However, the same software, at this point in time,*cannot process*many market data feeds*in NASDAQ and provide a reasonable estimate of why the market moved a certain direction based on a statistical analysis of a large set of event data where the cause-and-effect*features (in this case, relationships)*are difficult to extract.* (BTW, this is generally called “feature extraction” in the scientific community)
Why?
Because the current-state-of-the-art of stream-processing oriented event processing software*cannot perform the required backwards chaining to infer causality from large sets of data where causality is*unknown, undiscovered and uncertain.
Forward chaining, continuous query, time series analytics across sliding time windows of streaming data can only perform a subset of the overall CEP domain as defined*by Dr. Luckham et al.
It is really that simple.** Why cloud and confuse the community?
We like forward chaining*using*continuous queries and*time series*analysis across sliding time windows of streaming data.*
There is nothing dishonorable about forward chaining*using*continuous queries and*time series*analysis across sliding time windows of streaming data.***
There is nothing wrong with forward chaining*using*continuous queries and*time series*analysis across sliding time windows of streaming data.*
There is nothing embarrassing about forward chaining*using*continuous queries and*time series*analysis across sliding time windows of streaming data.*
Forward chaining*using*continuous queries and*time series*analysis across sliding time windows of streaming data is a subset of the CEP space, just like the definition above, repeated below:
The difference between a cloud and a stream is that there is no event relationship that totally orders the events in a cloud. A stream is a cloud, but the converse is not necessarily true.
It is really simple.** Why cloud a concept so simple and so accurate?
Source...