Sunday, January 21, 2007

Complex Event Processing

Complex Event Processing (CEP) was the topic for the SDForum Business Intelligence SIG January meeting. Mark Tsimelzon, President, CTO and Founder of Coral8 spoke on "Drinking from a Fire Hose: the Why's and How's of Complex Event Processing".

Mark started out by showing us a long list of applications such as, RFID, financial securities, e-commerce, telecom and computer network security that share the same characteristics. Each of these applications can generate hundreds of thousands of event per second that need to be processed, filtered and have critical events identified and responded to in a millisecond or second timeframe.

The first response to building a system for one of these complex event processing applications is to load the data into a database and continuously run queries against the data. Unfortunately this introduces a number of delays that interfere with response time. Firstly there is the delay in loading the data into the database, as efficient database loading works best in batches. Next there is a delay in waiting for the query to be run as it is run periodically. Finally there is a delay caused by interference between the load process that is writing data and the query process that is trying to read the same data.

Given the problem of using a database, the next response to building a CEP system is to write a custom program in Java or C to do the job. This can be coded to meet the response time and data rate requirements, however it is inflexible. Any change to the requirements or data streams requires recoding and testing which take time and money. Coral8 and other vendors in the CEP space provide a system like a database that is programmable in a high level SQL-like language and that can process event streams at a rate similar to the hand coded system.

In a conventional database system, the data is at rest in the database and the queries act on the data. In a CEP system, the queries are static and the event data streams past the queries. When an event triggers a query, the query typically generate new event data. This structure allows event data processing to be parallelized by having several event processors that run different queries in parallel on the same data stream. Processing can be pipelined by having the output streams of one event processor feed into the inputs of another event processor.

It is important to understand that the purpose of a CEP system is not to store data. While events can linger, they eventually pass out of the system and are gone. A database complements a CEP system. For example, Coral8 can read data from database systems and even caches the data for improved efficiency. Also, output streams from Coral8 can, and usually are, fed into database systems.

If you want to try out CEP, visit the Coral8 web-site. There you can download documentation and a trial version of the software.

No comments: