Using Drools Fusion for complex event processing in hybris
IntroductionIn this post, I would like to introduce the first results of Drools Fusion and Hybris integration. Based on this approach you can create very comprehensive, cost-competitive and scalable e-commerce solutions for such tasks as realtime personalization and customer segmentation, monitoring and the lean data processing. Let’s take “Personalization” for instance. The hybris personalization module performs customer behavior analysis on the same servers and during the web request. As you can see, these steps are sequential. It is no secret that the personalization capabilities in hybris always slow down the system, up to 1.5-2 times. The typical solution is to add more CPU and memory resources, but it has the same limited effect as optimizing the code and the database. Most often the personalization features are not used at all. I have sought to create a prototype of very scalable and cost-efficient solution and eventually I created a PoC.
SolutionThe solution explained in this article will allow you to create customer behavior driven e-commerce websites without any significant impact on a performance. The key components of the solution are the following:
- Drools. The core of the system is Drools rule engine. Rules are pieces of knowledge often expressed as, “When some conditions occur, then do some tasks.”. Unlike hybris built-in drools engine, this system is stateful.
- Drools Fusion. The system deals with streams of events instead of static data. It means that you can use such conditions as “if there is more than 10 events of the type X from the customer Y during last 5 minutes then change the state”. The system is based on Drools Fusion that is a well known event processing engine.
- Drools server. Drools execution server lets you interact with Drools through REST interface over HTTP.
- Drools Workbench. Drools Workbench is a web application that provides a generic web UI for authoring and management of rules. It is integrated with Drools server and supports Drools Fusion.
- Completely asynchronous data exchange and processing. It means that the engine works in parallel to the web site. For the example above, if the personalization engine needs more time to make a decision, the decision will be delivered back later, with the next web request.
- AJAX requests are used to deliver some decision-dependent content back to the customer without waiting for the next customer request.
- when the customer visits more than 5 webpages marked as “Photo” during last 30 seconds, then this customer is a photographer.
- when the customer visits more than 5 webpages marked as “Non-Photo” during last 30 seconds, then this customer is a non-photographer.
- If the transaction event is found in working memory, save the attribute “sessionId” of the event in $sessionId variable. Not found? Stop processing.
- Next check is a SessionState object. We have many of them, but for this rule we need only specific one: which state isn’t equal to “photographer” and sessionId is equal to one that mentioned above ($sessionId).
- Third condition is a bit more complicated. It counts all the (past) events that have pageCategory=”photo” and sessionId=$sessionId within 30-sec time frame. This condition is true when the total count is more than 5 items.
PerformanceCEP engines are designed to process a large volume of events at extremely high speeds. Throughput requirements are often well over 100,000 events per second, while processing latency demands can be as low as one millisecond, or less. Due to asynchronous nature, the system is very scalable. The hybris part is likely not require scaling at all because this solution doesn’t have any “heavy” components that are inside hybris.
© Rauf Aliev, October 2016