Using Drools Fusion for complex event processing in hybris
In this post, I would like to introduce the first results of Drools Fusion and Hybris integration. Based on this approach you can create very comprehensive, cost-competitive and scalable e-commerce solutions for such tasks as realtime personalization and customer segmentation, monitoring and the lean data processing.
Let’s take “Personalization” for instance. The hybris personalization module performs customer behavior analysis on the same servers and during the web request.
As you can see, these steps are sequential. It is no secret that the personalization capabilities in hybris always slow down the system, up to 1.5-2 times. The typical solution is to add more CPU and memory resources, but it has the same limited effect as optimizing the code and the database. Most often the personalization features are not used at all.
I have sought to create a prototype of very scalable and cost-efficient solution and eventually I created a PoC.
The solution explained in this article will allow you to create customer behavior driven e-commerce websites without any significant impact on a performance. The key components of the solution are the following:
- Drools. The core of the system is Drools rule engine. Rules are pieces of knowledge often expressed as, “When some conditions occur, then do some tasks.”. Unlike hybris built-in drools engine, this system is stateful.
- Drools Fusion. The system deals with streams of events instead of static data. It means that you can use such conditions as “if there is more than 10 events of the type X from the customer Y during last 5 minutes then change the state”. The system is based on Drools Fusion that is a well known event processing engine.
- Drools server. Drools execution server lets you interact with Drools through REST interface over HTTP.
- Drools Workbench. Drools Workbench is a web application that provides a generic web UI for authoring and management of rules. It is integrated with Drools server and supports Drools Fusion.
There are also key concepts that makes the solution unique:
- Completely asynchronous data exchange and processing. It means that the engine works in parallel to the web site. For the example above, if the personalization engine needs more time to make a decision, the decision will be delivered back later, with the next web request.
- AJAX requests are used to deliver some decision-dependent content back to the customer without waiting for the next customer request.
In the video below you will see a simple demonstration of the concept. There are two sample rules in the example:
- when the customer visits more than 5 webpages marked as “Photo” during last 30 seconds, then this customer is a photographer.
- when the customer visits more than 5 webpages marked as “Non-Photo” during last 30 seconds, then this customer is a non-photographer.
In this simple example, page is marked as Photo if the page URL contains a keyword “camera”. Otherwise this page is marked as “Non-photo”.
Also to make the things simpler, saying “customer” I mean “customer session”.
Step 1. hybris storefront creates an event. When customer requests the page, hybris storefront sends an event, “TransactionEvent”. In our example, the event structure is super simple: session id and pageCategory (Photo/non-photo). It is a very fast operation, because hybris doesn’t care about message delivery. It simply throws the message into the queue.
Step 2. The message is inserted into Drools Fusion working memory. It happens milliseconds after the message is thrown by hybris storefront. Message processing is a bit slower, but it is also a pretty fast operation, because only one drools fusion command is performed, “insert an event”. The event is pulled from the queue and created in the memory. At this point the rule starts working. It is important there is no process that waits while the system will come back with any processing results. The message was simply thrown from queue to Drools Fusion.
In the example, the event is configured to live next 30 seconds. After it, the event will be purged automatically.
Step 2.1. hybris can provide Drools Fusion with some data used in the rules, such as Customer profile information or session information. There was no need for it in my PoC, this part is very simple and very similar to the explained in the “Step 2” section with one exception: hybris pushes facts rather than events. Perhaps, it uses the same interfaces and approach as for events.
Step 3. Drools Fusion processes the events/facts stored in the working memory. There is a rule that triggers when the customer visited 5 photo-related pages within last 30 seconds.
There are two parts of the drools rule, LHS (left hand side) and RHS (right hand side). LHS is a “when” part, a set of conditions. There are three conditions in the rule:
- If the transaction event is found in working memory, save the attribute “sessionId” of the event in $sessionId variable. Not found? Stop processing.
- Next check is a SessionState object. We have many of them, but for this rule we need only specific one: which state isn’t equal to “photographer” and sessionId is equal to one that mentioned above ($sessionId).
- Third condition is a bit more complicated. It counts all the (past) events that have pageCategory=”photo” and sessionId=$sessionId within 30-sec time frame. This condition is true when the total count is more than 5 items.
As a result, the session state is changed. There are different ways on how implement the “then” part of the rule. I set up a new state for the session that is created in the working memory (there is another rule for that). The object in the working memory is updated.
It is important that Drools Fusion works in STREAM mode in this example. Events are time-ordered, old (expired) events are removed automatically.
Step 4. Hybris requests the status of the session and updates its internal state. (Drools Fusion is able to push this information back to hybris instead of waiting for the hybris data request, but in my example I used the simplest approach).
CEP engines are designed to process a large volume of events at extremely high speeds. Throughput requirements are often well over 100,000 events per second, while processing latency demands can be as low as one millisecond, or less.
Due to asynchronous nature, the system is very scalable. The hybris part is likely not require scaling at all because this solution doesn’t have any “heavy” components that are inside hybris.
© Rauf Aliev, October 2016