Page fragment caching for hybris

Category: Other Author: Rauf ALIEV 24 July 2016 7 comments

Introduction

Caching is inevitable for a high-performance, scalable web application. There are different types of caching that have already been implemented in hybris. However, almost every solution requires additional tools and improvements to make hybris more resilient to high traffic or page load time related requirements. There are different types of caching that could be used in hybris projects. Some of them are already included in the platform, such as entity cache or query cache. Some of them are on the database side, such as database query cache. SOLR plays the role of a caching server for products as well. However, for high performance you need to add some additional components to make the system faster. Sometimes it is impractical to cache an entire page because portions of the page may need to change on each request. In those cases, you can cache just a portion of a page. Default hybris doesn’t have any capabilities like this. There is a package from hybris Professional Service for page/page fragments caching based on Varnish. Since the package is poorly documented and licensed separately, I decided to create my own PoC to estimate the amount of effort needed to create a similar extension. I believe that my solution has some advantages in terms of features and flexibility, namely:

My solution has tools for cache invalidating on a coarse or fine grain level. For example, if objects are changed, then caches where these objects are mentioned or used, must be invalidated and recreated. For external non-manageable reverse proxy, the only solution is to wait until the cache TTL time is complete. With my solution, you have tools to manage the cache contents and to easily change it.
My solution is not limited to CMS objects. You can cache any page fragments, including parts of components or parts of the page controller templates.
With my solution, cache fragments may depend on each other or on external entities such as customer id, post data or session parameters.

Solution

Video

Syntax: custom tags

I used JSP custom tags to mark the areas for caching.

In order to use custom tags, you need to create a custom tag library. In my PoC, I created cachetags.tld and put it into resources/WEB-INF:

<?xml version="1.0" encoding="UTF-8"?>

<taglib xmlns="http://java.sun.com/xml/ns/javaee"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-jsptaglibrary_2_1.xsd"
 version="2.1">
 <tlib-version>1.1</tlib-version>
 <short-name>cachetags</short-name>
 <tag>
    <name>cached</name
    <tag-class>org.training.storefront.cache.CacheTags</tag-class>
    <body-content>JSP</body-content>
    <dynamic-attributes>true</dynamic-attributes>
 </tag>
</taglib>

You can specify any number of attributes in the tag with any names and any values. Names of the attributes are used as keys in the cache and values are used as values. For example:

<%taglib prefix="cache" uri="/WEB-INF/cachetags.tld"%>
<cache:cached attr1="value1" attr2="value2">
  something
</cache>

will create the following record in the cache:

{
 "value1_value2" : "something",
 "attr1" : "value1",
 "attr2" : "value2",
  ctime : 146863804587 
}

“ctime” is a creation time in ms. This value can be used to invalidate old records.

Cache storage

You can use any NoSQL database and in-memory cache libraries to store these JSONs. In my PoC I used MongoDB. Among all NoSQL solutions, this one works well with Windows.

Invalidation

Some server-side logic may make the cached fragment invalid. In my PoC, I purged the records that have the attribute or attributes that reflect the objects affected. For example,

Cache contents					JSP tags
key	obj	ProductCode	CustomerId	CategoryId
ProductDetails_1234_14	ProductDetails	1234	14		<cache:cached obj=”ProductDetails” ProductCode=”${categoryId}” CustomerId=”${customer.id}”> … </cache:cached>
ProductDetails_4321_14	ProductDetails	1234	14		<cache:cached obj=”ProductDetails” ProductCode=”${categoryId}” CustomerId=”${customer.id}”> … </cache:cached>
CategoryPage_/c/123	CategoryPage			/c/123	<cache:cached obj=”CategoryPage” ProductCode=”${category.id}”> … </cache:cached>
CategoryPage_/c/123	CategoryPage			/c/123	<cache:cached obj=”CategoryPage” CategoryId=”${category.id}> … </cache:cached>

If Product #1234 is changed, you need to send two requests to the cache engine:

cache.purge(“ProductCode”, “1234”) or cache.purge(“obj”, “ProductDetails”, “ProductCode”, “1234”);
cache.purge(“CategoryId”, getProductByCode(“1234”).getCategories);

If product 1234 is in category 123, all the records will be removed. If not, only the first two lines will be purged. The next time the customer makes a request to the product page, the cache will be updated. However, if this product was included in the product carousel component product list, and this component was cached, it is much more difficult to understand what product carousel component caches should be invalidated. A simple solution to this problem is to associate every cache entry with a time-to-live (TTL) value. With this solution, the validity of an entry is expired after some time, meaning the hits on expired cache entries are considered misses. The trade-off is content freshness to scalability.

Examples of cached fragments definitions

Product details. In the below example, the caching fragment depends on product code and session. It means that hybris will use different cached fragments for different pairs of product and user sessions. If product prices are not customer-dependent, you can use only the productCode attribute. In this case, different customers will use the same cached fragment. image2016-7-16 11-16-27

Category page. In the next example, the fragment depends on the URL and query string. Facets change the URL parameters and the category id is a part of the URL. Therefore with different category pages, you will have different cached fragments. image2016-7-16 11-18-48.png

Product carousel component. As you can see from the code, the cached fragment depends on the URL and $titlle. The product carousel component can be put into the same page with a different configuration. In our case the configurations have different titles. Each instance will be cached separately. If you eliminate the second attribute you will have two identical fragments because, for the second time, the fragment from the first instance is reused from cache. If you add ${title} as a second key, hybris will use different caches for the instances. image2016-7-16 11-22-8

Quick and dirty performance testing

https://electronics.local:9002/trainingstorefront/electronics/en/Open-Catalogue/Cameras/c/571
Number of threads (users): 80

Tags: caching, cms

7 Responses

Prashanth Reddy

29 July 2016 at 16:37

Hi Rauf,
Can you please provide some info on page fragment caching using mongodb, is there any source code that you can provide, i am really interested in that.
1. Rauf Aliev
  
  29 July 2016 at 19:59
  
  Contact me on Skype please. I will share something
  1. Prashanth Reddy Koppula
    
    29 September 2016 at 13:24
    
    Hi Rauf,
    Can you please share your skypeId.
    1. Rauf Aliev
      
      30 September 2016 at 14:12
      
      rauf_aliev
      Welcome 🙂
Prashanth Reddy Koppula

29 September 2016 at 13:22

Hi Rauf,
Can you please let me know your skypeId
Gyanendra

17 January 2017 at 07:53

i am also working on a POC , I think here I am missing ‘org.training.storefront.cache.CacheTags’, where would I find this tag class.
1. Rauf Aliev
  
  17 January 2017 at 10:55
  
  CacheTags.java is my custom tag implementation. I can’t find it just now to show the internals, because it is on my home laptop.. It pulls data from NoSQL or memory by the key. If the key is not found in the database or memory, it adds the HTML between the tags to it. The library has two methods, startTag and endTag (or something), so I am able to check data against the cache in startTag and put data in the cache in endTag.