Consolidating content and products SOLR search

Situation

Hybris search is designed primarily to deal with products. There are no content page search capabilities in hybris.

Hybris uses Apache SOLR for search. Using SOLR allows hybris to introduce such features as facet search, fuzzy search, and search-based category pages.

Solr’s basic unit of information is a document, which is a set of data that describes something. These documents are composed of fields. A product document would contain the product attributes, categories of the product, keywords, and so on. Shoe size could be a field.

The structure of the SOLR document is defined by SOLR schema. Some attributes could be defined as dynamic allowing hybris to store dynamic sets of product attributes in the SOLR index.

Hybris SOLR indexer fetches the information from the database, converts it into the SOLR document format and off-load these documents into the SOLR.

To fetch this data back from SOLR hybris uses Lucene Query Language and the indexes created by SOLR indexer.

SOLR is a lot faster than traditional databases that makes SOLR one of the best choices for e-commerce solutions. However, the there are some important limitations.

Complexity

If your website has both plain static content and product pages you may want to do a keyword search over both of them. For example, if your content pages are articles (reviews or news) they might be tagged in a similar way to tagging the product pages. These tags could be used as facets to filter pages (both product pages and content pages) of the same topic.

It is easy to configure hybris to use two indexes, one per page type. The results will also be  grouped by page type.

search_1

However, this approach doesn’t allow customers to filter results (both pages and content) by topic, for example. The idea of the today’s experiment is to get the consolidated results. For the example mentioned above, it should look like this:

search_2

The first guess is to add a new type to the list of item types (indexed types).

search5However, it will not work in hybris. Hybris gives only one indexer OOB so they have one and expect one only.

Hybris SOLR Indexer creates a SOLR core per type. Hybris SOLR Search is not able to mix items from different SOLR cores. Moreover, hybris SOLR Search can’t work with a collection of item types. All the classes of hybris SOLR Search work with only ONE item type instance, even when you have more. If you have two types, hybris SOLR Search will use only the first item.  In addition to that, hybris SOLR search is designed to work with product catalogs only.

To overcome these limitations you need to customize both indexer and search module. The technical details are under the video. How deep they should be customized depends on the specific requirements.

Solution

Technical details

To get consolidating work, you need to

  1. Add new SolrIndexedType item of ContentPage type.
  2. Add new full/update query. Let’s take the simplest one, “SELECT {PK} from {ContentPage}.”
  3. Add Solr Indexed Properties. They should be compatible with commerce solr properties (because they will share the same SOLR core)
  4. Create new populator that extends SearchSolrQueryPopulator. You need to do it to overcome the issue with Hybris Search module and Content Catalogs.
    the original hybris populator works with product catalogs only:finalCollection<CatalogVersionModel> catalogVersions = getSessionProductCatalogVersions();

    target.setCatalogVersions(catalogVersions);in my PoC I got rid of ittarget.setCatalogVersions(new ArrayList<CatalogVersionModel>());
  5. Create your own SolrCoreNameResolver to make hybris use one core for different types.
  6. Create your own ConfigurationExporterListener.beforeIndex and FullDirectIndexOperationStrategy.beforeIndex because they re-creates the SOLR core every time the indexer goes to the new type from the indexedTypes list.
  7. Add your own keyword providers for Content Pages. For example, they can pull out all indexable content of all the page’s components and place it into the one text solr field (“keywords”, for example).

Any questions?

Contact me privately using the form below or leave your comment to this article.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: