Consolidating content and products SOLR search

Situation

Hybris search is designed primarily to deal with products. There are no content page search capabilities in hybris.

Hybris uses Apache SOLR for search. Using SOLR allows hybris to introduce such features as facet search, fuzzy search, and search-based category pages.

Solr’s basic unit of information is a document, which is a set of data that describes something. These documents are composed of fields. A product document would contain the product attributes, categories of the product, keywords, and so on. Shoe size could be a field.

The structure of the SOLR document is defined by SOLR schema. Some attributes could be defined as dynamic allowing hybris to store dynamic sets of product attributes in the SOLR index.

Hybris SOLR indexer fetches the information from the database, converts it into the SOLR document format and off-load these documents into the SOLR.

To fetch this data back from SOLR hybris uses Lucene Query Language and the indexes created by SOLR indexer.

SOLR is a lot faster than traditional databases that makes SOLR one of the best choices for e-commerce solutions. However, the there are some important limitations.

Complexity

If your website has both plain static content and product pages you may want to do a keyword search over both of them. For example, if your content pages are articles (reviews or news) they might be tagged in a similar way to tagging the product pages. These tags could be used as facets to filter pages (both product pages and content pages) of the same topic.

It is easy to configure hybris to use two indexes, one per page type. The results will also be  grouped by page type.

search_1

However, this approach doesn’t allow customers to filter results (both pages and content) by topic, for example. The idea of the today’s experiment is to get the consolidated results. For the example mentioned above, it should look like this:

search_2

The first guess is to add a new type to the list of item types (indexed types).

search5However, it will not work in hybris. Hybris gives only one indexer OOB so they have one and expect one only.

Hybris SOLR Indexer creates a SOLR core per type. Hybris SOLR Search is not able to mix items from different SOLR cores. Moreover, hybris SOLR Search can’t work with a collection of item types. All the classes of hybris SOLR Search work with only ONE item type instance, even when you have more. If you have two types, hybris SOLR Search will use only the first item.  In addition to that, hybris SOLR search is designed to work with product catalogs only.

To overcome these limitations you need to customize both indexer and search module. The technical details are under the video. How deep they should be customized depends on the specific requirements.

Solution

Technical details

To get consolidating work, you need to

  1. Add new SolrIndexedType item of ContentPage type.
  2. Add new full/update query. Let’s take the simplest one, “SELECT {PK} from {ContentPage}.”
  3. Add Solr Indexed Properties. They should be compatible with commerce solr properties (because they will share the same SOLR core)
  4. Create new populator that extends SearchSolrQueryPopulator. You need to do it to overcome the issue with Hybris Search module and Content Catalogs.
    the original hybris populator works with product catalogs only:finalCollection<CatalogVersionModel> catalogVersions = getSessionProductCatalogVersions();

    target.setCatalogVersions(catalogVersions);in my PoC I got rid of ittarget.setCatalogVersions(new ArrayList<CatalogVersionModel>());
  5. Create your own SolrCoreNameResolver to make hybris use one core for different types.
  6. Create your own ConfigurationExporterListener.beforeIndex and FullDirectIndexOperationStrategy.beforeIndex because they re-creates the SOLR core every time the indexer goes to the new type from the indexedTypes list.
  7. Add your own keyword providers for Content Pages. For example, they can pull out all indexable content of all the page’s components and place it into the one text solr field (“keywords”, for example).

Any questions?

Contact me privately using the form below or leave your comment to this article.

2 comments

  1. Hi. Thanks for detailed article. I have an issue about products but not content pages.

    Is it possible to “group” the products by Base products? I want to see the other color variants beside the first color variant. For example, I have 10 products:

    C Red
    C Blue
    B Red
    A Gray
    D Yellow
    B Blue
    C Yellow

    I want to see these products this way:

    C Red
    C Blue
    C Yellow (moved here)
    B Red
    B Blue (moved here)
    A Gray
    D Yellow

    Is it possible OOTB? Or what the best way can be?

    I look forward for your advices. Thanks.

    Like

  2. The site looking good. I can’t wait to see future posts on this site. I hope you will share informative posts through this blog. Looking forward to see the next post.

    Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

%d bloggers like this: