In SAP Hybris Commerce, requests to the SOLR search engine are created by Query Builders. Simply put, these components convert user queries into SOLR queries. Certainly, it is not possible to explain the search results if you don’t know what SOLR request was generated and why it contains particular conditions in a particular form. Unfortunately, the official documentation is very sparse and lacks examples. This article explains the differences between the available query builders. You will also find it a useful complement to one of the previous articles about search relevancy.
Hybris has the following OOTB query builders:
- Default Free Text Query Builder
- Multi-Field Free Text Query Builder
- DisMax Free Text Query Builder
Hybris uses the custom relevance formula for two of these builders, Default and DisMax. For the Multi-Field builder, Hybris uses the SOLR default formula (LuceneQParser). For details on this topic, see the last section of this article.
Default Free Text Query Builder
The default query builder is the simplest in Hybris. As its name suggests, this query builder is used by default. However, it uses the Hybris custom relevancy formula, multiMaxScore (see the last section for details).
Example. If you search for:
full text stringyour SOLR request will look like this:
- For each field defined as “Full text”:
OR EXACT MATCH for:
full(boosting X)
OR EXACT MATCH for:
text(the same)
OR EXACT MATCH for:
string(the same)
If the “wildcard” flag is active for the field:
OR WILDCARD MATCH for:
full*(boosting X/2)
OR WILDCARD MATCH for:
text*(the same)
OR WILDCARD MATCH for:
string*(the same)
If fuzzy search is active for the field:
- If the field type is “text”:
OR WILDCARD MATCH for:
full~(with the specified fuzziness, if specified; boosting X/4)
OR WILDCARD MATCH for:
text~(the same)
OR WILDCARD MATCH for:
string~(the same)
- If the field type is “text”:
If the “phrase search” flag is active for the field:
OR WILDCARD MATCH for:
full text string(boosting X*2)
Note that:
- There is only one configurable boosting factor: for the field. Phrase, wildcard, fuzzy search, and phrase search boosting factors depend on the field boosting factor and are non-configurable (hard-coded).
For example, the SOLR query for the request:
word1 "word2 word3" word4will look like this:
(
(code_string:word1^90.0) OR
(keywords_text_fr:word1^20.0) OR
...
(name_text_fr:word1^100.0)
) OR (
(code_string:"word2 word3"^90.0) OR
(keywords_text_fr:"word2 word3"^20.0) OR
...
(name_text_fr:"word2 word3"^100.0)
) OR (
(code_string:word4^90.0) OR
...
(name_text_fr:word4^100.0)
) OR (
(keywords_text_fr:word1~^10.0) OR
...
(name_text_fr:word1~^25.0)
) OR (
(keywords_text_fr:"word2 word3"~^10.0) OR
...
(name_text_fr:"word2 word3"~^25.0)
) OR (
(keywords_text_fr:word4~^10.0) OR
...
(name_text_fr:word4~^25.0)
) OR (
(code_string:word1*^45.0) OR
(ean_string:word1*^50.0)
) OR (
(code_string:"word2 word3"*^45.0) OR
(ean_string:"word2 word3"*^50.0)
) OR (
(code_string:word4*^45.0) OR
(ean_string:word4*^50.0)
) OR (
(keywords_text_fr:"word1 word2 word3 word4"^40.0) OR
...
(name_text_fr:"word1 word2 word3 word4"^100.0)
)So the pattern is:
EXACT (f1,f2,...fN) OR FUZZY (f1,f2,...fN) OR WILDCARD (f1,f2,...fN) OR PHRASE (f1,f2,...fN).Hybris uses multiMaxParser, so the largest score wins in all pattern components (both f1…fN and EXACT/FUZZY/WILDCARD/PHRASE groups).
Multi-Field Free Text Query Builder
According to the documentation, it builds the query in a way that the final score will be the sum of the scores of all subqueries. This is how SOLR works by default.
However, it works differently from the default free text query builder in other aspects as well.
Tokens. The builder tokenizes the user query by splitting it by whitespace characters. However, it also supports quoted phrases. For example:
User query:
word1 "word2 word3" word4Result:
—
word1—
word2 word3—
word4Note that the second and third words are considered a single token here. Only double quotes work.
Phrase queries. In the example above, the phrase query is built from the original query by removing the double quotes. So the phrase query will look like this:
word1 word2 word3 word4Boosting. It uses specific boost factors for the exact match, fuzzy match, wildcard match, and phrase match, and these factors are configurable in Backoffice. Fuzziness, sloppiness, and the wildcard query type are configurable too.
Sloppiness. A sloppy phrase query specifies a maximum “slop,” or the number of positions tokens need to be moved to get a match. In other words, it defines how many transpositions of the words need to be done for the exact match. The slop is zero by default, requiring exact matches.
For example, “the President of first” with:
slop=3will match the document containing “the first President of the USA is Washington”, but:
slop=2won’t.
Slop=2will work for the query “the President first”, for example.
Fuzziness. Fuzziness is a similar thing, but for the letters of the tokens. It is the maximum allowed number of edits to match. For example:
persidentwill match:
presidentwith fuzziness=1.
For example, the SOLR query for the request:
word1 "word2 word3" word4will look like this:
(code_string:
(word1^90.0 OR
"word2 word3"^90.0 OR
word4^90.0 OR
word1*^45.0 OR
"word2 word3"*^45.0 OR
word4*^45.0)
) OR
(keywords_text_fr:
(word1^20.0 OR
"word2 word3"^20.0 OR
word4^20.0 OR
word1~^10.0 OR
"word2 word3"~^10.0 OR
word4~^10.0 OR
"word1 word2 word3 word4"^40.0)
) OR
...
(name_text_fr:
(word1^100.0 OR
"word2 word3"^100.0 OR
word4^100.0 OR
word1~^25.0 OR
"word2 word3"~^25.0 OR
word4~^25.0 OR
"word1 word2 word3 word4"^100.0)
)So the pattern is:
f1 (EXACT, FUZZY, WILDCARD, PHRASE) OR f2 (EXACT, FUZZY, WILDCARD, PHRASE) ... OR FN (EXACT, FUZZY, WILDCARD, PHRASE).DisMax Free Text Query Builder
Similar to the previous one, but it groups some of the subqueries. The score for the group will be the maximum score of the subqueries that belong to that group, not the sum. Hybris uses its custom relevancy formula (multiMaxScore). For details on this topic, see the last section of this article.
The DisMax Query Builder also supports quotes in the query, boosting, sloppiness, and fuzziness.
This query builder supports the parameters groupByQueryType and tie.
- Group By Query Type. It changes the way disjunction max queries are grouped. If set to
true, it also groups queries by type, where the types are: free text query, free text fuzzy query, and free text wildcard query. - Tie. The
tieparameter defines how much the final score of the query will be influenced by the scores of the lower-scoring fields compared to the highest-scoring field:0.0makes a query a pure “disjunction max query”;1.0makes the query a pure “disjunction sum query,” where it doesn’t matter what the maximum-scoring subquery is.
Understanding MultiMax Query Parser
Hybris uses the custom MultiMax query parser developed by SAP for two query builders, DisMax and Default. The plugin is very simple, but you need to know that it modifies the way the score is calculated.
The easiest way to explain it is to demonstrate the internals by example.
Let’s take the following sample documents for experimentation:
Document #1.
- id: “doc1”
- title_text_en: “the first President of the USA is Washington titleA”
- description_text_en: “the first President of the USA is Washington”
Document #2
- id: “doc2”
- title_text_en: “the second President of the USA is John Adams titleB”
- description_text_en: “the second President of the USA is John Adams”
Document #3
- id: “doc3”
- title_text_en: “the first head of the USA is Washington titleC”
- description_text_en: “the first head of the USA is Washington”
Take the following sample request:
(title_text_en:"first" OR description_text_en:first) OR (title_text_en:"titleC" OR description_text_en:"titleC")All components are joined using OR. The request is very close to what Hybris creates using the query builders (see the examples above).
Let’s examine the relevancy calculation for two different cases:
- Default parser (
LuceneQParser) - Hybris custom parser (
multiMaxScoreParser)
and compare the results.
The screenshots below may look difficult to understand. Don’t read everything — just look through. Note that multiMaxScore uses a max function, while LuceneQParser uses a sum function. This is a key difference between the custom and default query parsers.



This debug information shows that:
- LuceneQParser calculates the score for each subquery and sums them up to get the total query score.
- multiMaxScoreParser sums up the scores of the subqueries. However, it doesn’t sum up the scores from each component of the subquery.
The last statement means that in the default Hybris implementation of the scoring formula and with the DisMax/MultiMax parsers, it may not be important how many fields contain a particular token. For the particular token and particular field, the score depends on the global and local frequency of the term and the field length. I used “may not” because there are other components of the formula that make the dependency indirect.
For example, “first” is used in both fields, in the name and in the description. The Hybris formula, multiMaxScore, calculates the:
score = 0.55because it is the maximum of the scores for:
title_text_en:firstand:
description_text_en:firstFor example, we have the following documents in SOLR:

Let’s take the following request:
title_text_en:"first" OR description_text_en:firstDifferent parsers will show the documents in a different order:

eDisMax parser is based on LuceneQParser, so you will have the same scores and results with eDisMax for this set.
Default Query Builder Example

Multi-Field Query Builder Example
Note that the order of the documents is a bit different because of the different way of grouping and calculating subqueries.

To sum up:
- Default Query Builder uses only one boosting factor; all others are built based on this one. It doesn’t recognize quotes in the query. It uses
multiMaxScoreinstead of the SOLR defaultLuceneQ. - Multi-Field Query Builder doesn’t use
multiMaxScore. It recognizes quotes in the query. It supports exact match, phrase, fuzzy and wildcard boosts, fuzziness, and sloppiness. - DisMax Query Builder uses
multiMaxScoreand recognizes quotes. It supports exact match, phrase, fuzzy and wildcard boosts, fuzziness, and sloppiness.
© Rauf Aliev, August 2017