Implementing Product Whitelisting/Blacklisting in SAP Commerce Cloud for Large Product and Customer Bases
Introduction
Document-level access control ensures that the search results have only those products that a logged customer is authorized to see. This is a common request for B2B solutions with a large and sophisticated product and customer models. Many manufacturers and suppliers want to provide exclusive or restrictive access to products for particular partners. Such an approach reduces the number of incorrect or incomplete orders and makes navigation easier. In this article, we are discussing in detail how to support product whitelisting/blacklisting per customer in SAP Commerce Cloud for large product sets and large customer base. We also present our solution and possible alternatives. The tests showed that the solution is capable to process millions of documents, tens thousands of customers and millions of access rules saying what product is blacklisted/whitelisted for what customer. This article is a collaborative effort of EPAM solution architects,- Igor Sokolov, EPAM Solution Architect, USA, igor_sokolov@epam.com
- Rauf Aliev, EPAM Solution Architect, USA, rauf_aliev@epam.com
- Artsiom Yemelyanenka, EPAM Solution Architect, USA, Artsiom_Yemelyanenka@epam.com
Problem definition
SAP Commerce has limited support for fine-grained product access control access to work on large amounts of data. It is represented by Product Visibility configured on the level of categories which is limited to only work with the persistence layer but not through search and Apache SOLR. Another approach is Item-level access control. Both work fine when the amounts of data involved are relatively small. When it comes to millions of products and customers, the out-of-the-box solution won’t fit. There are many facets and specific details in the original task we needed to take into account for solutioning. In this article, we discuss only one particular problem in isolation.- The Access Control Lists (ACL), the rules saying what product is whitelisted/blacklisted for what customers, are provided by the external system via data integration. The integration details are out of scope in the context of the article.
- The key challenge is how to implement product search. For other components, the solution is trivial.
- There are 1,000,000 products (P) in the product catalog.
- There are 1,000 product groups (PG)
- There are 30,000 customers (C)
- There are 5,000 customer groups (CG).
- There are 2,000,000 rules (C<->P, CG<->P, C<->PG, CG<->PG)
Solution
As you know, SAP Commerce Cloud is tightly integrated with Apache SOLR. This search engine is used not only for full text search but also for populating product listing pages. There is no easy way to implement required functionality by re-configuring SAP Commerce or Apache Solr. Additionally, because of the cloud nature of the new SAP Commerce and limitations which come with that, adding new third-party software capable to support document-level access is also not a solution. Apache SOLR, as well as many other search engines, uses an inverted index, a central component of the almost all search engines and a key concept in Information Retrieval. Both full text search and facets are built on top of the inverted index. The limitations of the search engines are originated from the limitations of the inverted index. The simplest and straightforward approach is listing relevant customer groups or customers in the designated product attribute and use it for the facet filtering by putting a customer id or customer group into the hidden facet. At the indexing phase, these ids are considered as terms for SOLR. However, it was obvious to us that such a straightforward approach won’t work with the millions of products and tens of thousands of customers and customer groups. In this document, we’ll use the abbreviation ACL (Access Control List) to represent a list of customers and customer groups that can access a product or product group. Products have an ACL associated with them. The list is non-ordered. There are separate lists for allow and disallow rule groups. There are four topics we needed to study:- ACL format
- ACL items, their order, and format.
- Where to store the ACLs
- Should we store the ACL field along with other product information?
- How to store the ACLs
- What changes should we make to the SOLR configuration?
- How should the field type be configured in SOLR schema.xml for performance and scalability?
- What changes should we make in SAP Commerce? How scalable is the solution after making these changes?
ACL format
An ACL specifies allowed and disallowed customers as well as allowed and disallowed customer groups. Each customer or group is represented by a unique ID, up to eight characters in length. The order of the items doesn’t matter. There are two types of ACL:- whitelist
- blacklist
How to store ACL
We experimented with two methods of storing ACLs for products:- in the Apache Solr
- in the Redis DB
Apache Solr: StrField type vs TextField type
To store ACLs in Solr, we need to find the field type that would be best to store a simple list of IDs. Apache SOLR out-the-box provides two types for the text fields:- solr.StrField type (associated field java class is StringField)
- solr.TextField type (associated field java class is TextField)
<fieldType name="string" class="solr.StrField" docValues="true" sortMissingLast="true"/> … <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> </analyzer> </fieldType> … <dynamicField name="*_string" type="string" indexed="true" stored="true" /> <dynamicField name="*_string_mv" type="string" indexed="true" stored="true" multiValued="true" /> … <dynamicField name="*_text" type="text" indexed="true" stored="true" /> <dynamicField name="*_text_mv" type="text" indexed="true" stored="true" multiValued="true" /> <dynamicField name="*_text_en" type="text_en" indexed="true" stored="true" /> <dynamicField name="*_text_en_mv" type="text_en" indexed="true" stored="true" multiValued="true" /> …It is critical to draw attention to the statement docValues=”true” in the string type definition. According to Apache SOLR documentation, DocValues is a parameter that indicates should SOLR server use the column-oriented approach with a document-to-value mapping built at index time or standard row-oriented. In other words, values of DocValue fields are densely packed into columns instead of sparsely stored like they are with stored fields. This feature was added to Lucene 4.0 to improve performance for faceting, sorting and highlighting. The faceting engine, for example, needs to look up each term that appears in each document that makes up the result set and pull the document IDs in order to build a list of facets. Of course, DocValues consumes significantly more memory than the regular inverted indexed type. Before saying how we used DocValues in our final solution, let’s have a look at the tests and experiments we conducted to get more inputs and insights.
Load Tests
The purpose of this test is having a ballpark estimations of SOLR indexing performance for a large set of data with different text field types. We generated a test set with random ACL field values, and index the set using curl (see Uploading data with index handlers) on the regular MacBook. We needed a rough estimate and relative numbers. We used a standard SAP Commerce schema and server configuration (including JVM settings). The setup includes:- 2.2 GHz Intel Core i7, 16Gb, MacOS X
- 1,000,000 products
- SOLR attribute containing a comma-separated list of customers/customer groups
- Allowed to access the item
- Not allowed to access the item
Product ID | Customers and groups allowed | Customers and groups not allowed |
Product1 | C3, C5, CG1, CG6 | C2, C4 |
Product2 | C1, CG3, C5 | C2, CG6 |
Product3 | CG1, C2 | C3 |
- A number of items in each list is random, from 0 to 1000.
- The customer or customer groups IDs are random, from 0 to 10000.
- Product name and code are random and unique
- Product IDs is random and unique
GeSHi Error: GeSHi could not find the language json (using path /var/www/html/hybrismart.com/public_html/wp-content/plugins/codecolorer/lib/geshi/) (code 2)
Field type | Solr.TextField | Solr.StrField |
Loading the whole dataset | 1388 items/sec | Out-of-memory |
Loading dataset in 20000-item chunks into an empty index | 1333 items/sec | Initially 2000-2500 items/sec. But Out of memory! |
‘Multivalued’ vs space-delimited field
The next challenge is how to represent a list of user/user group IDs. As Apache SOLR documentation says, there are two major options:- ‘Multivalued’ fields containing a list of IDs (see SOLR field properties)
- A text field containing a space-delimited list of IDs (for example, using Standard Tokenizer)
Load and Update tests
This test was aimed to put the arguments above to the test. Additionally, we added a new field type Solr.StrField, with multivalued on and docValues configuration disabled. The setup was the same as in the previous experiment.Field type | Space-delimited Solr.TextField | Multivalued Solr.TextField | Multivalued Solr.StrField (docValues= false) |
Loading the whole dataset | 1388 items/sec | Out-of-memory | 980 items/sec |
Loading dataset in 20000-item chunks into an empty index | 1333 items/sec | 500 items/sec | 1111 items/sec |
Atomic update: Removing items from the list, only even groups are removed (~50%) | N/A | 444 items/sec. | 645-700 items/sec |
Atomic update: adding one item to the list | N/A | 606 items / sec | 1333 items/sec |
Atomic update: removing one item from the list | N/A | 476 items / sec | 1333 items/sec |
Atomic update: replacing the list with the shorter version (50% shorter, removed all even items) | 1300 items/sec | N/A | 1333 items/sec |
Optimizing SOLR and OOTB Search Server Configuration
In the previous test, we used the OOTB configuration, which is expectedly not optimized for such volumes and data model. From the whole list of properties on Field type definitions and properties page the following ones deserve attention:- Stored. “If true, the actual value of the field can be retrieved by queries.”
- OmitTermFreqAndPositions. “If true, omits term frequency, positions, and payloads from postings for this field. This can be a performance boost for fields that don’t require that information. It also reduces the storage space required for the index. Queries that rely on a position that is issued on a field with this option will silently fail to find documents.”.
- omitNorms. “If true, omits the norms associated with this field (this disables length normalization for the field, and saves some memory). Defaults to true for all primitive (non-analyzed) field types, such as int, float, data, bool, and string. Only full-text fields or fields need norms.”
<fieldType name="acl_text" class="solr.TextField" positionIncrementGap="0" omitNorms="true" omitTermFreqAndPositions="true"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> </fieldType> … <dynamicField name="*_acl_nonstored" type="acl_text" indexed="true" stored="false"/> <dynamicField name="*_acl" type="acl_text" indexed="true" stored="true"/>
Load tests
This time, the tests were run in AWS (r5x.large, 4 vCPU, 32Gb) with use of SolrJ library. SolrJ provides multi-threaded data load and uses CPU more efficiently than the simple curl-based loader. This time we use a dockerized Apache SOLR 7.2.1, with xmxm=2Gb. The enhanced field type was added to the SAP Commerce’s out of the box solr configuration. The size of the generated JSON file with 1M products and upto 3,000 ACL elements was about 8Gb.Dataset – 1M products and | l_text_en Single thread – curl | _text_en solrj, 4 threads | _acl solrj, 4 threads |
500 acl elements | 42 min | 31 min | 5.3 min |
1000 acls | 86 min | 57 min | 6.5 min |
2000 acls | 163 min | 111 min | 9 min |
3000 acls | 252 min | 168 min | 12 min |
- The enhancements help a lot. The load time is 20x faster with the non-stored _acl fields.
- Multi-threaded data load is 1.5x faster than single-threaded.
- The changes in the field type configuration increases with the number of terms (ACL elements) in the documents is, but this increase is the smallest for the optimized field type
Where to store ACL
There are two major options on where to store the ACLs:- As a product attribute, in the main core/collection
- As an attribute of the separate SOLR core/collection
Load/Update Tests and Results
The experiment below is done on SOLR running in standalone mode in AWS with same parameters as it is stated in the previous experiment. The numbers below are in minutes:Single ACL Core solrj, 4 threads | Separate ACL core solrj, 4 threads | |||
Dataset – 1M products and.. | Initial load *_acl | Update *_acl | Initial Load *_acl | Update *_acl |
500 retailer groups | 5.3 min | 7.4 min | 3 min | 4 min |
1000 r.g. | 6.5 min | 9 min | 5 min | 6 min |
2000 r.g. | 9 min | 12 min | 8 min | 8.5 min |
3000 r.g. | 12 | 15 | 9.5 | 13 |
- Moving ACL data to a separate core leads to better performance. The increase is ranged from 76% to 25%, depends on the size of ACL record.
- Expectedly, the update operation takes ~30% more time than initial load
Testing all in one
The goal of the final test is to measure the maximum processing speed / minimum processing time for the combined approach:- A custom SOLR field type (not stored, with omitNorms and omitTermFreqAndPositions on)
- A separate SOLR core for ACLs
- Multi-thread processing
Dataset | vCPU x 4 stored=”false” solrj, 4 threads | vCPU x 16 stored=”false” solrj, 16 threads |
1M products with 30K retailer groups | 83 min | 17 min |
Results
Finally, to sum up our findings, we may conclude that- The default SAP Commerce/SOLR configuration is good for the typical/standard tasks and show good performance for querying.
- Understanding SOLR out-of-the-box capabilities and SOLR tuning options are critical if it comes to the huge volumes, indexing-intensive operations, and non-standard search logic. For such cases, SOLR tuning is a must.
- In terms of indexing and updating, multivalued field types are generally slower than the normal text fields equipped with a tokenizer.
- Atomic Updates changing only one or a few values in the ‘multivalued’ field (=array) take about the same time as Atomic Updates replacing the whole value for a normal text field.
- A separate, dedicated SOLR core/collection for ACL storage can speed up data load up to 76%. However, it creates additional challenges (see the details in the appropriate section).
- Using non-stored ACL fields helps to boost the data load significantly. However, it creates new challenges (see the details in the appropriate section).
- Making the indexing process multi-threaded is an efficient way of performance tuning.
- All the changes applied together gives great results. If the data were represented as a JSON file, its size would about 50Gb, and it would take about 17 minutes to load it into SOLR.
What and how to change SAP Commerce
Let’s have a look at the adjustments need to be done on the SAP Commerce to use ACLs. Here is where we are:- ACL is a single field of the “*_acl” type
- ACL is stored in a separate SOLR core/collection as a whitespace-separated list of customers or customer groups a product is accessible or not accessible to.
- The groups are organized into hierarchy
public class DefaultAclGroupFacetSearchListener implements FacetSearchListener {
private static final String FILTER_QUERY_FIELD = "{!join from=%s fromIndex=%s to=%s}%s";
@Autowired
private UserService userService;
@Override
public void beforeSearch(FacetSearchContext facetSearchContext) throws FacetSearchException {
facetSearchContext.getSearchQuery().addFilterQuery(this.getFilterQuery());
}
@Override
public void afterSearch(FacetSearchContext facetSearchContext) throws FacetSearchException {
// Handling notifications is not expected
}
@Override
public void afterSearchError(FacetSearchContext facetSearchContext) throws FacetSearchException {
// Handling notifications is not expected
}
private String[] getGroupsUid() {
return userService.getCurrentUser().getGroups().stream()
.map(PrincipalGroupModel::getUid)
.map(String::toLowerCase)
.toArray(String[]::new);
}
private QueryField getFilterQuery() {
String[] value = getGroupsUid();
String field = String.format(FILTER_QUERY_FIELD, “id”, “acl_core”, “id”, “allowed_groups_acl”);
return new QueryField(field, SearchQuery.Operator.OR, value);
}
}
private static final String FILTER_QUERY_FIELD = "{!join from=%s fromIndex=%s to=%s}%s";
@Autowired
private UserService userService;
@Override
public void beforeSearch(FacetSearchContext facetSearchContext) throws FacetSearchException {
facetSearchContext.getSearchQuery().addFilterQuery(this.getFilterQuery());
}
@Override
public void afterSearch(FacetSearchContext facetSearchContext) throws FacetSearchException {
// Handling notifications is not expected
}
@Override
public void afterSearchError(FacetSearchContext facetSearchContext) throws FacetSearchException {
// Handling notifications is not expected
}
private String[] getGroupsUid() {
return userService.getCurrentUser().getGroups().stream()
.map(PrincipalGroupModel::getUid)
.map(String::toLowerCase)
.toArray(String[]::new);
}
private QueryField getFilterQuery() {
String[] value = getGroupsUid();
String field = String.format(FILTER_QUERY_FIELD, “id”, “acl_core”, “id”, “allowed_groups_acl”);
return new QueryField(field, SearchQuery.Operator.OR, value);
}
}
Alternative solutions, Known Limitations and Challenges
ACL Data outside Solr
Our solution works well if there is direct access to Solr and we are able to create a new core and load ACL data to it in the multithreaded manner. However, some restrictions can interfere with these plans. For example, Solr service is provided as search-as-a-service. The most probably you won’t be able to create a separate core and use join requests for such setup. As an alternative to the Solr-based solution, there is an alternative option. Redis DB can be also used as a storage for ACL. The alternative solution is based on Redis DB. This NoSQL database is known as the ultrafast data store. Redis demonstrates the best results in manipulating of simple data structures, such as lists and sets. The benchmarks show that Redis is capable to run more than 100k set requests per second on Intel(R) Xeon(R) CPU E5520 @ 2.27GH. We ran load tests against our data (AWS, r5x.large for the Redis server):Measure | Time (s) |
Loading 1.5M product visibility rules | ~100 |
Checking visibility for random groups for 1000 random products | 0.1 |
Checking visibility for random groups for 10000 random products | 1 |
- Executing text search in SOLR. At this phase, the ACLs are not taken into account, and the list of products returned will contain items the customer can’t access to, which is part of the design. These items will be filtered out at the next phases.
- Filtering out the items the customer has no access to. For that, the system performs checks for each item from the returned set using the Redis-based API until the customer-facing list is not complete. Since the results are normally delivered paginated, the size of the customer-facing list is limited to page size.
- No need to customize Solr or add a new core. This point is important if the cloud environment has customization limitations, such as ones Azure-based SAP Commerce Cloud has.
- Initial data load is fast.
- Slower search. Total execution time is a sum of Solr query time, Redis query time and post-processing (intersection of the sets).
- It may involve a significant overhead (memory and CPU) because post-processing is performed on the application node for each customer query or product listing request.
- Depends on distribution. The post-processing will take more time and resources (CPU, memory) If the majority of products are not visible/accessible to the majority of users.
- Facets won’t work properly because the hidden products are involved in the facet calculation. If such products are removed from the result set, the facets won’t be valid.
Making it scalable
In our task definition, we had no more than 30k user groups participating in ACL fields. How to build the system if the bar is 10 times higher? What if the rules change frequently? If we go with the Solr solution, we’ll come up with the huge Solr index. If we go with the Redis-based solution, the visibility calculation process will be slower per item, and, consequently, much slower for batch processing. The scalable solution involves the use of both Solr and Redis components. In order to make the ACLs shorter and speed up indexing, you can use hashes instead of groups and perform the calculation in two phases. Instead of storing the customer or customer group id in the ACL field, you can store the hashed/group value in the index. The hash/grouping function should be designed to reduce the number of unique elements in Solr. At the query phase, the results should be post-processed to filter out irrelevant items. For each item from the result set given by Solr, the system requests Redis for the final decision. When the customer receives a sufficient number of products, the post-processed result set is delivered to the customer. For the logged in user, the customer id and group ids are known. To create a search result list or list of products for the category, the system calculates the hashes from the customer id and all customer groups. These IDs will be used for filtering in the Solr Search filtering query. The total number of hashed IDs is smaller than the total number of customer and group IDs.Conclusion
In this article, we presented the solutions to support product whitelisting/blacklisting per customer in SAP Commerce Cloud for large product sets and large customer base, namely:- The Solr-based solution where ACL (list of customers or customer groups) are stored in the Solr index.
- The Redis-based solution where ACL is stored in Redis and requested by Solr for each Solr item.
- The combined solution, where Solr ACL items are hashed ACL items, and Redis is used for post-processing to filter out the irrelevant items.
Authors
The design, prototypes, and implementation have been led by a team of solution architects and developers from EPAM USA and EPAM RU offices.- Igor Sokolov, EPAM Solution Architect, USA
- Rauf Aliev, EPAM Solution Architect, USA
- Artsiom Yemelyanenka, EPAM Solution Architect, USA