The Refactoring Challenge

If you have been working in the consulting industry for your whole career, one thing that you fail to truly appreciate is the longevity of code. More often than not, it lasts “forever” (or at least for the longevity of the overall solution its part of). From my time in software product companies, the harsh reality is that, due to time and cost constraints, a developer typically has one and only one shot to “get it right”. I’m not talking about minor bugfixes here; I’m referring to the actual design of the code for a given feature and how it is chosen to be implemented. Product owners, responsible for the next “killer app” (ie. feature), simply do not have a vested interest in sparing precious development cycles “refactoring”. Expending effort to (re)achieve the status quo simply doesn’t sell more software licences, even if those familiar with the code can definitively “prove” that it is a wise (typically long-term) investment to do so. Likewise, code over time becomes entangled with other code, making it increasingly brittle and riskier to change. Backward capability is also a trait that needs to be factored into the decision making process. Most developers in consultant companies tend to transition between projects & accounts within a year, and while the average “rotation time” at EPAM is likely longer than most of EPAM’s competitors, for the individual developer (or BA, QA, etc.) there is always the lure of “greener pastures” on the “next big project” to fuel the desires for change amongst the developer ranks. The consequence of this of course is that many consultants fail to truly appreciate the challenges of supporting established code bases over the long haul since they are not around to witness it. They don’t have to “eat their own dogfood”. So my topic today hinges on how well Hybris (now SAP) has done in keeping the Hybris Commerce codebase “fresh” and overcoming the “refactoring challenge”. In the case of SAP Commerce Cloud, formerly Hybris, what we are talking about here is a code base nearing 22 years old. That’s a lot of “growing up” to do. And, frankly, it’s seen a lot of “growing out” too over the years, with the current code base (platform) so massive that it can easily take years to master every corner of it. Over the years we have seen countless iterations and refactorings of the platform, from the rise of the accelerators, add-ons, service layer, promotion engine overhaul, etc. There have been what I would deem “failed” refactorings as well, legacy Core+ concepts such as CIS from the v5.x era being my personal favourite example. Having watched Hybris evolve from the v4.x era, I would say that Hybris (now SAP) has done a decent job of balancing the competing pressures of net-new development, enhancements to existing features, and core refactoring. Back in January 2016, I conducted an extensive analysis of (then called) SAP Hybris. Below is the rough weighting that I ascertained SAP Hybris was allocating their R&D dollars to across 5 categories at the time:
  1. Net-new Features
  2. Enhancing Existing Features
  3. Refactoring Existing Features
  4. SAP Integration
  5. YaaS
    • Does everyone remember this precursor to today’s microservices rage?
Figure 1 – Product Development Focus For Commerce By Hybris/SAP, January 2016
There are a few interesting things about this pie chart and the distributions it represents. Firstly, we see a fairly even distribution of R&D dollars across the spectrum. Secondly, we see innovation, as represented by the “net-new features” slice, as the most weighty. I would argue that this is a key reason SAP Commerce Cloud has maintained #1 positioning dominance in the years since I collected this data. Thirdly, and most critical to the topic at hand, is the fact that the second most weighty group is actually the “refactoring existing features” slice. To invest nearly ¼ of all R&D effort in refactoring efforts is, in my mind, impressive. Each one of those dollars represents a concerted effort of Hybris leadership to ensure that their platform remains relevant not only today, but also relevant for tomorrow, thereby arguably avoiding (thus far) the fate of Oracle ATG, which seems to have fallen off a cliff in terms of popularity despite its dominance in years past. This investment should not be underrated. Think about the platform today, and then take away the backoffice. Return to a time when the promotion and voucher engines didn’t really work together. Try convincing someone today that they should start from an accelerator that does not natively support responsive design. Hybris simply wouldn’t be where it is today if these systems and many more were not given “major surgery” in years past. Fast forward now to (almost) 2019. While I have not “rerun” the math for recent R&D investments (perhaps a topic for a future blog article), anecdotally, my perceptions of R&D investments based on a reading of various roadmaps and release notes over the years tells me that things have not dramatically changed. For those familiar with Hybris history, YaaS has obviously disappeared. (Some would say “evolved” into Context-Driven Services.) There is lots of “net-new” investment in Kubernetes, Docker and SAP’s “Cloud” offering, and recently SAP’s “Cloud v2” offering, to name just a few. Existing feature sets continue to evolve, as the APIs expand and evolve to support more API-based designs and integrations, including headless commerce and microservice-based architectures. And SAP continues to expand and enhance the out of the box integrations with various other SAP software, whether it’s from the SAP Customer Experience portfolio or the broader SAP portfolio. Yet despite all that investment, there is some “grey hair” to be concerned about. I remember Hybris boldly proclaiming in the v4.x era that, “the end of JALO [layer] was nigh” with the soon to be released v5.x series. Fast forward 2+ years, and we now have SAP proclaiming “the end of JALO [layer] was nigh” (again), with the soon to be released v6.x series. Lo and behold, we still have JALO today. Despite the value Service Layer and the newer Service Layer Direct have brought to the platform, despite the intention that these replace the JALO layer, despite the longstanding “deprecated” status of the JALO layer, the reality is that the JALO layer is so deep, so embedded in the platform that changing it or outright removing it would have a disproportionally high disruptive impact on the nearly 1,869 (per Hybris code bases in production today globally. That in turn would cause major pain to those production environments, gum up the desire for licence holders to upgrade, and otherwise severely increase the friction associated with adopting legitimate, cool new technology that SAP opts to roll out in future SAP Commerce Cloud versions. Basically, even if it’s a “win” in the long term, it’s in no one’s interest short term. While we tend to think of refactoring in terms of “code” by default, it’s not necessarily limited to “code only”. Think of the install scripts that SAP has introduced over the years as a way to improve the install process of the platform and set up/configure it (eg. select an accelerator) to your purposes. But this raises another interesting question: Why hasn’t SAP refactored the build system? Ant, for all its capabilities, is ancient in software lifecycle terms. The core Java world has more or less replaced it with Maven as the new de facto build system king. And yet, the pain and friction caused by upgrading the default build system of the Hybris platform is not nearly as impactful as attacking the JALO layer. Is it disruptive? Absolutely. It would impact almost all Hybris deployments everywhere. But the key difference is that the (code) customizations embodied by those 1,869 production Hybris e-commerce sites is not threatened. Testing a new build system can be easily done in lower level environments, and even in a worst-case scenario, a production site would take longer to deploy as unexpected deployment kinks resulting in site downtime were worked out. But once built and deployed, the overall functionality of the site would not be in question. In my capacity as SAP Customer Experience (Hybris) Competency Center Head at EPAM, I have the privilege of working with some of the best SAP Commerce Cloud technical talent in the business. For a select group of them, the aforementioned challenge around ANT/Maven has not gone unnoticed. This team has set out to prove the feasibility of converting the SAP Commerce Cloud build system from ANT to Maven. And they have succeeded. In fact, they discovered that converting from ANT to Maven is relatively straightforward; more “brawn” than “brain” in so far that there are a few key patterns to follow in the conversion process. Once these patterns are “known”, the rest of the conversion is just brute forcing your way through the rest of the packages/modules/extensions. In our next Hybrismart blog post, this team from EPAM is going to tell the story of their journey to “Mavenize” SAP Commerce Cloud, the benefits they have seen as a result, and their hope that one day SAP overcomes the “refactoring challenge” and officially retires ANT in favour of Maven.

Leave a Reply