The promise of machine learning (ML) is vast and investment is increasing, but results are lagging because too many projects don’t make it to production in an operational system. A Wall Street data scientist recently said to me, “It took us two months to build and train the model and six months to deploy it.” This is the norm. Analytics leaders cite deployment and productionization – aka the “last mile” – as the biggest bottleneck for analytics and ML in particular. This is where analytics and production IT converge and where business value creation starts.
A few industry leaders in insurance, retail, media, and finance have solved the last mile challenge. They are iterating hundreds or thousands of models on the frontline. They are also deploying analytics in real time to make decisions and continuously grow and improve their businesses. What do they have that most enterprises do not? Answer: a modernized, model deployment strategy and capability.
Why Is Model Productionization So Hard?
Embedding models into business applications and deploying them into production workflows is where data science meets IT. Here are four of the most common reasons model productionization is so challenging:
Technology heterogeneity: Today, models reside between their data and the end-point applications running on infrastructure; yet, the tools and technology used at each stage are often in a constant state of change. As model development continues to be democratized, data scientists across business units are building models with an increasing variety of solutions. Similarly, tool options for data integration and management are evolving toward ever more flexible, friendly, and fast solutions. Compute and storage options from cloud and on-prem vendors offer continuous improvements in price-performance, while both new and legacy end-point applications – both legacy and new – come in all shapes and sizes.
Cultural divide between data science and production IT: There’s a great divide between data scientists and the production IT world where models create value. While software promotion from development to production is fine-tuned, analytic models are different in terms of processes and players. Production IT is a complex, fast-moving, high volume, governed universe where data scientists neither make nor necessarily understand the rules. Absent assimilating seeing a solution, data scientists’ tend to throw their models over the IT wall and hope for the best.
Model pipeline complexity: Models are pipelines of many components whose processes must be orchestrated and automated. Each component may run on separate schedules, on different compute resource types, and have separate dependencies. Because champion/challenger model competition is a bedrock of continuous improvement, there’s likely to be at least one “challenger” pipeline competing against the current “champion”. So, a “model” is, in fact, many nodes to be deployed, monitored, scaled, and managed.
Data challenges: Models are only as reliable as their data pipelines, and the data itself can stress model performance in a variety of ways. Missing files and/or streams can throw off a pipeline and missing and invalid values can throw off a model. Data volume changes can change model behavior too. Since it’s the norm to develop a model on one data set, train it on another, and certify it on yet another, and resetting model parameters is expected before production. As a result, data challenges are often considered the “known unknown”.
Components of a Model Deployment Platform
It’s hard to imagine, but there was a time when a data lake or data warehouse wasn’t a given. The next wave of centralized service for analytics will be the model deployment platform. Making model deployment work at scale – supporting many internal data science clients, supporting hundreds of models, minimizing deployment time and compute costs, ensuring governance, meeting SLAs and compliance requirements — is a sophisticated technical and organizational challenge that needs be tackled only once. So, the solution needs three facets: technology, process, and organization.
Technology: There’s consensus on many technology best practices, including: microservices-based architectures to support diverse model development tools and frameworks; future-proofing against changes in application, data, and infrastructure; metadata logging and supporting reproducibility and traceability; version controlling and CI/CD-compliance; API management; supporting collaboration and re-use of models; performance monitoring dashboards; API wrappers; et al. A model deployment platform requires tools that companies already use for collaboration, workflow, source code management, CI/CD, scheduling, and dashboards. New tools from emerging vendors are available to help with model asset management, deployment, monitoring, and scaling.
Process: Getting fast and fluid with model deployment and support where iteration is the holy grail for model excellence — requires a hardening of an enterprise’s model development life cycle (MDLC) and clarifying the steps where IT and data science converge. The model deployment platform is not just post-production but traverses the MDLC with touchpoints for QA, model asset management, source control, and data certification. Moreover, the MDLC must be instrumented to co-exist with enterprises’ DevOps processes for continuous integration/continuous deployment processes which manage risk, compliance, and info security as this is where data science and production IT really dance together.
Organization: Most data enterprises are missing a unit: a support team for data scientists’ models, aka a “ModelOps” team. This team can sit in IT or data science, but absent a ModelOps team, the workload defaults to data scientists. However, because the work is part IT and part data science operations, it’s a bad use of data scientists’ time. The ModelOps team is the nexus of communication between data scientists, data engineers, and DevOps, ensuring proper hand-offs and execution of go-live protocol. Coupled with a Model Deployment Platform is a new role: the “analytics engineer” whose role it is to glue the IT and data sciences objects together (e.g., version management, policy adherence on what goes into production, creating automated workflows, performance monitoring, et al).
Tips to Getting Started
Yes, a model deployment platform requires a technology investment, team changes, and process enhancements. And, it’s a sine qua non for achieving the vision of becoming an intelligent, model-driven digitized business.
But how do you get started? Some companies find that the most practical way to build clarity and consensus is to create a minimum viable product (MVP). Starting with one model and one business unit, devise a 3 to 6-month sprint plan for a creating an MVP which burn in the technology, roles, and process and provide a working example for your company. At each sprint retrospectively share the key lessons learned and which goals were achieved. Work with business and technical stakeholders to enlist others to “champion” the success. With a modern deployment platform and some careful planning, data science can converge with production IT and the flywheel effect of continuously learning models can become a reality.