Many Data Scientists launch into machine learning projects without having a production plan, which often leads to serious problems at the time of deployment. It is time-consuming and expensive to create models, so one needs to plan before putting it in production. Here are the steps that should follow for a successful deployment of ML models to production.
Let’s say you have this amazingly powerful machine learning model. How might we take it to production? What are the steps in the process of building a real product that uses cutting-edge AI technology? I’ll break down some scenarios and give our general approach for each one:
Case 1: Self-service app powered by ML models built and maintained by a centralized ML team
In this scenario, access to your customer data is controlled through well-defined API endpoints protected by authentication (OAuth) and/or encryption (SSL). An example of this could be if you were building an eCommerce site with mostly public data from eBay or Alibaba.
The usual path for data scientists in this scenario is to build an internal Python package that can be used by the app and its models. Then, as part of CI/CD, deploy code to production servers.
Case 2: Enterprise app with public APIs that integrate with existing applications
In some cases, you might have access to your customer’s data via an application programming interface (API). In these scenarios, it’s important that security be a first-class citizen. Your data scientists should work closely with security teams when implementing new features or updating the system to protect sensitive information. The usual path for data scientists working in this scenario is almost identical to Case 1 as they likely will not need direct access to production systems or live user training data.
Case 3: Enterprise app with private APIs that integrate with on-premise or cloud applications
In some cases, your enterprise has access to data via existing APIs but the training and testing data are difficult to get hold of. In these scenarios, it is common for people in non-data science roles (e.g., engineers, product managers) to have different ideas about how you should implement a model than the data scientists themselves! We recommend having regular meetings between all parties involved as models can take time and resources to train properly — hence the importance of alignment with key stakeholders early on in the process. The usual path for data scientists working in this scenario is almost identical to Case 1 as they likely will not need direct access to production systems or live user training data.
Case 4: Enterprise app with on-premise ML infrastructure and API access to customer data
In many enterprise organizations, there is a centralized machine learning (ML) team that builds the models in-house and provides them for use by other teams through an internal API. In this scenario, it is likely that you will have direct access to production systems, customer training data, and possibly even live production traffic if your model needs to update as new events happen. However, we still recommend having somebody who understands the full stack of your ML system involved in steering product development early on in the process so that all parties are aware of how their work fits into a bigger picture.
Another thing to watch out for in this scenario is over-fitting. For example, say your production data is noisy and/or sparse so you choose a modeling technique that works in those scenarios (e.g., boosted decision trees). However, in production, the noise goes away and now your model doesn’t work as well! This is why it’s important to have systems in place for monitoring your models and getting alerts when they stop working or start reporting strange results.
Case 5: Enterprise app with on-premise ML infrastructure that provides batch jobs to training servers
In many enterprise organizations, there is existing machine learning pipelines in place that ingest customer data from web services of some sort and then train them using Hadoop, Spark, or another system. In this scenario, it is likely that you will not have direct access to production systems, customer training data, or even live production traffic.
Case 5 is similar to case 2 in the sense that you should work with security and compliance teams early on in the process as well as with core ML teams who will likely be providing code for your app to use directly. You’ll also need a way of monitoring your models so that you can get alerts when they stop working or start reporting strange results.
In general, it’s very common for feature engineering tasks (e.g., cleaning up messy text) to fall onto product managers and data analysts/scientists where they need to be trained once or twice using sample data before going on their merry way. In my experience, this results in messy code that has a high chance of breaking over time as well as users who don’t really understand why/how the model is making predictions. Instead, I highly recommend hiring a junior engineer or data scientist to help you with some of the more complex feature engineering tasks where they will have access to your source code so they can build tools and learn about how your system works in production.
Case 6: Enterprise app with original training data
In these scenarios, you’ll need to make sure that any sensitive user information (e.g., names, emails) is either sanitized or removed from the training dataset before passing it along to teams building models within their product development lifecycle. You may also want to check out Shared Learning Environments (SLEs) to see if these are things that they can help you with.