By the end of 2024, 75 % of enterprises will shift from piloting to operationalizing artificial intelligence, and the vast majority of workloads will end up in the cloud. The complexity, magnitude, and length of migrations may be daunting for some enterprises that plan to migrate to the cloud. Different teams have different appetites for new tooling. While the application development team is focused on running their web applications on premises, the data science team is hungry for the latest cloud technology. Even with a multi-year cloud migration plan, some of the product releases must be built on the cloud in order to meet the enterprise ‘s business outcomes.
We propose hybrid machine learning patterns as an intermediate step in your journey to the cloud for these customers. Local compute resources such as personal laptops or corporate data centers are typically used in hybrid ML patterns. With the hybrid ML architecture patterns described in this post, enterprises can achieve their desired business goals without having to wait for the cloud migration to complete. We want to support customer success in all forms.
We have published a new white paper to help you integrate the cloud with your existing machine learning infrastructure. See the guides for more information.
The architecture patterns are hybrid.
The whitepaper gives you an overview of the various hybrid ML patterns across the entire lifecycle, including model development, data preparation, training, deployment, and ongoing management. The eight different hybrid ML architectural patterns are summarized in the table. In addition to the advantages and disadvantages, we provide a preliminary reference architecture for each pattern. When the level of effort to maintain and scale a pattern has exceeded the value it provides, we identify a “ when to move ” criterion to help you make decisions.
There is development.
There is deployment.
|Develop on personal computers, train and host in the cloud||Train locally, deploy in the cloud||Serve ML models in the cloud to applications hosted on premises|
|Develop on local servers, train and host in the cloud||Store data locally, train and deploy in the cloud||Host ML models with [email protected] to applications on premises|
|Develop in the cloud while connecting to data hosted on premises||Train with a third-party SaaS provider to host in the cloud|
|Train in the cloud, deploy ML models on premises||Orchestrate hybrid ML workloads with Kubeflow and Amazon EKS Anywhere|
There is a focus on serving models hosted in the cloud to applications hosted on premises in this post.
An overview of architecture.
enterprise migrations is the most common use case for this hybrid pattern. Your data science team may be ready to deploy to the cloud, but your application team is still working on their code to host on cloud-native services. This approach allows the data scientists to bring their newest models to market while the application team considers when, where, and how to move the rest of the application to the cloud.
The architecture for hosting an ML model via Amazon is shown in the diagram, serving responses to requests from applications hosted on premises.
There is a technical deep dive.
We focus on the various components that comprise the hybrid workload explicitly and refer to resources elsewhere as necessary in this section.
A retail company has an application development team that has hosted their website on premises. The company wants to improve brand loyalty, grow sales and revenue, and increase efficiency by using data. They plan to increase customer engagement by 50 % by adding a recommended for you button on their home screen. They are struggling to deliver personalized experiences due to the limitations of static, rule-based systems, complexity and costs, and lack of platform integration due to their current architecture.
The application team has a 5-year enterprise migration strategy to migrate their web application to the cloud, whereas the data science teams are ready to begin implementation in the cloud. With the hybrid architecture pattern described in this post, the company can achieve their desired business outcome quickly without having to wait for the 5-year enterprise migration to complete.
The data scientists train and deploy the trained model in the cloud. The exposed endpoints allow the ecommerce web application to consume the model. In detail, let ‘s walk this through.
In the model development phase, data scientists can use local development environments, such as PyCharm or Jupyter installations on their personal computer, and then connect to the cloud via a command line interface. They can use the single web-based visual interface that comes with common data science packages and kernels for model development.
Data scientists can take advantage of SageMaker ‘s training capabilities, which include access to on-demand instances, automatic model tuning, managed Spot Instances, checkpointing for saving the state of models, managed distributed training, and many more. Train a Model with Amazon is an overview of training models.
Read more: Free Pattern: Crochet Butterfly
After the model is trained, data scientists can deploy the models using SageMaker hosting capabilities and expose a REST HTTP endpoint serving predictions to end applications hosted on premises. The application development teams can use the hosted endpoints to interact with the machine learning model. The response times for the deployed models are as low as a few milliseconds. Real-time responses are required in use cases.
To provide inference results to its end users, the client application on premises connects with the ML model hosted on the SageMaker hosted endpoint on Amazon Web Services over a private network. The client application can use any client library to invoke the endpoint using an HTTP Post request. Some of the low-level details, such as the use of the AWS credentials saved in our client application environment, can be abstracted with the use of the SageMaker invoke-endpoint runtime command from the Boto3 client.
It is possible to make the endpoint accessible over the internet. Adding an Amazon function in between is a common pattern that you can use to access SageMaker hosted endpoints. In order to send the request in the format expected by the endpoint, or postprocessing for transforming the response into the format required by the client application, you can use the Lambda function. Call Lambda is a model endpoint for the Amazon SageMaker.
To provide inference results to its end users, the client application on premises uses a private network and a Direct Connect connection to connect with the models hosted on SageMaker.
The diagram shows how the data science team develops the model, performs training, and deploys it in the cloud, while the application development team develops and deploys the application on premises.
After the model is deployed into the production environment, your data scientists can use Amazon SageMaker Model Monitor to continuously monitor the quality of the models in real time. When there are deviations in the model quality, they can set up an automated alert triggering system. When the quality of the model reaches certain thresholds, Amazon CloudWatch Logs will notify you. Data scientists can take corrective actions such as retraining models, auditing systems, or fixing quality issues without having to monitor models manually. The downside of implementing monitoring solutions from scratch can be avoided with the help of Amazon Managed Services.
Your data scientists can reduce the time required to deploy their models by using Amazon ‘s Inference Recommender. It helps your data scientists choose the best instance type and configuration for their models.
It is always a good idea to separate hosting your model from your application. The data scientists use dedicated resources to host their models that are separated from the application, which simplifies the process of pushing better models. A key step in innovation is this. The model will be highly performant if there is no tightcoupling between the model and the application.
This approach provides the ability to redeploy a model with updated data, as well as improving the model performance with updated research trends. Markets are changing all the time, and the model needs to stay up to date with the latest trends. Retraining and redeploying your model with updated data is the only way you can fulfill that requirement.
Check out the hybrid machine learning white paper, in which we look at additional patterns for hosting machine learning models in the cloud. We look at hybrid ML patterns across the entire lifecycle. We look at training and deployment in the cloud. Patterns for training locally to deploy on the cloud, and even to host models in the cloud to serve applications on premises are discussed.
How do you integrate the cloud with your existing infrastructure ? Contact us at hybrid-ml-support @ amazon if you want to engage the authors of this document for advice on your cloud migration.
About the authors.
Alak Eswaradass is a Solutions Architect at AWS, based in Chicago, Illinois. She is passionate about helping customers design cloud architectures utilizing AWS services to solve business challenges. She hangs out with her daughters and explores the outdoors in her free time.
Emily Webber joined AWS just after SageMaker launched, and has been trying to tell the world about it ever since! Outside of building new ML experiences for customers, Emily enjoys meditating and studying Tibetan Buddhism.
Roop Bains is a Solutions Architect at AWS focusing on AI/ML. He is passionate about machine learning and helping customers achieve their business objectives. In his spare time, he enjoys reading and hiking.