Amazon SageMaker gets eight new capabilities

Eight new capabilities have been unveiled for Amazon SageMaker, AWS’s end-to-end machine learning (ML) service. Developers, data scientists, and business analysts use Amazon SageMaker to build, train, and deploy ML models quickly and easily using its fully managed infrastructure, tools, and workflows.

The new features include new Amazon SageMaker governance capabilities that provide visibility into model performance throughout the ML lifecycle. New Amazon SageMaker Studio Notebook capabilities provide an enhanced notebook experience that enables customers to inspect and address data-quality issues in just a few clicks, facilitate real-time collaboration across data science teams, and accelerate the process of going from experimentation to production by converting notebook code into automated jobs. Finally, new capabilities within Amazon SageMaker automate model validation and make it easier to work with geospatial data.

“Today, tens of thousands of customers of all sizes and across industries rely on Amazon SageMaker. AWS customers are building millions of models, training models with billions of parameters, and generating trillions of predictions every month. Many customers are using ML at a scale that was unheard of just a few years ago,” said Bratin Saha, vice president of Artificial Intelligence and Machine Learning at AWS. “The new Amazon SageMaker capabilities announced today make it even easier for teams to expedite the end-to-end development and deployment of ML models.

From purpose-built governance tools to a next-generation notebook experience and streamlined model testing to enhanced support for geospatial data, we are building on Amazon SageMaker’s success to help customers take advantage of ML at scale.”

New ML governance capabilities in Amazon SageMaker

Amazon SageMaker offers new capabilities that help customers more easily scale governance across the ML model lifecycle. As the number of models and users within an organization increases, it becomes harder to set least-privilege access controls and establish governance processes to document model information (e.g., input data sets, training environment information, model-use description, and risk rating). Once models are deployed, customers also need to monitor for bias and feature drift to ensure they perform as expected.

Amazon SageMaker Role Manager makes it easier to control access and permissions: Appropriate user-access controls are a cornerstone of governance and support data privacy, prevent information leaks, and ensure practitioners can access the tools they need to do their jobs. Implementing these controls becomes increasingly complex as data science teams swell to dozens or even hundreds of people. ML administrators—individuals who create and monitor an organization’s ML systems—must balance the push to streamline development while controlling access to tasks, resources, and data within ML workflows.

Today, administrators create spreadsheets or use ad hoc lists to navigate access policies needed for dozens of different activities (e.g., data prep and training) and roles (e.g., ML engineer and data scientist). Maintaining these tools is manual, and it can take weeks to determine the specific tasks new users will need to do their jobs effectively. Amazon SageMaker Role Manager makes it easier for administrators to control access and define permissions for users. Administrators can select and edit prebuilt templates based on various user roles and responsibilities. The tool then automatically creates the access policies with necessary permissions within minutes, reducing the time and effort to onboard and manage users over time.

Amazon SageMaker Model Cards simplify model information gathering: Today, most practitioners rely on disparate tools (e.g., email, spreadsheets, and text files) to document the business requirements, key decisions, and observations during model development and evaluation. Practitioners need this information to support approval workflows, registration, audits, customer inquiries, and monitoring, but it can take months to gather these details for each model. Some practitioners try to solve this by building complex recordkeeping systems, which is manual, time consuming, and error-prone.

Amazon SageMaker Model Cards provide a single location to store model information in the AWS console, streamlining documentation throughout a model’s lifecycle. The new capability auto-populates training details like input datasets, training environment, and training results directly into Amazon SageMaker Model Cards. Practitioners can also include additional information using a self-guided questionnaire to document model information (e.g., performance goals, risk rating), training and evaluation results (e.g., bias or accuracy measurements), and observations for future reference to further improve governance and support the responsible use of ML.

Amazon SageMaker Model Dashboard provides a central interface to track ML models: Once a model has been deployed to production, practitioners want to track their model over time to understand how it performs and to identify potential issues. This task is normally done on an individual basis for each model, but as an organization starts to deploy thousands of models, this becomes increasingly complex and requires more time and resources.

Amazon SageMaker Model Dashboard provides a comprehensive overview of deployed models and endpoints, enabling practitioners to track resources and model behavior in one place. From the dashboard, customers can also use built-in integrations with Amazon SageMaker Model Monitor (AWS’s model and data drift monitoring capability) and Amazon SageMaker Clarify (AWS’s ML bias-detection capability). This end-to-end visibility into model behavior and performance provides the necessary information to streamline ML governance processes and quickly troubleshoot model issues.

Next-generation Notebooks

Amazon SageMaker Studio Notebook gives practitioners a fully managed notebook experience, from data exploration to deployment. As teams grow in size and complexity, dozens of practitioners may need to collaboratively develop models using notebooks. AWS continues to offer the best notebook experience for users with the launch of three new features that help customers coordinate and automate their notebook code.

Simplified data preparation: Practitioners want to explore datasets directly in notebooks to spot and correct potential data-quality issues (e.g., missing information, extreme values, skewed datasets, and biases) as they prepare data for training.

Practitioners can spend months writing boilerplate code to visualize and examine different parts of their dataset to identify and fix problems. Amazon SageMaker Studio Notebook now offers a built-in data preparation capability that allows practitioners to visually review data characteristics and remediate data-quality problems in just a few clicks—all directly in their notebook environment.

When users display a data frame (i.e., a tabular representation of data) in their notebook, Amazon SageMaker Studio Notebook automatically generates charts to help users identify data-quality issues and suggests data transformations to help fix common problems. Once the practitioner selects a data transformation, Amazon SageMaker Studio Notebook generates the corresponding code within the notebook so it can be repeatedly applied every time the notebook is run.

Accelerate collaboration across data science teams: After data has been prepared, practitioners are ready to start developing a model—an iterative process that may require teammates to collaborate within a single notebook. Today, teams must exchange notebooks and other assets (e.g., models and datasets) over email or chat applications to work on a notebook together in real time, leading to communication fatigue, delayed feedback loops, and version-control issues.

Amazon SageMaker now gives teams a workspace where they can read, edit, and run notebooks together in real time to streamline collaboration and communication. Teammates can review notebook results together to immediately understand how a model performs, without passing information back and forth.

Related News

With built-in support for services like BitBucket and AWS CodeCommit, teams can easily manage different notebook versions and compare changes over time. Affiliated resources, like experiments and ML models, are also automatically saved to help teams stay organized.

Automatic conversion of notebook code to production-ready jobs: When practitioners want to move a finished ML model into production, they usually copy snippets of code from the notebook into a script, package the script with all its dependencies into a container, and schedule the container to run.

To run this job repeatedly on a schedule, they must set up, configure, and manage a continuous integration and continuous delivery (CI/CD) pipeline to automate their deployments. It can take weeks to get all the necessary infrastructure set up, which takes time away from core ML development activities.

Amazon SageMaker Studio Notebook now allows practitioners to select a notebook and automate it as a job that can run in a production environment. Once a notebook is selected, Amazon SageMaker Studio Notebook takes a snapshot of the entire notebook, packages its dependencies in a container, builds the infrastructure, runs the notebook as an automated job on a schedule set by the practitioner, and deprovisions the infrastructure upon job completion, reducing the time it takes to move a notebook to production from weeks to hours.

Automated validation of new models using real-time inference requests: Before deploying to production, practitioners test and validate every model to check performance and identify errors that could negatively impact the business. Typically, they use historical inference request data to test the performance of a new model, but this data sometimes fails to account for current, real-world inference requests. For example, historical data for an ML model to plan the fastest route might fail to account for an accident or a sudden road closure that significantly alters the flow of traffic.

To address this issue, practitioners route a copy of the inference requests going to a production model to the new model they want to test. It can take weeks to build this testing infrastructure, mirror inference requests, and compare how models perform across key metrics (e.g., latency and throughput).

While this provides practitioners with greater confidence in how the model will perform, the cost and complexity of implementing these solutions for hundreds or thousands of models makes it unscalable.

Amazon SageMaker Inference now provides a capability to make it easier for practitioners to compare the performance of new models against production models, using the same real-world inference request data in real time. Now, they can easily scale their testing to thousands of new models simultaneously, without building their own testing infrastructure. To start, a customer selects the production model they want to test against, and Amazon SageMaker Inference deploys the new model to a hosting environment with the exact same conditions.

Amazon SageMaker routes a copy of the inference requests received by the production model to the new model and creates a dashboard to display performance differences across key metrics, so customers can see how each model differs in real time. Once the customer validates the new model’s performance and is confident it is free of potential errors, they can safely deploy it.

New geospatial capabilities in Amazon SageMaker make it easier for customers to make predictions using satellite and location data: Today, most data captured has geospatial information (e.g., location coordinates, weather maps, and traffic data). However, only a small amount of it is used for ML purposes because geospatial datasets are difficult to work with and can often be petabytes in size, spanning entire cities or hundreds of acres of land.

To start building a geospatial model, customers typically augment their proprietary data by procuring third-party data sources like satellite imagery or map data. Practitioners need to combine this data, prepare it for training, and then write code to divide datasets into manageable subsets due to the massive size of geospatial data. Once customers are ready to deploy their trained models, they must write more code to recombine multiple datasets to correlate the data and ML model predictions.

To extract predictions from a finished model, practitioners then need to spend days using open source visualization tools to render on a map. The entire process from data enrichment to visualization can take months, which makes it hard for customers to take advantage of geospatial data and generate timely ML predictions.

Amazon SageMaker now accelerates and simplifies generating geospatial ML predictions by enabling customers to enrich their datasets, train geospatial models, and visualize the results in hours instead of months. With just a few clicks or using an API, customers can use Amazon SageMaker to access a range of geospatial data sources from AWS (e.g., Amazon Location Service), open-source datasets (e.g., Amazon Open Data), or their own proprietary data including from third-party providers (like Planet Labs).

Once a practitioner has selected the datasets they want to use, they can take advantage of built-in operators to combine these datasets with their own proprietary data. To speed up model development, Amazon SageMaker provides access to pre-trained deep-learning models for use cases such as increasing crop yields with precision agriculture, monitoring areas after natural disasters, and improving urban planning. After training, the built-in visualization tool displays data on a map to uncover new predictions.