sagemaker
Open source library for training and deploying models on Amazon SageMaker.
Description
.. image:: https://github.com/aws/sagemaker-python-sdk/raw/master/branding/icon/sagemaker-banner.png :height: 100px :alt: SageMaker ==================== SageMaker Python SDK ==================== .. image:: https://img.shields.io/pypi/v/sagemaker.svg :target: https://pypi.python.org/pypi/sagemaker :alt: Latest Version .. image:: https://img.shields.io/pypi/pyversions/sagemaker.svg :target: https://pypi.python.org/pypi/sagemaker :alt: Supported Python Versions .. image:: https://img.shields.io/badge/code_style-black-000000.svg :target: https://github.com/python/black :alt: Code style: black .. image:: https://readthedocs.org/projects/sagemaker/badge/?version=stable :target: https://sagemaker.readthedocs.io/en/stable/ :alt: Documentation Status .. image:: https://github.com/aws/sagemaker-python-sdk/actions/workflows/codebuild-ci-health.yml/badge.svg :target: https://github.com/aws/sagemaker-python-sdk/actions/workflows/codebuild-ci-health.yml :alt: CI Health SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker. With the SDK, you can train and deploy models using popular deep learning frameworks **Apache MXNet** and **PyTorch**. You can also train and deploy models with **Amazon algorithms**, which are scalable implementations of core machine learning algorithms that are optimized for SageMaker and GPU training. If you have **your own algorithms** built into SageMaker compatible Docker containers, you can train and host models using these as well. To install SageMaker Python SDK, see `Installing SageMaker Python SDK <#installing-the-sagemaker-python-sdk>`_. βπ₯ SageMaker V3 Release ------------------------- Version 3.0.0 represents a significant milestone in our product's evolution. This major release introduces a modernized architecture, enhanced performance, and powerful new features while maintaining our commitment to user experience and reliability. **Important: Please review these breaking changes before upgrading.** * Older interfaces such as Estimator, Model, Predictor and all their subclasses will not be supported in V3. * Please see our `V3 examples folder <https://github.com/aws/sagemaker-python-sdk/tree/master/v3-examples>`__ for example notebooks and usage patterns. Migrating to V3 ---------------- **Upgrading to 3.x** To upgrade to the latest version of SageMaker Python SDK 3.x: :: pip install --upgrade sagemaker If you prefer to downgrade to the 2.x version: :: pip install sagemaker==2.* See `SageMaker V2 Examples <#sagemaker-v2-examples>`__ for V2 documentation and examples. **Key Benefits of 3.x** * **Modular Architecture**: Separate PyPI packages for core, training, and serving capabilities * `sagemaker-core <https://pypi.org/project/sagemaker-core/>`__ * `sagemaker-train <https://pypi.org/project/sagemaker-train/>`__ * `sagemaker-serve <https://pypi.org/project/sagemaker-serve/>`__ * `sagemaker-mlops <https://pypi.org/project/sagemaker-mlops/>`__ * **Unified Training & Inference**: Single classes (ModelTrainer, ModelBuilder) replace multiple framework-specific classes * **Object-Oriented API**: Structured interface with auto-generated configs aligned with AWS APIs * **Simplified Workflows**: Reduced boilerplate and more intuitive interfaces **Training Experience** V3 introduces the unified ModelTrainer class to reduce complexity of initial setup and deployment for model training. This replaces the V2 Estimator class and framework-specific classes (PyTorchEstimator, SKLearnEstimator, etc.). This example shows how to train a model using a custom training container with training data from S3. *SageMaker Python SDK 2.x:* .. code:: python from sagemaker.estimator import Estimator estimator = Estimator( image_uri="my-training-image", role="arn:aws:iam::123456789012:role/SageMakerRole", instance_count=1, instance_type="ml.m5.xlarge", output_path="s3://my-bucket/output" ) estimator.fit({"training": "s3://my-bucket/train"}) *SageMaker Python SDK 3.x:* .. code:: python from sagemaker.train import ModelTrainer from sagemaker.train.configs import InputData trainer = ModelTrainer( training_image="my-training-image", role="arn:aws:iam::123456789012:role/SageMakerRole" ) train_data = InputData( channel_name="training", data_source="s3://my-bucket/train" ) trainer.train(input_data_config=[train_data]) **See more examples:** `SageMaker V3 Examples <#sagemaker-v3-examples>`__ **Inference Experience** V3 introduces the unified ModelBuilder class for model deployment and inference. This replaces the V2 Model class and framework-specific classes (PyTorchModel, TensorFlowModel, SKLearnModel, XGBoostModel, etc.). This example shows how to deploy a trained model for real-time inference. *SageMaker Python SDK 2.x:* .. code:: python fro
Release History
| Version | Changes | Urgency | Date |
|---|---|---|---|
| 3.8.0 | Imported from PyPI (3.8.0) | Low | 4/21/2026 |
| v2.257.2 | ### Enhancements * Update SDK to use latest LMI v23 image for sdk v2.x (#5710) * Update SDK to use latest LMIv22 image for sdk v2.x (#5641) * Update SDK to use latest LMI image for sdk v2.x (#5617) ### Bug Fixes * Security fixes for Triton HMAC key exposure and missing integrity check (v2) (#5656) * Include jumpstart/region_config.json in MANIFEST.in (#5605) * Apply default experiment config for pipelines only in regions with SageMaker Experiments (#5570) ### Other Changes * Remove | High | 4/21/2026 |
| v3.8.0 | ## v3.8.0 (2026-04-16) ### - New Feature - Feature Group Manager - Image Upgrades - Remove Pytorch hard dependency - Bug Fixes: - Add MLFLowConfig to Base Model - Support for docker compose > v2 - Improve SDK v3 Hugging Face support | High | 4/16/2026 |
| v3.7.1 | ### Features - **Telemetry**: Added telemetry emitter to `ScriptProcessor` and `FrameworkProcessor`, enabling SDK usage tracking for processing jobs via the telemetry attribution module (new `PROCESSING` feature enum added to telemetry constants) ### Fixes - **ModelBuilder**: Fixed `accept_eula` handling in ModelBuilder's LoRA deployment path β previously hardcoded to `True`, now respects the user-provided value and raises a `ValueError` if not explicitly set to `True` - **Evaluate**: Fixe | Medium | 4/1/2026 |
| v3.7.0 | ### Fixes - **ModelBuilder**: Sync Nova hosting configs with AGISageMakerInference - **Evaluate**: Remove GPT OSS model evaluation restriction ### Features - **AWS Batch**: Add support for Quota Management job submission and job priority update - **AWS Batch**: Extend list_jobs_by_share for quota_share_name - **Evaluate**: Support IAM role for BaseEvaluator - **Telemetry**: Add telemetry attribution module for SDK usage provenance - **MLflow**: Metrics visualization, enhanced wait UI, | Medium | 3/25/2026 |
| v3.6.0 | ### Fixes - **HyperparameterTuner**: Include sm_drivers channel in HyperparameterTuner jobs - **Pipeline**: Fix handling of training step dependencies to allow successful pipeline creation - **ModelBuilder**: Fix the bug in deploy from LORA finetuning job ### Features - **Feature Processor**: Port feature processor to v3 - **Jumpstart**: Add EUSC region config for JumpStart | Low | 3/20/2026 |
| v2.257.1 | ### Bug Fixes * Fix test failures with pytest and setuptools compatibility (#5574) ### Dependency Updates * Relax protobuf version constraint to <7.0 (#5573) ### Enhancements * Add telemetry for Feature Store (#5557) * Add VERL (Versatile Reinforcement Learning) support (#5498) | Low | 3/5/2026 |
| v3.5.0 | ### Features - **Feature Store v3**: New version of Feature Store functionality - **Batch job listing by share identifier**: Added support for listing Batch jobs filtered by share identifier - **Stop condition for model customization trainers**: Added stopping condition support to model customization trainers - **EMRStep smart output**: Enhanced EMR step output handling with smart output capabilities - **Transform AMI version support**: Added support for specifying AMI version in SageMaker | Low | 3/3/2026 |
| v3.4.1 | ### Fixes - **Pipelines**: Correct Tag class usage in pipeline creation (#5526) - **ModelTrainer**: Support PipelineVariables in hyperparameters (#5519) - **HyperparameterTuner**: Include ModelTrainer internal channels (#5516) - **Experiments**: Don't apply default experiment config for pipelines in non-Eureka GA regions (#5500) ### Features - **JumpStart**: Added ISO regions support (#5505) - **JumpStart**: Added version 1.4 and 1.5 (#5538) ### Chores - Added unit and integration | Low | 2/11/2026 |
| v2.257.0 | ## v2.257.0 (2026-02-03) ### Features * Update image URIs for DJL 0.36.0 release | Low | 2/3/2026 |
| v3.4.0 | ### Features - feat: add emr-serverless step for SageMaker Pipelines ### Bug fixes and Other Changes - Add Nova recipe training support in ModelTrainer - Add Partner-app Auth provider - Add sagemaker dependency for remote function by default V3 | Low | 1/23/2026 |
| v2.256.1 | ### Bug fixes and Other Changes * Bug fixes remote function | Low | 1/21/2026 |
| v3.3.1 | ### Bug fixes and Other Changes * ProcessingJob fix - Remove tags in Processor while Job creation * Telemetry Updates * sagemaker-mlops bug fix - Correct source code 'dependencies' parameter to 'requirements' * aws_batch bug fix - remove experiment config parameter as it Estimator is deprecated. | Low | 1/13/2026 |
| v2.256.0 | ### Features - Image for Numpy 2.0 support with XGBoost ### Bug fixes and Other Changes - Bug fix for Triton Model server for inference - Removal of hmac key parameter for remote function - Bug fixes for input validation for local mode and resource management for iterators | Low | 1/9/2026 |
| v3.3.0 | ### Features - AWS_Batch: queueing of training jobs with ModelTrainer ### Bug fixes and Other Changes - Fixes for model registry with ModelBuilder | Low | 12/20/2025 |
| v3.2.0 | ### Features - Evaluator handshake with trainer - Datasets Format validation ### Bug fixes and Other Changes - Add xgboost 3.0-5 to release - Fix get_child_process_ids parsing issue | Low | 12/20/2025 |
| v3.1.1 | ### Bug fixes and Other Changes - Fine-tuning SDK: - Add validation to bedrock reward models - Hyperparameter issue fixes, Add validation s3 output path - Fix the recipe selection for multiple recipe scenario - Train wait() timeout exception handling - Update example notebooks to reflect recent code changes | Low | 12/11/2025 |
| v3.1.0 | # Model Fine-Tuning Support in SageMaker Python SDK V3 Weβre excited to introduce comprehensive model fine-tuning capabilities in the SageMaker Python SDK V3, bringing state-of-the-art fine-tuning techniques to production ML workflows. Fine-tune foundation models with enterprise features including automated experiment tracking, serverless infrastructure, and integrated evaluationβall with just a few lines of code. ## What's New The SageMaker Python SDK V3 now includes four specialized F | Low | 12/3/2025 |
| v2.255.0 | ## What's Changed - Extracts reward Lambda ARN from Nova recipes - Passes it as training job hyperparameter - Added LLMFT recipe support with standardized recipe handling - Enhanced recipe validation and multi-model type compatibility | Low | 12/3/2025 |
| v3.0.1 | ## What's Changed * Update pyproject.toml and prepare for v3.0.1 release by @zhaoqizqwang in https://github.com/aws/sagemaker-python-sdk/pull/5329 **Full Changelog**: https://github.com/aws/sagemaker-python-sdk/compare/v3.0.0...v3.0.1 ## Note This release is created retroactively for code deployed on Thu Nov 20 2025 All changes listed below are already live in production. | Low | 12/3/2025 |
| v3.0.0 | βπ₯ SageMaker V3 Release Version 3.0.0 represents a significant milestone in our product's evolution. This major release introduces a modernized architecture, enhanced performance, and powerful new features while maintaining our commitment to user experience and reliability. Important: Please review these breaking changes before upgrading. Older interfaces such as Estimator, Model, Predictor and all their subclasses will not be supported in V3. Please see our [V3 examples folder](https | Low | 12/3/2025 |
| v2.254.1 | ### Bug Fixes and Other Changes * update get_execution_role to directly return the ExecutionRoleArn if it presents in the resource metadata file * [hf] HF PT Training DLCs | Low | 10/31/2025 |
| v2.254.0 | ### Features * Triton v25.09 DLC ### Bug Fixes and Other Changes * Add Numpy 2.0 support * add HF Optimum Neuron DLCs * [Hugging Face][Pytorch] Inference DLC 4.51.3 * [hf] HF Inference TGI | Low | 10/29/2025 |
| v2.253.1 | ### Bug Fixes and Other Changes * Update instance type regex to also include hyphens * Revert the change "Add Numpy 2.0 support" * [hf-tei] add image uri to utils * add TEI 1.8.2 | Low | 10/14/2025 |
| v2.253.0 | ### Features * Added condition to allow eval recipe. * add model_type hyperparameter support for Nova recipes ### Bug Fixes and Other Changes * Fix for a failed slow test: numpy fix * Add numpy 2.0 support * chore: domain support for eu-isoe-west-1 * Adding default identity implementations to InferenceSpec * djl regions fixes #5273 * Fix flaky integ test | Low | 10/10/2025 |
| v2.252.0 | ### Features * change S3 endpoint env name * add eval custom lambda arn to hyperparameters ### Bug Fixes and Other Changes * merge rba without the iso region changes * handle trial component status message longer than API supports * Add nova custom lambda in hyperparameter from estimator * add retryable option to emr step in SageMaker Pipelines * Feature/js mlops telemetry * latest tgi | Low | 9/29/2025 |
| v2.251.1 | ### Bug Fixes and Other Changes * chore: onboard tei 1.8.0 | Low | 8/29/2025 |
| v2.251.0 | ### Features * support pipeline versioning ### Bug Fixes and Other Changes * GPT OSS Hotfix * dockerfile stuck on interactive shell * add sleep for model deployment | Low | 8/21/2025 |
| v2.250.0 | ### Features * Add support for InstancePlacementConfig in Estimator for training jobs running on ultraserver capacity ### Bug Fixes and Other Changes * Add more constraints to test requirements | Low | 8/8/2025 |
| v2.249.0 | ### Features * AWS Batch for SageMaker Training jobs ### Bug Fixes and Other Changes * Directly use customer-provided endpoint name for ModelBuilder deployment. * update image_uri_configs 07-23-2025 07:18:25 PST | Low | 7/31/2025 |
| v2.248.2 | ### Bug Fixes and Other Changes * Relax boto3 version requirement * update image_uri_configs 07-22-2025 07:18:25 PST * update image_uri_configs 07-18-2025 07:18:28 PST * add hard dependency on sagemaker-core pypi lib * When rootlessDocker is enabled, return a fixed SageMaker IP | Low | 7/22/2025 |
| v2.248.1 | ### Bug Fixes and Other Changes * Nova training support | Low | 7/16/2025 |
| v2.248.0 | ### Features * integrate amtviz for visualization of tuning jobs ### Bug Fixes and Other Changes * build(deps): bump requests in /tests/data/serve_resources/mlflow/pytorch * build(deps): bump protobuf from 4.25.5 to 4.25.8 in /requirements/extras * build(deps): bump mlflow in /tests/data/serve_resources/mlflow/xgboost * build(deps): bump torch in /tests/data/modules/script_mode * sanitize git clone repo input url * Adding Hyperpod feature to enable hyperpod telemetry * Adding Hyperpod | Low | 7/15/2025 |
| v2.247.1 | ### Bug Fixes and Other Changes * update image_uri_configs 06-19-2025 07:18:34 PST | Low | 6/23/2025 |
| v2.247.0 | ### Features * Add support for MetricDefinitions in ModelTrainer ### Bug Fixes and Other Changes * update jumpstart region_config, update image_uri_configs 06-12-2025 07:18:12 PST * Add ignore_patterns in ModelTrainer to ignore specific files/folders * Allow import failure for internal _hashlib module | Low | 6/13/2025 |
| v2.246.0 | ### Features * Triton v25.04 DLC ### Bug Fixes and Other Changes * Update Attrs version to widen support * update estimator documentation regarding hyperparameters for source_dir | Low | 6/4/2025 |
| v2.245.0 | ### Features * Correct mypy type checking through PEP 561 ### Bug Fixes and Other Changes * MLFLow update for dependabot * addWaiterTimeoutHandling * merge method inputs with class inputs * update image_uri_configs 05-20-2025 07:18:17 PST | Low | 5/28/2025 |
| v2.244.2 | ### Bug Fixes and Other Changes * include model channel for gated uncompressed models * clarify model monitor one time schedule bug * update jumpstart region_config 05-15-2025 07:18:15 PST * update image_uri_configs 05-14-2025 07:18:16 PST * Add image configs and region config for TPE (ap-east-2) * Improve defaults handling in ModelTrainer | Low | 5/19/2025 |
| v2.244.1 | ### Bug Fixes and Other Changes * Fix Flask-Limiter version * Fix test_huggingface_tei_uris() * huggingface-llm-neuronx dlc * huggingface-neuronx dlc image_uri * huggingface-tei dlc image_uri * Fix test_deploy_with_update_endpoint() * add AG v1.3 * parameter mismatch in update_endpoint * remove --strip-component for untar source tar.gz * Fix type annotations * chore: Allow omegaconf >=2.2,<3 * honor json serialization of HPs * Map llama models to correct script * pin test dependen | Low | 5/15/2025 |
| v2.244.0 | ### Features * support custom workflow deployment in ModelBuilder using SMD image. ### Bug Fixes and Other Changes * Add Owner ID check for bucket with path when prefix is provided * Add model server timeout * pin mamba version to 24.11.3-2 to avoid inconsistent test runs * Update ModelTrainer to support s3 uri and tar.gz file as source_dir * chore: add huggingface images | Low | 5/2/2025 |
| v2.243.3 | ### Bug Fixes and Other Changes * update readme to reflect py312 upgrade * Revert the PR changes 5122 * Py312 upgrade step 2: Update dependencies, integ tests and unit tests * update pr test to deprecate py38 and add py312 * update image_uri_configs 04-16-2025 07:18:18 PST * update image_uri_configs 04-15-2025 07:18:10 PST * update image_uri_configs 04-11-2025 07:18:19 PST | Low | 4/23/2025 |
| v2.243.2 | ### Bug Fixes and Other Changes * tgi image uri unit tests * Fix deepdiff dependencies | Low | 4/16/2025 |
| v2.243.1 | ### Bug Fixes and Other Changes * Added handler for pipeline variable while creating process job * Fix issue #4856 by copying environment variables * remove historical job_name caching which causes long job name * Update instance gpu info * Master * Add mlflow tracking arn telemetry * chore: fix semantic versioning for wildcard identifier * flaky test ### Documentation Changes * update pipelines step caching examples to include more steps * update ModelStep data dependency info | Low | 4/11/2025 |
| v2.243.0 | ### Features * Enabled update_endpoint through model_builder ### Bug Fixes and Other Changes * Update for PT 2.5.1, SMP 2.8.0 * chore: move jumpstart region definitions to json file * fix flaky clarify model monitor test * fix flaky spark processor integ * use temp file in unit tests * Update transformers version * Aligned disable_output_compression for @remote with Estimator * Update Jinja version * update image_uri_configs 03-26-2025 07:18:16 PST * chore: fix integ tests to use | Low | 3/27/2025 |
| v2.242.0 | ### Features * add integ tests for training JumpStart models in private hub ### Bug Fixes and Other Changes * Torch upgrade * Prevent RunContext overlap between test_run tests * remove s3 output location requirement from hub class init * Fixing Pytorch training python version in tests * update image_uri_configs 03-11-2025 07:18:09 PST * resolve infinite loop in _find_config on Windows systems * pipeline definition function doc update | Low | 3/14/2025 |
| v2.241.0 | ### Features * Make DistributedConfig Extensible * support training for JumpStart model references as part of Curated Hub Phase 2 * Allow ModelTrainer to accept hyperparameters file ### Bug Fixes and Other Changes * Skip tests with deprecated instance type * Ensure Model.is_repack() returns a boolean * Fix error when there is no session to call _create_model_request() * Use sagemaker session's s3_resource in download_folder * Added check for the presence of model package group before | Low | 3/6/2025 |
| v2.240.0 | ### Features * Add support for TGI Neuronx 0.0.27 and HF PT 2.3.0 image in PySDK ### Bug Fixes and Other Changes * Remove main function entrypoint in ModelBuilder dependency manager. * forbid extras in Configs * altconfig hubcontent and reenable integ test * Merge branch 'master-rba' into local_merge * py_version doc fixes * Add backward compatbility for RecordSerializer and RecordDeserializer * update image_uri_configs 02-21-2025 06:18:10 PST * update image_uri_configs 02-20-2025 | Low | 2/25/2025 |
| v2.239.3 | ### Bug Fixes and Other Changes * added ap-southeast-7 and mx-central-1 for Jumpstart * update image_uri_configs 02-19-2025 06:18:15 PST | Low | 2/19/2025 |
| v2.239.2 | ### Bug Fixes and Other Changes * Add warning about not supporting torch.nn.SyncBatchNorm * pass in inference_ami_version to model_based endpoint type * Fix hyperparameter strategy docs * Add framework_version to all TensorFlowModel examples * Move RecordSerializer and RecordDeserializer to sagemaker.serializers and sagemaker.deserialzers | Low | 2/18/2025 |
| v2.239.1 | ### Bug Fixes and Other Changes * keep sagemaker_session from being overridden to None * Fix all type hint and docstrings for callable * Fix the workshop link for Step Functions * Fix Tensorflow doc link * Fix FeatureGroup docstring * Add type hint for ProcessingOutput * Fix sourcedir.tar.gz filenames in docstrings * Fix documentation for local mode * bug in get latest version was getting the max sorted alphabetically * Add cleanup logic to model builder integ tests for endpoints * F | Low | 2/14/2025 |
| v2.239.0 | ### Features * Add support for deepseek recipes ### Bug Fixes and Other Changes * mpirun protocol - distributed training with @remote decorator * Allow telemetry only in supported regions * Fix ssh host policy | Low | 2/1/2025 |
