awswrangler

Pandas on AWS.

Why this rank:Strong adoptionRelease freshnessHealthy release cadence

Description

# AWS SDK for pandas (awswrangler) *Pandas on AWS* Easy integration with Athena, Glue, Redshift, Timestream, OpenSearch, Neptune, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL). ![AWS SDK for pandas](https://github.com/aws/aws-sdk-pandas/blob/main/docs/source/_static/logo2.png?raw=true "AWS SDK for pandas") ![tracker](https://d3tiqpr4kkkomd.cloudfront.net/img/pixel.png?asset=GVOYN2BOOQ573LTVIHEW) > An [AWS Professional Service](https://aws.amazon.com/professional-services/) open source initiative | aws-proserve-opensource@amazon.com [![PyPi](https://img.shields.io/pypi/v/awswrangler)](https://pypi.org/project/awswrangler/) [![Conda](https://img.shields.io/conda/vn/conda-forge/awswrangler)](https://anaconda.org/conda-forge/awswrangler) [![Python Version](https://img.shields.io/pypi/pyversions/awswrangler.svg)](https://pypi.org/project/awswrangler/) [![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff) [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![Checked with mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/) ![Static Checking](https://github.com/aws/aws-sdk-pandas/workflows/Static%20Checking/badge.svg?branch=main) [![Documentation Status](https://readthedocs.org/projects/aws-sdk-pandas/badge/?version=latest)](https://aws-sdk-pandas.readthedocs.io/?badge=latest) | Source | Downloads | Installation Command | |--------|-----------|----------------------| | **[PyPi](https://pypi.org/project/awswrangler/)** | [![PyPI Downloads](https://img.shields.io/pypi/dm/awswrangler)](https://pypi.org/project/awswrangler/) | `pip install awswrangler` | | **[Conda](https://anaconda.org/conda-forge/awswrangler)** | [![Conda Downloads](https://img.shields.io/conda/dn/conda-forge/awswrangler.svg)](https://anaconda.org/conda-forge/awswrangler) | `conda install -c conda-forge awswrangler` | > ⚠️ **Starting version 3.0, optional modules must be installed explicitly:**<br> ➡️`pip install 'awswrangler[redshift]'` ## Table of contents - [Quick Start](#quick-start) - [At Scale](#at-scale) - [Read The Docs](#read-the-docs) - [Getting Help](#getting-help) - [Logging](#logging) ## Quick Start Installation command: `pip install awswrangler` > ⚠️ **Starting version 3.0, optional modules must be installed explicitly:**<br> ➡️`pip install 'awswrangler[redshift]'` ```py3 import awswrangler as wr import pandas as pd from datetime import datetime df = pd.DataFrame({"id": [1, 2], "value": ["foo", "boo"]}) # Storing data on Data Lake wr.s3.to_parquet( df=df, path="s3://bucket/dataset/", dataset=True, database="my_db", table="my_table" ) # Retrieving the data directly from Amazon S3 df = wr.s3.read_parquet("s3://bucket/dataset/", dataset=True) # Retrieving the data from Amazon Athena df = wr.athena.read_sql_query("SELECT * FROM my_table", database="my_db") # Get a Redshift connection from Glue Catalog and retrieving data from Redshift Spectrum con = wr.redshift.connect("my-glue-connection") df = wr.redshift.read_sql_query("SELECT * FROM external_schema.my_table", con=con) con.close() # Amazon Timestream Write df = pd.DataFrame({ "time": [datetime.now(), datetime.now()], "my_dimension": ["foo", "boo"], "measure": [1.0, 1.1], }) rejected_records = wr.timestream.write(df, database="sampleDB", table="sampleTable", time_col="time", measure_col="measure", dimensions_cols=["my_dimension"], ) # Amazon Timestream Query wr.timestream.query(""" SELECT time, measure_value::double, my_dimension FROM "sampleDB"."sampleTable" ORDER BY time DESC LIMIT 3 """) ``` ## At scale AWS SDK for pandas can also run your workflows at scale by leveraging [Modin](https://modin.readthedocs.io/en/stable/) and [Ray](https://www.ray.io/). Both projects aim to speed up data workloads by distributing processing over a cluster of workers. Read our [docs](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/scale.html) or head to our latest [tutorials](https://github.com/aws/aws-sdk-pandas/tree/main/tutorials) to learn more. ## [Read The Docs](https://aws-sdk-pandas.readthedocs.io/) - [**What is AWS SDK for pandas?**](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/about.html) - [**Install**](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/install.html) - [PyPi (pip)](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/install.html#pypi-pip) - [Conda](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/install.html#conda) - [AWS Lambda Layer](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/install.html#aws-lambda-layer) - [AWS Glue Python Shell Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/install.html#aws-glue-python-shell-jobs) - [AWS Glue PySpark Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/install.html#aws-glue-pyspark-jobs) - [Amazon SageMa

Release History

Version	Changes	Urgency	Date
3.16.1	## Notable Changes ⚠️ * pyarrow upgraded from v20.0.0 to v.22.0.0 in AWS lambda layers ⚠️ ### Bugfixes 🐛 * fix(athena): verify bucket ownership and manifest integrity by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3314 ### Security / Dependency Updates 🛡️ * chore(deps): bump cryptography from 46.0.6 to 46.0.7 by @dependabot[bot] in https://github.com/aws/aws-sdk-pandas/pull/3297 * chore(deps): bump uv from 0.10.10 to 0.11.6 by @dependabot[bot] in https://github.com/aws/	High	5/7/2026
3.16.0	Imported from PyPI (3.16.0)	Low	4/21/2026
3.15.1	### Security / Dependency Updates 🛡️ * fix: upgrade setuptools due to CVE-2026-23949 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3261 * chore: pyasn1, wheel, filelock security fixes by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3262 * chore: wheel security fix #3262 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3263 * chore: Update dependencies by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3268 ### Housekeeping 🧹 * chore(dep	Low	2/5/2026
3.15.0	## Notable Changes ⚠️ * fix: upgrade aiohttp due to CVE-2025-69223 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3250 * chore: Build Python 3.14 layers by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3251 * chore: Drop Python 3.9 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3257 ### Features / Enhancements 🚀 * feat(s3): add to_deltalake_streaming for single-commit Delta writes by @skoschik in https://github.com/aws/aws-sdk-pandas/pull/3231 * f	Low	1/13/2026
3.14.0	## Notable Changes ⚠️ * chore: upgrade pg8000 due to CVE-2025-61385 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3225 ### Features / Enhancements 🚀 * feat: support redshift `CLEANPATH` by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3211 * feat: add result reuse configuration to query execution functions by @DavidKatz-il in https://github.com/aws/aws-sdk-pandas/pull/3212 ### Bugfixes 🐛 * fix: Add `s3_output` parameter to `_start_query_execution` call in "	Low	10/30/2025
3.13.0	## Notable Changes ⚠️ * updated `aiohhtp==3.12.15`to fix CVE-2025-53643 (LOW) by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3197 ### Features / Enhancements 🚀 * feat: ray 2.49.0 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3194 * feat: add support for aurora-mysql and aurora-postgresql engines by @senorcinco in https://github.com/aws/aws-sdk-pandas/pull/3188 ### Bugfixes 🐛 * fix: opensearch session by @kukushking in https://github.com/aws/aws-sdk-pandas	Low	9/10/2025
3.12.1	## Notable Changes ⚠️ * Moved to [uv package manager](https://github.com/astral-sh/uv) 🔥 🔥 🔥 ### Features / Enhancements 🚀 * feat: uv by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3132 ### Security / Dependency Updates 🛡️ * chore(deps): bump the production-dependencies group with 4 updates by @dependabot in https://github.com/aws/aws-sdk-pandas/pull/3159 * chore(deps): bump the production-dependencies group with 4 updates by @dependabot in https://github.com/aws/	Low	6/18/2025
3.12.0	## Notable Changes ⚠️ * AWS Lambda Layers: pyarrow was upgraded to 20.0.0 ### Features / Enhancements 🚀 * feat: add pyarrow_additional_kwargs to athena.to_iceberg by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/3094 * feat: add dtype argument to delete_from_iceberg by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/3099 * feat: add redshift and rds data api query params by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3111 * chore: ray 2.45 by @kukus	Low	5/29/2025
3.11.0	## Notable Changes ⚠️ * AWS SDK for pandas now supports Python 3.13! 🎉 * Python 3.8 is no longer supported (reached [end-of-life](https://devguide.python.org/versions/) Oct 7 2024) 🚫 * AWS Lambda Layers: pyarrow was upgraded to 18.1.0 * AWS Lambda Layers: numpy was upgraded to 2.2.1 ### Features / Enhancements 🚀 * add support for Python 3.13 & deprecate Python 3.8 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3045 * return opensearch aggregation top hits by @ku	Low	1/10/2025
3.10.1	## Bug fixes 🐛 * fix: update references in introduction notebook by @emmanuel-ferdman in https://github.com/aws/aws-sdk-pandas/pull/3009 * fix: read parquet file in chunked mode per row group by @FredericKayser in https://github.com/aws/aws-sdk-pandas/pull/3016 * fix: add missing raise statement in RS Data API by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/3025 ## Documentation 📚 * chore: Prepare 3.10.1 release by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/30	Low	12/4/2024
3.10.0	## Features * feat: Support numpy 2.0 by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2944 * feat(redshift): Automatically add new DataFrame columns to Redshift tables during write operation by @jack-dell in https://github.com/aws/aws-sdk-pandas/pull/2948 * feat: modify_refresh_interval flag in opensearch index_documents by @AvihaiSam in https://github.com/aws/aws-sdk-pandas/pull/2980 * feat: support postgresql array types by @kukushking in https://github.com/aws/aws-sdk-p	Low	10/31/2024
3.9.1	## Bug fixes 🐛 * bucketing error with newer version of Modin (0.31.0) by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2896 * `athena.read_sql_query` failing for time columns by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2895 * add an argument to control handling nulls in merge criteria by @brendan-cook-87 in https://github.com/aws/aws-sdk-pandas/pull/2892 * address Ray deprecation warnings by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas	Low	8/19/2024
3.9.0	## Enhancements 🎉 * Support ORC and CSV in `redshift.copy_from_files` function by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2849 * Support different merge conditions in `athena.to_iceberg` function by @aldder in https://github.com/aws/aws-sdk-pandas/pull/2861 * Manage `NULL` values in `athena.to_iceberg` merge statement by @aldder in https://github.com/aws/aws-sdk-pandas/pull/2872 * Upgrade Ray to 2.30 by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2870	Low	7/8/2024
3.8.0	## Enhancements 🎉 * support client-side parameter resolution in athena.create_ctas_table by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2797 * add commit_transaction to postgres.to_sql by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2795 * add columns parameters support by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2814 * add overwrite_method to `postgresql.to_sql` by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2820 * add u	Low	6/5/2024
3.7.3	## Bug fixes 🐛 - Iceberg schema evolution fails for map, array and struct types by @LeonLuttenberger in #2755 - trickle down `s3_output` in `athena.to_iceberg` by @jaidisido in #2767 - respect order of columns in `to_iceberg` by @jaidisido in #2768 - add PyArrow `fixed_size_binary` dtype support by @jaidisido in #2775 - Opensearch serverless vector search collections - remove default `_id` by @kukushking in #2784 - missing keys in `list_to_arrow_table` by @kukushking in #2778 - prevent `	Low	4/22/2024
3.7.2	## Features/Enhancements 🚀 - Add support for DeltaLake's DynamoDB lock mechanism by @LeonLuttenberger in #2705 ## Bug fixes 🐛 - `wr.athena.to_iceberg` - Insert query has mismatched column types #2678 by @GalvFionic in #2715 - allow `s3_output` in `athena.to_iceberg` by @jaidisido in #2727 - replace deprecated `np.split_array` by @jaidisido in #2735 - Athena `to_iceberg` fails with non-lowercase column names by @LeonLuttenberger in #2736 - Support Ray 2.10 by @kukushking in #2741 ##	Low	3/27/2024
3.7.1	## Bug fixes 🐛 * fix breaking change in `_create_table` by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2711 * pin pyarrow to version 8 and above by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2709 ## Documentation 📚 * fix `redshift.to_sql` doc indentation error by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2706 Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/3.7.0...3.7.1	Low	3/7/2024
3.7.0	## Breaking changes 💥 Lake Formation Governed tables are being phased out and we are dropping support (#2692). ## Features/Enhancements 🚀 * support parquet client encryption (#2642) by @Marwen94 in https://github.com/aws/aws-sdk-pandas/pull/2674 ## Bug fixes 🐛 * Index columns removed on s3.to_parquet by @robert-schmidtke in https://github.com/aws/aws-sdk-pandas/pull/2655 * Missing timezone metadata by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2682 * remove enforced	Low	3/5/2024
3.6.0	## Features/Enhancements 🚀 * Enable Iceberg row deletion & add `mode` parameter to `to_iceberg` by @LeonLuttenberger in #2632 * Add support for pyarrow type `large_string` by @joakibo in #2663 * Add `max_results` to `athena.list_query_executions` by @LeonLuttenberger in #2665 ## Bug fixes 🐛 * Pyarrow 15 imports & remove unused code by @kukushking in #2649 ## New Contributors * @joakibo made their first contribution in https://github.com/aws/aws-sdk-pandas/pull/2663 **Full Changel	Low	2/14/2024
3.5.2	## Bug fixes 🐛 * DynamoDB key & filter expressions attribute overwrite by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2615 * Allow PostgreSQL reserved keywords as column names by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2619 * Add `to_iceberg` support for filling missing columns in the DataFrame with None by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2616 * Forward `ignore_nulls` for container types by @raaidarshad in #2636 ## Doc	Low	1/25/2024
3.5.1	## Bug fixes 🐛 * Deserialization error when reading from DynamoDB using `KeyConditionExpression` by @LeonLuttenberger in #2607 * Reading of chunked parquet when columns parameter is specified by @rchromik in #2599 ## Documentation 📚 * Add `show_create_table` to Athena API page by @MikeSchriefer in #2610 ## Other 🤖 * chore: Replace `bump2version` with `bump-my-version` by @LeonLuttenberger in #2608 * chore(deps-dev): bump jinja2 from 3.1.2 to 3.1.3 by @dependabot in #2609 * chore(d	Low	1/12/2024
3.5.0	## Breaking changes 💥 Due to [CVEs](https://www.anyscale.com/blog/update-on-ray-cves-cve-2023-6019-cve-2023-6020-cve-2023-6021-cve-2023-48022-cve-2023-48023), Ray is capped to patched version 2.9.x. As a result, the latest version of the library cannot be used on the Glue for Ray runtime. We have raised the CVEs issue to the Glue team ## Features/Enhancements 🚀 * Add `spark_properties` to athena spark by @rajagurunath in https://github.com/aws/aws-sdk-pandas/pull/2508 * Add `MERGE INTO`	Low	1/11/2024
3.4.2	## Features/Enhancements 🚀 * Update pyarrow to 14.0.1 to fix [arbitrary code execution security vulnerability](https://github.com/aws/aws-sdk-pandas/security/dependabot/35) Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/3.4.1...3.4.2	Low	11/13/2023
3.4.1	## Features/Enhancements 🚀 * feat: Add schema evolution to `athena.to_iceberg` by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2465 * feat: Athena - add `client_request_token` by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2474 * feat: Redshift data api - allow all auth combinations by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2475 * feat: add columns comments to iceberg by @frenchytheasian in https://github.com/aws/aws-sdk-pandas/pull/2482	Low	10/24/2023
3.4.0	## Features/Enhancements 🚀 * Geospatial - parse Athena geospatial types via geopandas by @kukushking in #2346 * Allow group identifiers to be used in `wr.cloudwatch` queries by @LeonLuttenberger in #2430 * Add ignore null store parquet metadata by @raaidarshad in #2450 ## Bug fixes 🐛 * Add missing boto3 session in `athena.to_iceberg` wait_query by @jaidisido in #2428 * Add catalog ID in `athena.to_iceberg` by @jaidisido in #2446 * Return None for missing column and partition key comm	Low	9/11/2023
3.3.0	## Features/Enhancements 🚀 * Support Athena query prepared statements & Athena parameterized queries by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2344 * Add dtype parameter in to_iceberg function by @paulobrunheroto in https://github.com/aws/aws-sdk-pandas/pull/2359 * Add CleanRooms read module by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2366 * Escape and validate table identifiers and literals in PostreSQL by @kukushking in https://github.com/aws/aws-	Low	8/1/2023
3.2.1	## Fixes 🛠️ * Fix error where library could not be imported on Windows due to `No module named 'pyarrow._orc'` by @LeonLuttenberger in #2341 #2337 * Lower `packaging` version requirement by @LeonLuttenberger in #2340 * Allow Ray 2.5 & downgrade tox by @kukushking in #2338 Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/3.2.0...3.2.1	Low	6/14/2023
3.2.0	### Features/Enhancements 🚀 * Add `s3.read_orc` and `s3.to_orc` by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2312 🔥 * Apache Spark on Amazon Athena - `wr.athena.create_spark_session` & `wr.athena.run_spark_calculation` by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2314 🚀 * EMR Serverless by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2304 🔥 * Add `to_sql` for RDS Data API by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pu	Low	6/13/2023
3.1.1	## What's Changed * fix: Add missing `packaging` dependency by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2281 Full Changelog: https://github.com/aws/aws-sdk-pandas/compare/3.1.0...3.1.1	Low	5/16/2023
3.1.0	### Features/Enhancements 🚀 * Add `neptune.bulk_load` for bulk loading data into Neptune by @LeonLuttenberger in #2238 #2267 * Add `s3.to_deltalake` function by @LeonLuttenberger in #2228 * Add Timestream Batch Load support by @jaidisido in #2214 * Add Iceberg insert by @kukushking in #2233 * Support upsert mode for OracleDB by @LeonLuttenberger in #2265 * Add `chunked` parameter to DynamoDB read functions by @LeonLuttenberger in #2227 * Upgrade Modin to 0.20.1 & allow Ray 2.4 by @kuku	Low	5/15/2023
3.0.0	### Breaking changes 💥 * Move dependencies to optional by @jaidisido in #1992 🔓 * Dependencies required by the following modules have been moved to optional: redshift, mysql, postgres, sqlserver, oracle, gremlin, sparql, deltalake * The required dependencies can be easily installed with `pip install awswrangler[<MODULE_NAME>]`, for example `pip install awswrangler[redshift]` * Change SQL formatters for Athena and LakeFormation so that they properly format types by @Taragolis and @Le	Low	4/13/2023
2.20.1	## What's Changed * (fix) Timestream - ignore None, NaN, and NaT measure values by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2072 * (docs) Minor - update opensearch api docs by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2085 * Correct documentation for `chunksize=True` by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2087 * fix: timestream empty batches by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2098 * enhancement: Add times	Low	3/21/2023
3.0.0rc3	## What's Changed ### Breaking changes: * breaking change: Move dependencies to optional by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1992 * breaking change: Use ExecuteStatement instead of Scan for DynamoDB read_partiql by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1964 ### Features/Enhancements: * enhancement: Refactor engine switching when Ray is installed by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1792 * logging: Enable user to	Low	3/9/2023
2.20.0	### Breaking changes - `dynamodb.read_partiql` no longer performs a Scan operation under the hood. Instead the `ExecuteStatement` API is used. It means that the `PartiQL` IAM permission is required instead of `Scan` ### Noteworthy (feat): opensearch serverless by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1922. See the [tutorial](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/035%20-%20OpenSearch%20Serverless.ipynb) 🔥 * (breaking change): Use `ExecuteStateme	Low	3/1/2023
2.19.0	## Noteworthy * Glue Data Quality now supported, checkout the [tutorial](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/034%20-%20Glue%20Data%20Quality.ipynb) 🔥 * Delta lake support by @fvaleye * New DynamoDB `read_items` method by @a-slice-of-py ## Features & enhancements * feat: add read_items to dynamodb module by @a-slice-of-py in https://github.com/aws/aws-sdk-pandas/pull/1877 * Add deltalake support in AWS S3 with Pandas by @fvaleye in https://github.com/aws/aws-sdk-pa	Low	1/9/2023
2.18.0	## Noteworthy - Pyarrow 10 support 🔥 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1731 - Lambda layers now available in `af-south-1` (Cape Town) 🌍 by @malachi-constant ## Features & enhancements - Add unload_approach to athena.read_sql_table by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1634 - Pass additional partition projection params to wr.s3.to_parquet & cat… by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1627 - Regenerate poetry.lock wi	Low	12/2/2022
3.0.0rc2	## What's Changed * (enhancement): Enable missing unit tests and Redshift, Athena, LF load tests by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1736 * (enhancement): configure scheduling options, remove dependencies on internal ray impl by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1734 * (testing): Enable Athena and Redshift tests, and address errors by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1721 * (feat): Make tqdm progress reporting opt	Low	11/23/2022
3.0.0rc1	## What's Changed * (enhancement): Move RayLogger out of non-distributed modules by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1686 * (perf): Distribute data types inference by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1692 * (docs): Update config tutorial to include new configuration values by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1696 * (fix): partition block overwriting by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1695	Low	10/27/2022
3.0.0b3	## What's Changed * (feat): Add partitioning on block level by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1653 * (refactor): Make room for additional distributed engines by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1646 * (feat): Distribute s3 write text by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1631 * (docs): Add "Introduction to Ray" Tutorial by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1661 * (fix): Return addre	Low	10/12/2022
3.0.0b2	## What's Changed * (feat) Update to Ray 2.0 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1635 * (feat) Ray logging by @malachi-constant in https://github.com/aws/aws-sdk-pandas/pull/1623 * (enhancement): Reduce LOC in S3 write methods create_table by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1626 * (docs) Tutorial: Run SDK for pandas job on ray cluster by @malachi-constant in https://github.com/aws/aws-sdk-pandas/pull/1616 Full Changelog: https://github	Low	9/30/2022
3.0.0b1	## What's Changed * (test) Consolidate unit and load tests by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1525 * (feat) Distribute S3 read text by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1567 * (feat) Distribute s3 wait_objects by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1539 * (test) Ray Load Tests CDK Stack and Instructions for Load Testing by @malachi-constant in https://github.com/aws/aws-sdk-pandas/pull/1583 * (fix) Fix S3 rea	Low	9/22/2022
2.17.0	## New Functionalities - RedshiftDataAPI serverless support 🔥 #1530 - Check out the [tutorial](https://aws-sdk-pandas.readthedocs.io/en/latest/tutorials/030%20-%20Data%20Api.html) - Add `get_query_results` to the Athena module #1496 - Check out the [function documentation](https://aws-sdk-pandas.readthedocs.io/en/latest/stubs/awswrangler.athena.get_query_results.html#awswrangler.athena.get_query_results) - Add `generate_create_query` to the Athena module #1514 - Check out the	Low	9/20/2022
3.0.0a2	This is a pre-release for the Wrangler@Scale project ## What's Changed * (feat): Add directory for Distributed Wrangler Load Tests by @malachi-constant in https://github.com/awslabs/aws-data-wrangler/pull/1464 * (CI): Distribute tests in tox config by @malachi-constant in https://github.com/awslabs/aws-data-wrangler/pull/1469 * (feat): Distribute s3 delete objects by @malachi-constant in https://github.com/awslabs/aws-data-wrangler/pull/1474 * (CI): Enable new CI pipeline for standard & d	Low	8/17/2022
3.0.0a1	This is a pre-release for the Wrangler@Scale project ## What's Changed * (feat): Add distributed config flag and initialise method by @jaidisido in https://github.com/awslabs/aws-data-wrangler/pull/1389 * (feat): Add distributed Lake Formation read by @jaidisido in https://github.com/awslabs/aws-data-wrangler/pull/1397 * (feat): Distribute S3 select over multiple paths and scan ranges by @jaidisido in https://github.com/awslabs/aws-data-wrangler/pull/1445 * (refactor): Refactor threading/	Low	8/17/2022
2.16.1	### Noteworthy > 🐛 Fixed issue introduced by `2.16.0` to method `s3.read_parquet()` ### Patch - Fix bug: pq_file.schema.names(): TypeError: 'list' object is not callable `s3.read_parquet()` #1412 --- *P.S.* The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. [Just upload it and run](https://aws-data-wrangler.readthedocs.io/en/stable/install.html) or [use](https://aws-data-wrangler.readthedocs.io/en/2.16.1/install.html#public-artifacts) them fro	Low	6/28/2022
2.16.0	### Noteworthy > ⚠️ For platforms without PyArrow 7 support (e.g. MWAA, [EMR](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#aws-glue-pyspark-jobs)):<br> ➡️ `pip install pyarrow==2 awswrangler` ### New Functionalities - Add support for Oracle Database 🔥 #1259 Check out the [tutorial](https://aws-data-wrangler.readthedocs.io/en/latest/tutorials/007%20-%20Redshift%2C%20M	Low	6/22/2022
2.15.1	### Noteworthy > ⚠️ Dropped Python 3.6 support > ⚠️ For platforms without PyArrow 7 support (e.g. MWAA, [EMR](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#aws-glue-pyspark-jobs)):<br> ➡️ `pip install pyarrow==2 awswrangler` ### Patch - Add `sparql` extra & make `SPARQLWrapper` dependency optional #1252 --- *P.S.* The AWS Lambda Layer file (.zip) and the	Low	4/11/2022
2.15.0	### Noteworthy > ⚠️ Dropped Python 3.6 support > ⚠️ For platforms without PyArrow 7 support (e.g. MWAA, [EMR](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#aws-glue-pyspark-jobs)):<br> ➡️ `pip install pyarrow==2 awswrangler` ### New Functionalities - Amazon Neptune module 🚀 #1084 Check out the [tutorial](https://aws-data-wrangler.readthedocs.io/en/latest/tutorials/	Low	3/28/2022
2.14.0	### Caveats > ⚠️ For platforms without PyArrow 6 support (e.g. MWAA, [EMR](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#aws-glue-pyspark-jobs)):<br> ➡️ `pip install pyarrow==2 awswrangler` ### New Functionalities - Support Athena Unload 🚀 #1038 ### Enhancements - Add the `ExcludeColumnSchema=True` argument to the glue.get_partitions call to reduce response si	Low	1/28/2022
2.13.0	### Caveats > ⚠️ For platforms without PyArrow 6 support (e.g. MWAA, [EMR](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#aws-glue-pyspark-jobs)):<br> ➡️ `pip install pyarrow==2 awswrangler` ### Breaking changes - Fix sanitize methods to align with Glue/Hive naming conventions #579 ### New Functionalities - AWS Lake Formation Governed Tables 🚀 #570 - Support for	Low	12/3/2021
2.12.1	### Caveats > ⚠️ For platforms without PyArrow 5 support (e.g. MWAA, [EMR](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#aws-glue-pyspark-jobs)):<br> ➡️ `pip install pyarrow==2 awswrangler` ### Patch - Removing unnecessary dev dependencies from main #961 --- *P.S.* The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. [Just uploa	Low	10/18/2021

Dependencies & License Audit

Loading dependencies...

Similar Packages

sagemaker-studioPython library to interact with Amazon SageMaker Unified Studio1.1.13

alibabacloud-adb20211201Alibaba Cloud adb (20211201) SDK Library for Pythonmaster@2026-06-06

ydbYDB Python SDK3.29.1

sagemakerOpen source library for training and deploying models on Amazon SageMaker.v3.13.0

typerTyper, build great CLIs. Easy to code. Based on Python type hints.0.26.7

More from Amazon Web Services

aws-cdk-cloud-assembly-schemaSchema for the protocol between CDK framework and CDK CLI

boto3The AWS SDK for Python

aws-lambda-powertoolsPowertools for AWS Lambda (Python) is a developer toolkit to implement Serverless best practices and increase developer velocity.

sagemakerOpen source library for training and deploying models on Amazon SageMaker.

More in Databases

orbitOne API for 20+ LLM providers, your databases, and your files — self-hosted, open-source AI gateway with RAG, voice, and guardrails.

alibabacloud-adb20211201Alibaba Cloud adb (20211201) SDK Library for Python

milvusMilvus is a high-performance, cloud-native vector database built for scalable vector ANN search

qdrantQdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/