freshcrate
Skin:/
Home > Databases > awswrangler

awswrangler

Pandas on AWS.

Why this rank:Strong adoptionRelease freshnessHealthy release cadence

Description

# AWS SDK for pandas (awswrangler) *Pandas on AWS* Easy integration with Athena, Glue, Redshift, Timestream, OpenSearch, Neptune, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL). ![AWS SDK for pandas](https://github.com/aws/aws-sdk-pandas/blob/main/docs/source/_static/logo2.png?raw=true "AWS SDK for pandas") ![tracker](https://d3tiqpr4kkkomd.cloudfront.net/img/pixel.png?asset=GVOYN2BOOQ573LTVIHEW) > An [AWS Professional Service](https://aws.amazon.com/professional-services/) open source initiative | aws-proserve-opensource@amazon.com [![PyPi](https://img.shields.io/pypi/v/awswrangler)](https://pypi.org/project/awswrangler/) [![Conda](https://img.shields.io/conda/vn/conda-forge/awswrangler)](https://anaconda.org/conda-forge/awswrangler) [![Python Version](https://img.shields.io/pypi/pyversions/awswrangler.svg)](https://pypi.org/project/awswrangler/) [![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff) [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![Checked with mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/) ![Static Checking](https://github.com/aws/aws-sdk-pandas/workflows/Static%20Checking/badge.svg?branch=main) [![Documentation Status](https://readthedocs.org/projects/aws-sdk-pandas/badge/?version=latest)](https://aws-sdk-pandas.readthedocs.io/?badge=latest) | Source | Downloads | Installation Command | |--------|-----------|----------------------| | **[PyPi](https://pypi.org/project/awswrangler/)** | [![PyPI Downloads](https://img.shields.io/pypi/dm/awswrangler)](https://pypi.org/project/awswrangler/) | `pip install awswrangler` | | **[Conda](https://anaconda.org/conda-forge/awswrangler)** | [![Conda Downloads](https://img.shields.io/conda/dn/conda-forge/awswrangler.svg)](https://anaconda.org/conda-forge/awswrangler) | `conda install -c conda-forge awswrangler` | > โš ๏ธ **Starting version 3.0, optional modules must be installed explicitly:**<br> โžก๏ธ`pip install 'awswrangler[redshift]'` ## Table of contents - [Quick Start](#quick-start) - [At Scale](#at-scale) - [Read The Docs](#read-the-docs) - [Getting Help](#getting-help) - [Logging](#logging) ## Quick Start Installation command: `pip install awswrangler` > โš ๏ธ **Starting version 3.0, optional modules must be installed explicitly:**<br> โžก๏ธ`pip install 'awswrangler[redshift]'` ```py3 import awswrangler as wr import pandas as pd from datetime import datetime df = pd.DataFrame({"id": [1, 2], "value": ["foo", "boo"]}) # Storing data on Data Lake wr.s3.to_parquet( df=df, path="s3://bucket/dataset/", dataset=True, database="my_db", table="my_table" ) # Retrieving the data directly from Amazon S3 df = wr.s3.read_parquet("s3://bucket/dataset/", dataset=True) # Retrieving the data from Amazon Athena df = wr.athena.read_sql_query("SELECT * FROM my_table", database="my_db") # Get a Redshift connection from Glue Catalog and retrieving data from Redshift Spectrum con = wr.redshift.connect("my-glue-connection") df = wr.redshift.read_sql_query("SELECT * FROM external_schema.my_table", con=con) con.close() # Amazon Timestream Write df = pd.DataFrame({ "time": [datetime.now(), datetime.now()], "my_dimension": ["foo", "boo"], "measure": [1.0, 1.1], }) rejected_records = wr.timestream.write(df, database="sampleDB", table="sampleTable", time_col="time", measure_col="measure", dimensions_cols=["my_dimension"], ) # Amazon Timestream Query wr.timestream.query(""" SELECT time, measure_value::double, my_dimension FROM "sampleDB"."sampleTable" ORDER BY time DESC LIMIT 3 """) ``` ## At scale AWS SDK for pandas can also run your workflows at scale by leveraging [Modin](https://modin.readthedocs.io/en/stable/) and [Ray](https://www.ray.io/). Both projects aim to speed up data workloads by distributing processing over a cluster of workers. Read our [docs](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/scale.html) or head to our latest [tutorials](https://github.com/aws/aws-sdk-pandas/tree/main/tutorials) to learn more. ## [Read The Docs](https://aws-sdk-pandas.readthedocs.io/) - [**What is AWS SDK for pandas?**](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/about.html) - [**Install**](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/install.html) - [PyPi (pip)](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/install.html#pypi-pip) - [Conda](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/install.html#conda) - [AWS Lambda Layer](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/install.html#aws-lambda-layer) - [AWS Glue Python Shell Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/install.html#aws-glue-python-shell-jobs) - [AWS Glue PySpark Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.16.0/install.html#aws-glue-pyspark-jobs) - [Amazon SageMa

Release History

VersionChangesUrgencyDate
3.16.1## Notable Changes โš ๏ธ * pyarrow upgraded from v20.0.0 to v.22.0.0 in AWS lambda layers โš ๏ธ ### Bugfixes ๐Ÿ› * fix(athena): verify bucket ownership and manifest integrity by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3314 ### Security / Dependency Updates ๐Ÿ›ก๏ธ * chore(deps): bump cryptography from 46.0.6 to 46.0.7 by @dependabot[bot] in https://github.com/aws/aws-sdk-pandas/pull/3297 * chore(deps): bump uv from 0.10.10 to 0.11.6 by @dependabot[bot] in https://github.com/aws/High5/7/2026
3.16.0Imported from PyPI (3.16.0)Low4/21/2026
3.15.1### Security / Dependency Updates ๐Ÿ›ก๏ธ * fix: upgrade setuptools due to CVE-2026-23949 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3261 * chore: pyasn1, wheel, filelock security fixes by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3262 * chore: wheel security fix #3262 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3263 * chore: Update dependencies by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3268 ### Housekeeping ๐Ÿงน * chore(depLow2/5/2026
3.15.0## Notable Changes โš ๏ธ * fix: upgrade aiohttp due to CVE-2025-69223 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3250 * chore: Build Python 3.14 layers by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3251 * chore: Drop Python 3.9 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3257 ### Features / Enhancements ๐Ÿš€ * feat(s3): add to_deltalake_streaming for single-commit Delta writes by @skoschik in https://github.com/aws/aws-sdk-pandas/pull/3231 * fLow1/13/2026
3.14.0## Notable Changes โš ๏ธ * chore: upgrade pg8000 due to CVE-2025-61385 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3225 ### Features / Enhancements ๐Ÿš€ * feat: support redshift `CLEANPATH` by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3211 * feat: add result reuse configuration to query execution functions by @DavidKatz-il in https://github.com/aws/aws-sdk-pandas/pull/3212 ### Bugfixes ๐Ÿ› * fix: Add `s3_output` parameter to `_start_query_execution` call in "Low10/30/2025
3.13.0## Notable Changes โš ๏ธ * updated `aiohhtp==3.12.15`to fix CVE-2025-53643 (LOW) by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3197 ### Features / Enhancements ๐Ÿš€ * feat: ray 2.49.0 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3194 * feat: add support for aurora-mysql and aurora-postgresql engines by @senorcinco in https://github.com/aws/aws-sdk-pandas/pull/3188 ### Bugfixes ๐Ÿ› * fix: opensearch session by @kukushking in https://github.com/aws/aws-sdk-pandasLow9/10/2025
3.12.1## Notable Changes โš ๏ธ * Moved to [uv package manager](https://github.com/astral-sh/uv) ๐Ÿ”ฅ ๐Ÿ”ฅ ๐Ÿ”ฅ ### Features / Enhancements ๐Ÿš€ * feat: uv by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3132 ### Security / Dependency Updates ๐Ÿ›ก๏ธ * chore(deps): bump the production-dependencies group with 4 updates by @dependabot in https://github.com/aws/aws-sdk-pandas/pull/3159 * chore(deps): bump the production-dependencies group with 4 updates by @dependabot in https://github.com/aws/Low6/18/2025
3.12.0## Notable Changes โš ๏ธ * AWS Lambda Layers: **pyarrow** was upgraded to 20.0.0 ### Features / Enhancements ๐Ÿš€ * feat: add pyarrow_additional_kwargs to athena.to_iceberg by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/3094 * feat: add dtype argument to delete_from_iceberg by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/3099 * feat: add redshift and rds data api query params by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3111 * chore: ray 2.45 by @kukusLow5/29/2025
3.11.0## Notable Changes โš ๏ธ * AWS SDK for pandas now supports Python 3.13! ๐ŸŽ‰ * Python 3.8 is no longer supported (reached [end-of-life](https://devguide.python.org/versions/) Oct 7 2024) ๐Ÿšซ * AWS Lambda Layers: **pyarrow** was upgraded to 18.1.0 * AWS Lambda Layers: **numpy** was upgraded to 2.2.1 ### Features / Enhancements ๐Ÿš€ * add support for Python 3.13 & deprecate Python 3.8 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/3045 * return opensearch aggregation top hits by @kuLow1/10/2025
3.10.1## Bug fixes ๐Ÿ› * fix: update references in introduction notebook by @emmanuel-ferdman in https://github.com/aws/aws-sdk-pandas/pull/3009 * fix: read parquet file in chunked mode per row group by @FredericKayser in https://github.com/aws/aws-sdk-pandas/pull/3016 * fix: add missing raise statement in RS Data API by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/3025 ## Documentation ๐Ÿ“š * chore: Prepare 3.10.1 release by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/30Low12/4/2024
3.10.0## Features * feat: Support numpy 2.0 by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2944 * feat(redshift): Automatically add new DataFrame columns to Redshift tables during write operation by @jack-dell in https://github.com/aws/aws-sdk-pandas/pull/2948 * feat: modify_refresh_interval flag in opensearch index_documents by @AvihaiSam in https://github.com/aws/aws-sdk-pandas/pull/2980 * feat: support postgresql array types by @kukushking in https://github.com/aws/aws-sdk-pLow10/31/2024
3.9.1## Bug fixes ๐Ÿ› * bucketing error with newer version of Modin (0.31.0) by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2896 * `athena.read_sql_query` failing for time columns by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2895 * add an argument to control handling nulls in merge criteria by @brendan-cook-87 in https://github.com/aws/aws-sdk-pandas/pull/2892 * address Ray deprecation warnings by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandasLow8/19/2024
3.9.0## Enhancements ๐ŸŽ‰ * Support ORC and CSV in `redshift.copy_from_files` function by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2849 * Support different merge conditions in `athena.to_iceberg` function by @aldder in https://github.com/aws/aws-sdk-pandas/pull/2861 * Manage `NULL` values in `athena.to_iceberg` merge statement by @aldder in https://github.com/aws/aws-sdk-pandas/pull/2872 * Upgrade Ray to 2.30 by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2870 Low7/8/2024
3.8.0## Enhancements ๐ŸŽ‰ * support client-side parameter resolution in athena.create_ctas_table by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2797 * add commit_transaction to postgres.to_sql by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2795 * add columns parameters support by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2814 * add overwrite_method to `postgresql.to_sql` by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2820 * add uLow6/5/2024
3.7.3## Bug fixes ๐Ÿ› - Iceberg schema evolution fails for map, array and struct types by @LeonLuttenberger in #2755 - trickle down `s3_output` in `athena.to_iceberg` by @jaidisido in #2767 - respect order of columns in `to_iceberg` by @jaidisido in #2768 - add PyArrow `fixed_size_binary` dtype support by @jaidisido in #2775 - Opensearch serverless vector search collections - remove default `_id` by @kukushking in #2784 - missing keys in `list_to_arrow_table` by @kukushking in #2778 - prevent `Low4/22/2024
3.7.2## Features/Enhancements ๐Ÿš€ - Add support for DeltaLake's DynamoDB lock mechanism by @LeonLuttenberger in #2705 ## Bug fixes ๐Ÿ› - `wr.athena.to_iceberg` - Insert query has mismatched column types #2678 by @GalvFionic in #2715 - allow `s3_output` in `athena.to_iceberg` by @jaidisido in #2727 - replace deprecated `np.split_array` by @jaidisido in #2735 - Athena `to_iceberg` fails with non-lowercase column names by @LeonLuttenberger in #2736 - Support Ray 2.10 by @kukushking in #2741 ##Low3/27/2024
3.7.1## Bug fixes ๐Ÿ› * fix breaking change in `_create_table` by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2711 * pin pyarrow to version 8 and above by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2709 ## Documentation ๐Ÿ“š * fix `redshift.to_sql` doc indentation error by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2706 **Full Changelog**: https://github.com/aws/aws-sdk-pandas/compare/3.7.0...3.7.1Low3/7/2024
3.7.0## Breaking changes ๐Ÿ’ฅ Lake Formation Governed tables are being phased out and we are dropping support (#2692). ## Features/Enhancements ๐Ÿš€ * support parquet client encryption (#2642) by @Marwen94 in https://github.com/aws/aws-sdk-pandas/pull/2674 ## Bug fixes ๐Ÿ› * Index columns removed on s3.to_parquet by @robert-schmidtke in https://github.com/aws/aws-sdk-pandas/pull/2655 * Missing timezone metadata by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2682 * remove enforced Low3/5/2024
3.6.0## Features/Enhancements ๐Ÿš€ * Enable Iceberg row deletion & add `mode` parameter to `to_iceberg` by @LeonLuttenberger in #2632 * Add support for pyarrow type `large_string` by @joakibo in #2663 * Add `max_results` to `athena.list_query_executions` by @LeonLuttenberger in #2665 ## Bug fixes ๐Ÿ› * Pyarrow 15 imports & remove unused code by @kukushking in #2649 ## New Contributors * @joakibo made their first contribution in https://github.com/aws/aws-sdk-pandas/pull/2663 **Full ChangelLow2/14/2024
3.5.2## Bug fixes ๐Ÿ› * DynamoDB key & filter expressions attribute overwrite by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2615 * Allow PostgreSQL reserved keywords as column names by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2619 * Add `to_iceberg` support for filling missing columns in the DataFrame with None by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2616 * Forward `ignore_nulls` for container types by @raaidarshad in #2636 ## DocLow1/25/2024
3.5.1## Bug fixes ๐Ÿ› * Deserialization error when reading from DynamoDB using `KeyConditionExpression` by @LeonLuttenberger in #2607 * Reading of chunked parquet when columns parameter is specified by @rchromik in #2599 ## Documentation ๐Ÿ“š * Add `show_create_table` to Athena API page by @MikeSchriefer in #2610 ## Other ๐Ÿค– * chore: Replace `bump2version` with `bump-my-version` by @LeonLuttenberger in #2608 * chore(deps-dev): bump jinja2 from 3.1.2 to 3.1.3 by @dependabot in #2609 * chore(dLow1/12/2024
3.5.0## Breaking changes ๐Ÿ’ฅ Due to [CVEs](https://www.anyscale.com/blog/update-on-ray-cves-cve-2023-6019-cve-2023-6020-cve-2023-6021-cve-2023-48022-cve-2023-48023), Ray is capped to patched version 2.9.x. As a result, the latest version of the library cannot be used on the Glue for Ray runtime. We have raised the CVEs issue to the Glue team ## Features/Enhancements ๐Ÿš€ * Add `spark_properties` to athena spark by @rajagurunath in https://github.com/aws/aws-sdk-pandas/pull/2508 * Add `MERGE INTO`Low1/11/2024
3.4.2## Features/Enhancements ๐Ÿš€ * Update pyarrow to 14.0.1 to fix [arbitrary code execution security vulnerability](https://github.com/aws/aws-sdk-pandas/security/dependabot/35) **Full Changelog**: https://github.com/aws/aws-sdk-pandas/compare/3.4.1...3.4.2Low11/13/2023
3.4.1## Features/Enhancements ๐Ÿš€ * feat: Add schema evolution to `athena.to_iceberg` by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2465 * feat: Athena - add `client_request_token` by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2474 * feat: Redshift data api - allow all auth combinations by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2475 * feat: add columns comments to iceberg by @frenchytheasian in https://github.com/aws/aws-sdk-pandas/pull/2482 Low10/24/2023
3.4.0## Features/Enhancements ๐Ÿš€ * Geospatial - parse Athena geospatial types via geopandas by @kukushking in #2346 * Allow group identifiers to be used in `wr.cloudwatch` queries by @LeonLuttenberger in #2430 * Add ignore null store parquet metadata by @raaidarshad in #2450 ## Bug fixes ๐Ÿ› * Add missing boto3 session in `athena.to_iceberg` wait_query by @jaidisido in #2428 * Add catalog ID in `athena.to_iceberg` by @jaidisido in #2446 * Return None for missing column and partition key commLow9/11/2023
3.3.0## Features/Enhancements ๐Ÿš€ * Support Athena query prepared statements & Athena parameterized queries by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2344 * Add dtype parameter in to_iceberg function by @paulobrunheroto in https://github.com/aws/aws-sdk-pandas/pull/2359 * Add CleanRooms read module by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/2366 * Escape and validate table identifiers and literals in PostreSQL by @kukushking in https://github.com/aws/aws-Low8/1/2023
3.2.1## Fixes ๐Ÿ› ๏ธ * Fix error where library could not be imported on Windows due to `No module named 'pyarrow._orc'` by @LeonLuttenberger in #2341 #2337 * Lower `packaging` version requirement by @LeonLuttenberger in #2340 * Allow Ray 2.5 & downgrade tox by @kukushking in #2338 **Full Changelog**: https://github.com/aws/aws-sdk-pandas/compare/3.2.0...3.2.1Low6/14/2023
3.2.0### Features/Enhancements ๐Ÿš€ * Add `s3.read_orc` and `s3.to_orc` by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2312 ๐Ÿ”ฅ * Apache Spark on Amazon Athena - `wr.athena.create_spark_session` & `wr.athena.run_spark_calculation` by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2314 ๐Ÿš€ * EMR Serverless by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2304 ๐Ÿ”ฅ * Add `to_sql` for RDS Data API by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/puLow6/13/2023
3.1.1## What's Changed * fix: Add missing `packaging` dependency by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2281 **Full Changelog**: https://github.com/aws/aws-sdk-pandas/compare/3.1.0...3.1.1Low5/16/2023
3.1.0### Features/Enhancements ๐Ÿš€ * Add `neptune.bulk_load` for bulk loading data into Neptune by @LeonLuttenberger in #2238 #2267 * Add `s3.to_deltalake` function by @LeonLuttenberger in #2228 * Add Timestream Batch Load support by @jaidisido in #2214 * Add Iceberg insert by @kukushking in #2233 * Support upsert mode for OracleDB by @LeonLuttenberger in #2265 * Add `chunked` parameter to DynamoDB read functions by @LeonLuttenberger in #2227 * Upgrade Modin to 0.20.1 & allow Ray 2.4 by @kukuLow5/15/2023
3.0.0### Breaking changes ๐Ÿ’ฅ * Move dependencies to optional by @jaidisido in #1992 ๐Ÿ”“ * Dependencies required by the following modules have been moved to optional: redshift, mysql, postgres, sqlserver, oracle, gremlin, sparql, deltalake * The required dependencies can be easily installed with `pip install awswrangler[<MODULE_NAME>]`, for example `pip install awswrangler[redshift]` * Change SQL formatters for Athena and LakeFormation so that they properly format types by @Taragolis and @LeLow4/13/2023
2.20.1## What's Changed * (fix) Timestream - ignore None, NaN, and NaT measure values by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2072 * (docs) Minor - update opensearch api docs by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2085 * Correct documentation for `chunksize=True` by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/2087 * fix: timestream empty batches by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/2098 * enhancement: Add timesLow3/21/2023
3.0.0rc3## What's Changed ### Breaking changes: * breaking change: Move dependencies to optional by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1992 * breaking change: Use ExecuteStatement instead of Scan for DynamoDB read_partiql by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1964 ### Features/Enhancements: * enhancement: Refactor engine switching when Ray is installed by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1792 * logging: Enable user toLow3/9/2023
2.20.0### Breaking changes - `dynamodb.read_partiql` no longer performs a Scan operation under the hood. Instead the `ExecuteStatement` API is used. It means that the `PartiQL*` IAM permission is required instead of `Scan` ### Noteworthy * (feat): opensearch serverless by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1922. See the [tutorial](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/035%20-%20OpenSearch%20Serverless.ipynb) ๐Ÿ”ฅ * (breaking change): Use `ExecuteStatemeLow3/1/2023
2.19.0## Noteworthy * Glue Data Quality now supported, checkout the [tutorial](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/034%20-%20Glue%20Data%20Quality.ipynb) ๐Ÿ”ฅ * Delta lake support by @fvaleye * New DynamoDB `read_items` method by @a-slice-of-py ## Features & enhancements * feat: add read_items to dynamodb module by @a-slice-of-py in https://github.com/aws/aws-sdk-pandas/pull/1877 * Add deltalake support in AWS S3 with Pandas by @fvaleye in https://github.com/aws/aws-sdk-paLow1/9/2023
2.18.0## Noteworthy - Pyarrow 10 support ๐Ÿ”ฅ by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1731 - Lambda layers now available in `af-south-1` (Cape Town) ๐ŸŒ by @malachi-constant ## Features & enhancements - Add unload_approach to athena.read_sql_table by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1634 - Pass additional partition projection params to wr.s3.to_parquet & catโ€ฆ by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1627 - Regenerate poetry.lock wiLow12/2/2022
3.0.0rc2## What's Changed * (enhancement): Enable missing unit tests and Redshift, Athena, LF load tests by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1736 * (enhancement): configure scheduling options, remove dependencies on internal ray impl by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1734 * (testing): Enable Athena and Redshift tests, and address errors by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1721 * (feat): Make tqdm progress reporting optLow11/23/2022
3.0.0rc1## What's Changed * (enhancement): Move RayLogger out of non-distributed modules by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1686 * (perf): Distribute data types inference by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1692 * (docs): Update config tutorial to include new configuration values by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1696 * (fix): partition block overwriting by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1695Low10/27/2022
3.0.0b3## What's Changed * (feat): Add partitioning on block level by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1653 * (refactor): Make room for additional distributed engines by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1646 * (feat): Distribute s3 write text by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1631 * (docs): Add "Introduction to Ray" Tutorial by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1661 * (fix): Return addreLow10/12/2022
3.0.0b2## What's Changed * (feat) Update to Ray 2.0 by @kukushking in https://github.com/aws/aws-sdk-pandas/pull/1635 * (feat) Ray logging by @malachi-constant in https://github.com/aws/aws-sdk-pandas/pull/1623 * (enhancement): Reduce LOC in S3 write methods create_table by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1626 * (docs) Tutorial: Run SDK for pandas job on ray cluster by @malachi-constant in https://github.com/aws/aws-sdk-pandas/pull/1616 **Full Changelog**: https://githubLow9/30/2022
3.0.0b1## What's Changed * (test) Consolidate unit and load tests by @jaidisido in https://github.com/aws/aws-sdk-pandas/pull/1525 * (feat) Distribute S3 read text by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1567 * (feat) Distribute s3 wait_objects by @LeonLuttenberger in https://github.com/aws/aws-sdk-pandas/pull/1539 * (test) Ray Load Tests CDK Stack and Instructions for Load Testing by @malachi-constant in https://github.com/aws/aws-sdk-pandas/pull/1583 * (fix) Fix S3 reaLow9/22/2022
2.17.0## New Functionalities - RedshiftDataAPI serverless support ๐Ÿ”ฅ #1530 - Check out the [tutorial](https://aws-sdk-pandas.readthedocs.io/en/latest/tutorials/030%20-%20Data%20Api.html) - Add `get_query_results` to the Athena module #1496 - Check out the [function documentation](https://aws-sdk-pandas.readthedocs.io/en/latest/stubs/awswrangler.athena.get_query_results.html#awswrangler.athena.get_query_results) - Add `generate_create_query` to the Athena module #1514 - Check out the Low9/20/2022
3.0.0a2This is a pre-release for the Wrangler@Scale project ## What's Changed * (feat): Add directory for Distributed Wrangler Load Tests by @malachi-constant in https://github.com/awslabs/aws-data-wrangler/pull/1464 * (CI): Distribute tests in tox config by @malachi-constant in https://github.com/awslabs/aws-data-wrangler/pull/1469 * (feat): Distribute s3 delete objects by @malachi-constant in https://github.com/awslabs/aws-data-wrangler/pull/1474 * (CI): Enable new CI pipeline for standard & dLow8/17/2022
3.0.0a1This is a pre-release for the Wrangler@Scale project ## What's Changed * (feat): Add distributed config flag and initialise method by @jaidisido in https://github.com/awslabs/aws-data-wrangler/pull/1389 * (feat): Add distributed Lake Formation read by @jaidisido in https://github.com/awslabs/aws-data-wrangler/pull/1397 * (feat): Distribute S3 select over multiple paths and scan ranges by @jaidisido in https://github.com/awslabs/aws-data-wrangler/pull/1445 * (refactor): Refactor threading/Low8/17/2022
2.16.1### Noteworthy > ๐Ÿ› Fixed issue introduced by `2.16.0` to method `s3.read_parquet()` ### Patch - Fix bug: pq_file.schema.names(): TypeError: 'list' object is not callable `s3.read_parquet()` #1412 --- ***P.S.*** The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. [Just upload it and run](https://aws-data-wrangler.readthedocs.io/en/stable/install.html) or [use](https://aws-data-wrangler.readthedocs.io/en/2.16.1/install.html#public-artifacts) them froLow6/28/2022
2.16.0### Noteworthy > โš ๏ธ **For platforms without PyArrow 7 support (e.g. MWAA, [EMR](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#aws-glue-pyspark-jobs)):**<br> โžก๏ธ `pip install pyarrow==2 awswrangler` ### New Functionalities - Add support for Oracle Database ๐Ÿ”ฅ #1259 Check out the [tutorial](https://aws-data-wrangler.readthedocs.io/en/latest/tutorials/007%20-%20Redshift%2C%20MLow6/22/2022
2.15.1### Noteworthy > โš ๏ธ Dropped Python 3.6 support > โš ๏ธ **For platforms without PyArrow 7 support (e.g. MWAA, [EMR](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#aws-glue-pyspark-jobs)):**<br> โžก๏ธ `pip install pyarrow==2 awswrangler` ### Patch - Add `sparql` extra & make `SPARQLWrapper` dependency optional #1252 --- ***P.S.*** The AWS Lambda Layer file (.zip) and the Low4/11/2022
2.15.0### Noteworthy > โš ๏ธ Dropped Python 3.6 support > โš ๏ธ **For platforms without PyArrow 7 support (e.g. MWAA, [EMR](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#aws-glue-pyspark-jobs)):**<br> โžก๏ธ `pip install pyarrow==2 awswrangler` ### New Functionalities - Amazon Neptune module ๐Ÿš€ #1084 Check out the [tutorial](https://aws-data-wrangler.readthedocs.io/en/latest/tutorials/Low3/28/2022
2.14.0### Caveats > โš ๏ธ **For platforms without PyArrow 6 support (e.g. MWAA, [EMR](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#aws-glue-pyspark-jobs)):**<br> โžก๏ธ `pip install pyarrow==2 awswrangler` ### New Functionalities - Support Athena Unload ๐Ÿš€ #1038 ### Enhancements - Add the `ExcludeColumnSchema=True` argument to the glue.get_partitions call to reduce response siLow1/28/2022
2.13.0### Caveats > โš ๏ธ **For platforms without PyArrow 6 support (e.g. MWAA, [EMR](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#aws-glue-pyspark-jobs)):**<br> โžก๏ธ `pip install pyarrow==2 awswrangler` ### Breaking changes - Fix sanitize methods to align with Glue/Hive naming conventions #579 ### New Functionalities - AWS Lake Formation Governed Tables ๐Ÿš€ #570 - Support forLow12/3/2021
2.12.1### Caveats > โš ๏ธ **For platforms without PyArrow 5 support (e.g. MWAA, [EMR](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/stable/install.html#aws-glue-pyspark-jobs)):**<br> โžก๏ธ `pip install pyarrow==2 awswrangler` ### Patch - Removing unnecessary dev dependencies from main #961 --- ***P.S.*** The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. [Just uploaLow10/18/2021

Dependencies & License Audit

Loading dependencies...

Similar Packages

sagemaker-studioPython library to interact with Amazon SageMaker Unified Studio1.1.13
alibabacloud-adb20211201Alibaba Cloud adb (20211201) SDK Library for Pythonmaster@2026-06-06
ydbYDB Python SDK3.29.1
sagemakerOpen source library for training and deploying models on Amazon SageMaker.v3.13.0
typerTyper, build great CLIs. Easy to code. Based on Python type hints.0.26.7

More from Amazon Web Services

aws-cdk-cloud-assembly-schemaSchema for the protocol between CDK framework and CDK CLI
boto3The AWS SDK for Python
aws-lambda-powertoolsPowertools for AWS Lambda (Python) is a developer toolkit to implement Serverless best practices and increase developer velocity.
sagemakerOpen source library for training and deploying models on Amazon SageMaker.

More in Databases

orbitOne API for 20+ LLM providers, your databases, and your files โ€” self-hosted, open-source AI gateway with RAG, voice, and guardrails.
alibabacloud-adb20211201Alibaba Cloud adb (20211201) SDK Library for Python
milvusMilvus is a high-performance, cloud-native vector database built for scalable vector ANN search
qdrantQdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/