# pathway

> Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

- **URL**: https://www.freshcrate.ai/projects/pathway
- **Author**: pathwaycom
- **Category**: Frameworks
- **Latest version**: `v0.31.0` (2026-05-25)
- **License**: NOASSERTION
- **Source**: https://github.com/pathwaycom/pathway
- **Homepage**: https://pathway.com
- **Language**: Python
- **GitHub**: 63,440 stars, 1,624 forks
- **Registry**: github
- **Tags**: `batch-processing`, `data-analytics`, `data-pipelines`, `data-processing`, `dataflow`, `etl`, `etl-framework`, `iot-analytics`, `python`

## Description

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

## Recent releases

| Version | Date | Urgency | Changes |
| --- | --- | --- | --- |
| `v0.31.0` | 2026-05-25 | High | ### Added - `pw.io.sqlite.write` connector, which writes a Pathway table into a SQLite database file. Supports two modes: `stream_of_changes` (default) appends each event alongside `time`/`diff` metadata columns, while `snapshot` maintains the current state of the table via `INSERT ... ON CONFLICT DO UPDATE` on insertions and `DELETE` on retractions, keyed on the `primary_key` parameter. Values are encoded using the same storage-class mapping that `pw.io.sqlite.read` accepts, so `write` / `read |
| `v0.30.1` | 2026-04-23 | High | ### Added - `pw.io.rabbitmq.read` and `pw.io.rabbitmq.write` connectors for reading from and writing to RabbitMQ Streams. Supports JSON, plaintext, and raw formats; streaming and static modes; persistence with offset recovery; dynamic topics (writing to different streams per row); `start_from` parameter (`"beginning"`, `"end"`, or `"timestamp"`); TLS configuration; and message metadata including AMQP 1.0 properties and application properties. Header values are JSON-encoded for round-trip compat |
| `v0.30.0` | 2026-03-24 | Medium | ### Added - `pw.io.mongodb.read` connector, which reads data from a MongoDB collection. The connector first delivers a full snapshot of the collection and then, if the streaming mode is used, subscribes to the change stream to receive incremental updates in real time. - `pw.io.postgres.read` connector, which reads data from a PostgreSQL table directly by parsing the Write-Ahead Log (WAL). - `pw.io.postgres.write` and `pw.io.postgres.read` now support serialization/deserialization of `np.ndarr |
| `v0.29.1` | 2026-02-16 | Low | ### Added - `pw.io.kafka.read` and `pw.io.kafka.write` connectors now support OAUTHBEARER authentication. - `pw.io.mongodb.write` connector now supports an `output_table_type` parameter with two modes: `stream_of_changes` (default) and `snapshot`. In `snapshot` mode, the connector maintains the current state of the Pathway table in MongoDB using the `_id` field as the primary key, while `stream_of_changes` preserves the existing behavior by writing all events with `time` and `diff` flags to re |
| `v0.29.0` | 2026-01-22 | Low | ### Added - Pathway Web Dashboard providing user-friendly interface for monitoring Pathway pipelines in real time with interactive graph plotting and latency/memory metrics. - `pw.io.kafka.read` now includes message headers in the parsed metadata. The headers are available at the top level of the metadata in the `headers` array. Each element of the array is a pair consisting of a string header name and a base64-encoded header value. If the header is null, the corresponding value is also null. |
| `v0.28.0` | 2026-01-08 | Low | ### Added - `pw.io.kafka.read` and `pw.io.redpanda.read` now allow each schema field to be specified as coming from either the message key or the message value. - Connector groups now support the specification of an idle duration. When this is set, if a source does not provide any data for the specified period of time, it will be excluded from the group until it produces data again. - It is now possible to assign priorities to sources within a connector group. When a priority is set, it ensur |
| `v0.27.1` | 2025-12-08 | Low | ## [0.27.1] - 2025-12-08  ### Added - `pw.Table.filter_out_results_of_forgetting` method, allowing to revert the effects of forgetting at a later stage.  ### Changed - The MCP server `tool` method now allows to pass an optional `description`, default value ​​being kept as the handler's docstring. - `pw.io.kafka.read` and `pw.io.redpanda.read` now create a `key` column storing the contents of the message keys. |
| `v0.27.0` | 2025-11-13 | Low | ### Added - JetStream extension is now supported in both NATS read and write connectors. - The Iceberg connectors now support Glue as a catalog backend. - New `Table.add_update_timestamp_utc` function for tracking update time of rows in the table  ### Changed - **BREAKING** The API for the Iceberg connectors has changed. The `catalog` parameter is now required in both `pw.io.iceberg.read` and `pw.io.iceberg.write`. This parameter can be either of type `pw.io.iceberg.RestCatalog` or `pw.io. |
| `v0.26.4` | 2025-10-16 | Low | ### Added - New external integration with [Qdrant](https://qdrant.tech/). - `pw.io.mysql.write` method for writing to MySQL. It supports two output table types: stream of changes and a realtime-updated data snapshot.  ### Changed - `pw.io.deltalake.read` now accepts the `start_from_timestamp_ms` parameter for non-append-only tables. In this case, the connector will replay the history of changes in the table version by version starting from the state of the table at the given timestamp. The |
| `v0.26.3` | 2025-10-03 | Low | ### Added  - New parser `pathway.xpacks.llm.parsers.PaddleOCRParser` supporting parsing of PDF, PPTX and images. |

## Citation

- HTML: https://www.freshcrate.ai/projects/pathway
- Markdown: https://www.freshcrate.ai/projects/pathway.md
- Dependencies JSON: https://www.freshcrate.ai/api/projects/pathway/deps

_Generated by freshcrate.ai. Indexes github releases for AI-agent ecosystem packages._
