# waybackpy

> Python package that interfaces with the Internet Archive's Wayback Machine APIs. Archive pages and retrieve archived pages easily.

- **URL**: https://www.freshcrate.ai/projects/waybackpy
- **Author**: Akash Mahanty
- **Category**: Frameworks
- **Latest version**: `3.0.6` (2026-04-21)
- **License**: MIT
- **Source**: https://github.com/akamhy/waybackpy/wiki
- **Homepage**: https://akamhy.github.io/waybackpy/
- **Language**: Python
- **GitHub**: 572 stars, 40 forks
- **Registry**: pypi (`waybackpy`)
- **Tags**: `archive`, `internet`, `machine`, `pypi`, `wayback`, `website`

## Description

<!-- markdownlint-disable MD033 MD041 -->
<div align="center">

<img src="https://raw.githubusercontent.com/akamhy/waybackpy/master/assets/waybackpy_logo.svg"><br>

<h3>A Python package & CLI tool that interfaces with the Wayback Machine API</h3>

</div>

<p align="center">
<a href="https://github.com/akamhy/waybackpy/actions?query=workflow%3ATests"><img alt="Unit Tests" src="https://github.com/akamhy/waybackpy/workflows/Tests/badge.svg"></a>
<a href="https://codecov.io/gh/akamhy/waybackpy"><img alt="codecov" src="https://codecov.io/gh/akamhy/waybackpy/branch/master/graph/badge.svg"></a>
<a href="https://pypi.org/project/waybackpy/"><img alt="pypi" src="https://img.shields.io/pypi/v/waybackpy.svg"></a>
<a href="https://pepy.tech/project/waybackpy?versions=2*&versions=1*&versions=3*"><img alt="Downloads" src="https://pepy.tech/badge/waybackpy/month"></a>
<a href="https://app.codacy.com/gh/akamhy/waybackpy?utm_source=github.com&utm_medium=referral&utm_content=akamhy/waybackpy&utm_campaign=Badge_Grade_Settings"><img alt="Codacy Badge" src="https://api.codacy.com/project/badge/Grade/6d777d8509f642ac89a20715bb3a6193"></a>
<a href="https://github.com/akamhy/waybackpy/commits/master"><img alt="GitHub lastest commit" src="https://img.shields.io/github/last-commit/akamhy/waybackpy?color=blue&style=flat-square"></a>
<a href="#"><img alt="PyPI - Python Version" src="https://img.shields.io/pypi/pyversions/waybackpy?style=flat-square"></a>
<a href="https://github.com/psf/black"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg"></a>
</p>

---

# <img src="https://github.githubassets.com/images/icons/emoji/unicode/2b50.png" width="30"></img> Introduction

Waybackpy is a Python package and a CLI tool that interfaces with the Wayback Machine APIs.

Wayback Machine has 3 client side APIs.

- SavePageNow or Save API
- CDX Server API
- Availability API

These three APIs can be accessed via the waybackpy either by importing it from a python file/module or from the command-line interface.

## <img src="https://github.githubassets.com/images/icons/emoji/unicode/1f3d7.png" width="20"></img> Installation

**Using [pip](https://en.wikipedia.org/wiki/Pip_(package_manager)), from [PyPI](https://pypi.org/) (recommended)**:

```bash
pip install waybackpy
```

**Using [conda](https://en.wikipedia.org/wiki/Conda_(package_manager)), from [conda-forge](https://anaconda.org/conda-forge/waybackpy) (recommended)**:

See also [waybackpy feedstock](https://github.com/conda-forge/waybackpy-feedstock), maintainers are [@rafaelrdealmeida](https://github.com/rafaelrdealmeida/),
 [@labriunesp](https://github.com/labriunesp/)
 and [@akamhy](https://github.com/akamhy/).

```bash
conda install -c conda-forge waybackpy
```

**Install directly from [this git repository](https://github.com/akamhy/waybackpy) (NOT recommended)**:

```bash
pip install git+https://github.com/akamhy/waybackpy.git
```

## <img src="https://github.githubassets.com/images/icons/emoji/unicode/1f433.png" width="20"></img> Docker Image

Docker Hub: [hub.docker.com/r/secsi/waybackpy](https://hub.docker.com/r/secsi/waybackpy)

Docker image is automatically updated on every release by [Regulary and Automatically Updated Docker Images](https://github.com/cybersecsi/RAUDI) (RAUDI).

RAUDI is a tool by [SecSI](https://secsi.io), an Italian cybersecurity startup.

## <img src="https://github.githubassets.com/images/icons/emoji/unicode/1f680.png" width="20"></img> Usage

### As a Python package

#### Save API aka SavePageNow

```python
>>> from waybackpy import WaybackMachineSaveAPI
>>> url = "https://github.com"
>>> user_agent = "Mozilla/5.0 (Windows NT 5.1; rv:40.0) Gecko/20100101 Firefox/40.0"
>>>
>>> save_api = WaybackMachineSaveAPI(url, user_agent)
>>> save_api.save()
https://web.archive.org/web/20220118125249/https://github.com/
>>> save_api.cached_save
False
>>> save_api.timestamp()
datetime.datetime(2022, 1, 18, 12, 52, 49)
```

#### CDX API aka CDXServerAPI

```python
>>> from waybackpy import WaybackMachineCDXServerAPI
>>> url = "https://google.com"
>>> user_agent = "my new app's user agent"
>>> cdx_api = WaybackMachineCDXServerAPI(url, user_agent)
```
##### oldest
```python
>>> cdx_api.oldest()
com,google)/ 19981111184551 http://google.com:80/ text/html 200 HOQ2TGPYAEQJPNUA6M4SMZ3NGQRBXDZ3 381
>>> oldest = cdx_api.oldest()
>>> oldest
com,google)/ 19981111184551 http://google.com:80/ text/html 200 HOQ2TGPYAEQJPNUA6M4SMZ3NGQRBXDZ3 381
>>> oldest.archive_url
'https://web.archive.org/web/19981111184551/http://google.com:80/'
>>> oldest.original
'http://google.com:80/'
>>> oldest.urlkey
'com,google)/'
>>> oldest.timestamp
'19981111184551'
>>> oldest.datetime_timestamp
datetime.datetime(1998, 11, 11, 18, 45, 51)
>>> oldest.statuscode
'200'
>>> oldest.mimetype
'text/html'
```
##### newest
```python
>>> newest = cdx_api.newest()
>>> newest
com,google)/ 20220217234427 http://@google.com/ text/html 301 Y6PVK4XWOI3BXQEXM5WLLWU5JKUVNSFZ 563
>>> newe

## Recent releases

| Version | Date | Urgency | Changes |
| --- | --- | --- | --- |
| `3.0.6` | 2026-04-21 | Low | Imported from PyPI (3.0.6) |
| `3.0.5` | 2022-02-18 | Low | ## What's Changed * undo drop python3.6 by @akamhy in https://github.com/akamhy/waybackpy/pull/163   **Full Changelog**: https://github.com/akamhy/waybackpy/compare/3.0.4...3.0.5  [![Download waybackpy](https://a.fsdn.com/con/app/sf-download-button)](https://sourceforge.net/projects/waybackpy/files/3.0.5/v3.0.5.zip/download) |
| `3.0.4` | 2022-02-18 | Low | ## What's Changed * Move metadata from __init__.py into setup.cfg by @eggplants in https://github.com/akamhy/waybackpy/pull/153 * add sort param support in CDX API class by @akamhy in https://github.com/akamhy/waybackpy/pull/156 * Add sort, use_pagination and closest by @akamhy in https://github.com/akamhy/waybackpy/pull/158 * Cdx based oldest newest and near by @akamhy in https://github.com/akamhy/waybackpy/pull/159   **Full Changelog**: https://github.com/akamhy/waybackpy/compare/3.0.3. |
| `3.0.3` | 2022-02-09 | Low | ## What's Changed * Dropped Python 3.4 to 3.6, both inclusive. * Catch 429 and 509 status code for save page now API * Increase the default CDX limit from 5000 to 25000 records per API call. * Added type hint * The package will now close the sessions explicitly. * Removed useless code. * Added docstrings.  ## New Contributors * @eggplants made their first contribution in https://github.com/akamhy/waybackpy/pull/124 * @deepsource-autofix made their first contribution in https://github. |
| `3.0.2` | 2022-01-25 | Low | Nothing changed wrt to the previous version but creating a release for Conda forge.  Replace the NON-ASCII character figlet with ASCII character figlet.   see https://github.com/conda-forge/staged-recipes/pull/17643  [![Download waybackpy](https://a.fsdn.com/con/app/sf-download-button)](https://sourceforge.net/projects/waybackpy/files/3.0.2/v3.0.2.zip/download) |
| `3.0.1` | 2022-01-25 | Low | ## What's Changed * escape '.' before 'archive.org' by @akamhy in https://github.com/akamhy/waybackpy/pull/112 * Update setup.py by @rafaelrdealmeida in https://github.com/akamhy/waybackpy/pull/114 * do not use f-strings in setup.py by @akamhy in https://github.com/akamhy/waybackpy/pull/115  ## New Contributors * @rafaelrdealmeida made their first contribution in https://github.com/akamhy/waybackpy/pull/114   See also https://github.com/conda-forge/staged-recipes/pull/17634 and https:// |
| `3.0.0` | 2022-01-18 | Low | ## What's Changed  - 3 different APIs have now 3 different classes, WaybackMachineCDXServerAPI, WaybackMachineSaveAPI and WaybackMachineAvailabilityAPI. - CLI now supports the CDX API. - The past Url class will be continued to be supported, don't need to worry that your old code will break. - Get is now deprecated, it was a bad idea even trying to add tasks meant for urllib.  **Full Changelog**: https://github.com/akamhy/waybackpy/compare/2.4.4...3.0.0 |
| `2.4.4` | 2021-09-03 | Low | - When the response code is 509, raise an error with an explanation (based on the actual error message contained in the response HTML). - Fix typo  [![Download waybackpy](https://a.fsdn.com/con/app/sf-download-button)](https://sourceforge.net/projects/waybackpy/files/2.4.4/v2.4.4.zip/download) |
| `2.4.3` | 2021-04-02 | Low | - Fix redirect issues with HTTP and HTTPS redirection  - More stable archiving   [![Download waybackpy](https://a.fsdn.com/con/app/sf-download-button)](https://sourceforge.net/projects/waybackpy/files/2.4.3/v2.4.3.zip/download) |
| `2.4.2` | 2021-01-24 | Low | - added CLI Arg --file, if this Arg is not used with known URLs than waybackpy will not save the output URLs in file. - added cached_save flag on waybackpy URL object, if the returned saved archive is older than 3 mins the flag is true else false. - BUG FIX : the CLI --json arg was not returning valid JSON instead JSON loaded python dict. This is now fixed.  [![Download waybackpy](https://a.fsdn.com/con/app/sf-download-button)](https://sourceforge.net/projects/waybackpy/files/2.4.2/v2.4.2.zip/ |

## Citation

- HTML: https://www.freshcrate.ai/projects/waybackpy
- Markdown: https://www.freshcrate.ai/projects/waybackpy.md
- Dependencies JSON: https://www.freshcrate.ai/api/projects/waybackpy/deps

_Generated by freshcrate.ai. Indexes pypi releases for AI-agent ecosystem packages._
