freshcrate
Home > RAG & Memory > preshed

preshed

Cython hash table that trusts the keys are pre-hashed

Description

<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a> # preshed: Cython Hash Table for Pre-Hashed Keys Simple but high performance Cython hash table mapping pre-randomized keys to `void*` values. Inspired by [Jeff Preshing](http://preshing.com/20130107/this-hash-table-is-faster-than-a-judy-array/). All Python APIs provded by the `BloomFilter` and `PreshMap` classes are thread-safe on both the GIL-enabled build and the free-threaded build of Python 3.14 and newer. If you use the C API or the `PreshCounter` class, you must provide external synchronization if you use the data structures by this library in a multithreaded environment. [![tests](https://github.com/explosion/preshed/actions/workflows/tests.yml/badge.svg)](https://github.com/explosion/preshed/actions/workflows/tests.yml) [![pypi Version](https://img.shields.io/pypi/v/preshed.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.python.org/pypi/preshed) [![conda Version](https://img.shields.io/conda/vn/conda-forge/preshed.svg?style=flat-square&logo=conda-forge&logoColor=white)](https://anaconda.org/conda-forge/preshed) [![Python wheels](https://img.shields.io/badge/wheels-%E2%9C%93-4c1.svg?longCache=true&style=flat-square&logo=python&logoColor=white)](https://github.com/explosion/wheelwright/releases) ## Installation ```bash pip install preshed --only-binary preshed ``` Or with conda: ```bash conda install -c conda-forge preshed ``` ## Usage ### PreshMap A hash map for pre-hashed keys, mapping `uint64` to `uint64` values. ```python from preshed.maps import PreshMap map = PreshMap() # create with default size map = PreshMap(initial_size=1024) # create with initial capacity (must be power of 2) map[key] = value # set a value value = map[key] # get a value (returns None if missing) value = map.pop(key) # remove and return a value del map[key] # delete a key key in map # membership test len(map) # number of entries for key in map: # iterate over keys pass for key, value in map.items(): # iterate over key-value pairs pass for value in map.values(): # iterate over values pass ``` ### BloomFilter A probabilistic set for fast membership testing of integer keys. ```python from preshed.bloom import BloomFilter bloom = BloomFilter(size=1024, hash_funcs=23) # explicit parameters bloom = BloomFilter.from_error_rate(10000, error_rate=1e-4) # auto-sized bloom.add(42) # add a key 42 in bloom # membership test (may have false positives) data = bloom.to_bytes() # serialize bloom.from_bytes(data) # deserialize in-place ``` ### PreshCounter A counter backed by a hash map, for counting occurrences of `uint64` keys. ```python from preshed.counter import PreshCounter counter = PreshCounter() counter.inc(key, 1) # increment key by 1 count = counter[key] # get current count len(counter) # number of buckets for key, count in counter: # iterate over entries pass counter.smooth() # apply Good-Turing smoothing prob = counter.prob(key) # get smoothed probability ``` ### Cython API All classes expose a C-level API via `.pxd` files for use in Cython extensions. The low-level `MapStruct` and `BloomStruct` functions operate on raw structs and can be called without the GIL: ```cython from preshed.maps cimport PreshMap, map_get, map_set, map_iter, key_t from preshed.bloom cimport BloomFilter, bloom_add, bloom_contains cdef PreshMap table = PreshMap() # Low-level nogil access (requires external synchronization) cdef void* value with nogil: value = map_get(table.c_map, some_key) ```

Release History

VersionChangesUrgencyDate
3.0.13Imported from PyPI (3.0.13)Low4/21/2026
release-v3.0.13Improve noexcept markers in headers to support better nogil usage.Medium3/23/2026
release-v3.0.12Add Windows ARM wheels.Low11/17/2025
release-v3.0.11Release release-v3.0.11Low11/13/2025
release-v3.0.10Publish wheels for Python 3.13 and Linux ARMLow5/26/2025
v3.0.9* Package updates and binary wheels for python 3.12.Low9/15/2023
v4.0.0* Fix bloom size issues with Windows serialization (#38). * bloom: replace raw array with `std::vector` and raw pointer with `std::unique_ptr` (#39). * map/counter: replace raw array with `std::vector` and raw pointer with `std::unique_ptr` (#40). * Remove `PreshMapArray` (#40).Low4/27/2023
v3.0.8* Package updates and binary wheels for python 3.11. Low10/14/2022
v3.0.7* Update setup for Setuptools v65.0.0Low8/16/2022
v2.0.0* Require `cymem>=2.0.0` * Add `wheel` to `setup_requires`Low10/14/2018
v1.0.1* Regenerate C source for Python 3.7Low7/21/2018

Dependencies & License Audit

Loading dependencies...

Similar Packages

azure-search-documentsMicrosoft Azure Cognitive Search Client Library for Pythonazure-template_0.1.0b6187637
apache-tvm-ffitvm ffi0.1.10
luqumA Lucene query parser generating ElasticSearch queries and more !1.0.0
torchaoPackage for applying ao techniques to GPU models0.17.0
banksA prompt programming language2.4.1