freshcrate
Home > Frameworks > pyannote-audio

pyannote-audio

State-of-the-art speaker diarization toolkit

Description

<p align="center"> <a href="https://pyannote.ai/" target="blank"><img src="https://avatars.githubusercontent.com/u/162698670" width="64" /></a> </p> <div align="center"> <h1><code>pyannote</code> speaker diarization toolkit</h1> </div> `pyannote.audio` is an open-source toolkit written in Python for speaker diarization. Based on [PyTorch](https://pytorch.org) machine learning framework, it comes with state-of-the-art [pretrained models and pipelines](https://hf.co/pyannote), that can be further finetuned to your own data for even better performance. <p align="center"> <a href="https://www.youtube.com/watch?v=37R_R82lfwA"><img src="https://img.youtube.com/vi/37R_R82lfwA/0.jpg"></a> </p> ## Highlights - :exploding_head: state-of-the-art performance (see [Benchmark](#benchmark)) - :hugs: pretrained [pipelines](https://hf.co/models?other=pyannote-audio-pipeline) (and [models](https://hf.co/models?other=pyannote-audio-model)) on [:hugs: model hub](https://huggingface.co/pyannote) - :rocket: built-in support for [pyannoteAI](https://pyannote.ai) premium speaker diarization - :snake: Python-first API - :zap: multi-GPU training with [pytorch-lightning](https://pytorchlightning.ai/) ## `community-1` open-source speaker diarization 1. Make sure [`ffmpeg`](https://ffmpeg.org/) is installed on your machine (needed by [`torchcodec`](https://docs.pytorch.org/torchcodec/) audio decoding library) 2. Install with [`uv`](https://docs.astral.sh/uv/)`add pyannote.audio` (recommended) or `pip install pyannote.audio` 3. Accept [`pyannote/speaker-diarization-community-1`](https://hf.co/pyannote/speaker-diarization-community-1) user conditions 4. Create Huggingface access token at [`hf.co/settings/tokens`](https://hf.co/settings/tokens) ```python import torch from pyannote.audio import Pipeline from pyannote.audio.pipelines.utils.hook import ProgressHook # Community-1 open-source speaker diarization pipeline pipeline = Pipeline.from_pretrained( "pyannote/speaker-diarization-community-1", token="HUGGINGFACE_ACCESS_TOKEN") # send pipeline to GPU (when available) pipeline.to(torch.device("cuda")) # apply pretrained pipeline (with optional progress hook) with ProgressHook() as hook: output = pipeline("audio.wav", hook=hook) # runs locally # print the result for turn, speaker in output.speaker_diarization: print(f"start={turn.start:.1f}s stop={turn.end:.1f}s speaker_{speaker}") # start=0.2s stop=1.5s speaker_0 # start=1.8s stop=3.9s speaker_1 # start=4.2s stop=5.7s speaker_0 # ... ``` ## `precision-2` premium speaker diarization 1. Create pyannoteAI API key at [`dashboard.pyannote.ai`](https://dashboard.pyannote.ai) 2. Enjoy free credits! ```python from pyannote.audio import Pipeline # Precision-2 premium speaker diarization service pipeline = Pipeline.from_pretrained( "pyannote/speaker-diarization-precision-2", token="PYANNOTEAI_API_KEY") output = pipeline("audio.wav") # runs on pyannoteAI servers # print the result for turn, speaker in output.speaker_diarization: print(f"start={turn.start:.1f}s stop={turn.end:.1f}s {speaker}") # start=0.2s stop=1.6s SPEAKER_00 # start=1.8s stop=4.0s SPEAKER_01 # start=4.2s stop=5.6s SPEAKER_00 # ... ``` Visit [`docs.pyannote.ai`](https://docs.pyannote.ai) to learn about other pyannoteAI features (voiceprinting, confidence scores, ...) ## Benchmark | Benchmark (last updated in 2025-09) | <a href="https://hf.co/pyannote/speaker-diarization-3.1">`legacy` (3.1)</a>| <a href="https://hf.co/pyannote/speaker-diarization-community-1">`community-1`</a> | <a href="https://docs.pyannote.ai">`precision-2`</a> | | --------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ | -------------------------------------------------| ------------------------------------------------ | | [AISHELL-4](https://arxiv.org/abs/2104.03603) | 12.2 | 11.7 | 11.4 | | [AliMeeting](https://www.openslr.org/119/) (channel 1) | 24.5 | 20.3 | 15.2 | | [AMI](https://groups.inf.ed.ac.uk/ami/corpus/) (IHM) | 18.8 | 17.0 | 12.9 | | [AMI](https://groups.inf.ed.ac.uk/ami/corpus/) (SDM) | 22.7 | 19.9 | 15.6 | | [AVA-AVD](https://arxiv.org/abs/2111.14448) | 49.7 | 44.6 | 37.1 | | [CALLHOME](https://catalog.ldc.upenn.edu/LDC2001S97) ([part 2](https://github.com/BUTSpeechFIT/CALLHOME_sublists/issues/1)) | 28.5 | 26.7 | 16.6 | | [DIHARD 3](https://catalog.ldc.upenn.edu/LDC2022S14) ([full](https://arxiv.org/abs/2012.01477)) | 21.4 | 20.2 | 14.7 | | [Ego4D](https://arxiv.org/abs/2110.07

Release History

VersionChangesUrgencyDate
4.0.4Imported from PyPI (4.0.4)Low4/21/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

pre-commitA framework for managing and maintaining multi-language pre-commit hooks.v4.6.0
azure-core-tracing-opentelemetryMicrosoft Azure Azure Core OpenTelemetry plugin Library for Pythonazure-template_0.1.0b6187637
spdx-toolsSPDX parser and tools.0.8.5
lacesDjango components that know how to render themselves.0.1.2
django-tasksA backport of Django's built in Tasks framework0.12.0