freshcrate
Home > Databases > sentence-transformers

sentence-transformers

Embeddings, Retrieval, and Reranking

Description

<!--- BADGES: START ---> [![HF Models](https://img.shields.io/badge/%F0%9F%A4%97-models-yellow)](https://huggingface.co/models?library=sentence-transformers) [![GitHub - License](https://img.shields.io/github/license/huggingface/sentence-transformers?logo=github&style=flat&color=green)][#github-license] [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/sentence-transformers?logo=pypi&style=flat&color=blue)][#pypi-package] [![PyPI - Package Version](https://img.shields.io/pypi/v/sentence-transformers?logo=pypi&style=flat&color=orange)][#pypi-package] [![Docs - GitHub.io](https://img.shields.io/static/v1?logo=github&style=flat&color=pink&label=docs&message=sentence-transformers)][#docs-package] <!-- [![PyPI - Downloads](https://img.shields.io/pypi/dm/sentence-transformers?logo=pypi&style=flat&color=green)][#pypi-package] --> <!--- BADGES: END ---> # Sentence Transformers: Embeddings, Retrieval, and Reranking This framework provides an easy method to compute embeddings for accessing, using, and training state-of-the-art embedding and reranker models. It can be used to compute embeddings using Sentence Transformer models ([quickstart](https://sbert.net/docs/quickstart.html#sentence-transformer)), to calculate similarity scores using Cross-Encoder (a.k.a. reranker) models ([quickstart](https://sbert.net/docs/quickstart.html#cross-encoder)) or to generate sparse embeddings using Sparse Encoder models ([quickstart](https://sbert.net/docs/quickstart.html#sparse-encoder)). This unlocks a wide range of applications, including [semantic search](https://sbert.net/examples/applications/semantic-search/README.html), [semantic textual similarity](https://sbert.net/docs/sentence_transformer/usage/semantic_textual_similarity.html), and [paraphrase mining](https://sbert.net/examples/applications/paraphrase-mining/README.html). A wide selection of over [15,000 pre-trained Sentence Transformers models](https://huggingface.co/models?library=sentence-transformers) are available for immediate use on 🤗 Hugging Face, including many of the state-of-the-art models from the [Massive Text Embeddings Benchmark (MTEB) leaderboard](https://huggingface.co/spaces/mteb/leaderboard). Additionally, it is easy to train or finetune your own [embedding models](https://sbert.net/docs/sentence_transformer/training_overview.html), [reranker models](https://sbert.net/docs/cross_encoder/training_overview.html) or [sparse encoder models](https://sbert.net/docs/sparse_encoder/training_overview.html) using Sentence Transformers, enabling you to create custom models for your specific use cases. For the **full documentation**, see **[www.SBERT.net](https://www.sbert.net)**. ## Installation We recommend **Python 3.10+**, **[PyTorch 1.11.0+](https://pytorch.org/get-started/locally/)**, and **[transformers v4.34.0+](https://github.com/huggingface/transformers)**. **Install with pip** ``` pip install -U sentence-transformers ``` **Install with conda** ``` conda install -c conda-forge sentence-transformers ``` **Install from sources** Alternatively, you can also clone the latest version from the [repository](https://github.com/huggingface/sentence-transformers) and install it directly from the source code: ``` pip install -e . ``` **PyTorch with CUDA** If you want to use a GPU / CUDA, you must install PyTorch with the matching CUDA Version. Follow [PyTorch - Get Started](https://pytorch.org/get-started/locally/) for further details how to install PyTorch. ## Getting Started See [Quickstart](https://www.sbert.net/docs/quickstart.html) in our documentation. ### Embedding Models First download a pretrained embedding a.k.a. Sentence Transformer model. ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2") ``` Then provide some texts to the model. ```python sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium.", ] embeddings = model.encode(sentences) print(embeddings.shape) # => (3, 384) ``` And that's already it. We now have numpy arrays with the embeddings, one for each text. We can use these to compute similarities. ```python similarities = model.similarity(embeddings, embeddings) print(similarities) # tensor([[1.0000, 0.6660, 0.1046], # [0.6660, 1.0000, 0.1411], # [0.1046, 0.1411, 1.0000]]) ``` ### Reranker Models First download a pretrained reranker a.k.a. Cross Encoder model. ```python from sentence_transformers import CrossEncoder # 1. Load a pretrained CrossEncoder model model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L6-v2") ``` Then provide some texts to the model. ```python # The texts for which to predict similarity scores query = "How many people live in Berlin?" passages = [ "Berlin had a population of 3,520,031 registered inhabitants in an area

Release History

VersionChangesUrgencyDate
5.4.1Imported from PyPI (5.4.1)Low4/21/2026
v5.4.1This patch release allows `encode()` and `predict()` to accept 1D numpy string arrays as inputs. Install this version with ```bash # Training + Inference pip install sentence-transformers[train]==5.4.1 # Inference only, use one of: pip install sentence-transformers==5.4.1 pip install sentence-transformers[onnx-gpu]==5.4.1 pip install sentence-transformers[onnx]==5.4.1 pip install sentence-transformers[openvino]==5.4.1 # Multimodal dependencies (optional): pip install sentence-trMedium4/14/2026
v5.4.0This large release introduces first-class multimodal support for both `SentenceTransformer` and `CrossEncoder`, making it easy to compute embeddings and rerank across text, images, audio, and video. The `CrossEncoder` class has been fully modularized, allowing for generative rerankers (CausalLM-based models) via a new `LogitScore` module. Flash Attention 2 now automatically skips padding for text-only inputs, providing significant speedups & memory reductions, especially when input lengths vary.Medium4/9/2026
v5.3.0This minor version brings several improvements to contrastive learning: `MultipleNegativesRankingLoss` now supports alternative InfoNCE formulations (symmetric, GTE-style) and optional hardness weighting for harder negatives. Two new losses are introduced, `GlobalOrthogonalRegularizationLoss` for embedding space regularization and `CachedSpladeLoss` for memory-efficient SPLADE training. The release also adds a faster hashed batch sampler, fixes `GroupByLabelBatchSampler` for triplet losses, and Low3/12/2026
v5.2.3This patch release introduces compatibility with Transformers v5.2. Install this version with ```bash # Training + Inference pip install sentence-transformers[train]==5.2.3 # Inference only, use one of: pip install sentence-transformers==5.2.3 pip install sentence-transformers[onnx-gpu]==5.2.3 pip install sentence-transformers[onnx]==5.2.3 pip install sentence-transformers[openvino]==5.2.3 ``` ## Transformers v5.2 Support [Transformers v5.2](https://github.com/huggingface/traLow2/17/2026
v5.2.2This patch release replaces mandatory `requests` dependency with an optional `httpx` dependency. Install this version with ```bash # Training + Inference pip install sentence-transformers[train]==5.2.2 # Inference only, use one of: pip install sentence-transformers==5.2.2 pip install sentence-transformers[onnx-gpu]==5.2.2 pip install sentence-transformers[onnx]==5.2.2 pip install sentence-transformers[openvino]==5.2.2 ``` ## Transformers v5 Support [Transformers v5.0](https:/Low1/27/2026
v5.2.1This patch release adds support for the full [Transformers v5 release](https://github.com/huggingface/transformers/releases/tag/v5.0.0). Install this version with ```bash # Training + Inference pip install sentence-transformers[train]==5.2.1 # Inference only, use one of: pip install sentence-transformers==5.2.1 pip install sentence-transformers[onnx-gpu]==5.2.1 pip install sentence-transformers[onnx]==5.2.1 pip install sentence-transformers[openvino]==5.2.1 ``` ## TransformerLow1/26/2026
v5.2.0This minor release introduces multi-processing for CrossEncoder (rerankers), multilingual NanoBEIR evaluators, similarity score outputs in `mine_hard_negatives`, Transformers v5 support, Python 3.9 deprecations, and more. Install this version with ```bash # Training + Inference pip install sentence-transformers[train]==5.2.0 # Inference only, use one of: pip install sentence-transformers==5.2.0 pip install sentence-transformers[onnx-gpu]==5.2.0 pip install sentence-transformers[onnLow12/11/2025
v5.1.2This patch celebrates the transition of Sentence Transformers to Hugging Face, and improves model saving, loading defaults, and loss compatibilities. Install this version with ```bash # Training + Inference pip install sentence-transformers[train]==5.1.2 # Inference only, use one of: pip install sentence-transformers==5.1.2 pip install sentence-transformers[onnx-gpu]==5.1.2 pip install sentence-transformers[onnx]==5.1.2 pip install sentence-transformers[openvino]==5.1.2 ``` #Low10/22/2025
v5.1.1This patch makes Sentence Transformers more explicit with incorrect arguments and introduces some fixes for multi-GPU processing, evaluators, and hard negatives mining. Install this version with ```bash # Training + Inference pip install sentence-transformers[train]==5.1.1 # Inference only, use one of: pip install sentence-transformers==5.1.1 pip install sentence-transformers[onnx-gpu]==5.1.1 pip install sentence-transformers[onnx]==5.1.1 pip install sentence-transformers[openvinoLow9/22/2025
v5.1.0This release introduces 2 new efficient computing backends for SparseEncoder embedding models: [ONNX and OpenVINO + optimization & quantization, allowing for speedups up to 2x-3x](https://sbert.net/docs/sparse_encoder/usage/efficiency.html); a new "n-tuple-score" output format for hard negative mining for distillation; gathering across devices for free lunch on multi-gpu training; trackio support; MTEB documentation; any many small fixes and features. Install this version with ```bash # TLow8/6/2025
v5.0.0This release consists of significant updates including the introduction of Sparse Encoder models, new methods `encode_query` and `encode_document`, multi-processing support in `encode`, the `Router` module for asymmetric models, custom learning rates for parameter groups, composite loss logging, and various small improvements and bug fixes. Install this version with ```bash # Training + Inference pip install sentence-transformers[train]==5.0.0 # Inference only, use one of: pip instalLow7/1/2025
v4.1.0This release introduces 2 new efficient computing backends for CrossEncoder (reranker) models: [ONNX and OpenVINO + optimization & quantization, allowing for speedups up to 2x-3x](https://sbert.net/docs/cross_encoder/usage/efficiency.html); improved hard negatives mining strategies, and minor improvements. Install this version with ```bash # Training + Inference pip install sentence-transformers[train]==4.1.0 # Inference only, use one of: pip install sentence-transformers==4.1.0 pip iLow4/15/2025
v4.0.2This patch release updates some logic for maximum sequence lengths, typing issues, FSDP training, and distributed training device placement. Install this version with ```bash # Training + Inference pip install sentence-transformers[train]==4.0.2 # Inference only, use one of: pip install sentence-transformers==4.0.2 pip install sentence-transformers[onnx-gpu]==4.0.2 pip install sentence-transformers[onnx]==4.0.2 pip install sentence-transformers[openvino]==4.0.2 ``` ## Safer CrosLow4/3/2025
v4.0.1This release consists of a major refactor that overhauls the reranker a.k.a. Cross Encoder [training approach](https://huggingface.co/blog/train-reranker) (introducing multi-gpu training, bf16, loss logging, callbacks, and much more), including all new [Training Overview](https://sbert.net/docs/cross_encoder/training_overview.html), [Loss Overview](https://sbert.net/docs/cross_encoder/loss_overview.html), [API Reference](https://sbert.net/docs/package_reference/cross_encoder/index.html) docs, [tLow3/26/2025
v3.4.1This release introduces a convenient compatibility with [Model2Vec models](https://huggingface.co/models?library=model2vec), and fixes a bug that caused an outgoing request even when using a local model. Install this version with ```bash # Training + Inference pip install sentence-transformers[train]==3.4.1 # Inference only, use one of: pip install sentence-transformers==3.4.1 pip install sentence-transformers[onnx-gpu]==3.4.1 pip install sentence-transformers[onnx]==3.4.1 pip insLow1/29/2025
v3.4.0This release resolves a memory leak when deleting a model & trainer, adds compatibility between the Cached... losses and the Matryoshka loss modifier, resolves numerous bugs, and adds several small features. Install this version with ```bash # Training + Inference pip install sentence-transformers[train]==3.4.0 # Inference only, use one of: pip install sentence-transformers==3.4.0 pip install sentence-transformers[onnx-gpu]==3.4.0 pip install sentence-transformers[onnx]==3.4.0 pip iLow1/23/2025
v3.3.1This patch release fixes a small issue with loading private models from Hugging Face using the `token` argument. Install this version with ``` # Training + Inference pip install sentence-transformers[train]==3.3.1 # Inference only, use one of: pip install sentence-transformers==3.3.1 pip install sentence-transformers[onnx-gpu]==3.3.1 pip install sentence-transformers[onnx]==3.3.1 pip install sentence-transformers[openvino]==3.3.1 ``` ## Details If you're loading model under thiLow11/18/2024
v3.3.04x speedup for CPU with [OpenVINO int8 static quantization](https://sbert.net/docs/sentence_transformer/usage/efficiency.html#quantizing-openvino-models), [training with prompts for a free performance boost](https://sbert.net/examples/training/prompts/README.html), convenient evaluation on [NanoBEIR](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#nanobeirevaluator): a subset of a strong Information Retrieval benchmark, PEFT compatibility by easily adding/loading adLow11/11/2024
v3.2.1This patch release fixes some small bugs, such as related to loading CLIP models, automatic model card generation issues, and ensuring compatibility with third party libraries. Install this version with ```bash # Training + Inference pip install sentence-transformers[train]==3.2.1 # Inference only, use one of: pip install sentence-transformers==3.2.1 pip install sentence-transformers[onnx-gpu]==3.2.1 pip install sentence-transformers[onnx]==3.2.1 pip install sentence-transformers[opLow10/21/2024
v3.2.0This release introduces 2 new efficient computing backends for SentenceTransformer models: [ONNX and OpenVINO + optimization & quantization, allowing for speedups up to 2x-3x](https://sbert.net/docs/sentence_transformer/usage/efficiency.html); static embeddings via [Model2Vec](https://github.com/MinishLab/model2vec) allowing for lightning-fast models (i.e., 50x-500x speedups) at a ~10%-20% performance cost; and various small improvements and fixes. Install this version with ```bash # TrainiLow10/10/2024
v3.1.1This patch release fixes hard negatives mining for models that don't automatically normalize their embeddings and it lifts the `numpy<2` restriction that was previously required. Install this version with ```bash # Full installation: pip install sentence-transformers[train]==3.1.1 # Inference only: pip install sentence-transformers==3.1.1 ``` ## Hard Negatives Mining Patch (#2944) The [`mine_hard_negatives`](https://sbert.net/docs/package_reference/util.html#sentence_transformers.Low9/19/2024
v3.1.0This release introduces a [hard negatives mining utility](https://sbert.net/docs/package_reference/util.html#sentence_transformers.util.mine_hard_negatives) to get better models out of your data, a new strong [loss function](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativessymmetricrankingloss) for symmetric tasks, training with streaming datasets to avoid having to store datasets fully on disk, custom modules to allow for more creativity from modeLow9/11/2024
v3.0.1This patch release introduces some improvements for the SentenceTransformerTrainer, as well as some updates for the automatic model card generation. It also patches some minor evaluator bugs and a bug with `MatryoshkaLoss`. Lastly, every single Sentence Transformer model can now be saved and loaded with the safer `model.safetensors` files. Install this version with ```bash # Full installation: pip install sentence-transformers[train]==3.0.1 # Inference only: pip install sentence-transfLow6/7/2024
v3.0.0This release consists of a major refactor that overhauls the [training approach](https://huggingface.co/blog/train-sentence-transformers) (introducing multi-gpu training, bf16, loss logging, callbacks, and much more), adds convenient [`similarity`](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer.similarity) and [`similarity_pairwise`](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.Low5/28/2024
v2.7.0This release introduces a [new promising loss function](https://sbert.net/docs/package_reference/losses.html#cachedgistembedloss), easier inference for [Matryoshka models](https://sbert.net/examples/training/matryoshka/README.html), new functionality for CrossEncoders and Inference on Intel Gaudi2, along much more. Install this version with ``` pip install sentence-transformers==2.7.0 ``` ## New loss function: CachedGISTEmbedLoss (#2592) For a number of years, [`MultipleNegativesRankinLow4/17/2024
v2.6.1This is a patch release to fix a bug in [`semantic_search_faiss`](https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_faiss) and [`semantic_search_usearch`](https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.semantic_search_usearch) that caused the scores to not correspond to the returned corpus indices. Additionally, you can now evaluate embedding models after quantizing their embeddings. ## Low3/26/2024
v2.6.0This release brings embedding quantization: a way to heavily speed up retrieval & other tasks, and a new powerful loss function: GISTEmbedLoss. Install this version with ``` pip install sentence-transformers==2.6.0 ``` ## Embedding Quantization Embeddings may be challenging to scale up, which leads to expensive solutions and high latencies. However, there is a new approach to counter this problem; it entails reducing the size of each of the individual values in the embedding: **QuantizLow3/22/2024
v2.5.1This is a patch release to fix a bug in `CrossEncoder.rank` that caused the last value to be discarded when using the default `top_k=-1`. ## `CrossEncoder.rank` patch: ```python from sentence_transformers.cross_encoder import CrossEncoder # Pre-trained cross encoder model = CrossEncoder("cross-encoder/stsb-distilroberta-base") # We want to compute the similarity between the query sentence query = "A man is eating pasta." # With all sentences in the corpus corpus = [ "A maLow3/1/2024
v2.5.0This release brings two new loss functions, a new way to (re)rank with CrossEncoder models, and more fixes Install this version with ``` pip install sentence-transformers==2.5.0 ``` ## 2D Matryoshka & Adaptive Layer models (#2506) Embedding models are often encoder models with numerous layers, such as 12 (e.g. [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2)) or 6 (e.g. [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2))Low2/29/2024
v2.4.0This release introduces numerous notable features that are well worth learning about! Install this version with ``` pip install sentence-transformers==2.4.0 ``` ## MatryoshkaLoss (#2485) Dense embedding models typically produce embeddings with a fixed size, such as 768 or 1024. All further computations (clustering, classification, semantic search, retrieval, reranking, etc.) must then be done on these full embeddings. [Matryoshka Representation Learning](https://arxiv.org/abs/2205.1314Low2/23/2024
v2.3.1This releases patches a niche bug when loading a Sentence Transformer model which: 1. is local 2. uses a `Normalize` module as specified in `modules.json` 3. does not contain the directory specified in the model configuration This only occurs when a model with `Normalize` is downloaded from the Hugging Face hub and then later used locally. See #2458 and #2459 for more details. ## Release highlights * Don't require loading files for Normalize by @tomaarsen (#2460) **Full Changelog**Low1/30/2024
v2.3.0This release focuses on various bug fixes & improvements to keep up with adjacent works like `transformers` and `huggingface_hub`. These are the key changes in the release: # Pushing models to the Hugging Face Hub (#2376) Prior to Sentence Transformers v2.3.0, saving models to the Hugging Face Hub may have resulted in various errors depending on the versions of the dependencies. Sentence Transformers v2.3.0 introduces a refactor to [`save_to_hub`](https://sbert.net/docs/package_reference/SLow1/29/2024
v2.2.2`huggingface_hub` dropped support in version 0.5.0 for Python 3.6 This release fixes the issue so that `huggingface_hub` with version 0.4.0 and Python 3.6 can still be used.Low6/26/2022
v2.2.1Version `0.8.1` of `huggingface_hub` introduces several changes that resulted in errors and warnings. This version of `sentence-transformers` fixes these issues. Further, several improvements have been added / merged: - `util.community_detection` was improved: 1) It works in a batched mode to save memory, 2) Overlapping clusters are no longer dropped but removed by overlapping items, 3) The parameter `init_max_size` was removed and replaced by a heuristic to estimate the max size of clusterLow6/23/2022
v2.2.0# T5 You can now use the encoder from T5 to learn text embeddings. You can use it like any other transformer model: ```python from sentence_transformers import SentenceTransformer, models word_embedding_model = models.Transformer('t5-base', max_seq_length=256) pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension()) model = SentenceTransformer(modules=[word_embedding_model, pooling_model]) ``` See [T5-Benchmark results](https://www.sbert.net/docs/training/oveLow2/10/2022
v2.1.0This is a smaller release with some new features ### MarginMSELoss [MarginMSELoss](https://github.com/UKPLab/sentence-transformers/blob/master/sentence_transformers/losses/MarginMSELoss.py) is a great method to train embeddings model with the help of a cross-encoder model. The details are explained here: [MSMARCO - MarginMSE Training](https://www.sbert.net/examples/training/ms_marco/README.html#marginmse) You pass your training data in the format: ```python InputExample(texts=[query, pLow10/1/2021
v2.0.0## Models hosted on the hub All pre-trained models are now hosted on the [Huggingface Models hub](https://huggingface.co/models). Our pre-trained models can be found here: [https://huggingface.co/sentence-transformers](https://huggingface.co/sentence-transformers) But you can easily share your own sentence-transformer model on the hub and have other people easily access it. Simple upload the folder and have people load it via: ``` model = SentenceTransformer('[your_username]/[model_naLow6/24/2021
v1.2.1Final release of version 1: Makes v1 of sentence-transformers forward compatible with models from version 2 of sentence-transformers.Low6/24/2021
v1.2.0 # Unsupervised Sentence Embedding Learning New methods integrated to train sentence embedding models without labeled data. See [Unsupervised Learning](https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning) for an overview of all existent methods. New methods: - **[CT](https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning/CT)**: Integration of [Semantic Re-Tuning With Contrastive Tension (CT)](https://openreview.nLow5/12/2021
v1.1.0## Unsupervised Sentence Embedding Learning This release integrates methods that allows to learn sentence embeddings without having labeled data: - **[TSDAE](https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning/TSDAE)**: TSDAE is using a denoising auto-encoder to learn sentence embeddings. The method has been presented in our [recent paper](https://arxiv.org/abs/2104.06979) and achieves state-of-the-art performance for several tasks. - **[GenQ](https://giLow4/21/2021
v1.0.4It was not possible to fine-tune and save the CLIPModel. This release fixes it. CLIPModel can now be saved like any other model by calling `model.save(path)`Low4/1/2021
v1.0.3v1.0.3 - Patch for util.paraphrase_mining methodLow3/22/2021
v1.0.2v1.0.2 - Patch for CLIPModel, new Image Examples - Bugfix in CLIPModel: Too long inputs raised a RuntimeError. Now they are truncated. - New util function: util.paraphrase_mining_embeddings, to find most similar embeddings in a matrix - **Image Clustering** and **Duplicate Image Detection** examples added: [more info](https://www.sbert.net/examples/applications/image-search/README.html#examples) Low3/19/2021
v1.0.0This release brings many new improvements and new features. Also, the version number scheme is updated. Now we use the format x.y.z with x: for major releases, y: smaller releases with new features, z: bugfixes ## Text-Image-Model CLIP You can now encode text and images in the same vector space using the OpenAI CLIP Model. You can use the model like this: ```python from sentence_transformers import SentenceTransformer, util from PIL import Image #Load CLIP model model = SentenceTransfLow3/18/2021
v0.4.1**Refactored Tokenization** - Faster tokenization speed: Using batched tokenization for training & inference - Now, all sentences in a batch are tokenized simoultanously. - Usage of the `SentencesDataset` no longer needed for training. You can pass your train examples directly to the DataLoader: ```python train_examples = [InputExample(texts=['My first sentence', 'My second sentence'], label=0.8), InputExample(texts=['Another pair', 'Unrelated sentence'], label=0.3)] train_dataloader =Low1/4/2021
v0.4.0- Updated the dependencies so that it works with Huggingface Transformers version 4. Sentence-Transformers still works with huggingface transformers version 3, but an update to version 4 of transformers is recommended. Future changes might break with transformers version 3. - New naming of pre-trained models. Models will be named: {task}-{transformer_model}. So 'bert-base-nli-stsb-mean-tokens' becomes 'stsb-bert-base'. Models will still be available under their old names, but newer models willLow12/22/2020
v0.3.9This release only include some smaller updates: - Code was tested with transformers 3.5.1, requirement was updated so that it works with transformers 3.5.1 - As some parts and models require Pytorch >= 1.6.0, requirement was updated to require at least pytorch 1.6.0. Most of the code and models will work with older pytorch versions. - model.encode() stored the embeddings on the GPU, which required quite a lot of GPU memory when encoding millions of sentences. The embeddings are now moved to Low11/18/2020
v0.3.8- Add support training and using [CrossEncoder](https://www.sbert.net/docs/usage/cross-encoder.html) - Data Augmentation method [AugSBERT](https://www.sbert.net/examples/training/data_augmentation/README.html) added - New model trained on large scale paraphrase data. Models works on internal benchmark much better than previous models: **distilroberta-base-paraphrase-v1** and **xlm-r-distilroberta-base-paraphrase-v1** - New model for Information Retrieval trained on MS Marco: **distilroberta-bLow10/19/2020
v0.3.7- Upgrade transformers dependency, transformers 3.1.0, 3.2.0 and 3.3.1 are working - Added example code for model distillation: Sentence Embeddings models can be drastically reduced to e.g. only 2-4 layers while keeping 98+% of their performance. Code can be found in examples/training/distillation - Transformer models can now accepts two inputs ['sentence 1', 'context for sent1'], which are encoded as the two inputs for BERT. Minor changes: - Tokenization in the multi-processes encoding Low9/29/2020
v0.3.6Hugginface Transformers version 3.1.0 had a breaking change with previous version 3.0.2 This release fixes the issue so that Sentence-Transformers is compatible with Huggingface Transformers 3.1.0. Note, that this and future version will not be compatible with transformers < 3.1.0.Low9/11/2020

Dependencies & License Audit

Loading dependencies...

Similar Packages

transformersTransformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.5.5.4
ipythonIPython: Productive Interactive Computing9.12.0
tokenizersNo description0.22.2
LettuceDetectLightweight hallucination detection framework for RAG applications0.1.8
azure-storage-blobMicrosoft Azure Blob Storage Client Library for Pythonazure-template_0.1.0b6187637