gensim

Python framework for fast Vector Space Modelling

decomposition indexing latent lsa pypi semantic singular svd value

Why this rank:Strong adoptionRelease freshnessHealthy release cadence

Description

============================================== gensim -- Topic Modelling in Python ============================================== |GA|_ |Wheel|_ .. |GA| image:: https://github.com/RaRe-Technologies/gensim/actions/workflows/tests.yml/badge.svg?branch=develop .. |Wheel| image:: https://img.shields.io/pypi/wheel/gensim.svg .. _GA: https://github.com/RaRe-Technologies/gensim/actions .. _Downloads: https://pypi.org/project/gensim/ .. _License: https://radimrehurek.com/gensim/intro.html#licensing .. _Wheel: https://pypi.org/project/gensim/ Gensim is a Python library for *topic modelling*, *document indexing* and *similarity retrieval* with large corpora. Target audience is the *natural language processing* (NLP) and *information retrieval* (IR) community. Features --------- * All algorithms are **memory-independent** w.r.t. the corpus size (can process input larger than RAM, streamed, out-of-core) * **Intuitive interfaces** * easy to plug in your own input corpus/datastream (simple streaming API) * easy to extend with other Vector Space algorithms (simple transformation API) * Efficient multicore implementations of popular algorithms, such as online **Latent Semantic Analysis (LSA/LSI/SVD)**, **Latent Dirichlet Allocation (LDA)**, **Random Projections (RP)**, **Hierarchical Dirichlet Process (HDP)** or **word2vec deep learning**. * **Distributed computing**: can run *Latent Semantic Analysis* and *Latent Dirichlet Allocation* on a cluster of computers. * Extensive `documentation and Jupyter Notebook tutorials <https://github.com/RaRe-Technologies/gensim/#documentation>`_. If this feature list left you scratching your head, you can first read more about the `Vector Space Model <https://en.wikipedia.org/wiki/Vector_space_model>`_ and `unsupervised document analysis <https://en.wikipedia.org/wiki/Latent_semantic_indexing>`_ on Wikipedia. Installation ------------ This software depends on `NumPy and Scipy <https://scipy.org/install/>`_, two Python packages for scientific computing. You must have them installed prior to installing `gensim`. It is also recommended you install a fast BLAS library before installing NumPy. This is optional, but using an optimized BLAS such as MKL, `ATLAS <https://math-atlas.sourceforge.net/>`_ or `OpenBLAS <https://xianyi.github.io/OpenBLAS/>`_ is known to improve performance by as much as an order of magnitude. On OSX, NumPy picks up its vecLib BLAS automatically, so you don't need to do anything special. Install the latest version of gensim:: pip install --upgrade gensim Or, if you have instead downloaded and unzipped the `source tar.gz <https://pypi.org/project/gensim/>`_ package:: python setup.py install For alternative modes of installation, see the `documentation <https://radimrehurek.com/gensim/#install>`_. Gensim is being `continuously tested <https://radimrehurek.com/gensim/#testing>`_ under all `supported Python versions <https://github.com/RaRe-Technologies/gensim/wiki/Gensim-And-Compatibility>`_. Support for Python 2.7 was dropped in gensim 4.0.0 – install gensim 3.8.3 if you must use Python 2.7. How come gensim is so fast and memory efficient? Isn't it pure Python, and isn't Python slow and greedy? -------------------------------------------------------------------------------------------------------- Many scientific algorithms can be expressed in terms of large matrix operations (see the BLAS note above). Gensim taps into these low-level BLAS libraries, by means of its dependency on NumPy. So while gensim-the-top-level-code is pure Python, it actually executes highly optimized Fortran/C under the hood, including multithreading (if your BLAS is so configured). Memory-wise, gensim makes heavy use of Python's built-in generators and iterators for streamed data processing. Memory efficiency was one of gensim's `design goals <https://radimrehurek.com/gensim/intro.html#design-principles>`_, and is a central feature of gensim, rather than something bolted on as an afterthought. Documentation ------------- * `QuickStart`_ * `Tutorials`_ * `Tutorial Videos`_ * `Official Documentation and Walkthrough`_ Citing gensim ------------- When `citing gensim in academic papers and theses <https://scholar.google.cz/citations?view_op=view_citation&hl=en&user=9vG_kV0AAAAJ&citation_for_view=9vG_kV0AAAAJ:u-x6o8ySG0sC>`_, please use this BibTeX entry:: @inproceedings{rehurek_lrec, title = {{Software Framework for Topic Modelling with Large Corpora}}, author = {Radim {\v R}eh{\r u}{\v r}ek and Petr Sojka}, booktitle = {{Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks}}, pages = {45--50}, year = 2010, month = May, day = 22, publisher = {ELRA}, address = {Valletta, Malta}, language={English} } ---------------- Gensim is open source software released under the `GNU LGPLv2.1 license <https://www.gnu.org/licenses/old-licenses/lgpl-2.1.en.h

Release History

Version	Changes	Urgency	Date
4.4.0	Imported from PyPI (4.4.0)	Low	4/21/2026
4.3.2	Changes ======= ## 4.3.2, 2023-08-23 ### :red_circle: Bug fixes * Fix incorrect conversion of cosine distance to cosine similarity (__[monash849](https://github.com/monash849)__, [#3441](https://github.com/RaRe-Technologies/gensim/pull/3441)) ### :books: Tutorial and doc improvements * Fix inconsistent documentation for LdaSeqModel #3474 (__[rsokolewicz](https://github.com/rsokolewicz)__, [#3475](https://github.com/RaRe-Technologies/gensim/pull/3475)) * Update the licence link t	Low	8/24/2023
4.3.0	## What's Changed * Allow overriding the Cython version requirement by @pabs3 in https://github.com/RaRe-Technologies/gensim/pull/3323 * Update Python module MANIFEST by @pabs3 in https://github.com/RaRe-Technologies/gensim/pull/3343 * Clean up references to `Morfessor`, `tox` and `gensim.models.wrappers` by @pabs3 in https://github.com/RaRe-Technologies/gensim/pull/3345 * Disable the Gensim 3=>4 warning in docs by @piskvorky in https://github.com/RaRe-Technologies/gensim/pull/3346 * pin	Low	12/21/2022
4.2.0	A number of incremental improvements, optimizations and bugfixes: [CHANGELOG](https://github.com/RaRe-Technologies/gensim/blob/develop/CHANGELOG.md)	Low	5/1/2022
4.1.2	## 4.1.2, 2021-09-17 This is a bugfix release that addresses left over compatibility issues with older versions of numpy and MacOS. ## 4.1.1, 2021-09-14 This is a bugfix release that addresses compatibility issues with older versions of numpy. ## 4.1.0, 2021-08-15 Gensim 4.1 brings two major new functionalities: * [Ensemble LDA](https://radimrehurek.com/gensim/auto_examples/tutorials/run_ensemblelda.html) for robust training, selection and comparison of LDA models. * [FastSS m	Low	9/18/2021
4.1.1	## 4.1.1, 2021-09-14 This is a bugfix release that addresses compatibility issues with older versions of numpy. ## 4.1.0, 2021-08-15 Gensim 4.1 brings two major new functionalities: * [Ensemble LDA](https://radimrehurek.com/gensim/auto_examples/tutorials/run_ensemblelda.html) for robust training, selection and comparison of LDA models. * [FastSS module](https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/similarities/fastss.pyx) for super fast Levenshtein "fuzzy search"	Low	9/14/2021
4.1.0	## 4.1.0, 2021-08-15 Gensim 4.1 brings two major new functionalities: * [Ensemble LDA](https://radimrehurek.com/gensim/auto_examples/tutorials/run_ensemblelda.html) for robust training, selection and comparison of LDA models. * [FastSS module](https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/similarities/fastss.pyx) for super fast Levenshtein "fuzzy search" queries. Used e.g. for ["soft term similarity"](https://github.com/RaRe-Technologies/gensim/pull/3146) calculations.	Low	8/29/2021
4.0.1	## 4.0.1, 2021-04-01 Bugfix release to address issues with wheels on Windows due to Numpy binary incompatibility: - https://github.com/RaRe-Technologies/gensim/issues/3095 - https://github.com/RaRe-Technologies/gensim/issues/3097 ## 4.0.0, 2021-03-24 ⚠️ Gensim 4.0 contains breaking API changes! See the [Migration guide](https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4) to update your existing Gensim 3.x code and models. Gensim 4.0 is a major rel	Low	4/1/2021
4.0.0	Changes ======= ## 4.0.0, 2021-03-24 ⚠️ Gensim 4.0 contains breaking API changes! See the [Migration guide](https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4) to update your existing Gensim 3.x code and models. Gensim 4.0 is a major release with lots of performance & robustness improvements, and a new website. ### Main highlights * Massively optimized popular algorithms the community has grown to love: [fastText](https://radimrehurek.com/gensim/m	Low	3/25/2021
4.0.0.rc1	## 4.0.0.rc1, 2021-03-19 ⚠️ Gensim 4.0 contains breaking API changes! See the [Migration guide](https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4) to update your existing Gensim 3.x code and models. Gensim 4.0 is a major release with lots of performance & robustness improvements and a new website. ### Main highlights (see also 👍 Improvements below) * Massively optimized popular algorithms the community has grown to love: [fastText](https://radimre	Low	3/22/2021
4.0.0beta	## 4.0.0beta, 2020-10-31 ⚠️ Gensim 4.0 contains breaking API changes! See the [Migration guide](https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4) to update your existing Gensim 3.x code and models. ### Main highlights * Massively optimized popular algorithms the community has grown to love: [fastText](https://radimrehurek.com/gensim/models/fasttext.html), [word2vec](https://radimrehurek.com/gensim/models/word2vec.html), [doc2vec](https://radimrehurek.c	Low	11/1/2020
3.8.3	## :warning: 3.8.x will be the last gensim version to support Py2.7. Starting with 4.0.0, gensim will only support Py3.5 and above ## 3.8.3, 2020-05-03 This is primarily a bugfix release to bring back Py2.7 compatibility to gensim 3.8. ### :red_circle: Bug fixes * Bring back Py27 support (PR [#2812](https://github.com/RaRe-Technologies/gensim/pull/2812), __[@mpenkov](https://github.com/mpenkov)__) * Fix wrong version reported by setup.py (Issue [#2796](https://github.com/RaRe-Techno	Low	5/4/2020
3.8.2	## 3.8.2, 2020-04-10 ### :red_circle: Bug fixes * Pin `smart_open` version for compatibility with Py2.7 ### :warning: Deprecations (will be removed in the next major release) * Remove - `gensim.models.FastText.load_fasttext_format`: use load_facebook_vectors to load embeddings only (faster, less CPU/memory usage, does not support training continuation) and load_facebook_model to load full model (slower, more CPU/memory intensive, supports training continuation) - `gensim.mo	Low	4/12/2020
3.8.1	## 3.8.1, 2019-09-23 ### :red_circle: Bug fixes * Fix usage of base_dir instead of BASE_DIR in _load_info in downloader. (__[movb](https://github.com/movb)__, [#2605](https://github.com/RaRe-Technologies/gensim/pull/2605)) * Update the version of smart_open in the setup.py file (__[AMR-KELEG](https://github.com/AMR-KELEG)__, [#2582](https://github.com/RaRe-Technologies/gensim/pull/2582)) * Properly handle unicode_errors arg parameter when loading a vocab file (__[wmtzk](https://github.co	Low	9/26/2019
3.8.0	## 3.8.0, 2019-07-08 ## :warning: 3.8.x will be the last Gensim version to support Py2.7. Starting with 4.0.0, Gensim will only support Py3.5 and above ### :star2: New Features * Enable online training of Poincare models (__[koiizukag](https://github.com/koiizukag)__, [#2505](https://github.com/RaRe-Technologies/gensim/pull/2505)) * Make BM25 more scalable by adding support for generator inputs (__[saraswatmks](https://github.com/saraswatmks)__, [#2479](https://github.com/RaRe-Technolo	Low	7/9/2019
3.7.3	## 3.7.3, 2019-05-06 ### :red_circle: Bug fixes * Fix fasttext model loading from gzip files (__[mpenkov](https://github.com/mpenkov)__, [#2476](https://github.com/RaRe-Technologies/gensim/pull/2476)) * Clean up FastText Cython code, fix division by zero (__[mpenkov](https://github.com/mpenkov)__, [#2382](https://github.com/RaRe-Technologies/gensim/pull/2382)) * Update legacy model loading (__[mpenkov](https://github.com/mpenkov)__, [#2454](https://github.com/RaRe-Technologies/gensim/pul	Low	5/8/2019
3.7.2	## 3.7.2, 2019-04-06 ### :star2: New Features - `gensim.models.fasttext.load_facebook_model` function: load full model (slower, more CPU/memory intensive, supports training continuation) ```python >>> from gensim.test.utils import datapath >>> >>> cap_path = datapath("crime-and-punishment.bin") >>> fb_model = load_facebook_model(cap_path) >>> >>> 'landlord' in fb_model.wv.vocab # Word is out of vocabulary False >>> oov_term = fb_model.wv['landlord'] >>> >>	Low	4/10/2019
3.7.1	## 3.7.1, 2019-01-31 ### :+1: Improvements * NMF optimization & documentation (__[@anotherbugmaster](https://github.com/anotherbugmaster)__, [#2361](https://github.com/RaRe-Technologies/gensim/pull/2361)) * Optimize `FastText.load_fasttext_model` (__[@mpenkov](https://github.com/mpenkov)__, [#2340](https://github.com/RaRe-Technologies/gensim/pull/2340)) * Add warning when string is used as argument to `Doc2Vec.infer_vector` (__[@tobycheese](https://github.com/tobycheese)__, [#2347](https	Low	1/31/2019
3.7.0	## 3.7.0, 2019-01-18 ### :star2: New features * Fast Online NMF (__[@anotherbugmaster](https://github.com/anotherbugmaster)__, [#2007](https://github.com/RaRe-Technologies/gensim/pull/2007)) - Benchmark `wiki-english-20171001` \| Model \| Perplexity \| Coherence \| L2 norm \| Train time (minutes) \| \|-------\|------------\|-----------\|---------\|----------------------\| \| LDA \| 4727.07 \| -2.514 \| 7.372 \| 138 \| \| NMF \| 975.74 \| -2.814 \| 7.265 \| 73 \|	Low	1/18/2019
3.6.0	## 3.6.0, 2018-09-20 ### :star2: New features * File-based training for `2Vec` models (__[@persiyanov](https://github.com/persiyanov)__, [#2127](https://github.com/RaRe-Technologies/gensim/pull/2127) & [#2078](https://github.com/RaRe-Technologies/gensim/pull/2078) & [#2048](https://github.com/RaRe-Technologies/gensim/pull/2048)) [Blog post / Jupyter tutorial](https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/Any2Vec_Filebased.ipynb). New training mode for `	Low	9/20/2018
3.5.0	## 3.5.0, 2018-07-06 This release comprises a glorious 38 pull requests from 28 contributors. Most of the effort went into improving the documentation—hence the release code name "Docs 💬"! Apart from the massive overhaul of all Gensim documentation (including docstring style and examples—[you asked for it](https://rare-technologies.com/gensim-survey-2018/)), we also managed to sneak in some new functionality and a number of bug fixes. As usual, see the notes below for a complete list,	Low	7/6/2018
3.4.0	## 3.4.0, 2018-03-01 ### :star2: New features: * Massive optimizations of `gensim.models.LdaModel`: much faster training, using Cython. (__[@arlenk](https://github.com/arlenk)__, [#1767](https://github.com/RaRe-Technologies/gensim/pull/1767)) - Training benchmark :boom: \| dataset \| old LDA [sec] \| optimized LDA [sec] \| speed up \| \|---------\|---------------\|---------------------\|---------\| \| nytimes \| 3473 \| 1975 \| 1.76x \| \| enron \| 774 \| 437 \| 1.77x \|	Low	3/1/2018
3.3.0	## 3.3.0, 2018-02-02 :star2: New features: * Re-designed all "2vec" implementations (__[@manneshiva](https://github.com/manneshiva)__, [#1777](https://github.com/RaRe-Technologies/gensim/pull/1777)) - Modular organization of `Word2Vec`, `Doc2Vec`, `FastText`, etc ..., making it easier to add new models in the future and re-use code - Fully backward compatible (even with loading models stored by a previous Gensim version) - [Detailed documentation for the 2vec refactoring pro	Low	2/2/2018
3.2.0	## 3.2.0, 2017-12-09 :star2: New features: * New download API for corpora and pre-trained models (__[@chaitaliSaini](https://github.com/chaitaliSaini)__ & __[@menshikh-iv](https://github.com/menshikh-iv)__, [#1705](https://github.com/RaRe-Technologies/gensim/pull/1705) & [#1632](https://github.com/RaRe-Technologies/gensim/pull/1632) & [#1492](https://github.com/RaRe-Technologies/gensim/pull/1492)) - Download large NLP datasets in one line of Python, then use with memory-efficien	Low	12/9/2017
3.1.0	## 3.1.0, 2017-11-06 :star2: New features: * Massive optimizations to LSI model training (__[@isamaru](https://github.com/isamaru)__, [#1620](https://github.com/RaRe-Technologies/gensim/pull/1620) & [#1622](https://github.com/RaRe-Technologies/gensim/pull/1622)) - LSI model allows use of single precision (float32), to consume 40% less memory while being 40% faster. - LSI model can now also accept CSC matrix as input, for further memory and speed boost. - Overall, if your enti	Low	11/6/2017
3.0.1	## 3.0.1, 2017-10-12 :red_circle: Bug fixes: * Fix Keras import, speedup importing time. Fix #1614 (@menshikh-v, [#1615](https://github.com/RaRe-Technologies/gensim/pull/1615)) * Fix Sphinx warnings and retrieve all missing .rst (@anotherbugmaster and @menshikh-iv, [#1612](https://github.com/RaRe-Technologies/gensim/pull/1612)) * Fix logger message in lsi_dispatcher (@lorosanu, [#1603](https://github.com/RaRe-Technologies/gensim/pull/1603)) :books: Tutorial and doc improvements: * Fi	Low	10/12/2017
3.0.0	## 3.0.0, 2017-09-27 :star2: New features: * Add unsupervised FastText to Gensim (@chinmayapancholi13, [#1525](https://github.com/RaRe-Technologies/gensim/pull/1525)) * Add sklearn API for gensim models (@chinmayapancholi13, [#1462](https://github.com/RaRe-Technologies/gensim/pull/1462)) * Add callback metrics for LdaModel and integration with Visdom (@parulsethi, [#1399](https://github.com/RaRe-Technologies/gensim/pull/1399)) * Add TranslationMatrix model (@robotcator, [#1434](https://	Low	9/27/2017
2.3.0	## 2.3.0, 2017-07-25 :star2: New features: * Add Dockerfile for gensim with external wrappers (@parulsethi, [#1368](https://github.com/RaRe-Technologies/gensim/pull/1368)) * Add sklearn wrapper for Word2Vec (@chinmayapancholi13, [#1437](https://github.com/RaRe-Technologies/gensim/pull/1437)) * Add loss function for Word2Vec. Fix #999 (@chinmayapancholi13, [#1201](https://github.com/RaRe-Technologies/gensim/pull/1201)) * Add sklearn wrapper for AuthorTopic model (@chinmayapancholi13, [#1	Low	7/25/2017
2.2.0	## 2.2.0, 2017-06-21 :star2: New features: * Add sklearn wrapper for RpModel (@chinmayapancholi13, [#1395](https://github.com/RaRe-Technologies/gensim/pull/1395)) * Add sklearn wrappers for LdaModel and LsiModel (@chinmayapancholi13, [#1398](https://github.com/RaRe-Technologies/gensim/pull/1398)) * Add sklearn wrapper for LdaSeq (@chinmayapancholi13, [#1405](https://github.com/RaRe-Technologies/gensim/pull/1405)) * Add keras wrapper for Word2Vec model (@chinmayapancholi13, [#1248](https	Low	6/21/2017
2.1.0	## 2.1.0, 2017-05-12 :star2: New features: * Add modified save_word2vec_format for Doc2Vec, to save document vectors. (@parulsethi, [#1256](https://github.com/RaRe-Technologies/gensim/pull/1256)) :+1: Improvements: * Add automatic code style check limited only to the code modified in PR (@tmylk, [#1287](https://github.com/RaRe-Technologies/gensim/pull/1287)) * Replace `logger.warn` by `logger.warning` (@chinmayapancholi13, [#1295](https://github.com/RaRe-Technologies/gensim/pull/1295)	Low	5/12/2017
2.0.0	Breaking changes: Any direct calls to method train() of Word2Vec/Doc2Vec now require an explicit epochs parameter and explicit estimate of corpus size. The most usual way to call `train` is `vec_model.train(sentences, total_examples=self.corpus_count, epochs=self.iter)` See the [method documentation](https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/models/word2vec.py#L766) for more information. * Explicit epochs and corpus size in word2vec train(). (@gojomo, @robotcator	Low	4/10/2017
1.0.1	* Rebuild cumulative table on load. Fix #1180. (@tmylk,[#1181](https://github.com/RaRe-Technologies/gensim/pull/893)) * most_similar_cosmul bug fix (@dkim010, [#1177](https://github.com/RaRe-Technologies/gensim/pull/1177)) * Fix loading old word2vec models pre-1.0.0 (@jayantj, [#1179](https://github.com/RaRe-Technologies/gensim/pull/1179)) * Load utf-8 words in fasttext (@jayantj, [#1176](https://github.com/RaRe-Technologies/gensim/pull/1176))	Low	3/4/2017
1.0.0	1.0.0, 2017-02-24 Deprecated methods: In order to share word vector querying code between different training algos(Word2Vec, Fastext, WordRank, VarEmbed) we have separated storage and querying of word vectors into a separate class `KeyedVectors`. Two methods and several attributes in word2vec class have been deprecated. The methods are `load_word2vec_format` and `save_word2vec_format`. The attributes are `syn0norm`, `syn0`, `vocab`, `index2word` . They have been moved to `KeyedVectors`	Low	2/24/2017
1.0.0rc2	1.0.0RC2, 2017-02-16 Deprecated methods: In order to share word vector querying code between different training algos(Word2Vec, Fastext, WordRank, VarEmbed) we have separated storage and querying of word vectors into a separate class `KeyedVectors`. Two methods and several attributes in word2vec class have been deprecated. The methods are `load_word2vec_format` and `save_word2vec_format`. The attributes are `syn0norm`, `syn0`, `vocab`, `index2word` . They have been moved to `KeyedVectors` c	Low	2/17/2017
0.13.4.1	0.13.4.1, 2017-01-04 - Disable direct access warnings on save and load of Word2vec/Doc2vec (@tmylk, [#1072](https://github.com/RaRe-Technologies/gensim/pull/1072)) - Making Default hs error explicit (@accraze, [#1054](https://github.com/RaRe-Technologies/gensim/pull/1054)) - Removed unnecessary numpy imports (@bhargavvader, [#1065](https://github.com/RaRe-Technologies/gensim/pull/1065)) - Utils and Matutils changes (@bhargavvader, [#1062](https://github.com/RaRe-Technologies/gensim/pull/1062)	Low	1/4/2017
0.13.4	# Deprecation warning After upgrading to this release you might see deprecation warnings like this: ``` WARNING:gensim.models.word2vec:direct access to syn0norm will not be supported in future gensim releases, please use model.wv.syn0norm ``` These warnings are correct and you are encouraged to change your Word2vec/Doc2vec code to use the new model.wv.syn0norm and model.wv.vocab fields instead of old direct access like model.syn0norm and model.vocab. The direct access will be deprecated in Fe	Low	12/25/2016
0.13.3	0.13.3, 2016-10-20 - Add vocabulary expansion feature to word2vec. (@isohyt, [#900](https://github.com/RaRe-Technologies/gensim/pull/900)) - Tutorial: Reproducing Doc2vec paper result on wikipedia. (@isohyt, [#654](https://github.com/RaRe-Technologies/gensim/pull/654)) - Add Save/Load interface to AnnoyIndexer for index persistence (@fortiema, [#845](https://github.com/RaRe-Technologies/gensim/pull/845)) - Fixed issue [#938](https://github.com/RaRe-Technologies/gensim/issues/938),Creating a unif	Low	10/21/2016
0.13.2	0.13.2, 2016-08-19 - wordtopics has changed to word_topics in ldamallet, and fixed issue #764. (@bhargavvader, [#771](https://github.com/RaRe-Technologies/gensim/pull/771)) - assigning wordtopics value of word_topics to keep backward compatibility, for now - topics, topn parameters changed to num_topics and num_words in show_topics() and print_topics()(@droudy, [#755](https://github.com/RaRe-Technologies/gensim/pull/755)) - In hdpmodel and dtmmodel - NOT BACKWARDS COMPATIBLE! - Added rand	Low	8/26/2016
0.13.1	Initial release of Topic Coherence C_v and U_mass. More work will be done here but external API will remain the same.	Low	6/23/2016
0.13.0	0.12.5, 2016 Tutorials migrated from website to ipynb (@j9chan, #721), (@jesford, #733, #725, 716) New doc2vec intro tutorial (@seanlaw, #730) Gensim Quick Start Tutorial (@andrewjlm, #727) Add export_phrases(sentences) to model Phrases (hanabi1224 #588) SparseMatrixSimilarity returns a sparse matrix if maintain_sparsity is True (@davechallis, #590) added functionality for Topics of Words in document - i.e, dynamic topics. (@bhargavvader, #704) also included tutorial which explains new function	Low	6/22/2016
0.13.0rc1	# Changes 0.12.5, 2016 - Tutorials migrated from website to ipynb (@j9chan, #721), (@jesford, #733, #725, 716) - New doc2vec intro tutorial (@seanlaw, #730) - Gensim Quick Start Tutorial (@andrewjlm, #727) - Add export_phrases(sentences) to model Phrases (hanabi1224 #588) - SparseMatrixSimilarity returns a sparse matrix if `maintain_sparsity` is True (@davechallis, #590) - added functionality for Topics of Words in document - i.e, dynamic topics. (@bhargavvader, #704) - also included tutorial	Low	6/10/2016
0.12.4	- Word2vec in line with original word2vec.c (Andrey Kutuzov, #538) - Same default values. See diff https://github.com/akutuzov/gensim/commit/6456cbcd75e6f8720451766ba31cc046b4463ae2 - Standalone script with command line arguments matching those of original C tool. Usage ./word2vec_standalone.py -train data.txt -output trained_vec.txt -size 200 -window 2 -sample 1e-4 - load_word2vec_format() performance (@svenkreiss, #555) - Remove `init_sims()` call for performance improvements when normali	Low	1/31/2016
0.12.3	0.12.3rc1, 05/11/2015 - Make show_topics return value consistent across models (Christopher Corley, #448) - All models with the `show_topics` method should return a list of `(topic_number, topic)` tuples, where `topic` is a list of `(word, probability)` tuples. - This is a breaking change that affects users of the `LsiModel`, `LdaModel`, and `LdaMulticore` that may be reliant on the old tuple layout of `(probability, word)`. - Mixed integer & string document-tags (key	Low	11/6/2015
0.12.3rc1	0.12.3rc1, 05/11/2015 - Make show_topics return value consistent across models (Christopher Corley, #448) - All models with the `show_topics` method should return a list of `(topic_number, topic)` tuples, where `topic` is a list of `(word, probability)` tuples. - This is a breaking change that affects users of the `LsiModel`, `LdaModel`, and `LdaMulticore` that may be reliant on the old tuple layout of `(probability, word)`. - Mixed integer & string document-tags (keys to doc	Low	11/5/2015

Dependencies & License Audit

Loading dependencies...

Similar Packages

alibabacloud-adb20211201Alibaba Cloud adb (20211201) SDK Library for Pythonmaster@2026-06-06

ydbYDB Python SDK3.29.1

typerTyper, build great CLIs. Easy to code. Based on Python type hints.0.26.7

django-timezone-fieldA Django app providing DB, form, and REST framework fields for zoneinfo and pytz timezone objects.main@2026-06-03

azure-storage-blobMicrosoft Azure Blob Storage Client Library for Pythonazure-mgmt-computelimit_1.1.0

More in Databases

orbitOne API for 20+ LLM providers, your databases, and your files — self-hosted, open-source AI gateway with RAG, voice, and guardrails.

alibabacloud-adb20211201Alibaba Cloud adb (20211201) SDK Library for Python

milvusMilvus is a high-performance, cloud-native vector database built for scalable vector ANN search

WeKnoraLLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.