# langcodes

> Tools for labeling human languages with IETF language tags

- **URL**: https://www.freshcrate.ai/projects/langcodes
- **Author**: pypi
- **Category**: Developer Tools
- **Latest version**: `3.5.1` (2026-04-21)
- **License**: Unknown
- **Source**: https://github.com/georgkrause/langcodes
- **Homepage**: https://pypi.org/project/langcodes/
- **Language**: Python
- **GitHub**: 28 stars, 9 forks
- **Registry**: pypi (`langcodes`)
- **Tags**: `pypi`

## Description

# Langcodes: a library for language codes

**langcodes** knows what languages are. It knows the standardized codes that
refer to them, such as `en` for English, `es` for Spanish and `hi` for Hindi.

These are [IETF language tags][]. You may know them by their old name, ISO 639
language codes. IETF has done some important things for backward compatibility
and supporting language variations that you won't find in the ISO standard.

[IETF language tags]: https://www.w3.org/International/articles/language-tags/

It may sound to you like langcodes solves a pretty boring problem. At one
level, that's right. Sometimes you have a boring problem, and it's great when a
library solves it for you.

But there's an interesting problem hiding in here. How do you work with
language codes? How do you know when two different codes represent the same
thing? How should your code represent relationships between codes, like the
following?

* `eng` is equivalent to `en`.
* `fra` and `fre` are both equivalent to `fr`.
* `en-GB` might be written as `en-gb` or `en_GB`. Or as 'en-UK', which is
  erroneous, but should be treated as the same.
* `en-CA` is not exactly equivalent to `en-US`, but it's really, really close.
* `en-Latn-US` is equivalent to `en-US`, because written English must be written
  in the Latin alphabet to be understood.
* The difference between `ar` and `arb` is the difference between "Arabic" and
  "Modern Standard Arabic", a difference that may not be relevant to you.
* You'll find Mandarin Chinese tagged as `cmn` on Wiktionary, but many other
  resources would call the same language `zh`.
* Chinese is written in different scripts in different territories. Some
  software distinguishes the script. Other software distinguishes the territory.
  The result is that `zh-CN` and `zh-Hans` are used interchangeably, as are
  `zh-TW` and `zh-Hant`, even though occasionally you'll need something
  different such as `zh-HK` or `zh-Latn-pinyin`.
* The Indonesian (`id`) and Malaysian (`ms` or `zsm`) languages are mutually
  intelligible.
* `jp` is not a language code. (The language code for Japanese is `ja`, but
  people confuse it with the country code for Japan.)

One way to know is to read IETF standards and Unicode technical reports.
Another way is to use a library that implements those standards and guidelines
for you, which langcodes does.

When you're working with these short language codes, you may want to see the
name that the language is called _in_ a language: `fr` is called "French" in
English. That language doesn't have to be English: `fr` is called "français" in
French. A supplement to langcodes, [`language_data`][language-data], provides
this information.

[language-data]: https://github.com/rspeer/language_data

langcodes is maintained by Elia Robyn Lake a.k.a. Robyn Speer, and is released
as free software under the MIT license.


## Standards implemented

Although this is not the only reason to use it, langcodes will make you more
acronym-compliant.

langcodes implements [BCP 47](http://tools.ietf.org/html/bcp47), the IETF Best
Current Practices on Tags for Identifying Languages. BCP 47 is also known as
RFC 5646. It subsumes ISO 639 and is backward compatible with it, and it also
implements recommendations from the [Unicode CLDR](http://cldr.unicode.org).

langcodes can also refer to a database of language properties and names, built
from Unicode CLDR and the IANA subtag registry, if you install `language_data`.

In summary, langcodes takes language codes and does the Right Thing with them,
and if you want to know exactly what the Right Thing is, there are some
documents you can go read.


# Documentation

## Standardizing language tags

This function standardizes tags, as strings, in several ways.

It replaces overlong tags with their shortest version, and also formats them
according to the conventions of BCP 47:

    >>> from langcodes import *
    >>> standardize_tag('eng_US')
    'en-US'

It removes script subtags that are redundant with the language:

    >>> standardize_tag('en-Latn')
    'en'

It replaces deprecated values with their correct versions, if possible:

    >>> standardize_tag('en-uk')
    'en-GB'

Sometimes this involves complex substitutions, such as replacing Serbo-Croatian
(`sh`) with Serbian in Latin script (`sr-Latn`), or the entire tag `sgn-US`
with `ase` (American Sign Language).

    >>> standardize_tag('sh-QU')
    'sr-Latn-EU'

    >>> standardize_tag('sgn-US')
    'ase'

If *macro* is True, it uses macrolanguage codes as a replacement for the most
common standardized language within that macrolanguage.

    >>> standardize_tag('arb-Arab', macro=True)
    'ar'

Even when *macro* is False, it shortens tags that contain both the
macrolanguage and the language:

    >>> standardize_tag('zh-cmn-hans-cn')
    'zh-Hans-CN'

If the tag can't be parsed according to BCP 47, this will raise a
LanguageTagError (a subclass of ValueError):

    >>> standardize_tag('spa-latn-mx')
    'es-MX'

## Recent releases

| Version | Date | Urgency | Changes |
| --- | --- | --- | --- |
| `3.5.1` | 2026-04-21 | Low | Imported from PyPI (3.5.1) |
| `v3.5.1` | 2025-12-02 | Low | ## What's Changed * style: fix typos by @kianmeng in https://github.com/georgkrause/langcodes/pull/28 * fix: Do not install language-data by default by @georgkrause in https://github.com/georgkrause/langcodes/pull/30 * fix: Add warning that best_match is deprecated by @georgkrause in https://github.com/georgkrause/langcodes/pull/31 * chore(deps): update dependency python to 3.14 by @renovate[bot] in https://github.com/georgkrause/langcodes/pull/41 * chore(deps): update actions/upload-artifa |
| `v3.5.1` | 2025-12-02 | Low | ## What's Changed * style: fix typos by @kianmeng in https://github.com/georgkrause/langcodes/pull/28 * fix: Do not install language-data by default by @georgkrause in https://github.com/georgkrause/langcodes/pull/30 * fix: Add warning that best_match is deprecated by @georgkrause in https://github.com/georgkrause/langcodes/pull/31 * chore(deps): update dependency python to 3.14 by @renovate[bot] in https://github.com/georgkrause/langcodes/pull/41 * chore(deps): update actions/upload-artifa |
| `v3.5.1` | 2025-12-02 | Low | ## What's Changed * style: fix typos by @kianmeng in https://github.com/georgkrause/langcodes/pull/28 * fix: Do not install language-data by default by @georgkrause in https://github.com/georgkrause/langcodes/pull/30 * fix: Add warning that best_match is deprecated by @georgkrause in https://github.com/georgkrause/langcodes/pull/31 * chore(deps): update dependency python to 3.14 by @renovate[bot] in https://github.com/georgkrause/langcodes/pull/41 * chore(deps): update actions/upload-artifa |
| `v3.5.1` | 2025-12-02 | Low | ## What's Changed * style: fix typos by @kianmeng in https://github.com/georgkrause/langcodes/pull/28 * fix: Do not install language-data by default by @georgkrause in https://github.com/georgkrause/langcodes/pull/30 * fix: Add warning that best_match is deprecated by @georgkrause in https://github.com/georgkrause/langcodes/pull/31 * chore(deps): update dependency python to 3.14 by @renovate[bot] in https://github.com/georgkrause/langcodes/pull/41 * chore(deps): update actions/upload-artifa |
| `v3.5.1` | 2025-12-02 | Low | ## What's Changed * style: fix typos by @kianmeng in https://github.com/georgkrause/langcodes/pull/28 * fix: Do not install language-data by default by @georgkrause in https://github.com/georgkrause/langcodes/pull/30 * fix: Add warning that best_match is deprecated by @georgkrause in https://github.com/georgkrause/langcodes/pull/31 * chore(deps): update dependency python to 3.14 by @renovate[bot] in https://github.com/georgkrause/langcodes/pull/41 * chore(deps): update actions/upload-artifa |
| `v3.5.1` | 2025-12-02 | Low | ## What's Changed * style: fix typos by @kianmeng in https://github.com/georgkrause/langcodes/pull/28 * fix: Do not install language-data by default by @georgkrause in https://github.com/georgkrause/langcodes/pull/30 * fix: Add warning that best_match is deprecated by @georgkrause in https://github.com/georgkrause/langcodes/pull/31 * chore(deps): update dependency python to 3.14 by @renovate[bot] in https://github.com/georgkrause/langcodes/pull/41 * chore(deps): update actions/upload-artifa |
| `v3.5.1` | 2025-12-02 | Low | ## What's Changed * style: fix typos by @kianmeng in https://github.com/georgkrause/langcodes/pull/28 * fix: Do not install language-data by default by @georgkrause in https://github.com/georgkrause/langcodes/pull/30 * fix: Add warning that best_match is deprecated by @georgkrause in https://github.com/georgkrause/langcodes/pull/31 * chore(deps): update dependency python to 3.14 by @renovate[bot] in https://github.com/georgkrause/langcodes/pull/41 * chore(deps): update actions/upload-artifa |
| `v3.5.1` | 2025-12-02 | Low | ## What's Changed * style: fix typos by @kianmeng in https://github.com/georgkrause/langcodes/pull/28 * fix: Do not install language-data by default by @georgkrause in https://github.com/georgkrause/langcodes/pull/30 * fix: Add warning that best_match is deprecated by @georgkrause in https://github.com/georgkrause/langcodes/pull/31 * chore(deps): update dependency python to 3.14 by @renovate[bot] in https://github.com/georgkrause/langcodes/pull/41 * chore(deps): update actions/upload-artifa |
| `v3.5.1` | 2025-12-02 | Low | ## What's Changed * style: fix typos by @kianmeng in https://github.com/georgkrause/langcodes/pull/28 * fix: Do not install language-data by default by @georgkrause in https://github.com/georgkrause/langcodes/pull/30 * fix: Add warning that best_match is deprecated by @georgkrause in https://github.com/georgkrause/langcodes/pull/31 * chore(deps): update dependency python to 3.14 by @renovate[bot] in https://github.com/georgkrause/langcodes/pull/41 * chore(deps): update actions/upload-artifa |

## Citation

- HTML: https://www.freshcrate.ai/projects/langcodes
- Markdown: https://www.freshcrate.ai/projects/langcodes.md
- Dependencies JSON: https://www.freshcrate.ai/api/projects/langcodes/deps

_Generated by freshcrate.ai. Indexes pypi releases for AI-agent ecosystem packages._
