babylonhealth/fastText_multilingual

Multilingual word vectors in 78 languages

45
/ 100
Emerging

Provides pre-computed SVD-based alignment matrices that project fastText's monolingual word vectors from 78 languages into a shared vector space, enabling cross-lingual similarity and translation prediction via nearest-neighbor lookup. Each language matrix is learned by aligning against English using bilingual dictionaries derived from Google Translate, achieving ~73% precision@1 for translation retrieval while preserving original monolingual relationships. The approach requires only applying a linear transformation to existing fastText vectors—no retraining needed.

1,202 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

1,202

Forks

120

Language

Jupyter Notebook

License

BSD-3-Clause

Last pushed

Mar 10, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/babylonhealth/fastText_multilingual"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.