curiosity-ai/catalyst

🚀 Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.

57
/ 100
Established

Built on pure C# with .NET Standard 2.0 support, it achieves >1M tokens/second through regex-free tokenization and supports multiple NER approaches (gazetteer, rule-based patterns, and perceptron models). Models serialize efficiently via MessagePack and integrate with FastText/StarSpace for embedding training, plus companion libraries for HNSW similarity search and UMAP dimensionality reduction, with language-specific models distributed as modular NuGet packages trained on Universal Dependencies data.

836 stars. Actively maintained with 7 commits in the last 30 days.

No Package No Dependents
Maintenance 20 / 25
Adoption 10 / 25
Maturity 9 / 25
Community 18 / 25

How are scores calculated?

Stars

836

Forks

83

Language

C#

License

MIT

Last pushed

Mar 09, 2026

Commits (30d)

7

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/curiosity-ai/catalyst"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.