OFA-Sys/Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

48
/ 100
Emerging

Trained on ~200M Chinese image-text pairs, it combines vision encoders (ResNet50 to ViT-H) with RoBERTa/RBT3 text encoders optimized for Chinese semantic alignment. The framework supports multiple deployment formats (ONNX, TensorRT, CoreML) and includes advanced training techniques like FlashAttention, gradient accumulation, and knowledge distillation for efficient fine-tuning on downstream tasks.

5,820 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

5,820

Forks

548

Language

Jupyter Notebook

License

MIT

Last pushed

Aug 29, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/OFA-Sys/Chinese-CLIP"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.