DRSY/MoTIS

[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)

29
/ 100
Experimental

Implements knowledge-distilled dual-encoders (6-12 layer Transformers) achieving CLIP-parity retrieval on MS COCO while reducing model size to 85-146MB and inference latency by 1.6-2.9x through layer pruning and supervised distillation. Provides multiple indexing strategies (linear scan, KMeans, Spotify Annoy) with lazy loading—encoding high-resolution images in the background while displaying thumbnails—and transpiles CLIP's tokenizer and preprocessing pipeline into native Swift/iOS via TorchScript.

126 stars. No commits in the last 6 months.

No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 11 / 25

How are scores calculated?

Stars

126

Forks

10

Language

Swift

License

Last pushed

May 11, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/DRSY/MoTIS"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.