samchengcs/IKEA-Dataset
A dataset for multimodal machine translation
This dataset helps e-commerce professionals, localization specialists, and product managers improve multilingual communication. It provides product descriptions in English-French and English-German pairs, alongside product images, sourced from IKEA and Under Armour. This allows users to train and evaluate systems that translate product information more accurately by understanding both text and visuals.
No commits in the last 6 months.
Use this if you need a specialized dataset to train or test machine translation systems that leverage both text and images for product descriptions.
Not ideal if you need general-purpose text translation, data for domains outside of e-commerce products, or require a very large dataset for a single language pair.
Stars
13
Forks
—
Language
—
License
MIT
Category
Last pushed
Dec 06, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/samchengcs/IKEA-Dataset"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
luheng/deep_srl
Code and pre-trained model for: Deep Semantic Role Labeling: What Works and What's Next
sileod/tasksource
Datasets collection and preprocessings framework for NLP extreme multitask learning
loomchild/maligna
Bilingual sengence aligner
CK-Explorer/DuoSubs
Semantic subtitle aligner and merger for bilingual subtitle syncing.
coastalcph/lex-glue
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English