iral-lab/gold

Multimodal grounded language dataset

13
/ 100
Experimental

This dataset provides a collection of images, depth data, text descriptions, and speech recordings for common objects. It includes 207 instances across 47 object classes like food, home, medical, office, and tools, captured from multiple angles. Researchers and developers working on domestic robots or intelligent systems can use this data to train and test how well their systems understand and describe objects using both visual and spoken information.

No commits in the last 6 months.

Use this if you are developing or evaluating AI models that need to connect spoken language with visual object information, especially for applications like robot interaction or object recognition.

Not ideal if you need data for a domain outside of common household/office objects, or if your application doesn't require multimodal data linking speech, text, and visual inputs.

robotics computer-vision speech-recognition natural-language-processing multimodal-AI
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 8 / 25
Community 0 / 25

How are scores calculated?

Stars

11

Forks

Language

License

Last pushed

Dec 14, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/iral-lab/gold"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.