DunnBC22/Vision_Audio_and_Multimodal_Projects
This repository includes all computer vision, audio, document AI, and multimodal projects.
34
/ 100
Emerging
No commits in the last 6 months.
No License
Stale 6m
No Package
No Dependents
Maintenance
0 / 25
Adoption
8 / 25
Maturity
8 / 25
Community
18 / 25
Stars
51
Forks
12
Language
Jupyter Notebook
License
—
Category
Last pushed
Jun 07, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/DunnBC22/Vision_Audio_and_Multimodal_Projects"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
dorarad/gansformer
Generative Adversarial Transformers
47
j-min/VL-T5
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
46
invictus717/MetaTransformer
Meta-Transformer for Unified Multimodal Learning
44
Yachay-AI/byt5-geotagging
Confidence and Byt5 - based geotagging model predicting coordinates from text alone.
42
zinengtang/TVLT
PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral)
39