image_captioning_with_transformers and pytorch-image-captioning
These are competitors—both implement standalone PyTorch solutions for transformer-based image captioning without dependency relationships, so a user would select one based on implementation details and code quality rather than using them together.
About image_captioning_with_transformers
zarzouram/image_captioning_with_transformers
Pytorch implementation of image captioning using transformer-based model.
Implements an encoder-decoder transformer architecture with per-head attention visualization capabilities, modified from PyTorch's standard multi-head attention to enable detailed attention analysis. Trained on MS COCO 2017 with beam search inference and comprehensive NLG evaluation metrics (BLEU, METEOR, GLEU). Includes preprocessing pipeline for image-caption dataset creation with HDF5 storage and Tensorboard integration for training monitoring.
About pytorch-image-captioning
senadkurtisi/pytorch-image-captioning
Transformer & CNN Image Captioning model in PyTorch.
Scores updated daily from GitHub, PyPI, and npm data. How scores work