promptfoo and promptfoo-action
The GitHub Action is a wrapper that integrates the core testing framework into CI/CD pipelines, making them complements designed to be used together rather than alternatives.
About promptfoo
promptfoo/promptfoo
Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
Supports automated red teaming and vulnerability scanning through LLM-generated adversarial prompts, alongside traditional metric-based evaluations with custom grading logic. Executes tests locally with configurable providers (OpenAI, Anthropic, Bedrock, Ollama, etc.) while storing results for comparison, and integrates natively with GitHub code scanning and CI/CD pipelines for continuous LLM app security validation.
About promptfoo-action
promptfoo/promptfoo-action
The GitHub Action for Promptfoo. Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
Automatically compares prompt changes between git commits and posts interactive before/after evaluations directly to pull requests, with support for push and manual workflow triggers. The action integrates with promptfoo's declarative YAML config system and web viewer for side-by-side result exploration, while supporting optional result caching and pass/fail thresholds to enforce quality gates in CI/CD pipelines.
Scores updated daily from GitHub, PyPI, and npm data. How scores work