aws-solutions-library-samples/guidance-for-scalable-model-inference-and-agentic-ai-on-amazon-eks

Comprehensive, scalable ML inference architecture using Amazon EKS, leveraging Graviton processors for cost-effective CPU-based inference and GPU instances for accelerated inference. Guidance provides a complete end-to-end platform for deploying LLMs with agentic AI capabilities, including RAG and MCP

/ 100

Emerging

Built on EKS with Karpenter for dynamic scaling, the solution orchestrates multi-agent workflows using the Strands Agent SDK, with LiteLLM providing a unified OpenAI-compatible gateway across Ray Serve and vLLM inference engines. Key integration points include Amazon OpenSearch for RAG, Langfuse for LLM observability, and MCP servers for external tools like Tavily web search, creating a production-grade agentic AI platform with automated feedback loops and quality assurance via Bedrock-hosted evaluators.

No Package No Dependents

Maintenance 10 / 25

Adoption 6 / 25

Maturity 9 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT-0

Category

agentic-ai-frameworks

Last pushed

Feb 14, 2026

Commits (30d)

GitHub

Agentic Ai Frameworks · 2 tools

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/mlops/aws-solutions-library-samples/guidance-for-scalable-model-inference-and-agentic-ai-on-amazon-eks"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Related tools

dyneth02/Air-Quality-Trends-Analysis-Project

Full-stack air quality analytics platform built with FastAPI, React, and MySQL. Aggregates...

Explore MLOps Tools

All categories Trending MLOps directory Insights