FareedKhan-dev/save-llm-api-cost

A straightforward method to reduce your LLM inference API costs and token usage.

/ 100

Emerging

Implements a memory-efficient algorithm that compresses conversation history by selectively storing and updating only essential facts rather than the entire chat log, reducing token usage by ~40%. It uses embedding-based similarity matching with semantic fact extraction and classification (ADD, UPDATE, NOOP operations) to intelligently manage long-context conversations. The approach integrates with OpenAI-compatible APIs (Nebius, OpenAI) and provides a practical Python reference implementation with comparative benchmarks showing token savings at scale.

No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 6 / 25

Maturity 9 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

MIT

Higher-rated alternatives

isEmmanuelOlowe/llm-cost-estimator

Estimating hardware and cloud costs of LLMs and transformer projects

saqibameen/model-cost

Compare LLM API pricing from your terminal. Supports 300+ models across all major providers....

WilliamJlvt/llm_price_scraper

A simple Python Scraper to retrieve pricing information for Large Language Models (LLMs) from an...

truefoundry/models

Community-maintained registry of AI/LLM model configurations - pricing, features, and limits...

quarkloop/llmcost

This repository is no longer actively maintained. Please use https://github.com/quarkloop/ai instead.

Explore LLM Tools

All categories Trending LLM Tool directory Insights