towhee and examples

The framework provides the core pipeline infrastructure while the examples repository demonstrates practical implementations of that infrastructure for specific unstructured data search tasks, making them complements designed to be used together.

towhee

Established

examples

Established

Maintenance 0/25

Adoption 10/25

Maturity 25/25

Community 19/25

Maintenance 0/25

Adoption 10/25

Maturity 16/25

Community 24/25

Stars: 3,458

Forks: 262

Downloads: —

Commits (30d): 0

Language: Python

License: Apache-2.0

Stars: 520

Forks: 124

Downloads: —

Commits (30d): 0

Language: Jupyter Notebook

License: Apache-2.0

Stale 6m

Stale 6m No Package No Dependents

About towhee

towhee-io/towhee

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

Supports multimodal unstructured data (text, images, video, audio) with 140+ pre-built operators spanning CV, NLP, and audio domains using a Pythonic method-chaining API. Leverages LLM-based pipeline orchestration with prompt management and knowledge retrieval, while offering pre-configured ETL pipelines for RAG, image search, and video deduplication. Can compile Python pipelines to high-performance Docker containers via Triton Inference Server, supporting TensorRT, PyTorch, and ONNX backends for CPU/GPU deployment.

About examples

towhee-io/examples

Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, molecular search, etc.

Towhee generates dense embedding vectors through composable ML operator pipelines, democratizing `x2vec` tasks across modalities including images, video, audio, and molecular data. It integrates pre-built operators from the Towhee Hub (CLIP, DPR, RDKit, etc.) with vector databases for similarity search and supports cross-modal retrieval tasks like text-to-image and text-to-video matching. The framework enables end-to-end applications from embedding generation through approximate nearest neighbor indexing with minimal code.

Scores updated daily from GitHub, PyPI, and npm data. How scores work