Solutions/Context Engineering

Context Engineering
on VeloDB

Store vectors, full-text, structured data, and JSON in one database. Retrieve with hybrid search and fusion re-ranking. Serve fresh context to LLMs and agents in milliseconds.

Keep context fresh with streaming ingest, real-time CDC and incremental index update in multimodel dataset
Store vectors, full-text, structured tables, and semi-structured JSON in one engine
Scale to billions of vectors with progressive filtering and IVPQ compression
Connect to AI applications through MCP Server, REST APIs, CLI and standard SQL
VeloDB context store for generative AI

Context that's
fresh
Retrieval that's accurate

The quality of every AI answer depends on the context behind it. VeloDB keeps that context fresh, stores it all in one place, and retrieves exactly what matters

Real-time context, not stale snapshots
Streaming ingest and native CDC keep your knowledge base current as source documents, policies, and transactional records change. Your LLM retrieves what is true now, not what was true when the index was last rebuilt.
Store all of context in one place
Your context lives in documents, metadata, embeddings, and event logs. VeloDB stores and queries all of them together. One SQL statement retrieves across vectors, full-text, structured columns, and JSON without stitching results from separate systems.
Retrieval accuracy without runaway infrastructure cost
VeloDB uses progressive filtering to keep retrieval accurate without scaling cost. Cheap SQL filters shrink the retrieval dimensions first, followed by keyword matching to prune out exact matches or negative matches, and finally vector similarity to the remaining data. Every stage reduces the work for the next one, so accuracy goes up while the computation time decreases.
Trusted in production

AI teams run on VeloDB

ByteDance searches 1 billion vectors for talent matching with 94% relevance, up from 58% on pure vector search, at 400ms latency.

94%
Relevance, from 58%
400ms
Latency at billion scale
384x
Compression with IVPQ

Per-segment BM25 caused ranking instability on every segment merge. Global statistics in Doris 4.0 with progressive filtering fixed it. Relevance jumped from 58% to 94%. Latency dropped from 2.8 seconds to 400 milliseconds. Storage shrank from 10TB on 20 servers to 500GB on a single server.

Engineering Team, ByteDance
Talent matching across global product portfolio
Read the full story
ByteDanceAISpeechXiaomiMeituanBaiduNetEaseKwaiJD.comTrip.comTencentByteDanceAISpeechXiaomiMeituanBaiduNetEaseKwaiJD.comTrip.comTencent
Real-world tradeoffs

Challenges building
production RAG and agents

01·Stale context
Your knowledge base is already outdated by the time your AI retrieves from it
Source documents change, policies get updated, transactional records shift. But most RAG pipelines rebuild indexes on a schedule, daily or weekly.
In the gap between updates, your AI retrieves information that is no longer true. Users get confident, wrong answers.
Tap to flip
How VeloDB solves it
Context that stays current as the source data changes
VeloDB ingests updates through streaming load and native CDC so changes from upstream databases and document stores become searchable in seconds, not days. Only changed content is re-indexed, so there are no expensive full-corpus rebuilds. Your knowledge base reflects what is true now, not what was true at the last scheduled refresh.
← Flip back
02·Fragmented context
Your AI needs context from four systems but can only query one at a time
Embeddings live in a vector database. Full-text lives in a search engine. Metadata lives in a relational database. JSON payloads live somewhere else.
Retrieving a complete answer means calling four APIs, merging results in application code, and hoping the data stays consistent across all of them.
Tap to flip
How VeloDB solves it
All context types stored and queried in one place
VeloDB stores vectors, full-text, structured columns, JSON, and bitmap labels together. A single SQL query searches across all modalities and ranks results using reciprocal rank fusion. No app-layer joins, no cross-system consistency gaps, and no extra infrastructure to operate.
← Flip back
03·Retrieval cost
Retrieval gets more expensive as your knowledge base grows but accuracy does not improve
Vector search scans the full index on every query. As the corpus grows from thousands to millions of documents, compute cost grows linearly while relevance stays flat or gets worse.
Teams end up choosing between retrieval quality and infrastructure budget.
Tap to flip
How VeloDB solves it
Progressive filtering for accurate and cost-efficient retrieval
VeloDB uses progressive filtering to keep retrieval accuracy high without scaling cost. Each query passes through multiple retrieval steps, each step narrowing the search space before the next one runs. As your knowledge base grows, the cost per query stays flat because expensive operations only apply to the pre-filtered subset of the total data that needs to be processed.
← Flip back
Architecture overview

VeloDB for generative AI

Ingest documents, transactional records, and event streams. Embed and chunk incrementally. Store vectors, text, structured metadata, JSON, and labels in one engine. Retrieve and rank via progressive filtering. Serve fresh context to LLMs and agents.

VeloDB context store
Context Sources
Documents
Confluence, PDF, S3 buckets
Transactional
MySQL, Postgres via CDC
Event streams
Kafka, Pulsar
External APIs
SaaS, webhooks
Ingest & Embed
CocoIndex
Incremental transform, chunk, embed
Routine Load
Continuous from Kafka
Native CDC
Doris 4.1 Streaming Jobs
Stream Load
HTTP push from services
Store & Index
HNSW + IVPQ Vectors
Billion-scale, 384x compression
Inverted Index + BM25
Global statistics for stable ranking
VARIANT (JSON)
Semi-structured metadata
Bitmap Labels
Fast set operations
Merge-on-Write
Latest version query-ready
Retrieve & Rank
Progressive Filtering
SQL, BM25, vectors in order
Reciprocal Rank Fusion
Combine signals from all modalities
Pre-filter Predicates
Shrink candidate set first
Single SQL Query
All modalities, one round trip
Serve to AI
MCP Server
Model Context Protocol for agents
REST API
HTTP retrieval for app code
Standard SQL
MySQL-compatible wire protocol
CLI
CocoIndex real-time context pipeline

Stop juggling four systems.
Run your context on one.

Spin up VeloDB Cloud in under 60 seconds and run hybrid search across vectors, text, structured data, and JSON in a single SQL query.

Need help? Contact us!