Meetup

Lakehouse at Scale: Apache Doris × OLake

date icon Saturday, May 9, 2026 | 11:00 AM IST
address iconHustlehub Tech Park Building, Bengaluru

Apache Iceberg adoption is accelerating, and with it come two operational realities data teams are running into head-on: table maintenance at scale and the demand for real-time, accurate retrieval powering AI systems.

This meetup brings together practitioners and contributors working at both ends of that problem. Expect technical depth, real production context, and open discussion with engineers who are actively building and operating lakehouse infrastructure.

TimeAgenda
11:00 - 11:30Welcome!
11:30 - 12:10How Apache Doris Powers AI Agents with Hybrid Search and Real-Time Analytics
Matt Yi, Apache Doris PMC Member, Tech VP at VeloDB
This talk will introduce hybrid search with Apache Doris as a next-generation retrieval solution for generative AI and context engineering. Addressing the semantic confusion problem by combining vector search, full-text search, and SQL to capture the exact-match intent with semantic similarity, resulting in a more accurate and cost-effective solution. Matt will then touch on how the native real-time capability that Apache Doris brings to OLAP can be extended to real-time RAG, helping organizations think about future challenges in this space.
12:10 - 12:50OLake Fusion: Solving Apache Iceberg Table Maintenance Problems at High Scale
Ankit Sharma, Tech Lead
Badal Prasad Singh, Software Engineer, OLake
This talk will address how continuous CDC ingestion can lead to small file accumulation and query performance degradation. The speakers will explore various compaction strategies (lite, medium, and full) and provide guidance on selecting the optimal mode based on specific workloads and file size targets. They will also cover practical implementation via Cron-based scheduling and orchestration using Helm and Docker Compose, while sharing key insights into building multi-catalog maintenance systems that ensure live ingestion remains uninterrupted.
12:50 - 13:00Break
13:00 - 13:30Apache Doris User Sharing
Nilanjan Sarkar
Meet Nilanjan, an experienced Data Engineer who specializes in enterprise-grade data solutions and is a familiar, supportive face in the Apache Doris community. Over the last year, Nilanjan has gone from exploring the potential of Apache Doris to deploying it in production. We are thrilled to have him share his unique journey, offering a deep dive into his technical experiences and the milestones of bringing Apache Doris to life in a real-world enterprise setting.
13:30Lunch

How to find us

Hustlehub Tech Park
PWD Quarters, 1st Sector, HSR Layout, Bengaluru, Karnataka 560102, India

Google Map

The VeloDB Speaker

speaker name

Matt Yi

Apache Doris PMC Member, Tech VP at VeloDB

As Tech VP at VeloDB and an Apache Doris PMC Member, Matt Yi brings over 10 years of expertise in database kernel R&D and technical management. He has spearheaded multiple major version iterations of Apache Doris, driving significant performance breakthroughs across diverse analytical scenarios. As a core driver of the community, Matt has witnessed and propelled the evolution of Apache Doris from a technical project into one of the most active and influential open-source communities in the global big data and database landscape.

Connect with Matt on Linkedin

Register

Register the event to receive the meeting link, event replay, and access to more resources!

By registering, you acknowledge that VeloDB will process your personal information in accordance with our Privacy Policy.

Join Slack or Discord to get real-time help, talk to core devs, and help plan the future of Apache Doris.

Need help? Contact us!