Back
User Stories

Apache Doris Empowers Real-time Lakehouse in Cainiao's Large-Scale Business Scenarios

VeloDB Engineering Team· 2025/08/18

Cainiao, the world's e-commerce logistics giant, chose Apache Doris to upgrade its data platform. This step-by-step migration started in 2023, including validating Doris in a mission-critical scenario, expanding Doris's application scenarios, and executing full-scale deployment. Doris’s cost efficiency, stability, and operational efficiency have been powerfully proven. Currently, Doris powers over 25 clusters (10,000+ CPUs) across 3 regions without any failure.

Why Apache Doris?

Cainiao, as the world's leading cross-border e-commerce logistics company, operates one of the largest global logistics networks spanning 200+ countries and regions. It has over 1,100 warehouses worldwide, handling over 80 million packages daily. The Cainiao App, with 60+ million MAUs, is the most popular logistics app globally.

Faced with such tremendous amounts of data, Cainiao developed a real-time data platform:

Why Apache Doris?.PNG

Cainiao chose Doris as the core solution for its OLAP.

First, Doris features open-source sustainability. Apache Software Foundation's governance mode ensures Doris's long-term, stable development and innovation. The open-source transparency allows users to grasp Doris's technical architecture and avoids vendor lock-in risks. Meanwhile, the vibrant Doris community provides abundant technical support and shares best practices.

Then, Cainiao proposed strict requirements for real-time data updates and efficient complex queries for its logistics data. Doris uses the Primary Key (Unique) Model for data updates, taking Merge-on-Write (MoW) as the default storage implementation. During data ingestion, if the record does not exist, it is inserted; if it exists, it is updated. Both whole row updates (default) and partial column updates are supported. Data updates can be completed within seconds, and queries can be responded to within hundreds of milliseconds.

Specifically, Doris supports:

  • Replenishment operations: Doris's sub-second data visibility provides the latest inventory status for replenishment decision-making.
  • Inventory management: Doris's high-concurrency real-time updates guarantee the consistency and freshness of inventory data to present inventory changes.
  • Order processing: Doris's Unique Key Model can efficiently synchronize upstream updated records on order status and perform efficient UPSERT operations, enabling real-time data updates for high-frequency queries.
  • Logistics tracking: Doris's real-time update capability ensures the freshness and consistency of tracking data, providing users with accurate information on package locations and status.

Lastly, Cainiao conducted extensive investigations and acknowledged that Doris powered a unified, stable operation system.

In summary, Doris brought low costs, high stability, high performance, and operational efficiency.

The Migration to Apache Doris

Validating Apache Doris in a Core Business Scenario

Cainiao chose package production progress, a mission-critical scenario, to validate Doris's availability. Package production progress is a multi-table ad-hoc scenario, involving over 35 filtering conditions, 38 dimensions, 50 metrics, 6 hundred-million-row tables, and 10,000 daily SQL Patterns.

Validating Apache Doris in a Core Business Scenario.png

Doris delivered superior performance:

  • Point queries: point queries on unique key tables achieved 1000-2000 QPS with response time ranging from 10 to 100 milliseconds.
  • Multi-table joins: results could be returned within one second, and multi-table joins achieved 200-300 QPS in core scenarios.

With Doris, Cainiao reduced costs by 90% and average response time by 72%.

Migration Challenges and Solutions

  • SQL Syntax Compatibility: Doris is highly compatible with the MySQL protocol and supports standard SQL syntax. For incompatible syntaxes, Cainiao first rewrote them after case-by-case analysis. For syntaxes that could not be rewritten, such as the strings concatenated by specific separators, Cainiao upgraded Doris's inverted index tokenizer with the function of tokenizing according to custom separators.
  • Data Export: as Doris only supports the UTF-8 character set encoding, non-UTF-8 encoded content will be displayed as garbled text. Cainiao implemented BOM (Byte Order Mark) header injection in the output stream to solve this problem.
  • Data Synchronization: logistics data writes feature high TPS (typically 50,000 rows/sec in single tables) and wide tables (typically 300-400 fields). Such a "wide-table" ingestion mode posed strict requirements for real-time data updates. Doris adopted unique key indexing, batch loading, and partitioning & bucketing, achieving low I/O costs, real-time data visibility, sub-second queries, and high data consistency.

How Apache Doris Powers An Efficient, Stable Data Platform

Automated Operation and Maintenance (O&M)

Cainiao extracted all reusable and independent nodes of O&M into APIs based on its internal infrastructure and Doris Manager, integrated APIs into the Doris O&M platform, and then organized them into each workflow. This allowed 10-minute cluster creation, batch scaling, faster resource group isolation, and global deployment. Most O&M work could be completed with one click.

Automated Operation and Maintenance (O&M).png

BadSQL Identification

Cainiao mainly adopted two methods:

  • Traffic tagging: tagging helped quickly trace the origin and usage scenario of SQL queries.
  • Doris audit logs: Doris provides auditing capabilities for database operations, allowing the recording of user logins, queries, and modification operations on the database. Cainiao visualized auditing data into charts to identify BadSQL details rapidly.

Killing BadSQL

  • SQL Block Rule: Doris supports planning-time circuit breaking, which is used to prevent the execution of statements that match specific patterns. Cainiao took tags as block rules to terminate Bad SQL accurately.
  • Kill All Query: Doris supports cancelling currently executing operations or disconnecting current connection sessions using the KILL command. Cainiao upgraded this function into Kill All Query to cancel all operations with one click.

Batch Deletion Based on Load

Doris supports deletion by adding a delete sign when loading data. Compared to the DELETE statement, using delete signs offers better usability and performance.

Resource Isolation

Doris supports isolation between different businesses. When data is not shared between multiple businesses, a Resource Group can be assigned to each business, ensuring no interference between them. This effectively consolidates multiple physical clusters into a single large cluster for management. Doris also supports read-write isolation. A cluster can be divided into two Resource Groups, with an Offline Resource Group for executing ETL jobs and an Online Resource Group for handling online queries. This allows for the provision of both online and offline services within a unified cluster.

Resource Isolation.png

The Future: Seamless Migration to Apache Doris

Cainiao will accelerate its migration to Apache Doris:

  • Introducing compute-storage decoupled mode: achieving flexible scaling of storage capacity and computing resources step-by-step for optimal performance and cost efficiency.
  • Building a unified lakehouse: focusing on accelerated data loading, full and incremental merge, and stream-batch processing. Currently, the Doris+Paimon data lake solution has been deployed in several scenarios.
  • Achieving regional disaster recovery: leveraging Doris's cross-cluster data synchronization capabilities to allow other clusters to quickly take over the business when one cluster fails.
  • Upgrading automated operation system: developing a stable, efficient operation system by automating more functions, and introducing AI technologies to boost efficiency.