In big data analytics, cost-efficiency remains a critical priority for all enterprises—especially those handling massive volumes of data. The ARM architecture, with its energy efficiency and low cost, is rapidly becoming the ideal solution in data centers. AWS Graviton processors, built on the ARM architecture, are leading this transition. Compared to traditional x86 processors, Graviton excels in computing performance while significantly reducing operational expenses, offering a clear benefits for data-intensive workload.
Apache Doris is an MPP-based, high-performance, real-time analytical database designed for blazing-fast anlytics of large-scale datasets. It meets diverse data processing and analytical workloads, including real-time analytics, lakehouse analytics, observability analytics and so on. VeloDB, a real-time database for search and analytics, built on Apache Doris, extends these capabilities. Through its core features (real-time, unified, elastic, and open), VeloDB offers enterprises with cost-effective, easy-to-use, secure, stability for data analytics at scale.
Both VeloDB and Apache Doris now support AWS Graviton, leveraging the efficiency of ARM processors. This combination delivers higher performance for large-scale data while maintaining lower energy consumption, significantly enhancing overall cost-effectiveness.
VeloDB's Innovation on AWS Graviton
VeloDB is extensively optimized for ARM architecture. Refined kernel scheduling and memory management significantly boost query speeds on ARM processors. Specific optimizations include:
- Native ARM Operator Vectorization: VeloDB fully supports operator vectorization on ARM. By using CPU's SIMD (Single Instruction, Multiple Data) instructions, we significantly increase data throughput during data processing, which is a crucial advantage for OLAP workloads. We achieved this by migrating x86's SSE and AVX instructions to ARM's NEON vectorization, ensuring VeloDB delivers top-tier data processing capabilities on ARM. Our next step involves deeper SVE adaptation, expected to yield even better performance.
- Efficient Multi-threading Synchronization: Unlike x86, ARM architecture offers more relaxed memory ordering, allowing for greater multi-thread parallelism. VeloDB fully uses this, adapting synchronization methods to bottlenecks to minimize overhead and ensure CPU resources prioritize core data processing.
- Efficient Task Scheduling: VeloDB's engine efficiently handles massive parallel tasks under heavy load. It fully uses modern CPU multi-core capabilities, decomposing queries for parallel execution during scheduling. Combined with ARM's low power and cost, users using VeloDB can deploy more CPU cores for analytical workloads.
VeloDB's Performance on ARM Architecture
To showcase VeloDB's performance across architectures, we set up both x86 and ARM clusters on AWS EC2, and evaluated VeloDB using industry-standard test suites.
As the summarized data below illustrates, VeloDB on the AWS EC2 ARM (48c) cluster consistently and significantly outperformed its equivalent x86 (48c) cluster across all five test sets.
Furthermore, considering AWS Graviton's lower cost, the ARM 48c also delivered higher cost efficiency, achieving a 32% improvement in the Clickbench test set.
01 Cluster Configuration
We selected AWS EC2 cloud servers to set up VeloDB clusters for testing, using x86 and ARM machines respectively. The cluster configuration for both was 48c.
- x86 architecture: 48c , equipped with the Ice Lake 8375C processor.
- ARM architecture: 48c, powered by AWS's proprietary Graviton3 processor.
The detailed configurations are as follows:
02 Test Methodology and Datasets
We used five of the most representative performance test sets (as shown in the table below) to comprehensively evaluate VeloDB's performance across various scenarios:
For each test set, all SQL queries within the set were executed sequentially. Each query was run 4 consecutive times (1 cold query and 3 hot queries). For the hot queries, the fastest execution time was taken as the actual time consumed for that SQL query, and the final results were aggregated.
03 Performance Comparison of ARM and x86 Across Test Sets
-
Clickbench
-
SSB 100G
-
SSB FLAT 100G
-
TPC-H 100G
-
TPC-DS 100G
Conclusion
Powered by Apache Doris, VeloDB now runs on AWS Graviton and testing confirms it’s a highly cost-effective solution for real-time data warehousing. Compared to parallel systems on x86 architecture, it cuts costs by 36%. As data volumes keep growing, enterprises can leverage this solution to build a more economical big data analytics tech stack. For further questions, feel free to reach out. CONTACT US