Fresh Data and Fast Queries

Real-Time Analytics is the future

The value of data diminishes over time. A real-time era dawns. The data industry is going from batch reporting to real-time dashboards, from pre-defined reports to ad-hoc queries, from staff-facing to customer-facing analytics, and from decision support to AI-driven decision-making.

Data latency and query latency are the two major indicators of real-time analytics. Data ingestion and storage in real time ensures data freshness, and fast response to high-concurrency data queries maximizes the value of data.

Key capabilities

Real-time data ingestion

Two modes of data ingestion: pull or push

  • Push-based data ingestion within seconds
  • Change Data Capture (CDC) from databases within seconds
  • Streaming data ingestion from data streams
  • Batch and incremental ingestion from data lakes
Real-time data storage

Various real-time data models and light schema change

  • Unique key model: real-time upserts (merge-on-write or merge-on-read)
  • Duplicate key model: real-time append
  • Aggregate key model: real-time append
  • Light schema change: add or delete columns easily within seconds
Fast query response

Fast response to real-time data serving and interactive ad-hoc analytics

  • High concurrency point queries : hybrid row/column store, prefix index, inverted indexBenchmark
  • Large wide table queries: columnar storage, vectorized execution, materialized viewsBenchmark
  • Complex join queries: CBO, MPP architecture, Runtime FilterBenchmark