Key concepts and architecture

This topic explains key concepts and architecture of CelerData Cloud Serverless to help you quickly understand this product.

CelerData Cloud Serverless is offered in two editions: Standard Edition and Premium Edition.

  • Standard Edition is designed as a query engine for data lakehouses. It does not provide any managed storage. Customer data is stored in the object store within their AWS account.
  • Premium Edition adds managed storage to Standard Edition. It can be used as a query engine as well as an analytical database. Premium Edition enables data ingestion from data lakehouses and the use of persistent materialized views to accelerate queries.

For the difference between features provided by these two editions, see Software editions.

Fully managed

CelerData Cloud Serverless is an easy-to-use, fully-managed data analytics platform built on top of StarRocks. It delivers a user-friendly SQL processing and blazingly fast data analytics experience, enabling you to query your data wherever it lives (data lake, Kafka, or anywhere else). You can concentrate on data analytics while CelerData will take care of the rest.

In this fully managed data analytics platform:

  • All compute resources (also referred to as warehouses in CelerData) are provided by the virtual machines within each CelerData cloud account. You do not need to provision any hardware. What's more, CelerData offers high scalability which enables you to start, stop, and scale compute resources all on demand.
  • All storage resources are provided by the object storage within each CelerData cloud account. Storage resources are independently scaled without affecting ongoing computing workloads. Please note that Standard Edition does not provide managed storage.
  • All software O&M operations, such as StarRocks version upgrades and service restarts, are handled smoothly by CelerData on behalf of the users.

In addition to storage and compute resources, other components on which CelerData depends are also provisioned in the cloud.

The CelerData team directly maintains the underlying StarRocks version, which ensures CelerData can always run on top of the most stable StarRocks version, so users can take advantage of up-to-date features and enhancements to the product.

Architecture

CelerData Cloud Serverless is built on top of the shared-data architecture of StarRocks, where storage and compute resources are decoupled to achieve near-unlimited scalability.

Architecture of Premium Edition

The following figure illustrates the architecture of CelerData Cloud Serverless Premium Edition to help you understand how CelerData works and interacts with other components.

premium

Explanation of components in the architecture from bottom up:

  • Storage resources are provided by the object storage within each CelerData cloud account. Data is secured using the built-in server-side encryption of the object storage. Because of the separation of storage and compute, the scaling of storage resources does not interrupt compute nodes.
  • Compute resources (compute nodes on which warehouses are built) are provided by virtual machines from cloud providers. You can create multiple warehouses within one cloud account to run different SQL workloads with high concurrency and desired isolation.
    • Warehouses can be horizontally scaled by adjusting the number of compute nodes (virtual machines) to suit varied requirements on query latency.
    • The underlying NVMe disks of a virtual machine function as the cache layer to cache data. When users analyze data in remote storage (such as CelerData-managed S3 or users' data lake), the cached data can significantly improve I/O efficiency.
  • StarRocks Frontend nodes are responsible for SQL parsing, query planning, metadata management, and workload management. They provide services from a K8s cluster built by CelerData in AWS and can automatically scale according to real-time QPS of the cluster.
  • Cloud SQL provides SQL gateway, user authentication, and network traffic control services. In the SQL gateway, customers can control query traffic from outside by customizing network policies. If customers have no access to public networks, they can still establish a private link with Cloud SQL as long as their client resides in the same AWS region as Cloud SQL.
  • Interaction layer provides a user-friendly GUI that allows connections from any MySQL-compliant clients or applications.

Architecture of Standard Edition

Standard Edition has a similar architecture to Premium Edition. The difference lies in the underlying storage.

Premium Edition provides managed storage, whereas Standard Edition does not. Customer data is stored in the object store within the customer's AWS account.

standard