- Release Notes
- Introduction to CelerData Cloud Serverless
- Quick Start
- Sign up for CelerData Cloud Serverless
- A quick tour of the console
- Connect to CelerData Cloud Serverless
- Create an IAM integration
- Create and assign a warehouse
- Create an external catalog
- Load data from cloud storage
- Load data from Apache Kafka/Confluent Cloud
- Try your first query
- Invite new users
- Design data access control policy
- Warehouses
- Catalog, database, table, view, and MV
- Overview of database objects
- Catalog
- Table types
- Asynchronous materialized views
- Data Loading
- Data access control
- Networking and private connectivity
- Usage and Billing
- Organization and Account
- Integration
- Query Acceleration
- Reference
- AWS IAM policies
- Information Schema
- Overview
- be_bvars
- be_cloud_native_compactions
- be_compactions
- character_sets
- collations
- column_privileges
- columns
- engines
- events
- global_variables
- key_column_usage
- load_tracking_logs
- loads
- materialized_views
- partitions
- pipe_files
- pipes
- referential_constraints
- routines
- schema_privileges
- schemata
- session_variables
- statistics
- table_constraints
- table_privileges
- tables
- tables_config
- task_runs
- tasks
- triggers
- user_privileges
- views
- Data Types
- System Metadatabase
- Keywords
- SQL Statements
- Account Management
- Data Definition
- CREATE TABLE
- ALTER TABLE
- DROP CATALOG
- CREATE TABLE LIKE
- REFRESH EXTERNAL TABLE
- RESTORE
- SET CATALOG
- DROP TABLE
- RECOVER
- USE
- CREATE MATERIALIZED VIEW
- DROP DATABASE
- ALTER MATERIALIZED VIEW
- DROP REPOSITORY
- CANCEL RESTORE
- DROP INDEX
- DROP MATERIALIZED VIEW
- CREATE DATABASE
- CREATE TABLE AS SELECT
- BACKUP
- CANCEL BACKUP
- CREATE REPOSITORY
- CREATE INDEX
- Data Manipulation
- INSERT
- SHOW CREATE DATABASE
- SHOW BACKUP
- SHOW ALTER MATERIALIZED VIEW
- SHOW CATALOGS
- SHOW CREATE MATERIALIZED VIEW
- SELECT
- SHOW ALTER
- SHOW MATERIALIZED VIEW
- RESUME ROUTINE LOAD
- ALTER ROUTINE LOAD
- SHOW TABLES
- STREAM LOAD
- SHOW PARTITIONS
- CANCEL REFRESH MATERIALIZED VIEW
- SHOW CREATE CATALOG
- SHOW ROUTINE LOAD TASK
- SHOW RESTORE
- CREATE ROUTINE LOAD
- STOP ROUTINE LOAD
- SHOW DATABASES
- BROKER LOAD
- SHOW ROUTINE LOAD
- PAUSE ROUTINE LOAD
- SHOW SNAPSHOT
- SHOW CREATE TABLE
- CANCEL LOAD
- REFRESH MATERIALIZED VIEW
- SHOW REPOSITORIES
- SHOW LOAD
- Administration
- DESCRIBE
- SQL Functions
- Function List
- String Functions
- CONCAT
- HEX
- LOWER
- SPLIT
- LPAD
- SUBSTRING
- PARSE_URL
- INSTR
- REPEAT
- LCASE
- REPLACE
- HEX_DECODE_BINARY
- RPAD
- SPLIT_PART
- STRCMP
- SPACE
- CHARACTER_LENGTH
- URL_ENCODE
- APPEND_TAILING_CHAR_IF_ABSENT
- LTRIM
- HEX_DECODE_STRING
- URL_DECODE
- LEFT
- STARTS_WITH
- CONCAT
- GROUP_CONCAT
- STR_TO_MAP
- STRLEFT
- STRRIGHT
- MONEY_FORMAT
- RIGHT
- SUBSTRING_INDEX
- UCASE
- TRIM
- FIND_IN_SET
- RTRIM
- ASCII
- UPPER
- REVERSE
- LENGTH
- UNHEX
- ENDS_WITH
- CHAR_LENGTH
- NULL_OR_EMPTY
- LOCATE
- CHAR
- Predicate Functions
- Map Functions
- Binary Functions
- Geospatial Functions
- Lambda Expression
- Utility Functions
- Bitmap Functions
- BITMAP_SUBSET_LIMIT
- TO_BITMAP
- BITMAP_AGG
- BITMAP_FROM_STRING
- BITMAP_OR
- BITMAP_REMOVE
- BITMAP_AND
- BITMAP_TO_BASE64
- BITMAP_MIN
- BITMAP_CONTAINS
- SUB_BITMAP
- BITMAP_UNION
- BITMAP_COUNT
- BITMAP_UNION_INT
- BITMAP_XOR
- BITMAP_UNION_COUNT
- BITMAP_HAS_ANY
- BITMAP_INTERSECT
- BITMAP_AND_NOT
- BITMAP_TO_STRING
- BITMAP_HASH
- INTERSECT_COUNT
- BITMAP_EMPTY
- BITMAP_MAX
- BASE64_TO_ARRAY
- BITMAP_TO_ARRAY
- Struct Functions
- Aggregate Functions
- RETENTION
- MI
- MULTI_DISTINCT_SUM
- WINDOW_FUNNEL
- STDDEV_SAMP
- GROUPING_ID
- HLL_HASH
- AVG
- HLL_UNION_AGG
- COUNT
- BITMAP
- HLL_EMPTY
- SUM
- MAX_BY
- PERCENTILE_CONT
- COVAR_POP
- PERCENTILE_APPROX
- HLL_RAW_AGG
- STDDEV
- CORR
- COVAR_SAMP
- MIN_BY
- MAX
- VAR_SAMP
- STD
- HLL_UNION
- APPROX_COUNT_DISTINCT
- MULTI_DISTINCT_COUNT
- VARIANCE
- ANY_VALUE
- COUNT_IF
- GROUPING
- PERCENTILE_DISC
- Array Functions
- ARRAY_CUM_SUM
- ARRAY_MAX
- ARRAY_LENGTH
- ARRAY_REMOVE
- UNNEST
- ARRAY_SLICE
- ALL_MATCH
- ARRAY_CONCAT
- ARRAY_SORT
- ARRAY_POSITION
- ARRAY_DIFFERENCE
- ARRAY_CONTAINS
- ARRAY_JOIN
- ARRAY_INTERSECT
- CARDINALITY
- ARRAY_CONTAINS_ALL
- ARRAYS_OVERLAP
- ARRAY_MIN
- ARRAY_MAP
- ELEMENT_AT
- ARRAY_APPEND
- ARRAY_SORTBY
- ARRAY_TO_BITMAP
- ARRAY_GENERATE
- ARRAY_AVG
- ARRAY_FILTER
- ANY_MATCH
- REVERSE
- ARRAY_AGG
- ARRAY_DISTINCT
- ARRAY_SUM
- Condition Functions
- Math Functions
- Date and Time Functions
- DAYNAME
- MINUTE
- FROM_UNIXTIME
- HOUR
- MONTHNAME
- MONTHS_ADD
- ADD_MONTHS
- DATE_SUB
- PREVIOUS_DAY
- TO_TERA_DATA
- MINUTES_SUB
- WEEKS_ADD
- HOURS_DIFF
- UNIX_TIMESTAMP
- DAY
- DATE_SLICE
- DATE
- CURTIME
- SECONDS_SUB
- MONTH
- WEEK
- TO_DATE
- TIMEDIFF
- MONTHS_DIFF
- STR_TO_JODATIME
- WEEK_ISO
- MICROSECONDS_SUB
- TIME_SLICE
- MAKEDATE
- DATE_TRUNC
- JODATIME
- DAYOFWEEK
- YEARS_SUB
- TIMESTAMP_ADD
- HOURS_SUB
- STR2DATE
- TIMESTAMP
- FROM_DAYS
- WEEK_OF_YEAR
- YEAR
- TIMESTAMP_DIFF
- TO_TERA_TIMESTAMP
- DAYOFMONTH
- DAYOFYEAR
- DATE_FORMAT
- MONTHS_SUB
- NEXT_DAY
- MINUTES_DIFF
- DATA_ADD
- MINUTES_ADD
- CURDATE
- DAY_OF_WEEK_ISO
- CURRENt_TIMESTAMP
- STR_TO_DATE
- LAST_DAY
- WEEKS_SUB
- TO_DAYS
- DATEDIFF
- NOW
- TO_ISO8601
- TIME_TO_SEC
- QUARTER
- SECONDS_DIFF
- UTC_TIMESTAMP
- DATA_DIFF
- SECONDS_ADD
- ADDDATE
- WEEKSDIFF
- CONVERT_TZ
- MICROSECONDS_ADD
- SECOND
- YEARS_DIFF
- YEARS_ADD
- HOURS_ADD
- DAYS_SUB
- DAYS_DIFF
- Cryptographic Functions
- Percentile Functions
- Bit Functions
- JSON Functions
- Hash Functions
- Scalar Functions
- Table Functions
What is CelerData Cloud Serverless?
CelerData Cloud Serverless is a fully managed, blazingly fast data lake analytics platform built on top of StarRocks.
By taking advantage of an architecture that features separation of storage and compute, CelerData brings users cost-effective compute resources for their analytical SQL workloads.
With its out-of-the-box data analytics infrastructure, CelerData provides timely insights to all stakeholders. CelerData has brought many engineering breakthroughs to the market, delivering over 3x performance gains in standard benchmarks and up to an 80% reduction in operating costs. A number of significant customers worldwide choose CelerData as their analytics platform.
CelerData Cloud Serverless supports a wide range of use cases within one platform
Data lake analytics
Directly query the data on your own data lake without data migration. CelerData Cloud Serverless supports all mainstream open data lake formats including Apache Hive™, Apache Hudi, Apache Iceberg, and Apache Delta Lake.
Query acceleration
Accelerate the analytics workloads on your data lake using asynchronous materialized views.
Data warehousing
Ingest data from external data sources into CelerData Cloud Serverless to support many more low-latency and high-concurrency data analysis scenarios.
The following figure shows the architecture of CelerData.
Software editions in CelerData Cloud Serverless
CelerData Cloud Serverless offers two software editions to choose from, ensuring that your usage fits your organization’s specific requirements:
- CelerData Cloud Serverless Standard Edition
- CelerData Cloud Serverless Premium Edition
Premium Edition builds on Standard Edition through the addition of edition-specific features and/or higher levels of service. As your organization’s needs to change and grow, changing editions is easy.
When you create an account in CelerData, you need to define the software edition of the account.
Standard Edition
Customer benefits:
- Provides the compute engine for your data lake at minimum cost
- Stores in open standard formats
- Easy to query directly against AWS S3 or AWS Glue
- One-click migration from Trino, Presto, or Athena
- Provides a unified query layer to build reports based on multiple data sources
In Standard Edition, CelerData will not provide managed storage volumes for customers, which means that CelerData can be used as a query engine to conduct low-latency interactive analysis of your data lake, but not store data locally within CelerData.
Premium Edition
Customer benefits:
- Everything included in Standard Edition
- Higher performance queries for low-latency customer-facing workloads
- Pipeline-free query acceleration using materialized views for simplified architecture/maintenance
- Real-time low-latency workloads
In Premium Edition, CelerData provides managed storage for customers, which means data can be stored persistently within CelerData instead of, or in addition to, data in your data lake. So, in this edition, CelerData can be used as a query engine, and also as an analytical database. Import from your data lake into CelerData further accelerates analysis and supports business scenarios with more stringent latency and concurrency requirements.
Feature comparison
Feature | Standard Edition | Premium Edition |
---|---|---|
Integrate with external metastore (AWS Glue or Hive metastore) and query data in external data system (Apache Hudi, Apache Iceberg, Apache Hive, or Deltalake) | ✓ | ✓ |
Table creation and data processing in external data system (Apache Hudi, Apache Iceberg, Apache Hive, or Deltalake) | ✓ | ✓ |
Query load isolation based on multi-warehousing | ✓ | ✓ |
Warehouse compute node scale-in and scale-out | ✓ | ✓ |
Warehouse auto-suspend | ✓ | ✓ |
Use local-disk cache to speed up queries | ✓ | ✓ |
Table creation in CelerData managed storage volumes | ✓ | |
Batch data ingestion from cloud storage (customer managed AWS S3) into CelerData managed tables | ✓ | |
Integrate with Confluent cloud and routinely load data into CelerData managed tables | ✓ | |
Use HTTP Streaming API to push data from local sources into CelerData managed tables | ✓ | |
Incrementally load Parquet or ORC files from customer-managed AWS S3 buckets into CelerData managed tables | ✓ | |
Accelerate queries and build models with materialized views | ✓ | |
Integration with BI tools | ✓ | ✓ |
Role-based access control | ✓ | ✓ |
Audit log | ✓ | ✓ |
Private link | ✓ | ✓ |
IP address whitelist | ✓ | ✓ |
Choosing between Standard and Premium Editions
Standard Edition
If you already have a data lake storage tier, for example you have built a data lake based on Iceberg, Hudi, Hive, or Deltalake on your own AWS S3 bucket and you are looking for a fully-managed query engine that can be compatible with them and deliver great query performance, you may want to consider going with Standard Edition. You just need to create warehouses (collections of compute resources) to execute the various SQL queries. In CelerData Cloud Serverless, the warehouses are responsible for the execution of SQL computations, not storage. The underlying compute nodes do not persistently store data, they are responsible for hot data caching and computation. Computation can be scaled horizontally according to your needs by adding and removing warehouses. Users only need to choose the appropriate warehouse size according to the trade-off about price and performance. When you use Standard Edition, you are only charged for the amount of time to run the warehouse. You can save unnecessary costs by resuming and suspending warehouses on demand.
Premium Edition
If you have not built a storage tier yet and are looking for a fully managed data analytics platform that can persist data and provide high-performance analytics, you can go for Premium Edition. Or, if you have already built a storage tier, and you want to be able to export to another system with lower query latency to accelerate your data analytics, whether you are persisting data in a streaming store (for example, Kafka), a data lake store (for example, AWS S3 or Apache Hive), or a database (for example, MySQL or PostgreSQL), you can go for Premium Edition. In summary, Premium Edition provides data import and persistent storage capabilities, as well as the ability to query both CelerData managed and external systems. CelerData managed data provides users with lower analysis latency.
It is worth mentioning that Premium Edition is built on the shared-data architecture of StarRocks, which provides separate storage and compute resources. Premium Edition utilizes object storage for data persistence, and almost unlimited storage. With Premium Edition, you pay for both the storage resources and the compute consumption of the warehouse, the hourly price will be slightly higher than Standard Edition.
Pricing concern
In CelerData Cloud Serverless, the hourly warehouse price for different software editions varies. We provide a free trial to help you to test without focusing on the cost. If you are satisfied with the features and performance of the product and would like to get a clearer price list, please feel free to contact our sales team.
Upgrading
When you are testing with Standard Edition and want to upgrade to Premium Edition, you can contact our technical support team to help you with the upgrade. Automatic upgrades are not yet supported, so it will take about a day to do so. One-way software edition upgrades are allowed. However, currently CelerData Cloud Serverless cannot support rollbacks for software editions, so choose carefully when you create an account and select the edition.