- Release Notes
- Get Started
- Clusters
- Cloud Settings
- Table Type
- Query Data Lakes
- Integration
- Query Acceleration
- Data Loading
- Concepts
- Batch load data from Amazon S3
- Batch load data from Azure cloud storage
- Load data from a local file system
- Load data from Confluent Cloud
- Load data from Amazon MSK
- Load data from Amazon Kinesis
- Data Unloading
- Data Backup
- Security
- Console Access Control
- Data Access Control
- Application keys
- Service accounts
- Use SSL connection
- Alarm
- Usage and Billing
- Organizations and Accounts
- Reference
- Amazon Web Services (AWS)
- Microsoft Azure
- SQL Reference
- Keywords
- ALL statements
- User Account Management
- Cluster Management
- ADMIN CANCEL REPAIR
- ADMIN CHECK TABLET
- ADMIN REPAIR
- ADMIN SET CONFIG
- ADMIN SET REPLICA STATUS
- ADMIN SHOW CONFIG
- ADMIN SHOW REPLICA DISTRIBUTION
- ADMIN SHOW REPLICA STATUS
- ALTER RESOURCE GROUP
- ALTER SYSTEM
- CANCEL DECOMMISSION
- CREATE FILE
- CREATE RESOURCE GROUP
- DROP FILE
- DROP RESOURCE GROUP
- EXPLAIN
- INSTALL PLUGIN
- SET
- SHOW BACKENDS
- SHOW BROKER
- SHOW COMPUTE NODES
- SHOW FRONTENDS
- SHOW FULL COLUMNS
- SHOW INDEX
- SHOW PLUGINS
- SHOW PROCESSLIST
- SHOW RESOURCE GROUP
- SHOW TABLE STATUS
- SHOW FILE
- SHOW VARIABLES
- UNINSTALL PLUGIN
- DDL
- ALTER DATABASE
- ALTER MATERIALIZED VIEW
- ALTER TABLE
- ALTER VIEW
- ANALYZE TABLE
- BACKUP
- CANCEL ALTER TABLE
- CANCEL BACKUP
- CANCEL RESTORE
- CREATE ANALYZE
- CREATE DATABASE
- CREATE EXTERNAL CATALOG
- CREATE INDEX
- CREATE MATERIALIZED VIEW
- CREATE REPOSITORY
- CREATE TABLE AS SELECT
- CREATE TABLE LIKE
- CREATE TABLE
- CREATE VIEW
- CREATE FUNCTION
- DROP ANALYZE
- DROP STATS
- DROP CATALOG
- DROP DATABASE
- DROP INDEX
- DROP MATERIALIZED VIEW
- DROP REPOSITORY
- DROP TABLE
- DROP VIEW
- DROP FUNCTION
- KILL ANALYZE
- RECOVER
- REFRESH EXTERNAL TABLE
- RESTORE
- SET CATALOG
- SHOW ANALYZE JOB
- SHOW ANALYZE STATUS
- SHOW META
- SHOW FUNCTION
- TRUNCATE TABLE
- USE
- DML
- ALTER LOAD
- ALTER ROUTINE LOAD
- BROKER LOAD
- CANCEL LOAD
- CANCEL EXPORT
- CANCEL REFRESH MATERIALIZED VIEW
- CREATE ROUTINE LOAD
- DELETE
- EXPORT
- GROUP BY
- INSERT
- PAUSE ROUTINE LOAD
- RESUME ROUTINE LOAD
- REFRESH MATERIALIZED VIEW
- SELECT
- SHOW ALTER
- SHOW ALTER MATERIALIZED VIEW
- SHOW BACKUP
- SHOW CATALOGS
- SHOW CREATE CATALOG
- SHOW CREATE MATERIALIZED VIEW
- SHOW CREATE TABLE
- SHOW CREATE VIEW
- SHOW DATA
- SHOW DATABASES
- SHOW DELETE
- SHOW DYNAMIC PARTITION TABLES
- SHOW EXPORT
- SHOW LOAD
- SHOW MATERIALIZED VIEW
- SHOW PARTITIONS
- SHOW REPOSITORIES
- SHOW RESTORE
- SHOW ROUTINE LOAD
- SHOW ROUTINE LOAD TASK
- SHOW SNAPSHOT
- SHOW TABLES
- SHOW TABLET
- SHOW TRANSACTION
- STOP ROUTINE LOAD
- STREAM LOAD
- SUBMIT TASK
- UPDATE
- Auxiliary Commands
- Data Types
- Keywords
- SQL Functions
- Function list
- Java UDFs
- Window functions
- Lambda expression
- Date Functions
- add_months
- adddate
- convert_tz
- current_date
- current_time
- current_timestamp
- date
- date_add
- date_diff
- date_format
- date_slice
- date_sub, subdate
- date_trunc
- datediff
- day
- dayofweek_iso
- dayname
- dayofmonth
- dayofweek
- dayofyear
- days_add
- days_diff
- days_sub
- from_days
- from_unixtime
- hour
- hours_add
- hours_diff
- hours_sub
- jodatime_format
- last_day
- makedate
- microseconds_add
- microseconds_sub
- minute
- minutes_add
- minutes_diff
- minutes_sub
- month
- monthname
- months_add
- months_diff
- months_sub
- next_day
- now
- previous_day
- quarter
- second
- seconds_add
- seconds_diff
- seconds_sub
- str_to_date
- str_to_jodatime
- str2date
- time_slice
- time_to_sec
- timediff
- timestamp
- timestampadd
- timestampdiff
- to_date
- to_days
- to_iso8601
- to_tera_date
- to_tera_timestamp
- unix_timestamp
- utc_timestamp
- week
- week_iso
- weekofyear
- weeks_add
- weeks_diff
- weeks_sub
- year
- years_add
- years_diff
- years_sub
- Aggregate Functions
- any_value
- approx_count_distinct
- array_agg
- avg
- bitmap
- bitmap_agg
- count
- count_if
- corr
- covar_pop
- covar_samp
- group_concat
- grouping
- grouping_id
- hll_empty
- hll_hash
- hll_raw_agg
- hll_union
- hll_union_agg
- max
- max_by
- min
- min_by
- multi_distinct_sum
- multi_distinct_count
- percentile_approx
- percentile_cont
- percentile_disc
- retention
- stddev
- stddev_samp
- sum
- variance, variance_pop, var_pop
- var_samp
- window_funnel
- Geographic Functions
- String Functions
- append_trailing_char_if_absent
- ascii
- char
- char_length
- character_length
- concat
- concat_ws
- ends_with
- find_in_set
- group_concat
- hex
- hex_decode_binary
- hex_decode_string
- instr
- lcase
- left
- length
- locate
- lower
- lpad
- ltrim
- money_format
- null_or_empty
- parse_url
- repeat
- replace
- reverse
- right
- rpad
- rtrim
- space
- split
- split_part
- substring_index
- starts_with
- strleft
- strright
- str_to_map
- substring
- trim
- ucase
- unhex
- upper
- url_decode
- url_encode
- Pattern Matching Functions
- JSON Functions
- Overview of JSON functions and operators
- JSON operators
- JSON constructor functions
- JSON query and processing functions
- Bit Functions
- Bitmap Functions
- Array Functions
- all_match
- any_match
- array_agg
- array_append
- array_avg
- array_concat
- array_contains
- array_contains_all
- array_cum_sum
- array_difference
- array_distinct
- array_filter
- array_generate
- array_intersect
- array_join
- array_length
- array_map
- array_max
- array_min
- array_position
- array_remove
- array_slice
- array_sort
- array_sortby
- array_sum
- arrays_overlap
- array_to_bitmap
- cardinality
- element_at
- reverse
- unnest
- Map Functions
- Binary Functions
- cast function
- hash function
- Cryptographic Functions
- Math Functions
- Pattern Matching Functions
- Percentile Functions
- Scalar Functions
- Struct Functions
- Table Functions
- Utility Functions
- AUTO_INCREMENT
- Generated columns
- System variables
- System limits
- Information Schema
- Overview
- be_bvars
- be_cloud_native_compactions
- be_compactions
- character_sets
- collations
- column_privileges
- columns
- engines
- events
- global_variables
- key_column_usage
- load_tracking_logs
- loads
- materialized_views
- partitions
- pipe_files
- pipes
- referential_constraints
- routines
- schema_privileges
- schemata
- session_variables
- statistics
- table_constraints
- table_privileges
- tables
- tables_config
- task_runs
- tasks
- triggers
- user_privileges
- views
- System Metadatabase
- API
- Overview
- Actions
- Clusters
- Create and Manage Clusters
- Query Clusters
- Identity and Access Management
- Organization and Account
- Usage and Billing
- Clusters
- Terraform Provider
- Run scripts
CREATE TABLE
Description
Creates a new table in CelerData.
Syntax
CREATE TABLE [IF NOT EXISTS] [database.]table_name
(column_definition1[, column_definition2, ...]
[, index_definition1[, ndex_definition12,]])
[ENGINE = olap]
[key_desc]
[COMMENT "table comment"];
[partition_desc]
[distribution_desc]
[rollup_index]
[ORDER BY (column_definition1,...)]
[PROPERTIES ("key"="value", ...)]
[BROKER PROPERTIES ("key"="value", ...)]
Parameters
column_definition
Syntax:
col_name col_type [agg_type] [NULL | NOT NULL] [DEFAULT "default_value"] [AUTO_INCREMENT] [AS generation_expr]
Note:
col_name:Column name
col_type:Column type
Specific column information, such as types and ranges:
* TINYINT(1 byte)
Range: -2^7 + 1 ~ 2^7 - 1
* SMALLINT(2 bytes)
Range: -2^15 + 1 ~ 2^15 - 1
* INT(4 bytes)
Range: -2^31 + 1 ~ 2^31 - 1
* BIGINT(8 bytes)
Range: -2^63 + 1 ~ 2^63 - 1
* LARGEINT(16 bytes)
Range: -2^127 + 1 ~ 2^127 - 1
* FLOAT(4 bytes)
Support scientific notation
* DOUBLE(8 bytes)
Support scientific notation
* DECIMAL[(precision, scale)] (16 bytes)
Default value: DECIMAL(10, 0).
precision: 1 ~ 38
scale: 0 ~ precision
Integer part:precision - scale
Scientific notation is not supported
* DATE(3 bytes)
Range :0000-01-01 ~ 9999-12-31
* DATETIME(8 bytes )
Range :0000-01-01 00:00:00 ~ 9999-12-31 23:59:59
* CHAR[(length)]
Fixed length string. Range:1 ~ 255。Default value: 1.
* VARCHAR[(length)]
Variable length string. Range:1 ~ 65533.
* HLL (1~16385 bytes)
For HLL type, there's no need to specify length or default value.
The length will be controlled within the system according to data aggregation.
HLL column can only be queried or used by hll_union_agg、Hll_cardinality、hll_hash.
* BITMAP
Bitmap type does not require specified length or default value. It represents a set of unsigned bigint numbers. The largest element could be up to 2^64 - 1.
agg_type:aggregation type. If not specified, this column is key column.
If specified, it it value column.
The aggregation types supported are as follows:
* SUM、MAX、MIN、REPLACE
* HLL_UNION (only for HLL type)
* BITMAP_UNION(only for BITMAP)
* REPLACE_IF_NOT_NULL:This means the imported data will only be replaced when it is of non-null value. If it is of null value, CelerData will retain the original value.
Note: if NOT NULL is specified by REPLACE_IF_NOT_NULL column when the table was created, CelerData will still convert the data to NULL without sending an error report to the user. With this, the user can import selected columns.
This aggregation type applies ONLY to the aggregation table whose key_desc type is AGGREGATE KEY.
NULL is not allowed by default. NULL value should be represented by /N in the imported data.
Note:
When the column of aggregation type BITMAP_UNION is imported, its original data types must be TINYINT, SMALLINT,
AUTO_INCREMENT:specifies an AUTO_INCREMENT
column. The data types of AUTO_INCREMENT
columns must be BIGINT. Auto-incremented IDs start from 1 and increase at a step of 1. For more information about AUTO_INCREMENT
columns, see AUTO_INCREMENT.
AS generation_expr: specifies the generated column and its expression. The generated column can be used to precompute and store the results of expressions, which significantly accelerates queries with the same complex expressions.
index_definition
Syntax:
INDEX index_name (col_name[, col_name, ...]) [USING BITMAP] COMMENT 'xxxxxx'
Note:
index_name: Index name
col_name: Column name
Note: Currently only BITMAP index is supported and it only applies to single columns.
ENGINE type
Default value: olap
.
CelerData supports creating Parquet-formatted tables in Iceberg catalogs, and you can insert data to these Parquet-formatted Iceberg tables by using INSERT INTO. See Create an Iceberg table.
key_desc
Syntax:
`key_type(k1[,k2 ...])`
Note:
Data is sequenced in specified key columns and has different attributes for different key_type.
Key_type supports:
AGGREGATE KEY: Identical content in key columns will be aggregated into value columns according to the specified aggregation type. It usually applies to business scenarios such as financial statements and multi-dimensional analysis.
UNIQUE KEY/PRIMARY KEY: Identical content in key columns will be replaced in value columns according to the import sequence. It can be applied to make addition, deletion, modification and query on key columns.
DUPLICATE KEY: Identical content in key columns, which also exists in CelerData at the same time. It can be used to store detailed data or data with no aggregation attributes.
DUPLICATE KEY is set as default. Data will be sequenced according to key columns.
Please note:
Value columns do not need to specify aggregation types when other key_type is used to create tables with the exception of AGGREGATE KEY.
partition_desc
Partition description can be used in the following three ways:
LESS THAN
Syntax:
PARTITION BY RANGE (k1, k2, ...)
(
PARTITION partition_name1 VALUES LESS THAN MAXVALUE|("value1", "value2", ...),
PARTITION partition_name2 VALUES LESS THAN MAXVALUE|("value1", "value2", ...)
...
)
Note:
Please use specified key columns and specified value ranges for partitioning.
- Partition name only supports [A-z0-9_]
- Columns in Range partition only support the following types: TINYINT, SAMLLINT, INT, BIGINT, LARGEINT, DATE, and DATETIME.
- Partitions are left closed and right open. The left boundary of the first partition is of minimum value.
- NULL value is stored only in partitions that contain minimum values. When the partition containing the minimum value is deleted, NULL values can no longer be imported.
- Partition columns can either be single columns or multiple columns. The partition values are the default minimum values.
Please note:
- Partitions are often used for managing data related to time.
- When data backtracking is needed, you may want to consider emptying the first partition for adding partitions later when necessary.
Fixed Range
Syntax:
PARTITION BY RANGE (k1, k2, k3, ...)
(
PARTITION partition_name1 VALUES [("k1-lower1", "k2-lower1", "k3-lower1",...), ("k1-upper1", "k2-upper1", "k3-upper1", ...)),
PARTITION partition_name2 VALUES [("k1-lower1-2", "k2-lower1-2", ...), ("k1-upper1-2", MAXVALUE, )),
"k3-upper1-2", ...
)
Note:
- Fixed Range is more flexible than LESS THAN. You can customize the left and right partitions.
- Fixed Range is the same as LESS THAN in the other aspects.
Create partitions in bulk
Syntax
PARTITION BY RANGE (datekey) (
START ("2021-01-01") END ("2021-01-04") EVERY (INTERVAL 1 day)
)
Description
You can specify the value for START
and END
and the expression in EVERY
to create partitions in bulk .
- If
datekey
supports DATE and INTEGER data type, the data type ofSTART
,END
, andEVERY
must be the same as the data type ofdatekey
. - If
datekey
only supports DATE data type, you need to use theINTERVAL
keyword to specify the date interval. You can specify the date interval by day, week, month, or year. The naming conventions of partitions are the same as those for dynamic partitions.
distribution_des
Hash bucketing
Syntax:
`DISTRIBUTED BY HASH (k1[,k2 ...]) [BUCKETS num]`
Note:
Please use specified key columns for Hash bucketing. The default bucket number is 10.
It is recommended to use Hash bucketing method.
ORDER BY
The primary key and sort key are decoupled in the Primary Key table. The sort key is specified by the ORDER BY
keyword and can be the permutation and combination of any columns.
NOTICE
If the sort key is specified, the prefix index is built according to the sort key; if the sort key is not specified, the prefix index is built according to the primary key.
PROPERTIES
- If ENGINE type is olap. Users can specify storage medium, cooldown time and replica number.
PROPERTIES (
"storage_medium" = "[SSD|HDD]",
[ "storage_cooldown_time" = "yyyy-MM-dd HH:mm:ss", ]
[ "replication_num" = "3" ]
)
storage_medium: SSD or HDD could be specified as the initial storage media. You can specify default initial storage medium by specifying default_storage_medium=xxx through FE configuration file fe.conf. If no medium is specified, the default is HDD.
Note: When FE configuration item enable_strict_storage_medium_check is True and if storage medium is not set in the CelerData cluster, the statement for table creating will report an error: Failed to find enough host in all backends with storage medium is SSD|HDD.
storage_cooldown_time: When the storage medium is SSD, please specify its storage cooldown time. The default time is 30 days. Format: "yyyy-MM-dd HH:mm:ss"
replication_num: number of replicas in the specified partition. Default number: 3.
When the table has only one partition, the properties belongs to the table. When the table has two levels of partitions, the properties belong to each partition. Users can also specify different properties for different partitions through ADD ADDITION and MODIFY PARTITION statements.
- If Engine type is olap, users can specify a column to adopt Bloomfilter index which applies only to the condition where in and equal are query filters. More discrete values in this column will result in more precise queries. Bloomfilter currently supports the key column, with the exception of the key column in TINYINT FLOAT DOUBLE type, and the value column with the aggregation method REPLACE.
PROPERTIES (
"bloom_filter_columns"="k1,k2,k3"
)
- If you want to use Colocate Join attributes, please specify it in properties.
PROPERTIES (
"colocate_with"="table1"
)
- If you want to use dynamic partition attributes, please specify it in properties.
PROPERTIES (
"dynamic_partition.enable" = "true|false",
"dynamic_partition.time_unit" = "DAY|WEEK|MONTH",
"dynamic_partition.start" = "${integer_value}",
"dynamic_partition.end" = "${integer_value}",
"dynamic_partition.prefix" = "${string_value}",
"dynamic_partition.buckets" = "${integer_value}"
dynamic_partition.enable: It is used to specify whether dynamic partitioning at the table level is enabled. Default value: true.
dynamic_partition.time_unit: It is used to specify the time unit for adding partitions dynamically. Time unit could be DAY, WEEK, MONTH.
dynamic_partition.start: It is used to specify how many partitions should be deleted. The value must be less than 0. Default value: integer.Min_VALUE.
dynamic_partition.end: It is used to specify the how many partitions will be created in advance. The value must be more than 0.
dynamic_partition.prefix: It is used to specify the prefix of the created partition. For instance, if the prefix is p, the partition will be named p20200108 automatically.
dynamic_partition.buckets: It is used to specify the number of buckets automatically created in partitions.
- When building tables, Rollup can be created in bulk.
Syntax:
ROLLUP (rollup_name (column_name1, column_name2, ...)
[FROM from_index_name]
[PROPERTIES ("key"="value", ...)],...)
Examples
Example 1: Create an olap table that uses Hash bucketing and column-based storage, and that is aggregated by identical key records.
CREATE TABLE example_db.table_hash
(
k1 TINYINT,
k2 DECIMAL(10, 2) DEFAULT "10.5",
v1 CHAR(10) REPLACE,
v2 INT SUM
)
ENGINE=olap
AGGREGATE KEY(k1, k2)
COMMENT "my first starrocks table"
DISTRIBUTED BY HASH(k1) BUCKETS 10
PROPERTIES ("storage_type"="column");
Example 2: Create an olap table that uses Hash bucketing and column-based storage, and that is aggregated by identical key records. Also, please set the storage medium and the cooldown time.
CREATE TABLE example_db.table_hash
(
k1 BIGINT,
k2 LARGEINT,
v1 VARCHAR(2048) REPLACE,
v2 SMALLINT SUM DEFAULT "10"
)
ENGINE=olap
UNIQUE KEY(k1, k2)
DISTRIBUTED BY HASH (k1, k2) BUCKETS 10
PROPERTIES(
"storage_type"="column",
"storage_medium" = "SSD",
"storage_cooldown_time" = "2015-06-04 00:00:00"
);
Or
CREATE TABLE example_db.table_hash
(
k1 BIGINT,
k2 LARGEINT,
v1 VARCHAR(2048) REPLACE,
v2 SMALLINT SUM DEFAULT "10"
)
ENGINE=olap
PRIMARY KEY(k1, k2)
DISTRIBUTED BY HASH (k1, k2) BUCKETS 10
PROPERTIES(
"storage_type"="column",
"storage_medium" = "SSD",
"storage_cooldown_time" = "2015-06-04 00:00:00"
);
Example 3: Create an olap table that uses Range partition, Hash bucketing and the default column-based storage. Records with the same key should exist at the same time. Also, please set the initial storage medium and the cooldown time.
LESS THAN
CREATE TABLE example_db.table_range
(
k1 DATE,
k2 INT,
k3 SMALLINT,
v1 VARCHAR(2048),
v2 DATETIME DEFAULT "2014-02-04 15:36:00"
)
ENGINE=olap
DUPLICATE KEY(k1, k2, k3)
PARTITION BY RANGE (k1)
(
PARTITION p1 VALUES LESS THAN ("2014-01-01"),
PARTITION p2 VALUES LESS THAN ("2014-06-01"),
PARTITION p3 VALUES LESS THAN ("2014-12-01")
)
DISTRIBUTED BY HASH(k2) BUCKETS 10
PROPERTIES(
"storage_medium" = "SSD",
"storage_cooldown_time" = "2015-06-04 00:00:00"
);
Note:
This statement will create 3 data partitions:
( { MIN }, {"2014-01-01"} )
[ {"2014-01-01"}, {"2014-06-01"} )
[ {"2014-06-01"}, {"2014-12-01"} )
Data outside these ranges will be not be loaded.
Fixed Range
CREATE TABLE table_range
(
k1 DATE,
k2 INT,
k3 SMALLINT,
v1 VARCHAR(2048),
v2 DATETIME DEFAULT "2014-02-04 15:36:00"
)
ENGINE=olap
DUPLICATE KEY(k1, k2, k3)
PARTITION BY RANGE (k1, k2, k3)
(
PARTITION p1 VALUES [("2014-01-01", "10", "200"), ("2014-01-01", "20", "300")),
PARTITION p2 VALUES [("2014-06-01", "100", "200"), ("2014-07-01", "100", "300"))
)
DISTRIBUTED BY HASH(k2) BUCKETS 10
PROPERTIES(
"storage_medium" = "SSD"
);
- Create a table that contain HLL columns.
CREATE TABLE example_db.example_table
(
k1 TINYINT,
k2 DECIMAL(10, 2) DEFAULT "10.5",
v1 HLL HLL_UNION,
v2 HLL HLL_UNION
)
ENGINE=olap
AGGREGATE KEY(k1, k2)
DISTRIBUTED BY HASH(k1) BUCKETS 10
PROPERTIES ("storage_type"="column");
- Create table containing BITMAP_UNION aggregation type. (The original data type of v1 and v2 columns muse be TINYINT, SMALLINT, INT)
CREATE TABLE example_db.example_table
(
k1 TINYINT,
k2 DECIMAL(10, 2) DEFAULT "10.5",
v1 BITMAP BITMAP_UNION,
v2 BITMAP BITMAP_UNION
)
ENGINE=olap
AGGREGATE KEY(k1, k2)
DISTRIBUTED BY HASH(k1) BUCKETS 10
PROPERTIES ("storage_type"="column");
- Create table t1 and t2 that support Colocate Join.
CREATE TABLE `t1`
(
`id` int(11) COMMENT "",
`value` varchar(8) COMMENT ""
)
ENGINE=OLAP
DUPLICATE KEY(`id`)
DISTRIBUTED BY HASH(`id`) BUCKETS 10
PROPERTIES
(
"colocate_with" = "t1"
);
CREATE TABLE `t2`
(
`id` int(11) COMMENT "",
`value` varchar(8) COMMENT ""
)
ENGINE=OLAP
DUPLICATE KEY(`id`)
DISTRIBUTED BY HASH(`id`) BUCKETS 10
PROPERTIES
(
"colocate_with" = "t1"
);
- Create a table with bitmap index
CREATE TABLE example_db.table_hash
(
k1 TINYINT,
k2 DECIMAL(10, 2) DEFAULT "10.5",
v1 CHAR(10) REPLACE,
v2 INT SUM,
INDEX k1_idx (k1) USING BITMAP COMMENT 'xxxxxx'
)
ENGINE=olap
AGGREGATE KEY(k1, k2)
COMMENT "my first starrocks table"
DISTRIBUTED BY HASH(k1) BUCKETS 10
PROPERTIES ("storage_type"="column");
- Create a dynamic partition table. (The dynamic partitioning function should be turned on in FE configuration.) This table will create partitions for three days and delete those created three days ago. For example, assuming today is 2020-01-08, partitions with these names will be created: p20200108, p20200109, p20200110, p20200111. And their ranges are:
[types: [DATE]; keys: [2020-01-08]; ‥types: [DATE]; keys: [2020-01-09]; )
[types: [DATE]; keys: [2020-01-09]; ‥types: [DATE]; keys: [2020-01-10]; )
[types: [DATE]; keys: [2020-01-10]; ‥types: [DATE]; keys: [2020-01-11]; )
[types: [DATE]; keys: [2020-01-11]; ‥types: [DATE]; keys: [2020-01-12]; )
CREATE TABLE example_db.dynamic_partition
(
k1 DATE,
k2 INT,
k3 SMALLINT,
v1 VARCHAR(2048),
v2 DATETIME DEFAULT "2014-02-04 15:36:00"
)
ENGINE=olap
DUPLICATE KEY(k1, k2, k3)
PARTITION BY RANGE (k1)
(
PARTITION p1 VALUES LESS THAN ("2014-01-01"),
PARTITION p2 VALUES LESS THAN ("2014-06-01"),
PARTITION p3 VALUES LESS THAN ("2014-12-01")
)
DISTRIBUTED BY HASH(k2) BUCKETS 10
PROPERTIES(
"storage_medium" = "SSD",
"dynamic_partition.time_unit" = "DAY",
"dynamic_partition.start" = "-3",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.buckets" = "10"
);
Create a Primary Key table and specify the sort key
Suppose that you need to analyze user behavior in real time from dimensions such as users' address and last active time. When you create a table, you can define the user_id
column as the primary key and define the combination of the address
and last_active
columns as the sort key.
create table users (
user_id bigint NOT NULL,
name string NOT NULL,
email string NULL,
address string NULL,
age tinyint NULL,
sex tinyint NULL,
last_active datetime,
property0 tinyint NOT NULL,
property1 tinyint NOT NULL,
property2 tinyint NOT NULL,
property3 tinyint NOT NULL
)
PRIMARY KEY (`user_id`)
DISTRIBUTED BY HASH(`user_id`) BUCKETS 4
ORDER BY(`address`,`last_active`)
PROPERTIES(
"enable_persistent_index" = "true"
);