- Release Notes
- Introduction to CelerData Cloud Serverless
- Quick Start
- Sign up for CelerData Cloud Serverless
- A quick tour of the console
- Connect to CelerData Cloud Serverless
- Create an IAM integration
- Create and assign a warehouse
- Create an external catalog
- Load data from cloud storage
- Load data from Apache Kafka/Confluent Cloud
- Try your first query
- Invite new users
- Design data access control policy
- Warehouses
- Catalog, database, table, view, and MV
- Overview of database objects
- Catalog
- Table types
- Asynchronous materialized views
- Data Loading
- Data access control
- Networking and private connectivity
- Usage and Billing
- Organization and Account
- Integration
- Query Acceleration
- Reference
- AWS IAM policies
- Information Schema
- Overview
- be_bvars
- be_cloud_native_compactions
- be_compactions
- character_sets
- collations
- column_privileges
- columns
- engines
- events
- global_variables
- key_column_usage
- load_tracking_logs
- loads
- materialized_views
- partitions
- pipe_files
- pipes
- referential_constraints
- routines
- schema_privileges
- schemata
- session_variables
- statistics
- table_constraints
- table_privileges
- tables
- tables_config
- task_runs
- tasks
- triggers
- user_privileges
- views
- Data Types
- System Metadatabase
- Keywords
- SQL Statements
- Account Management
- Data Definition
- CREATE TABLE
- ALTER TABLE
- DROP CATALOG
- CREATE TABLE LIKE
- REFRESH EXTERNAL TABLE
- RESTORE
- SET CATALOG
- DROP TABLE
- RECOVER
- USE
- CREATE MATERIALIZED VIEW
- DROP DATABASE
- ALTER MATERIALIZED VIEW
- DROP REPOSITORY
- CANCEL RESTORE
- DROP INDEX
- DROP MATERIALIZED VIEW
- CREATE DATABASE
- CREATE TABLE AS SELECT
- BACKUP
- CANCEL BACKUP
- CREATE REPOSITORY
- CREATE INDEX
- Data Manipulation
- INSERT
- SHOW CREATE DATABASE
- SHOW BACKUP
- SHOW ALTER MATERIALIZED VIEW
- SHOW CATALOGS
- SHOW CREATE MATERIALIZED VIEW
- SELECT
- SHOW ALTER
- SHOW MATERIALIZED VIEW
- RESUME ROUTINE LOAD
- ALTER ROUTINE LOAD
- SHOW TABLES
- STREAM LOAD
- SHOW PARTITIONS
- CANCEL REFRESH MATERIALIZED VIEW
- SHOW CREATE CATALOG
- SHOW ROUTINE LOAD TASK
- SHOW RESTORE
- CREATE ROUTINE LOAD
- STOP ROUTINE LOAD
- SHOW DATABASES
- BROKER LOAD
- SHOW ROUTINE LOAD
- PAUSE ROUTINE LOAD
- SHOW SNAPSHOT
- SHOW CREATE TABLE
- CANCEL LOAD
- REFRESH MATERIALIZED VIEW
- SHOW REPOSITORIES
- SHOW LOAD
- Administration
- DESCRIBE
- SQL Functions
- Function List
- String Functions
- CONCAT
- HEX
- LOWER
- SPLIT
- LPAD
- SUBSTRING
- PARSE_URL
- INSTR
- REPEAT
- LCASE
- REPLACE
- HEX_DECODE_BINARY
- RPAD
- SPLIT_PART
- STRCMP
- SPACE
- CHARACTER_LENGTH
- URL_ENCODE
- APPEND_TAILING_CHAR_IF_ABSENT
- LTRIM
- HEX_DECODE_STRING
- URL_DECODE
- LEFT
- STARTS_WITH
- CONCAT
- GROUP_CONCAT
- STR_TO_MAP
- STRLEFT
- STRRIGHT
- MONEY_FORMAT
- RIGHT
- SUBSTRING_INDEX
- UCASE
- TRIM
- FIND_IN_SET
- RTRIM
- ASCII
- UPPER
- REVERSE
- LENGTH
- UNHEX
- ENDS_WITH
- CHAR_LENGTH
- NULL_OR_EMPTY
- LOCATE
- CHAR
- Predicate Functions
- Map Functions
- Binary Functions
- Geospatial Functions
- Lambda Expression
- Utility Functions
- Bitmap Functions
- BITMAP_SUBSET_LIMIT
- TO_BITMAP
- BITMAP_AGG
- BITMAP_FROM_STRING
- BITMAP_OR
- BITMAP_REMOVE
- BITMAP_AND
- BITMAP_TO_BASE64
- BITMAP_MIN
- BITMAP_CONTAINS
- SUB_BITMAP
- BITMAP_UNION
- BITMAP_COUNT
- BITMAP_UNION_INT
- BITMAP_XOR
- BITMAP_UNION_COUNT
- BITMAP_HAS_ANY
- BITMAP_INTERSECT
- BITMAP_AND_NOT
- BITMAP_TO_STRING
- BITMAP_HASH
- INTERSECT_COUNT
- BITMAP_EMPTY
- BITMAP_MAX
- BASE64_TO_ARRAY
- BITMAP_TO_ARRAY
- Struct Functions
- Aggregate Functions
- RETENTION
- MI
- MULTI_DISTINCT_SUM
- WINDOW_FUNNEL
- STDDEV_SAMP
- GROUPING_ID
- HLL_HASH
- AVG
- HLL_UNION_AGG
- COUNT
- BITMAP
- HLL_EMPTY
- SUM
- MAX_BY
- PERCENTILE_CONT
- COVAR_POP
- PERCENTILE_APPROX
- HLL_RAW_AGG
- STDDEV
- CORR
- COVAR_SAMP
- MIN_BY
- MAX
- VAR_SAMP
- STD
- HLL_UNION
- APPROX_COUNT_DISTINCT
- MULTI_DISTINCT_COUNT
- VARIANCE
- ANY_VALUE
- COUNT_IF
- GROUPING
- PERCENTILE_DISC
- Array Functions
- ARRAY_CUM_SUM
- ARRAY_MAX
- ARRAY_LENGTH
- ARRAY_REMOVE
- UNNEST
- ARRAY_SLICE
- ALL_MATCH
- ARRAY_CONCAT
- ARRAY_SORT
- ARRAY_POSITION
- ARRAY_DIFFERENCE
- ARRAY_CONTAINS
- ARRAY_JOIN
- ARRAY_INTERSECT
- CARDINALITY
- ARRAY_CONTAINS_ALL
- ARRAYS_OVERLAP
- ARRAY_MIN
- ARRAY_MAP
- ELEMENT_AT
- ARRAY_APPEND
- ARRAY_SORTBY
- ARRAY_TO_BITMAP
- ARRAY_GENERATE
- ARRAY_AVG
- ARRAY_FILTER
- ANY_MATCH
- REVERSE
- ARRAY_AGG
- ARRAY_DISTINCT
- ARRAY_SUM
- Condition Functions
- Math Functions
- Date and Time Functions
- DAYNAME
- MINUTE
- FROM_UNIXTIME
- HOUR
- MONTHNAME
- MONTHS_ADD
- ADD_MONTHS
- DATE_SUB
- PREVIOUS_DAY
- TO_TERA_DATA
- MINUTES_SUB
- WEEKS_ADD
- HOURS_DIFF
- UNIX_TIMESTAMP
- DAY
- DATE_SLICE
- DATE
- CURTIME
- SECONDS_SUB
- MONTH
- WEEK
- TO_DATE
- TIMEDIFF
- MONTHS_DIFF
- STR_TO_JODATIME
- WEEK_ISO
- MICROSECONDS_SUB
- TIME_SLICE
- MAKEDATE
- DATE_TRUNC
- JODATIME
- DAYOFWEEK
- YEARS_SUB
- TIMESTAMP_ADD
- HOURS_SUB
- STR2DATE
- TIMESTAMP
- FROM_DAYS
- WEEK_OF_YEAR
- YEAR
- TIMESTAMP_DIFF
- TO_TERA_TIMESTAMP
- DAYOFMONTH
- DAYOFYEAR
- DATE_FORMAT
- MONTHS_SUB
- NEXT_DAY
- MINUTES_DIFF
- DATA_ADD
- MINUTES_ADD
- CURDATE
- DAY_OF_WEEK_ISO
- CURRENt_TIMESTAMP
- STR_TO_DATE
- LAST_DAY
- WEEKS_SUB
- TO_DAYS
- DATEDIFF
- NOW
- TO_ISO8601
- TIME_TO_SEC
- QUARTER
- SECONDS_DIFF
- UTC_TIMESTAMP
- DATA_DIFF
- SECONDS_ADD
- ADDDATE
- WEEKSDIFF
- CONVERT_TZ
- MICROSECONDS_ADD
- SECOND
- YEARS_DIFF
- YEARS_ADD
- HOURS_ADD
- DAYS_SUB
- DAYS_DIFF
- Cryptographic Functions
- Percentile Functions
- Bit Functions
- JSON Functions
- Hash Functions
- Scalar Functions
- Table Functions
CREATE MATERIALIZED VIEW
NOTE
This feature is included in the Premium software edition. Please see the Software editions documentation for details on the differences between Standard and Premium editions if you are subscribed to the Standard edition.
Description
Creates a materialized view. For usage information about materialized views, see Asynchronous materialized view.
CAUTION
Only users with the CREATE MATERIALIZED VIEW privilege in the database where the base table resides can create a materialized view.
Creating a materialized view is an asynchronous operation. Running this command successfully indicates that the task of creating the materialized view is submitted successfully. You can view the building status of an asynchronous materialized view by querying the metadata tables tasks
and task_runs
in Information Schema.
Asynchronous materialized view
Syntax
CREATE MATERIALIZED VIEW [IF NOT EXISTS] [database.]<mv_name>
[COMMENT ""]
-- distribution_desc
[DISTRIBUTED BY HASH(<bucket_key>[,<bucket_key2> ...]) [BUCKETS <bucket_number>]]
-- refresh_desc
[REFRESH
-- refresh_moment
[IMMEDIATE | DEFERRED]
-- refresh_scheme
[ASYNC | ASYNC (START <start_time>) EVERY INTERVAL <refresh_interval> | MANUAL]
]
-- partition_expression
[PARTITION BY
{<date_column> | date_trunc(fmt, <date_column>)}
]
-- order_by_expression
[ORDER BY (<sort_key>)]
[PROPERTIES ("key"="value", ...)]
AS
<query_statement>
Parameters in brackets [] are optional.
Parameters
mv_name (required)
The name of the materialized view. The naming requirements are as follows:
- The name must consist of letters (a-z or A-Z), digits (0-9), or underscores (_), and it can only start with a letter.
- The length of the name cannot exceed 64 characters.
- The name is case-sensitive.
CAUTION
Multiple materialized views can be created on the same base table, but the names of the materialized views in the same database cannot be duplicated.
COMMENT (optional)
Comment on the materialized view. Note that COMMENT
must be placed after mv_name
. Otherwise, the materialized view cannot be created.
distribution_desc (optional)
The bucketing strategy of the asynchronous materialized view. CelerData supports hash bucketing and random bucketing. If you do not specify this parameter, CelerData uses the random bucketing strategy and automatically sets the number of buckets.
Hash bucketing:
Syntax
DISTRIBUTED BY HASH (<bucket_key1>[,<bucket_key2> ...]) [BUCKETS <bucket_number>]
Random bucketing:
If you choose the random bucketing strategy and allow CelerData to set the number of buckets automatically, you do not need to specify
distribution_desc
. However, if you want to set the number of buckets manually, you can refer to the following syntax:DISTRIBUTED BY RANDOM BUCKETS <bucket_number>
CAUTION
Asynchronous materialized views with a random bucketing strategy cannot be assigned to a colocation group.
refresh_moment (optional)
The refresh moment of the materialized view. Default value: IMMEDIATE
. Valid values:
IMMEDIATE
: Refresh the asynchronous materialized view immediately after it is created.DEFERRED
: The asynchronous materialized view is not refreshed after it is created. You can manually refresh the materialized view or schedule regular refresh tasks.
refresh_scheme_desc (optional)
The refresh strategy of the asynchronous materialized view. Valid values:
ASYNC
: Asynchronous refresh mode. For a fixed-interval automatic refresh, you need to specify the refresh start time, refresh interval using the following units:DAY
,HOUR
,MINUTE
, andSECOND
. If you do not specify the interval, the materialized view refreshes each time the data in the base tables changes.MANUAL
: Manual refresh mode.
If this parameter is not specified, the default value MANUAL
is used.
partition_expression (optional)
The partitioning strategy of the asynchronous materialized view. As for the current version of CelerData, only one partition expression is supported when creating an asynchronous materialized view. Valid values:
column_name
: The name of the column used for partitioning. The expressionPARTITION BY dt
means to partition the materialized view according to thedt
column.- date_trunc function: The function used to truncate the time unit.
PARTITION BY date_trunc("MONTH", 'dt')
means that thedt
column is truncated to month as the unit for partitioning. The date_trunc function supports truncating time to units includingYEAR
,MONTH
,DAY
,HOUR
, andMINUTE
. You can further use time_slice or date_slice functions to convert the given time into the beginning or end of a time interval based on the specified time granularity, for example,PARTITION BY date_trunc("MONTH", time_slice(
dt, INTERVAL 7 DAY))
where time_slice's or date_slice's type must be of finer granularity thandate_trunc
's type.
If this parameter is not specified, no partitioning strategy is adopted by default.
order_by_expression (optional)
The sort key of the asynchronous materialized view. If you do not specify the sort key, CelerData chooses some of the prefix columns from SELECT columns as the sort keys. For example, in select a, b, c, d
, sort keys can be a
and b
.
PROPERTIES (optional)
Properties of the asynchronous materialized view. You can modify the properties of an existing materialized view using ALTER MATERIALIZED VIEW.
replication_num
: The number of materialized view replicas to create.storage_medium
: Storage medium type. Valid values:HDD
andSSD
.storage_cooldown_time
: the storage cooldown time for a partition. If both HDD and SSD storage mediums are used, data in the SSD storage is moved to the HDD storage after the time specified by this property. Format: "yyyy-MM-dd HH:mm:ss". The specified time must be later than the current time. If this property is not explicitly specified, the storage cooldown is not performed by default.partition_ttl_number
: The number of most recent materialized view partitions to keep. After the number of partitions exceeds this value, expired partitions will be deleted. CelerData will periodically check materialized view partitions according to the time interval specified in the FE configuration itemdynamic_partition_check_interval_seconds
, and automatically delete expired partitions. When the value is-1
, all partitions of the materialized view will be preserved. Default:-1
.partition_refresh_number
: In a single refresh, the maximum number of partitions to refresh. If the number of partitions to be refreshed exceeds this value, CelerData will split the refresh task and complete it in batches. Only when the previous batch of partitions is refreshed successfully, CelerData will continue to refresh the next batch of partitions until all partitions are refreshed. If any of the partitions fail to be refreshed, no subsequent refresh tasks will be generated. When the value is-1
, the refresh task will not be split. Default:-1
.excluded_trigger_tables
: If a base table of the materialized view is listed here, the automatic refresh task will not be triggered when the data in the base table is changed. This parameter only applies to load-triggered refresh strategy, and is usually used together with the propertyauto_refresh_partitions_limit
. Format:[db_name.]table_name
. When the value is an empty string, any data change in all base tables triggers the refresh of the corresponding materialized view. The default value is an empty string.auto_refresh_partitions_limit
: The number of most recent materialized view partitions that need to be refreshed when a materialized view refresh is triggered. You can use this property to limit the refresh range and reduce the refresh cost. However, because not all the partitions are refreshed, the data in the materialized view may not be consistent with the base table. Default:-1
. When the value is-1
, all partitions will be refreshed. When the value is a positive integer N, CelerData sorts the existing partitions in chronological order, and refreshes N partitions from the most recent partition. If the number of partitions is less than N, CelerData refreshes all existing partitions. If there are dynamic partitions created in advance in your materialized view, CelerData refreshes the pre-created partitions first, and then the existing partitions. Therefore, when setting this parameter, make sure that you have reserved margins for pre-created dynamic partitions.mv_rewrite_staleness_second
: If the materialized view's last refresh is within the time interval specified in this property, this materialized view can be used directly for query rewrite, regardless of whether the data in the base tables changes. If the last refresh is before this time interval, CelerData checks whether the base tables have been updated to determine whether the materialized view can be used for query rewrite. Unit: Second.colocate_with
: The colocation group of the asynchronous materialized view. See Colocate Join for further information.unique_constraints
andforeign_key_constraints
: The Unique Key constraints and Foreign Key constraints when you create an asynchronous materialized view for query rewrite in the View Delta Join scenario. See Asynchronous materialized view - Rewrite queries in View Delta Join scenario for further information.
CAUTION
The Unique Key constraints and Foreign Key constraints are only used for query rewrite. The Foreign Key constraint checks are not guaranteed when data is loaded into the table. You must ensure the data loaded into the table meets the constraints.
query_statement (required)
The query statement to create the asynchronous materialized view.
Query an asynchronous materialized view
An asynchronous materialized view is a physical table. You can operate it as any regular table except that you cannot directly load data into an asynchronous materialized view.
Automatic query rewrite with asynchronous materialized view
CelerData supports automatic and transparent query rewrite based on the SPJG-type asynchronous materialized views. The SPJG-type materialized views refer to materialized views whose plan only includes Scan, Filter, Project, and Aggregate types of operators. The SPJG-type materialized views query rewrite includes single table query rewrite, Join query rewrite, aggregation query rewrite, Union query rewrite and query rewrite based on nested materialized views.
See Asynchronous materialized view - Rewrite queries with the asynchronous materialized view for further information.
Supported data types
Asynchronous materialized views created based on the CelerData default catalog support the following data types:
- DATE
- DATETIME
- CHAR
- VARCHAR
- BOOLEAN
- TINYINT
- SMALLINT
- INT
- BIGINT
- LARGEINT
- FLOAT
- DOUBLE
- DECIMAL
- ARRAY
- JSON
- BITMAP
- HLL
- PERCENTILE
- MAP
- STRUCT
Asynchronous materialized views created based on the CelerData external catalogs support the following data types:
Hive Catalog
- INT/INTEGER
- BIGINT
- TIMESTAMP
- STRING
- VARCHAR
- CHAR
- DOUBLE
- FLOAT
- DECIMAL
- ARRAY
Hudi Catalog
- BOOLEAN
- INT
- DATE
- TimeMillis/TimeMicros
- TimestampMillis/TimestampMicros
- LONG
- FLOAT
- DOUBLE
- STRING
- ARRAY
- DECIMAL
Iceberg Catalog
- BOOLEAN
- INT
- LONG
- FLOAT
- DOUBLE
- DECIMAL(P, S)
- DATE
- TIME
- TIMESTAMP
- STRING
- UUID
- FIXED(L)
- BINARY
- LIST
Usage notes
The current version of CelerData does not support creating multiple materialized views at the same time. A new materialized view can only be created when the one before is completed.
About nested asynchronous materialized views:
- The refresh strategy for each materialized view only applies to the corresponding materialized view.
- Currently, CelerData does not limit the number of nesting levels. In a production environment, we recommend that the number of nesting layers not exceed THREE.
About external catalog asynchronous materialized views:
- External catalog materialized view only support async fixed-interval refresh and manual refresh.
- Strict consistency is not guaranteed between the materialized view and the base tables in the external catalog.
- Currently, building materialized views based on external resources is not supported.
- Currently, CelerData cannot perceive if the base table data in the external catalog has changed, so all partitions will be refreshed by default every time the base table is refreshed. You can manually refresh only some of partitions using REFRESH MATERIALIZED VIEW.
Examples
Examples of asynchronous materialized views
The following examples are based on the base tables below:
CREATE TABLE `lineorder` (
`lo_orderkey` int(11) NOT NULL COMMENT "",
`lo_linenumber` int(11) NOT NULL COMMENT "",
`lo_custkey` int(11) NOT NULL COMMENT "",
`lo_partkey` int(11) NOT NULL COMMENT "",
`lo_suppkey` int(11) NOT NULL COMMENT "",
`lo_orderdate` int(11) NOT NULL COMMENT "",
`lo_orderpriority` varchar(16) NOT NULL COMMENT "",
`lo_shippriority` int(11) NOT NULL COMMENT "",
`lo_quantity` int(11) NOT NULL COMMENT "",
`lo_extendedprice` int(11) NOT NULL COMMENT "",
`lo_ordtotalprice` int(11) NOT NULL COMMENT "",
`lo_discount` int(11) NOT NULL COMMENT "",
`lo_revenue` int(11) NOT NULL COMMENT "",
`lo_supplycost` int(11) NOT NULL COMMENT "",
`lo_tax` int(11) NOT NULL COMMENT "",
`lo_commitdate` int(11) NOT NULL COMMENT "",
`lo_shipmode` varchar(11) NOT NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`lo_orderkey`)
COMMENT "OLAP"
PARTITION BY RANGE(`lo_orderdate`)
(PARTITION p1 VALUES [("-2147483648"), ("19930101")),
PARTITION p2 VALUES [("19930101"), ("19940101")),
PARTITION p3 VALUES [("19940101"), ("19950101")),
PARTITION p4 VALUES [("19950101"), ("19960101")),
PARTITION p5 VALUES [("19960101"), ("19970101")),
PARTITION p6 VALUES [("19970101"), ("19980101")),
PARTITION p7 VALUES [("19980101"), ("19990101")))
DISTRIBUTED BY HASH(`lo_orderkey`);
CREATE TABLE IF NOT EXISTS `customer` (
`c_custkey` int(11) NOT NULL COMMENT "",
`c_name` varchar(26) NOT NULL COMMENT "",
`c_address` varchar(41) NOT NULL COMMENT "",
`c_city` varchar(11) NOT NULL COMMENT "",
`c_nation` varchar(16) NOT NULL COMMENT "",
`c_region` varchar(13) NOT NULL COMMENT "",
`c_phone` varchar(16) NOT NULL COMMENT "",
`c_mktsegment` varchar(11) NOT NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`c_custkey`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`c_custkey`);
CREATE TABLE IF NOT EXISTS `dates` (
`d_datekey` int(11) NOT NULL COMMENT "",
`d_date` varchar(20) NOT NULL COMMENT "",
`d_dayofweek` varchar(10) NOT NULL COMMENT "",
`d_month` varchar(11) NOT NULL COMMENT "",
`d_year` int(11) NOT NULL COMMENT "",
`d_yearmonthnum` int(11) NOT NULL COMMENT "",
`d_yearmonth` varchar(9) NOT NULL COMMENT "",
`d_daynuminweek` int(11) NOT NULL COMMENT "",
`d_daynuminmonth` int(11) NOT NULL COMMENT "",
`d_daynuminyear` int(11) NOT NULL COMMENT "",
`d_monthnuminyear` int(11) NOT NULL COMMENT "",
`d_weeknuminyear` int(11) NOT NULL COMMENT "",
`d_sellingseason` varchar(14) NOT NULL COMMENT "",
`d_lastdayinweekfl` int(11) NOT NULL COMMENT "",
`d_lastdayinmonthfl` int(11) NOT NULL COMMENT "",
`d_holidayfl` int(11) NOT NULL COMMENT "",
`d_weekdayfl` int(11) NOT NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`d_datekey`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`d_datekey`);
CREATE TABLE IF NOT EXISTS `supplier` (
`s_suppkey` int(11) NOT NULL COMMENT "",
`s_name` varchar(26) NOT NULL COMMENT "",
`s_address` varchar(26) NOT NULL COMMENT "",
`s_city` varchar(11) NOT NULL COMMENT "",
`s_nation` varchar(16) NOT NULL COMMENT "",
`s_region` varchar(13) NOT NULL COMMENT "",
`s_phone` varchar(16) NOT NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`s_suppkey`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`s_suppkey`);
CREATE TABLE IF NOT EXISTS `part` (
`p_partkey` int(11) NOT NULL COMMENT "",
`p_name` varchar(23) NOT NULL COMMENT "",
`p_mfgr` varchar(7) NOT NULL COMMENT "",
`p_category` varchar(8) NOT NULL COMMENT "",
`p_brand` varchar(10) NOT NULL COMMENT "",
`p_color` varchar(12) NOT NULL COMMENT "",
`p_type` varchar(26) NOT NULL COMMENT "",
`p_size` int(11) NOT NULL COMMENT "",
`p_container` varchar(11) NOT NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`p_partkey`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`p_partkey`);
create table orders (
dt date NOT NULL,
order_id bigint NOT NULL,
user_id int NOT NULL,
merchant_id int NOT NULL,
good_id int NOT NULL,
good_name string NOT NULL,
price int NOT NULL,
cnt int NOT NULL,
revenue int NOT NULL,
state tinyint NOT NULL
)
PRIMARY KEY (dt, order_id)
PARTITION BY RANGE(`dt`)
( PARTITION p20210820 VALUES [('2021-08-20'), ('2021-08-21')),
PARTITION p20210821 VALUES [('2021-08-21'), ('2021-08-22')) )
DISTRIBUTED BY HASH(order_id)
PROPERTIES (
"replication_num" = "3",
"enable_persistent_index" = "true"
);
Example 1: Create a non-partitioned materialized view.
CREATE MATERIALIZED VIEW lo_mv1
DISTRIBUTED BY HASH(`lo_orderkey`)
REFRESH ASYNC
AS
select
lo_orderkey,
lo_custkey,
sum(lo_quantity) as total_quantity,
sum(lo_revenue) as total_revenue,
count(lo_shipmode) as shipmode_count
from lineorder
group by lo_orderkey, lo_custkey
order by lo_orderkey;
Example 2: Create a partitioned materialized view.
CREATE MATERIALIZED VIEW lo_mv2
PARTITION BY `lo_orderdate`
DISTRIBUTED BY HASH(`lo_orderkey`)
REFRESH ASYNC START('2023-07-01 10:00:00') EVERY (interval 1 day)
AS
select
lo_orderkey,
lo_orderdate,
lo_custkey,
sum(lo_quantity) as total_quantity,
sum(lo_revenue) as total_revenue,
count(lo_shipmode) as shipmode_count
from lineorder
group by lo_orderkey, lo_orderdate, lo_custkey
order by lo_orderkey;
-- Use the date_trunc() function to partition the materialized view by month.
CREATE MATERIALIZED VIEW order_mv1
PARTITION BY date_trunc('month', `dt`)
DISTRIBUTED BY HASH(`order_id`)
REFRESH ASYNC START('2023-07-01 10:00:00') EVERY (interval 1 day)
AS
select
dt,
order_id,
user_id,
sum(cnt) as total_cnt,
sum(revenue) as total_revenue,
count(state) as state_count
from orders
group by dt, order_id, user_id;
Example 3: Create an asynchronous materialized view.
CREATE MATERIALIZED VIEW flat_lineorder
DISTRIBUTED BY HASH(`lo_orderkey`)
REFRESH MANUAL
AS
SELECT
l.LO_ORDERKEY AS LO_ORDERKEY,
l.LO_LINENUMBER AS LO_LINENUMBER,
l.LO_CUSTKEY AS LO_CUSTKEY,
l.LO_PARTKEY AS LO_PARTKEY,
l.LO_SUPPKEY AS LO_SUPPKEY,
l.LO_ORDERDATE AS LO_ORDERDATE,
l.LO_ORDERPRIORITY AS LO_ORDERPRIORITY,
l.LO_SHIPPRIORITY AS LO_SHIPPRIORITY,
l.LO_QUANTITY AS LO_QUANTITY,
l.LO_EXTENDEDPRICE AS LO_EXTENDEDPRICE,
l.LO_ORDTOTALPRICE AS LO_ORDTOTALPRICE,
l.LO_DISCOUNT AS LO_DISCOUNT,
l.LO_REVENUE AS LO_REVENUE,
l.LO_SUPPLYCOST AS LO_SUPPLYCOST,
l.LO_TAX AS LO_TAX,
l.LO_COMMITDATE AS LO_COMMITDATE,
l.LO_SHIPMODE AS LO_SHIPMODE,
c.C_NAME AS C_NAME,
c.C_ADDRESS AS C_ADDRESS,
c.C_CITY AS C_CITY,
c.C_NATION AS C_NATION,
c.C_REGION AS C_REGION,
c.C_PHONE AS C_PHONE,
c.C_MKTSEGMENT AS C_MKTSEGMENT,
s.S_NAME AS S_NAME,
s.S_ADDRESS AS S_ADDRESS,
s.S_CITY AS S_CITY,
s.S_NATION AS S_NATION,
s.S_REGION AS S_REGION,
s.S_PHONE AS S_PHONE,
p.P_NAME AS P_NAME,
p.P_MFGR AS P_MFGR,
p.P_CATEGORY AS P_CATEGORY,
p.P_BRAND AS P_BRAND,
p.P_COLOR AS P_COLOR,
p.P_TYPE AS P_TYPE,
p.P_SIZE AS P_SIZE,
p.P_CONTAINER AS P_CONTAINER FROM lineorder AS l
INNER JOIN customer AS c ON c.C_CUSTKEY = l.LO_CUSTKEY
INNER JOIN supplier AS s ON s.S_SUPPKEY = l.LO_SUPPKEY
INNER JOIN part AS p ON p.P_PARTKEY = l.LO_PARTKEY;