Skip to main content

Feature Support: Data Distribution

This document outlines the partitioning and bucketing features supported by StarRocks.

Supported table types

  • Bucketing

    Hash Bucketing is supported in all table types. Random Bucketing (from v3.1 onwards) is supported only in Duplicate Key tables.

  • Partitioning

    Expression Partitioning (from v3.1 onwards), Range Partitioning, and List Partitioning (from v3.1 onwards) are supported in all table types.

Bucketing

FeatureKey pointSupport statusNote
Bucketing strategyHash BucketingYes
Random BucketingYes (v3.1+)Random Bucketing is supported only in Duplicate Key tables.
From v3.2, StarRocks supports dynamically adjusting the number of tablets to create according to cluster information and the data size.
Bucket Key data typeDate, Integer, StringYes
Bucket numberAutomatically set the number of bucketsYes (v3.0+)Automatically determined by the number of BE nodes or the data volume of the largest historical partition.
The logic has been optimized separately for partitioned tables and non-partitioned tables in later versions.
Dynamic increase of the Bucket number for Random BucketingYes (v3.2+)

Partitioning

FeatureKey pointSupport statusNote
Partitioning strategyExpression PartitioningYes (v3.1+)
  • Including Partitioning based on a time function expression (since v3.0) and Partitioning based on the column expression (since v3.1)
  • Supported time functions: date_trunc, time_slice
Range PartitioningYes (v3.2+)Since v3.3.0, three specific time functions can be used for Partition Keys: from_unixtime, from_unixtime_ms, str2date, substr/substring.
List PartitioningYes (v3.1+)
Partition Key data typeDate, Integer, BooleanYes
StringYes
  • Only Expression Partitioning and List Partitioning support String-type Partition Key.
  • Range Partitioning does not support String-type Partition Key. You need to use str2date to transform the column to date types.

Differences between partitioning strategies

Expression PartitioningRange PartitioningList Partitioning
Time function expression-based PartitioningColumn expression-based Partitioning
Data typeDate (DATE/DATETIME)
  • String (except BINARY)
  • Date (DATE/DATETIME)
  • Integer and Boolean
  • String (except BINARY) [1]
  • Date or timestamp [1]
  • Integer
  • String (except BINARY)
  • Date (DATE/DATETIME)
  • Integer and Boolean
Support for multiple Partition Keys/ (Only supports one date-type Partition Key)YesYesYes
Support Null values for Partition KeysYes/ [2]Yes/ [2]
Manual creation of partitions before data loading/ [3]/ [3]
  • Yes if the partitions are manually created in batch
  • No if the dynamic partitioning strategy is adopted
Yes
Automatic creation of partitions while data loadingYesYes//
note
  • [1]: You need to use from_unixtime, str2date or other time functions to transform the column to date types.
  • [2]: Null values will be supported in Partition Keys for List Partitioning from v3.3.3 onwards.
  • [3]: Partitions are automatically created.

For detailed comparisons between List Partitioning and Expression Partitioning, refer to Comparison between list partitioning and expression partitioning.