Resource group
This topic describes the resource group feature of StarRocks.
With this feature, you could simultaneously run several workloads in a single cluster, including short query, ad-hoc query, ETL jobs, to save extra cost of deploying multiple clusters. From technical perspective, the execution engine would schedule concurrent workloads according to users' specification and isolate the interference among them.
The roadmap of Resource Group:
- Since v2.2, StarRocks supports limiting resource consumption for queries and implementing isolation and efficient use of resources among tenants in the same cluster.
- In StarRocks v2.3, you can further restrict the resource consumption for big queries, and prevent the cluster resources from getting exhausted by oversized query requests, to guarantee the system stability.
- StarRocks v2.5 supports limiting computation resource consumption for data loading (INSERT).
- From v3.3.5 onwards, StarRocks supports imposing hard limits on CPU resources.
Internal Table | External Table | Big Query Restriction | INSERT INTO | Broker Load | Routine Load, Stream Load, Schema Change | CPU Hard Limit | |
---|---|---|---|---|---|---|---|
2.2 | √ | × | × | × | × | × | x |
2.3 | √ | √ | √ | × | × | × | x |
2.5 | √ | √ | √ | √ | × | × | x |
3.1 & 3.2 | √ | √ | √ | √ | √ | × | x |
3.3.5 and later | √ | √ | √ | √ | √ | × | √ |
Terms
This section describes the terms that you must understand before you use the resource group feature.
resource group
Each resource group is a set of computing resources from a specific BE. You can divide each BE of your cluster into multiple resource groups. When a query is assigned to a resource group, StarRocks allocates CPU and memory resources to the resource group based on the resource quotas that you specified for the resource group.
You can specify CPU and memory resource quotas for a resource group on a BE by using the following parameters:
Parameter | Description | Value Range | Default |
---|---|---|---|
cpu_weight | The CPU scheduling weight of this resource group on a BE node. | (0, avg_be_cpu_cores ] (takes effect when greater than 0) | 0 |
exclusive_cpu_cores | CPU hard isolation parameter for this resource group. | (0, min_be_cpu_cores - 1 ] (takes effect when greater than 0) | 0 |
mem_limit | The percentage of memory available for queries by this resource group on the current BE node. | (0, 1] (required) | - |
spill_mem_limit_threshold | Memory usage threshold that triggers spilling to disk. | (0, 1] | 1.0 |
concurrency_limit | Maximum number of concurrent queries in this resource group. | Integer (takes effect when greater than 0) | 0 |
big_query_cpu_second_limit | Maximum CPU time (in seconds) for big query tasks on each BE node. | Integer (takes effect when greater than 0) | 0 |
big_query_scan_rows_limit | Maximum number of rows big query tasks can scan on each BE node. | Integer (takes effect when greater than 0) | 0 |
big_query_mem_limit | Maximum memory big query tasks can use on each BE node. | Integer (takes effect when greater than 0) | 0 |
CPU resource parameters
cpu_weight
This parameter specifies the CPU scheduling weight of a resource group on a single BE node, determining the relative share of CPU time allocated to tasks from this group. Before v3.3.5, this was referred to as cpu_core_limit
.
Its value range is (0, avg_be_cpu_cores
], where avg_be_cpu_cores
is the average number of CPU cores across all BE nodes. The parameter is effective only when it is set to greater than 0. Either cpu_weight or exclusive_cpu_cores must be greater than 0, but not both.
NOTE
For example, suppose three resource groups, rg1, rg2, and rg3, have cpu_weight values of 2, 6, and 8, respectively. On a fully loaded BE node, these groups would receive 12.5%, 37.5%, and 50% of the CPU time. If the node is not fully loaded and rg1 and rg2 are under load while rg3 is idle, rg1 and rg2 would receive 25% and 75% of the CPU time, respectively.
exclusive_cpu_cores
This parameter defines CPU hard hard limit for a resource group. It has two implications:
- Exclusive: Reserves
exclusive_cpu_cores
CPU cores exclusively for this resource group, making them unavailable to other groups, even when idle. - Quota: Limits the resource group to only using these reserved CPU cores, preventing it from using available CPU resources from other groups.
The value range is (0, min_be_cpu_cores - 1
], where min_be_cpu_cores
is the minimum number of CPU cores across all BE nodes. It takes effect only when greater than 0. Only one of cpu_weight
or exclusive_cpu_cores
can be set to greater than 0.
- Resource groups with
exclusive_cpu_cores
greater than 0 are called Exclusive resource groups, and the CPU cores allocated to them are called Exclusive Cores. Other groups are called Shared resource groups and run on Shared Cores. - The total number of
exclusive_cpu_cores
across all resource groups cannot exceedmin_be_cpu_cores - 1
. The upper limit is set to leave at least one Shared Core available.
The relationship between exclusive_cpu_cores
and cpu_weight
:
Only one of cpu_weight
or exclusive_cpu_cores
can be active at a time. Exclusive resource groups operate on their own reserved Exclusive Cores without requiring a share of CPU time through cpu_weight
.
You can configure whether Shared resource groups can borrow Exclusive Cores from Exclusive resource groups using the BE configuration enable_resource_group_cpu_borrowing
. When set to true
(default), Shared groups can borrow CPU resources when Exclusive groups are idle.
To modify this configuration dynamically, use the following command:
UPDATE information_schema.be_configs SET VALUE = "false" WHERE NAME = "enable_resource_group_cpu_borrowing";
Memory resource parameters
mem_limit
Specifies the percentage of memory (query pool) available to the resource group on the current BE node. The value range is (0,1].
spill_mem_limit_threshold
Defines the memory usage threshold that triggers spilling to disk. The value range is (0,1], with the default being 1 (inactive). Introduced in v3.1.7.
- When automatic spilling is enabled (
spill_mode
set toauto
), but resource groups are disabled, the system will spill intermediate results to disk when a query’s memory usage exceeds 80% ofquery_mem_limit
. - When resource groups are enabled, spilling will occur if:
- The total memory usage of all queries in the group exceeds
current BE memory limit * mem_limit * spill_mem_limit_threshold
, or - The memory usage of the current query exceeds 80% of
query_mem_limit
.
- The total memory usage of all queries in the group exceeds
Query concurrency parameters
concurrency_limit
Defines the maximum number of concurrent queries in the resource group to prevent system overload. Effective only when greater than 0, with a default value of 0.
Big query resource parameters
You can configure resource limits specifically for large queries using the following parameters:
big_query_cpu_second_limit
Specifies the maximum CPU time (in seconds) that large query tasks can use on each BE node, summing the actual CPU time used by parallel tasks. Effective only when greater than 0, with a default value of 0.
big_query_scan_rows_limit
Sets a limit on the number of rows large query tasks can scan on each BE node. Effective only when greater than 0, with a default value of 0.
big_query_mem_limit
Defines the maximum memory large query tasks can use on each BE node, in bytes. Effective only when greater than 0, with a default value of 0.
NOTE
When a query running in a resource group exceeds the above big query limit, the query will be terminated with an error. You can also view error messages in the
ErrorCode
column of the FE node fe.audit.log.