BE Configuration
Some BE configuration items are dynamic parameters which you can set interactively when BE nodes are online. The rest of them are static parameters. You can only set the static parameters of a BE node by changing them in the CelerData Cloud BYOC console.
- Sign in to the CelerData Cloud BYOC console.
- On the Clusters page, click the classic cluster that you want to configure.
- On the cluster details page, click the Cluster parameters tab.
- In the BE Node static configuration section, click View all parameter list.
- In the dialog box that appears, click New parameter.
- Enter the name of the static parameter you want to configure in the Parameter key field, and the parameter value you want to set in the Value field. For the details of the parameters you can configure, see Usage notes.
- Click the save button next to the Value field to save the record of the parameter change.
- After configuring all the parameters you want to change, click Save changes to save the changes.
- To allow the changes to take effect, you can either manually suspend and resume the cluster whenever you find suitable, or click Apply to all nodes in the BE Node static configuration section to restart the cluster instantly.
Follow these steps to configure Compute Node parameters of a warehouse in the CelerData Cloud BYOC console:
- Sign in to the CelerData Cloud BYOC console.
- On the Clusters page, click the elastic cluster that you want to configure.
- On the cluster details page, click the Warehouses tab.
- On the Warehouses tab, click the warehouse that you want to configure.
- On the warehouse details page, click the Warehouse parameters tab.
- In the Compute Node static configuration section, click View all parameter list.
- In the dialog box that appears, click New parameter.
- Enter the name of the parameter you want to configure in the Parameter key field, and the parameter value you want to set in the Value field.
note
CelerData provides validation checks on parameter keys and value types. Only valid parameter keys and values can be applied. For the details of the parameters you can configure, see Usage notes.
- Click the save button next to the Value field to save the record of the parameter change.
- After configuring all the parameters you want to change, click Save changes to save the changes.
- To allow the changes to take effect, you can either manually suspend and resume the warehouse whenever you find it suitable, or click Apply to all nodes in the Compute Node static configuration section to restart the warehouse instantly. For
default_warehouse, you can only restart the cluster.
View BE configuration items
You can view the BE configuration items using the following command:
SELECT * FROM information_schema.be_configs WHERE NAME LIKE "%<name_pattern>%"
Configure BE parameters
Configure BE dynamic parameters
You can configure a dynamic parameter of a BE node by using the curl command.
curl -XPOST http://be_host:http_port/api/update_config?<configuration_item>=<value>
Configure BE static parameters
You can only set the static parameters of a BE by changing them in the CelerData Cloud BYOC console, and restarting the BE to allow the changes to take effect.
This topic introduces the following types of FE configurations:
Query
clear_udf_cache_when_start
- Default: false
- Type: Boolean
- Unit: -
- Is mutable: No
- Description: When enabled, the BE's UserFunctionCache will clear all locally cached user function libraries on startup. During UserFunctionCache::init, the code calls _reset_cache_dir(), which removes UDF files from the configured UDF library directory (organized into kLibShardNum subdirectories) and deletes files with Java/Python UDF suffixes (.jar/.py). When disabled (default), the BE loads existing cached UDF files instead of deleting them. Enabling this forces UDF binaries to be re-downloaded on first use after restart (increasing network traffic and first-use latency).
- Introduced in: v4.0.0
dictionary_speculate_min_chunk_size
- Default: 10000
- Type: Int
- Unit: Rows
- Is mutable: No
- Description: Minimum number of rows (chunk size) used by StringColumnWriter and DictColumnWriter to trigger dictionary-encoding speculation. If an incoming column (or the accumulated buffer plus incoming rows) has size larger than or equal
dictionary_speculate_min_chunk_sizethe writer will run speculation immediately and set an encoding (DICT, PLAIN or BIT_SHUFFLE) rather than buffering more rows. Speculation usesdictionary_encoding_ratiofor string columns anddictionary_encoding_ratio_for_non_string_columnfor numeric/non-string columns to decide whether dictionary encoding is beneficial. Also, a large column byte_size (larger than or equal to UINT32_MAX) forces immediate speculation to avoidBinaryColumn<uint32_t>overflow. - Introduced in: v3.2.0
disable_storage_page_cache
- Default: false
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: A boolean value to control whether to disable PageCache.
- When PageCache is enabled, StarRocks caches the recently scanned data.
- PageCache can significantly improve the query performance when similar queries are repeated frequently.
trueindicates disabling PageCache.- The default value of this item has been changed from
truetofalsesince StarRocks v2.4.
- Introduced in: -
enable_bitmap_index_memory_page_cache
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: Whether to enable memory cache for Bitmap index. Memory cache is recommended if you want to use Bitmap indexes to accelerate point queries.
- Introduced in: v3.1
enable_compaction_flat_json
- Default: True
- Type: Boolean
- Unit:
- Is mutable: Yes
- Description: Whether to enable compaction for Flat JSON data.
- Introduced in: v3.3.3
enable_json_flat
- Default: false
- Type: Boolean
- Unit:
- Is mutable: Yes
- Description: Whether to enable the Flat JSON feature. After this feature is enabled, newly loaded JSON data will be automatically flattened, improving JSON query performance.
- Introduced in: v3.3.0
enable_lazy_dynamic_flat_json
- Default: True
- Type: Boolean
- Unit:
- Is mutable: Yes
- Description: Whether to enable Lazy Dyamic Flat JSON when a query misses Flat JSON schema in read process. When this item is set to
true, StarRocks will postpone the Flat JSON operation to calculation process instead of read process. - Introduced in: v3.3.3
enable_ordinal_index_memory_page_cache
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: Whether to enable memory cache for ordinal index. Ordinal index is a mapping from row IDs to data page positions, and it can be used to accelerate scans.
- Introduced in: -
enable_string_prefix_zonemap
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: Whether to enable ZoneMap for string (CHAR/VARCHAR) columns using prefix-based min/max. For non-key string columns, the min/max values are truncated to a fixed prefix length configured by
string_prefix_zonemap_prefix_len. - Introduced in: -
enable_zonemap_index_memory_page_cache
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: Whether to enable memory cache for zonemap index. Memory cache is recommended if you want to use zonemap indexes to accelerate scan.
- Introduced in: -
exchg_node_buffer_size_bytes
- Default: 10485760
- Type: Int
- Unit: Bytes
- Is mutable: Yes
- Description: The maximum buffer size on the receiver end of an exchange node for each query. This configuration item is a soft limit. A backpressure is triggered when data is sent to the receiver end with an excessive speed.
- Introduced in: -
exec_state_report_max_threads
- Default: 2
- Type: Int
- Unit: Threads
- Is mutable: Yes
- Description: Maximum number of threads for the exec-state-report thread pool. This pool is used by
ExecStateReporterto asynchronously send non-priority execution status reports (such as fragment completion and error status) from BE to FE via RPC. The actual pool size at startup ismax(1, exec_state_report_max_threads). Changing this config at runtime triggersupdate_max_threadson the pool in every executor set (shared and exclusive). The pool has a fixed task queue size of 1000; report submissions are silently dropped when all threads are busy and the queue is full. Paired withpriority_exec_state_report_max_threadsfor the high-priority pool. Increase this value when delayed or dropped exec-state reports are observed under high query concurrency. - Introduced in: v4.1.0, v4.0.8, v3.5.15
file_descriptor_cache_capacity
- Default: 16384
- Type: Int
- Unit: -
- Is mutable: No
- Description: The number of file descriptors that can be cached.
- Introduced in: -
flamegraph_tool_dir
- Default:
${STARROCKS_HOME}/bin/flamegraph - Type: String
- Unit: -
- Is mutable: No
- Description: Directory of the flamegraph tool, which should contain pprof, stackcollapse-go.pl, and flamegraph.pl scripts for generating flame graphs from profile data.
- Introduced in: -
fragment_pool_queue_size
- Default: 2048
- Type: Int
- Unit: -
- Is mutable: No
- Description: The upper limit of the query number that can be processed on each BE node.
- Introduced in: -
fragment_pool_thread_num_max
- Default: 4096
- Type: Int
- Unit: -
- Is mutable: No
- Description: The maximum number of threads used for query.
- Introduced in: -
fragment_pool_thread_num_min
- Default: 64
- Type: Int
- Unit: Minutes -
- Is mutable: No
- Description: The minimum number of threads used for query.
- Introduced in: -
hdfs_client_enable_hedged_read
- Default: false
- Type: Boolean
- Unit: -
- Is mutable: No
- Description: Specifies whether to enable the hedged read feature.
- Introduced in: v3.0
hdfs_client_hedged_read_threadpool_size
- Default: 128
- Type: Int
- Unit: -
- Is mutable: No
- Description: Specifies the size of the Hedged Read thread pool on your HDFS client. The thread pool size limits the number of threads to dedicate to the running of hedged reads in your HDFS client. It is equivalent to the
dfs.client.hedged.read.threadpool.sizeparameter in the hdfs-site.xml file of your HDFS cluster. - Introduced in: v3.0
hdfs_client_hedged_read_threshold_millis
- Default: 2500
- Type: Int
- Unit: Milliseconds
- Is mutable: No
- Description: Specifies the number of milliseconds to wait before starting up a hedged read. For example, you have set this parameter to
30. In this situation, if a read from a block has not returned within 30 milliseconds, your HDFS client immediately starts up a new read against a different block replica. It is equivalent to thedfs.client.hedged.read.threshold.millisparameter in the hdfs-site.xml file of your HDFS cluster. - Introduced in: v3.0
io_coalesce_adaptive_lazy_active
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: Based on the selectivity of predicates, adaptively determines whether to combine the I/O of predicate columns and non-predicate columns.
- Introduced in: v3.2
jit_lru_cache_size
- Default: 0
- Type: Int
- Unit: Bytes
- Is mutable: Yes
- Description: The LRU cache size for JIT compilation. It represents the actual size of the cache if it is set to greater than 0. If it is set to less than or equal to 0, the system will adaptively set the cache using the formula
jit_lru_cache_size = min(mem_limit*0.01, 1GB)(whilemem_limitof the node must be greater or equal to 16 GB). - Introduced in: -
json_flat_column_max
- Default: 100
- Type: Int
- Unit:
- Is mutable: Yes
- Description: The maximum number of sub-fields that can be extracted by Flat JSON. This parameter takes effect only when
enable_json_flatis set totrue. - Introduced in: v3.3.0
json_flat_create_zonemap
- Default: true
- Type: Boolean
- Unit:
- Is mutable: Yes
- Description: Whether to create ZoneMaps for flattened JSON sub-columns during write. This parameter takes effect only when
enable_json_flatis set totrue. - Introduced in: -
json_flat_null_factor
- Default: 0.3
- Type: Double
- Unit:
- Is mutable: Yes
- Description: The proportion of NULL values in the column to extract for Flat JSON. A column will not be extracted if its proportion of NULL value is higher than this threshold. This parameter takes effect only when
enable_json_flatis set totrue. - Introduced in: v3.3.0
json_flat_sparsity_factor
- Default: 0.3
- Type: Double
- Unit:
- Is mutable: Yes
- Description: The proportion of columns with the same name for Flat JSON. Extraction is not performed if the proportion of columns with the same name is lower than this value. This parameter takes effect only when
enable_json_flatis set totrue. - Introduced in: v3.3.0
lake_tablet_ignore_invalid_delete_predicate
- Default: false
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: A boolean value to control whether ignore invalid delete predicates in tablet rowset metadata which may be introduced by logic deletion to a duplicate key table after the column name renamed.
- Introduced in: v4.0
late_materialization_ratio
- Default: 10
- Type: Int
- Unit: -
- Is mutable: No
- Description: Integer ratio in range [0-1000] that controls the use of late materialization in the SegmentIterator (vector query engine). A value of
0(or ≤ 0) disables late materialization;1000(or ≥ 1000) forces late materialization for all reads. Values > 0 and < 1000 enable a conditional strategy where both late and early materialization contexts are prepared and the iterator selects behavior based on predicate filter ratios (higher values favor late materialization). When a segment contains complex metric types, StarRocks usesmetric_late_materialization_ratioinstead. Iflake_io_opts.cache_file_onlyis set, late materialization is disabled. - Introduced in: v3.2.0
max_hdfs_file_handle
- Default: 1000
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: The maximum number of HDFS file descriptors that can be opened.
- Introduced in: -
max_memory_sink_batch_count
- Default: 20
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: The maximum number of Scan Cache batches.
- Introduced in: -
max_pushdown_conditions_per_column
- Default: 1024
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: The maximum number of conditions that allow pushdown in each column. If the number of conditions exceeds this limit, the predicates are not pushed down to the storage layer.
- Introduced in: -
max_scan_key_num
- Default: 1024
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: The maximum number of scan keys segmented by each query.
- Introduced in: -
metric_late_materialization_ratio
- Default: 1000
- Type: Int
- Unit: -
- Is mutable: No
- Description: Controls when the late-materialization row access strategy is used for reads that include complex metric columns. Valid range: [0-1000].
0disables late materialization;1000forces late materialization for all applicable reads. Values 1–999 enable a conditional strategy where both late and early materialization contexts are prepared and chosen at runtime based on predicate/selectivity. When complex metric types exist,metric_late_materialization_ratiooverrides the generallate_materialization_ratio. Note:cache_file_onlyI/O mode will cause late materialization to be disabled regardless of this setting. - Introduced in: v3.2.0
min_file_descriptor_number
- Default: 60000
- Type: Int
- Unit: -
- Is mutable: No
- Description: The minimum number of file descriptors in the BE process.
- Introduced in: -
object_storage_connect_timeout_ms
- Default: -1
- Type: Int
- Unit: Milliseconds
- Is mutable: No
- Description: Timeout duration to establish socket connections with object storage.
-1indicates to use the default timeout duration of the SDK configurations. - Introduced in: v3.0.9
object_storage_request_timeout_ms
- Default: -1
- Type: Int
- Unit: Milliseconds
- Is mutable: No
- Description: Timeout duration to establish HTTP connections with object storage.
-1indicates to use the default timeout duration of the SDK configurations. - Introduced in: v3.0.9
parquet_late_materialization_enable
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: No
- Description: A boolean value to control whether to enable the late materialization of Parquet reader to improve performance.
trueindicates enabling late materialization, andfalseindicates disabling it. - Introduced in: -
parquet_page_index_enable
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: No
- Description: A boolean value to control whether to enable the pageindex of Parquet file to improve performance.
trueindicates enabling pageindex, andfalseindicates disabling it. - Introduced in: v3.3
parquet_reader_bloom_filter_enable
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: A boolean value to control whether to enable the bloom filter of Parquet file to improve performance.
trueindicates enabling the bloom filter, andfalseindicates disabling it. You can also control this behavior on session level using the system variableenable_parquet_reader_bloom_filter. Bloom filters in Parquet are maintained at the column level within each row group. If a Parquet file contains bloom filters for certain columns, queries can use predicates on those columns to efficiently skip row groups. - Introduced in: v3.5
path_gc_check_step
- Default: 1000
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: The maximum number of files that can be scanned continuously each time.
- Introduced in: -
path_gc_check_step_interval_ms
- Default: 10
- Type: Int
- Unit: Milliseconds
- Is mutable: Yes
- Description: The time interval between file scans.
- Introduced in: -
path_scan_interval_second
- Default: 86400
- Type: Int
- Unit: Seconds
- Is mutable: Yes
- Description: The time interval at which GC cleans expired data.
- Introduced in: -
pipeline_connector_scan_thread_num_per_cpu
- Default: 8
- Type: Double
- Unit: -
- Is mutable: Yes
- Description: The number of scan threads assigned to Pipeline Connector per CPU core in the BE node. This configuration is changed to dynamic from v3.1.7 onwards.
- Introduced in: -
pipeline_poller_timeout_guard_ms
- Default: -1
- Type: Int
- Unit: Milliseconds
- Is mutable: Yes
- Description: When this item is set to greater than
0, if a driver takes longer thanpipeline_poller_timeout_guard_msfor a single dispatch in the poller, then the information of the driver and operator is printed. - Introduced in: -
pipeline_prepare_thread_pool_queue_size
- Default: 102400
- Type: Int
- Unit: -
- Is mutable: No
- Description: The maximum queue lenggth of PREPARE fragment thread pool for Pipeline execution engine.
- Introduced in: -
pipeline_prepare_thread_pool_thread_num
- Default: 0
- Type: Int
- Unit: -
- Is mutable: No
- Description: Number of threads in the pipeline execution engine PREPARE fragment thread pool.
0indicates the value is equal to the number of system VCPU core number. - Introduced in: -
pipeline_prepare_timeout_guard_ms
- Default: -1
- Type: Int
- Unit: Milliseconds
- Is mutable: Yes
- Description: When this item is set to greater than
0, if a plan fragment exceedspipeline_prepare_timeout_guard_msduring the PREPARE process, a stack trace of the plan fragment is printed. - Introduced in: -
pipeline_scan_thread_pool_queue_size
- Default: 102400
- Type: Int
- Unit: -
- Is mutable: No
- Description: The maximum task queue length of SCAN thread pool for Pipeline execution engine.
- Introduced in: -
pk_index_parallel_get_threadpool_size
- Default: 1048576
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: Sets the maximum queue size (number of pending tasks) for the "cloud_native_pk_index_get" thread pool used by PK index parallel get operations in shared-data (cloud-native/lake) mode. The actual thread count for that pool is controlled by
pk_index_parallel_get_threadpool_max_threads; this setting only limits how many tasks may be queued awaiting execution. The very large default (2^20) effectively makes the queue unbounded; lowering it prevents excessive memory growth from queued tasks but may cause task submissions to block or fail when the queue is full. Tune together withpk_index_parallel_get_threadpool_max_threadsbased on workload concurrency and memory constraints. - Introduced in: -
priority_exec_state_report_max_threads
- Default: 2
- Type: Int
- Unit: Threads
- Is mutable: Yes
- Description: Maximum number of threads for the high-priority exec-state-report thread pool. This pool is used by
ExecStateReporterto asynchronously send high-priority execution status reports (such as urgent fragment failures) from BE to FE via RPC. Unlike the normal exec-state-report pool, this pool has an unbounded task queue. The actual pool size at startup ismax(1, priority_exec_state_report_max_threads). Changing this config at runtime triggersupdate_max_threadson the priority pool in every executor set (shared and exclusive). Paired withexec_state_report_max_threadsfor the normal pool. Increase this value when high-priority reports are delayed under heavy concurrent query loads. - Introduced in: v4.1.0, v4.0.8, v3.5.15
priority_queue_remaining_tasks_increased_frequency
- Default: 512
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: Controls how often the BlockingPriorityQueue increases ("ages") the priority of all remaining tasks to avoid starvation. Each successful get/pop increments an internal
_upgrade_counter; when_upgrade_counterexceedspriority_queue_remaining_tasks_increased_frequency, the queue increments every element's priority, rebuilds the heap, and resets the counter. Lower values cause more frequent priority aging (reducing starvation but increasing CPU cost due to iterating and re-heapifying); higher values reduce that overhead but delay priority adjustments. The value is a simple operation count threshold, not a time duration. - Introduced in: v3.2.0
query_cache_capacity
- Default: 536870912
- Type: Int
- Unit: Bytes
- Is mutable: No
- Description: The size of the query cache in the BE. The default size is 512 MB. The size cannot be less than 4 MB. If the memory capacity of the BE is insufficient to provision your expected query cache size, you can increase the memory capacity of the BE.
- Introduced in: -
query_pool_spill_mem_limit_threshold
- Default: 1.0
- Type: Double
- Unit: -
- Is mutable: No
- Description: If automatic spilling is enabled, when the memory usage of all queries exceeds
query_pool memory limit * query_pool_spill_mem_limit_threshold, intermediate result spilling will be triggered. - Introduced in: v3.2.7
query_scratch_dirs
- Default:
${STARROCKS_HOME} - Type: string
- Unit: -
- Is mutable: No
- Description: Comma-separated list of writable scratch directories used by query execution to spill intermediate data (for example, external sorts, hash joins, and other operators). Specify one or more paths separated by
;(e.g./mnt/ssd1/tmp;/mnt/ssd2/tmp). Directories should be accessible and writable by the BE process and have sufficient free space; StarRocks will pick among them to distribute spill I/O. Changes require a restart to take effect. If a directory is missing, not writable, or full, spilling may fail or degrade query performance. - Introduced in: v3.2.0
result_buffer_cancelled_interval_time
- Default: 300
- Type: Int
- Unit: Seconds
- Is mutable: Yes
- Description: The wait time before BufferControlBlock releases data.
- Introduced in: -
scan_context_gc_interval_min
- Default: 5
- Type: Int
- Unit: Minutes
- Is mutable: Yes
- Description: The time interval at which to clean the Scan Context.
- Introduced in: -
scanner_row_num
- Default: 16384
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: The maximum row count returned by each scan thread in a scan.
- Introduced in: -
scanner_thread_pool_queue_size
- Default: 102400
- Type: Int
- Unit: -
- Is mutable: No
- Description: The number of scan tasks supported by the storage engine.
- Introduced in: -
scanner_thread_pool_thread_num
- Default: 48
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: The number of threads which the storage engine used for concurrent storage volume scanning. All threads are managed in the thread pool.
- Introduced in: -
string_prefix_zonemap_prefix_len
- Default: 16
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: Prefix length used for string ZoneMap min/max when
enable_string_prefix_zonemapis enabled. - Introduced in: -
udf_thread_pool_size
- Default: 1
- Type: Int
- Unit: Threads
- Is mutable: No
- Description: Sets the size of the UDF call PriorityThreadPool created in ExecEnv (used for executing user-defined functions / UDF-related tasks). The value is used as the pool thread count and also as the pool queue capacity when constructing the thread pool (PriorityThreadPool("udf", thread_num, queue_size)). Increase to allow more concurrent UDF executions; keep small to avoid excessive CPU and memory contention.
- Introduced in: v3.2.0
update_memory_limit_percent
- Default: 60
- Type: Int
- Unit: Percent
- Is mutable: No
- Description: Fraction of the BE process memory reserved for update-related memory and caches. During startup
GlobalEnvcomputes theMemTrackerfor updates as process_mem_limit * clamp(update_memory_limit_percent, 0, 100) / 100.UpdateManageralso uses this percentage to size its primary-index/index-cache capacity (index cache capacity = GlobalEnv::process_mem_limit * update_memory_limit_percent / 100). The HTTP config update logic registers a callback that callsupdate_primary_index_memory_limiton the update managers, so changes would be applied to the update subsystem if the config were changed. Increasing this value gives more memory to update/primary-index paths (reducing memory available for other pools); decreasing it reduces update memory and cache capacity. Values are clamped to the range 0–100. - Introduced in: v3.2.0
vector_chunk_size
- Default: 4096
- Type: Int
- Unit: Rows
- Is mutable: No
- Description: The number of rows per vectorized chunk (batch) used throughout the execution and storage code paths. This value controls Chunk and RuntimeState batch_size creation, affects operator throughput, memory footprint per operator, spill and sort buffer sizing, and I/O heuristics (for example, ORC writer natural write size). Increasing it can improve CPU and I/O efficiency for wide/CPU-bound workloads but raises peak memory usage and can increase latency for small-result queries. Tune only when profiling shows batch-size is a bottleneck; otherwise keep the default for balanced memory and performance.
- Introduced in: v3.2.0
Loading and unloading
clear_transaction_task_worker_count
- Default: 1
- Type: Int
- Unit: -
- Is mutable: No
- Description: The number of threads used for clearing transaction.
- Introduced in: -
column_mode_partial_update_insert_batch_size
- Default: 4096
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: Batch size for column mode partial update when processing inserted rows. If this item is set to
0or negative, it will be clamped to1to avoid infinite loop. This item controls the number of newly inserted rows processed in each batch. Larger values can improve write performance but will consume more memory. - Introduced in: v3.5.10, v4.0.2
enable_load_spill_parallel_merge
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: Specifies whether to enable parallel spill merge within a single tablet. Enabling this can improve the performance of spill merge during data loading.
- Introduced in: -
enable_parallel_memtable_finalize
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: Specifies whether to enable parallel memtable finalize when loading data to lake tables (shared-data mode). When enabled, the memtable finalize operation (sort/aggregate) is moved from the write thread to the flush thread, allowing the write thread to continue inserting data into a new memtable while the previous one is being finalized and flushed in parallel. This can significantly improve load throughput by overlapping CPU-intensive finalize operations with I/O-bound flush operations. Note that this optimization is automatically disabled when auto-increment columns need to be filled, as auto-increment ID assignment must happen before the memtable is submitted for flush.
- Introduced in: -
allow_list_object_for_random_bucketing_on_cache_miss
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: Controls whether to allow object-storage LIST fallback when lake metadata cache misses during random bucketing size checks.
truemeans fallback to LIST metadata files to compute base size (historical behavior, more accurate size estimation).falsemeans skip LIST and usebase_size = 0, which reduces LIST object requests but may delay immutable marking due to less accurate size estimation. - Introduced in: 4.1.0, 4.0.7, 3.5.15
enable_stream_load_verbose_log
- Default: false
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: Specifies whether to log the HTTP requests and responses for Stream Load jobs.
- Introduced in: v2.5.17, v3.0.9, v3.1.6, v3.2.1
flush_thread_num_per_store
- Default: 2
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: Number of threads that are used for flushing MemTable in each store.
- Introduced in: -
lake_flush_thread_num_per_store
- Default: 0
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: Number of threads that are used for flushing MemTable in each store in a shared-data cluster.
When this value is set to
0, the system uses twice of the CPU core count as the value. When this value is set to less than0, the system uses the product of its absolute value and the CPU core count as the value. - Introduced in: v3.1.12, 3.2.7
load_data_reserve_hours
- Default: 4
- Type: Int
- Unit: Hours
- Is mutable: No
- Description: The reservation time for the files produced by small-scale loadings.
- Introduced in: -
load_error_log_reserve_hours
- Default: 48
- Type: Int
- Unit: Hours
- Is mutable: Yes
- Description: The time for which data loading logs are reserved.
- Introduced in: -
load_process_max_memory_limit_bytes
- Default: 107374182400
- Type: Int
- Unit: Bytes
- Is mutable: No
- Description: The maximum size limit of memory resources that can be taken up by all load processes on a BE node.
- Introduced in: -
load_spill_memory_usage_per_merge
- Default: 1073741824
- Type: Int
- Unit: Bytes
- Is mutable: Yes
- Description: The maximum memory usage per merge operation during spill merge. Default is 1 GB (1073741824 bytes). This parameter controls the memory consumption of individual merge tasks during data loading spill merge to prevent excessive memory usage.
- Introduced in: -
max_consumer_num_per_group
- Default: 3
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: The maximum number of consumers in a consumer group of Routine Load.
- Introduced in: -
max_runnings_transactions_per_txn_map
- Default: 100
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: The maximum number of transactions that can run concurrently in each partition.
- Introduced in: -
number_tablet_writer_threads
- Default: 0
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: The number of tablet writer threads used in ingestion, such as Stream Load, Broker Load and Insert. When the parameter is set to less than or equal to 0, the system uses half of the number of CPU cores, with a minimum of 16. When the parameter is set to greater than 0, the system uses that value. This configuration is changed to dynamic from v3.1.7 onwards.
- Introduced in: -
push_worker_count_high_priority
- Default: 3
- Type: Int
- Unit: -
- Is mutable: No
- Description: The number of threads used to handle a load task with HIGH priority.
- Introduced in: -
push_worker_count_normal_priority
- Default: 3
- Type: Int
- Unit: -
- Is mutable: No
- Description: The number of threads used to handle a load task with NORMAL priority.
- Introduced in: -
streaming_load_max_batch_size_mb
- Default: 100
- Type: Int
- Unit: MB
- Is mutable: Yes
- Description: The maximum size of a JSON file that can be streamed into StarRocks.
- Introduced in: -
streaming_load_max_mb
- Default: 102400
- Type: Int
- Unit: MB
- Is mutable: Yes
- Description: The maximum size of a file that can be streamed into StarRocks. From v3.0, the default value has been changed from
10240to102400. - Introduced in: -
streaming_load_rpc_max_alive_time_sec
- Default: 1200
- Type: Int
- Unit: Seconds
- Is mutable: No
- Description: The RPC timeout for Stream Load.
- Introduced in: -
transaction_publish_version_thread_pool_idle_time_ms
- Default: 60000
- Type: Int
- Unit: Milliseconds
- Is mutable: No
- Description: The idle time before a thread is reclaimed by the Publish Version thread pool.
- Introduced in: -
transaction_publish_version_worker_count
- Default: 0
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: The maximum number of threads used to publish a version. When this value is set to less than or equal to
0, the system uses the CPU core count as the value, so as to avoid insufficient thread resources when import concurrency is high but only a fixed number of threads are used. From v2.5, the default value has been changed from8to0. - Introduced in: -
write_buffer_size
- Default: 104857600
- Type: Int
- Unit: Bytes
- Is mutable: Yes
- Description: The buffer size of MemTable in the memory. This configuration item is the threshold to trigger a flush.
- Introduced in: -
broker_write_timeout_seconds
- Default: 30
- Type: int
- Unit: Seconds
- Is mutable: No
- Description: Timeout (in seconds) used by backend broker operations for write/IO RPCs. The value is multiplied by 1000 to produce millisecond timeouts and is passed as the default timeout_ms to BrokerFileSystem and BrokerServiceConnection instances (e.g., file export and snapshot upload/download). Increase this when brokers or network are slow or when transferring large files to avoid premature timeouts; decreasing it may cause broker RPCs to fail earlier. This value is defined in common/config and is applied at process start (not dynamically reloadable).
- Introduced in: v3.2.0
enable_load_channel_rpc_async
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: When enabled, handling of load-channel open RPCs (for example,
PTabletWriterOpen) is offloaded from the BRPC worker to a dedicated thread pool: the request handler creates aChannelOpenTaskand submits it to the internal_async_rpc_poolinstead of runningLoadChannelMgr::_openinline. This reduces work and blocking inside BRPC threads and allows tuning concurrency viaload_channel_rpc_thread_pool_numandload_channel_rpc_thread_pool_queue_size. If the thread pool submission fails (when pool is full or shut down), the request is canceled and an error status is returned. The pool is shut down onLoadChannelMgr::close(), so consider capacity and lifecycle when you want to enable this feature so as to avoid request rejections or delayed processing. - Introduced in: v3.5.0
enable_load_diagnose
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: When enabled, StarRocks will attempt an automated load diagnosis from BE OlapTableSink/NodeChannel after a brpc timeout matching "[E1008]Reached timeout". The code creates a
PLoadDiagnoseRequestand sends an RPC to the remote LoadChannel to collect a profile and/or stack trace (controlled byload_diagnose_rpc_timeout_profile_threshold_msandload_diagnose_rpc_timeout_stack_trace_threshold_ms). The diagnose RPC usesload_diagnose_send_rpc_timeout_msas its timeout. Diagnosis is skipped if a diagnose request is already in progress. Enabling this produces additional RPCs and profiling work on target nodes; disable on sensitive production workloads to avoid extra overhead. - Introduced in: v3.5.0
enable_load_segment_parallel
- Default: false
- Type: Boolean
- Unit: -
- Is mutable: No
- Description: When enabled, rowset segment loading and rowset-level reads are performed concurrently using StarRocks background thread pools (ExecEnv::load_segment_thread_pool and ExecEnv::load_rowset_thread_pool). Rowset::load_segments and TabletReader::get_segment_iterators submit per-segment or per-rowset tasks to these pools, falling back to serial loading and logging a warning if submission fails. Enable this to reduce read/load latency for large rowsets at the cost of increased CPU/IO concurrency and memory pressure. Note: parallel loading can change the load completion order of segments and therefore prevents partial compaction (code checks
_parallel_loadand disables partial compaction when enabled); consider implications for operations that rely on segment order. - Introduced in: v3.3.0, v3.4.0, v3.5.0
enable_streaming_load_thread_pool
- Default: true
- Type: Boolean
- Unit: -
- Is mutable: Yes
- Description: Controls whether streaming load scanners are submitted to the dedicated streaming load thread pool. When enabled and a query is a LOAD with
TLoadJobType::STREAM_LOAD, ConnectorScanNode submits scanner tasks to thestreaming_load_thread_pool(which is configured with INT32_MAX threads and queue sizes, i.e. effectively unbounded). When disabled, scanners use the generalthread_pooland itsPriorityThreadPoolsubmission logic (priority computation, try_offer/offer behavior). Enabling isolates streaming-load work from regular query execution to reduce interference; however, because the dedicated pool is effectively unbounded, enabling may increase concurrent threads and resource usage under heavy streaming-load traffic. This option is on by default and typically does not require modification. - Introduced in: v3.2.0
es_http_timeout_ms
- Default: 5000
- Type: Int
- Unit: Milliseconds
- Is mutable: No
- Description: HTTP connection timeout (in milliseconds) used by the ES network client in ESScanReader for Elasticsearch scroll requests. This value is applied via network_client.set_timeout_ms() before sending subsequent scroll POSTs and controls how long the client waits for an ES response during scrolling. Increase this value for slow networks or large queries to avoid premature timeouts; decrease to fail faster on unresponsive ES nodes. This setting complements
es_scroll_keepalive, which controls the scroll context keep-alive duration. - Introduced in: v3.2.0
es_index_max_result_window
- Default: 10000
- Type: Int
- Unit: -
- Is mutable: No
- Description: Limits the maximum number of documents StarRocks will request from Elasticsearch in a single batch. StarRocks sets the ES request batch size to min(
es_index_max_result_window,chunk_size) when buildingKEY_BATCH_SIZEfor the ES reader. If an ES request exceeds the Elasticsearch index settingindex.max_result_window, Elasticsearch returns HTTP 400 (Bad Request). Adjust this value when scanning large indexes or increase the ESindex.max_result_windowon the Elasticsearch side to permit larger single requests. - Introduced in: v3.2.0
ignore_load_tablet_failure
- Default: false
- Type: Boolean
- Unit: -
- Is mutable: No
- Description: When this item is set to
false, the system will treat any tablet header load failures (non-NotFound and non-AlreadyExist errors) as fatal: the code logs the error and calls LOG(FATAL) to stop the BE process. When it is set totrue, the BE continues startup despite such per-tablet load errors — failed tablet IDs are recorded and skipped while successful tablets are still loaded. Note that this parameter does NOT suppress fatal errors from the RocksDB meta scan itself, which always cause the process to quit. - Introduced in: v3.2.0
load_channel_abort_clean_up_delay_seconds
- Default: 600
- Type: Int
- Unit: Seconds
- Is mutable: Yes
- Description: Controls how long (in seconds) the system keeps the load IDs of aborted load channels before removing them from
_aborted_load_channels. When a load job is cancelled or fails, the load ID stays recorded so any late-arriving load RPCs can be rejected immediately; once the delay expires, the entry is cleaned during the periodic background sweep (minimum sweep interval is 60 seconds). Setting the delay too low risks accepting stray RPCs after an abort, while setting it too high may retain state and consume resources longer than necessary. Tune this to balance correctness of late-request rejection and resource retention for aborted loads. - Introduced in: v3.5.11, v4.0.4
load_channel_rpc_thread_pool_num
- Default: -1
- Type: Int
- Unit: Threads
- Is mutable: Yes
- Description: Maximum number of threads for the load-channel async RPC thread pool. When set to less than or equal to 0 (default
-1) the pool size is auto-set to the number of CPU cores (CpuInfo::num_cores()). The configured value is used as ThreadPoolBuilder's max threads and the pool's min threads is set to min(5, max_threads). The pool queue size is controlled separately byload_channel_rpc_thread_pool_queue_size. This setting was introduced to align the async RPC pool size with brpc workers' default (brpc_num_threads) so behavior remains compatible after switching load RPC handling from synchronous to asynchronous. Changing this config at runtime triggersExecEnv::GetInstance()->load_channel_mgr()->async_rpc_pool()->update_max_threads(...). - Introduced in: v3.5.0
load_channel_rpc_thread_pool_queue_size
- Default: 1024000
- Type: int
- Unit: Count
- Is mutable: No
- Description: Sets the maximum pending-task queue size for the Load channel RPC thread pool created by LoadChannelMgr. This thread pool executes asynchronous
openrequests whenenable_load_channel_rpc_asyncis enabled; the pool size is paired withload_channel_rpc_thread_pool_num. The large default (1024000) aligns with brpc workers' defaults to preserve behavior after switching from synchronous to asynchronous handling. If the queue is full, ThreadPool::submit() will fail and the incoming open RPC is cancelled with an error, causing the caller to receive a rejection. Increase this value to buffer larger bursts of concurrentopenrequests; reducing it tightens backpressure but may cause more rejections under load. - Introduced in: v3.5.0
load_diagnose_rpc_timeout_profile_threshold_ms
- Default: 60000
- Type: Int
- Unit: Milliseconds
- Is mutable: Yes
- Description: When a load RPC times out (error contains "[E1008]Reached timeout") and
enable_load_diagnoseis true, this threshold controls whether a full profiling diagnose is requested. If the request-level RPC timeout_rpc_timeout_msis greater thanload_diagnose_rpc_timeout_profile_threshold_ms, profiling is enabled for that diagnose. For smaller_rpc_timeout_msvalues, profiling is sampled once every 20 timeouts to avoid frequent heavy diagnostics for real-time/short-timeout loads. This value affects theprofileflag in thePLoadDiagnoseRequestsent; stack-trace behavior is controlled separately byload_diagnose_rpc_timeout_stack_trace_threshold_msand send timeout byload_diagnose_send_rpc_timeout_ms. - Introduced in: v3.5.0
load_diagnose_rpc_timeout_stack_trace_threshold_ms
- Default: 600000
- Type: Int
- Unit: Milliseconds
- Is mutable: Yes
- Description: Threshold (in ms) used to decide when to request remote stack traces for long-running load RPCs. When a load RPC times out with a timeout error and the effective RPC timeout (_rpc_timeout_ms) exceeds this value,
OlapTableSink/NodeChannelwill includestack_trace=truein aload_diagnoseRPC to the target BE so the BE can return stack traces for debugging.LocalTabletsChannel::SecondaryReplicasWaiteralso triggers a best-effort stack-trace diagnose from the primary if waiting for secondary replicas exceeds this interval. This behavior requiresenable_load_diagnoseand usesload_diagnose_send_rpc_timeout_msfor the diagnose RPC timeout; profiling is gated separately byload_diagnose_rpc_timeout_profile_threshold_ms. Lowering this value increases how aggressively stack traces are requested. - Introduced in: v3.5.0
load_diagnose_send_rpc_timeout_ms
- Default: 2000
- Type: Int
- Unit: Milliseconds
- Is mutable: Yes
- Description: Timeout (in milliseconds) applied to diagnosis-related brpc calls initiated by BE load paths. It is used to set the controller timeout for
load_diagnoseRPCs (sent by NodeChannel/OlapTableSink when a LoadChannel brpc call times out) and for replica-status queries (used by SecondaryReplicasWaiter / LocalTabletsChannel when checking primary replica state). Choose a value high enough to allow the remote side to respond with profile or stack-trace data, but not so high that failure handling is delayed. This parameter works together withenable_load_diagnose,load_diagnose_rpc_timeout_profile_threshold_ms, andload_diagnose_rpc_timeout_stack_trace_threshold_mswhich control when and what diagnostic information is requested. - Introduced in: v3.5.0
load_fp_brpc_timeout_ms
- Default: -1
- Type: Int
- Unit: Milliseconds
- Is mutable: Yes
- Description: Overrides the per-channel brpc RPC timeout used by OlapTableSink when the
node_channel_set_brpc_timeoutfail point is triggered. If set to a positive value, NodeChannel will set its internal_rpc_timeout_msto this value (in milliseconds) causing open/add-chunk/cancel RPCs to use the shorter timeout and enabling simulation of brpc timeouts that produce the "[E1008]Reached timeout" error. Default (-1) disables the override. Changing this value is intended for testing and fault injection; small values may produce false timeouts and trigger load diagnostics (seeenable_load_diagnose,load_diagnose_rpc_timeout_profile_threshold_ms,load_diagnose_rpc_timeout_stack_trace_threshold_ms, andload_diagnose_send_rpc_timeout_ms). - Introduced in: v3.5.0
load_fp_tablets_channel_add_chunk_block_ms
- Default: -1
- Type: Int
- Unit: Milliseconds
- Is mutable: Yes
- Description: When enabled (set to a positive milliseconds value) this fail-point configuration makes TabletsChannel::add_chunk sleep for the specified time during load processing. It is used to simulate BRPC timeout errors (e.g., "[E1008]Reached timeout") and to emulate an expensive add_chunk operation that increases load latency. A value less than or equal to 0 (default
-1) disables the injection. Intended for testing fault handling, timeouts, and replica synchronization behavior — do not enable in normal production workloads as it delays write completion and can trigger upstream timeouts or replica aborts. - Introduced in: v3.5.0
load_segment_thread_pool_num_max
- Default: 128
- Type: Int
- Unit: -
- Is mutable: No
- Description: Sets the maximum number of worker threads for BE load-related thread pools. This value is used by ThreadPoolBuilder to limit threads for both
load_rowset_poolandload_segment_poolin exec_env.cpp, controlling concurrency for processing loaded rowsets and segments (e.g., decoding, indexing, writing) during streaming and batch loads. Increasing this value raises parallelism and can improve load throughput but also increases CPU, memory usage, and potential contention; decreasing it limits concurrent load processing and may reduce throughput. Tune together withload_segment_thread_pool_queue_sizeandstreaming_load_thread_pool_idle_time_ms. Change requires BE restart. - Introduced in: v3.3.0, v3.4.0, v3.5.0
load_segment_thread_pool_queue_size
- Default: 10240
- Type: Int
- Unit: Tasks
- Is mutable: No
- Description: Sets the maximum queue length (number of pending tasks) for the load-related thread pools created as "load_rowset_pool" and "load_segment_pool". These pools use
load_segment_thread_pool_num_maxfor their max thread count and this configuration controls how many load segment/rowset tasks can be buffered before the ThreadPool's overflow policy takes effect (further submissions may be rejected or blocked depending on the ThreadPool implementation). Increase to allow more pending load work (uses more memory and can raise latency); decrease to limit buffered load concurrency and reduce memory usage. - Introduced in: v3.3.0, v3.4.0, v3.5.0
max_pulsar_consumer_num_per_group
- Default: 10
- Type: Int
- Unit: -
- Is mutable: Yes
- Description: Controls the maximum number of Pulsar consumers that may be created in a single data consumer group for routine load on a BE. Because cumulative acknowledge is not supported for multi-topic subscriptions, each consumer subscribes exactly one topic/partition; if the number of partitions in
pulsar_info->partitionsexceeds this value, group creation fails with an error advising to increasemax_pulsar_consumer_num_per_groupon the BE or add more BEs. This limit is enforced when constructing a PulsarDataConsumerGroup and prevents a BE from hosting more than this many consumers for one routine load group. For Kafka routine load,max_consumer_num_per_groupis used instead. - Introduced in: v3.2.0
pull_load_task_dir
- Default:
${STARROCKS_HOME}/var/pull_load - Type: string
- Unit: -
- Is mutable: No
- Description: Filesystem path where the BE stores data and working files for "pull load" tasks (downloaded source files, task state, temporary output, etc.). The directory must be writable by the BE process and have sufficient disk space for incoming loads. The default is relative to STARROCKS_HOME; tests create and expect this directory to exist (see test configuration).
- Introduced in: v3.2.0
routine_load_kafka_timeout_second
- Default: 10
- Type: Int
- Unit: Seconds
- Is mutable: No
- Description: Timeout in seconds used for Kafka-related routine load operations. When a client request does not specify a timeout,
routine_load_kafka_timeout_secondis used as the default RPC timeout (converted to milliseconds) forget_info. It is also used as the per-call consume poll timeout for the librdkafka consumer (converted to milliseconds and capped by remaining runtime). Note: the internalget_infopath reduces this value to 80% before passing it to librdkafka to avoid FE-side timeout races. Set this to a value that balances timely failure reporting and sufficient time for network/broker responses; changes require a restart because the setting is not mutable. - Introduced in: v3.2.0
routine_load_pulsar_timeout_second
- Default: 10
- Type: Int
- Unit: Seconds
- Is mutable: No
- Description: Default timeout (in seconds) that the BE uses for Pulsar-related routine load operations when the request does not supply an explicit timeout. Specifically,
PInternalServiceImplBase::get_pulsar_infomultiplies this value by 1000 to form the millisecond timeout passed to the routine load task executor methods that fetch Pulsar partition metadata and backlog. Increase to allow slower Pulsar responses at the cost of longer failure detection; decrease to fail faster on slow brokers. Analogous toroutine_load_kafka_timeout_secondused for Kafka. - Introduced in: v3.2.0
streaming_load_thread_pool_idle_time_ms
- Default: 2000
- Type: Int
- Unit: Milliseconds
- Is mutable: No
- Description: Sets the thread idle timeout (in milliseconds) for streaming-load related thread pools. The value is used as the idle timeout passed to ThreadPoolBuilder for the
stream_load_iopool and also forload_rowset_poolandload_segment_pool. Threads in these pools are reclaimed when idle for this duration; lower values reduce idle resource usage but increase thread creation overhead, while higher values keep threads alive longer. Thestream_load_iopool is used whenenable_streaming_load_thread_poolis enabled. - Introduced in: v3.2.0
streaming_load_thread_pool_num_min
- Default: 0
- Type: Int
- Unit: -
- Is mutable: No
- Description: Minimum number of threads for the streaming load IO thread pool ("stream_load_io") created during ExecEnv initialization. The pool is built with
set_max_threads(INT32_MAX)andset_max_queue_size(INT32_MAX)so it is effectively unbounded to avoid deadlocks for concurrent streaming loads. A value of 0 lets the pool start with no threads and grow on demand; setting a positive value reserves that many threads at startup. This pool is used whenenable_streaming_load_thread_poolis true and its idle timeout is controlled bystreaming_load_thread_pool_idle_time_ms. Overall concurrency is still constrained byfragment_pool_thread_num_maxandwebserver_num_workers; changing this value is rarely necessary and may increase resource usage if set too high. - Introduced in: v3.2.0