elastic_cluster_v2
~> The resource's API may change in subsequent versions to simplify user experience.
Deploy a multi-warehouse elastic CelerData cluster on AWS EC2 instances、Azure virtual machines or GCP virtual machines.
The implementation of this resource is part of the whole cluster deployment procedure and depends on the implementation of a data credential, a deployment credential, and a network configuration. For detailed procedures of cluster deployments on AWS、Azure and GCP, see Provision CelerData Cloud BYOC on AWS and Provision CelerData Cloud BYOC on Azure.
Supported Node Sizes
For information about the instance types supported by CelerData, see Supported instance types.
Example Usage
resource "celerdatabyoc_elastic_cluster_v2" "elastic_cluster_1" {
deployment_credential_id = "<deployment_credential_resource_ID>"
data_credential_id = "<data_credential_resource_ID>"
network_id = "<network_configuration_resource_ID>"
cluster_name = "<cluster_name>"
coordinator_node_size = "<coordinator_node_instance_type>"
coordinator_node_count = <coordinator_node_number>
// optional
coordinator_node_volume_config {
vol_size = <vol_size>
iops = <iops>
throughput = <throughput>
}
// optional
coordinator_node_configs = {
<key> = <value>
}
// The configuration for “default_warehouse” is required.
default_warehouse {
compute_node_size = "<compute_node_instance_type>"
compute_node_count = <compute_node_number>
// optional
compute_node_volume_config {
vol_number = <vol_number>
vol_size = <vol_size>
iops = <iops>
throughput = <throughput>
}
// optional
compute_node_configs = {
<key> = <value>
}
}
warehouse {
name = "<warehouse_name>"
compute_node_size = "<compute_node_instance_type>"
compute_node_count = <compute_node_number>
// When using an EBS-backed instance type, specify the following two parameters. Otherwise, delete them.
compute_node_ebs_disk_number = <compute_node_ebs_disk_number>
compute_node_ebs_disk_per_size = <compute_node_ebs_disk_per_size>
distribution_policy = "{specify_az | crossing_az}"
// specify_az = "us-west-2b"
// expected_state="Suspended"
// auto_scaling_policy = celerdatabyoc_auto_scaling_policy.policy_1.policy_json
// optional
compute_node_volume_config {
vol_number = <vol_number>
vol_size = <vol_size>
iops = <iops>
throughput = <throughput>
}
// optional
compute_node_configs = {
<key> = <value>
}
}
custom_ami {
ami = "<ami_id>"
// ami = "ami-09245d5773578a1d6"
os = "al2023"
}
// optional
scheduling_policy {
policy_name = "auto-resume-suspend"
description = "Auto resume/suspend"
active_days = ["TUESDAY"]
time_zone = "UTC" // IANA Time-Zone
resume_at = "09:00"
suspend_at = "18:00"
enable = true
}
default_admin_password = "<SQL_user_initial_password>"
expected_cluster_state = "{Suspended | Running}"
ldap_ssl_certs = [
"<ssl_cert_s3_path>"
]
ranger_certs_dir_path = "<ranger_config_s3_path>" // Example : "s3://your-bucket/ranger_config_dir"
resource_tags = {
celerdata = "<tag_name>"
}
csp = "{aws | azure}"
region = "<cloud_provider_region>"
init_scripts {
logs_dir = "<log_s3_path>"
script_path = "<script_s3_path>"
}
run_scripts_parallel = false
query_port = 9030
idle_suspend_interval = 60
}
Argument Reference
~> This section explains only the arguments of the celerdatabyoc_elastic_cluster_v2
resource. For the explanation of arguments of other resources, see the corresponding resource topics.
The celerdatabyoc_elastic_cluster_v2
resource contains the following required arguments and optional arguments:
Required:
-
cluster_name
: (Not allowed to modify) The desired name for the cluster. Enter a unique name. -
coordinator_node_size
: The instance type for coordinator nodes in the cluster. Select a coordinator node instance type from the table "Supported Node Sizes". For example, you can set this argument tom6i.4xlarge
. -
deployment_credential_id
: (Not allowed to modify) The ID of the deployment credential.- If you deploy the cluster on AWS, set this argument to
celerdatabyoc_aws_deployment_role_credential.<resource_name>.id
and replace<resource_name>
with the name of thecelerdatabyoc_aws_deployment_role_credential
resource. - If you deploy the cluster on Azure, set this argument to
celerdatabyoc_azure_deployment_credential.<resource_name>.id
and replace<resource_name>
with the name of thecelerdatabyoc_azure_deployment_credential
resource.
- If you deploy the cluster on AWS, set this argument to
-
data_credential_id
: (Not allowed to modify) The ID of the data credential.- If you deploy the cluster on AWS, set this argument to
celerdatabyoc_aws_data_credential.<resource_name>.id
and replace<resource_name>
with the name of thecelerdatabyoc_aws_data_credential
resource. - If you deploy the cluster on Azure, set this argument to
celerdatabyoc_azure_data_credential.<resource_name>.id
and replace<resource_name>
with the name of thecelerdatabyoc_azure_data_credential
resource.
- If you deploy the cluster on AWS, set this argument to
-
network_id
: (Not allowed to modify) The ID of the network configuration.- If you deploy the cluster on AWS, set this argument to
celerdatabyoc_aws_network.<resource_name>.id
and replace<resource_name>
with the name of thecelerdatabyoc_aws_network
resource. - If you deploy the cluster on Azure, set this argument to
celerdatabyoc_azure_network.<resource_name>.id
and replace<resource_name>
with the name of thecelerdatabyoc_azure_network
resource.
- If you deploy the cluster on AWS, set this argument to
-
default_warehouse
: (List of Object) The default warehouse. The attributes of a default warehouse include:-
compute_node_size
: (Required) The instance type for compute nodes in the cluster. Select a compute node instance type from the table "Supported Node Sizes". For example, you can set this argument tor6id.4xlarge
. -
compute_node_count
: (Optional) The number of compute nodes in the cluster. Valid values: any non-zero positive integer. Default value:3
. -
compute_node_volume_config
: The compute nodes volume configuration.vol_number
: (Not allowed to modify) The number of disks for each compute node. Valid values: [1,24]. Default value:2
.vol_size
: The size per disk for each compute node. Unit: GB. Default value:100
. You can only increase the value of this parameter.iops
: (Available only for AWS) Disk IOPS.throughput
: (Available only for AWS) Disk throughput. ~> You can use thevol_number
andvol_size
arguments to specify the disk space. The total disk space provisioned to a compute node is equal tovol_number
*vol_size
.
-
compute_node_configs
: The static configuration items you want to customize for the compute nodes in the warehouse. -
auto_scaling_policy
: (Optional) This policy will automatically scale the number of compute nodes, based on CPU utilization of the warehouse. For more information, see Enable Auto Scaling for your warehouse. You can generate thepolicy_json
value for this argument using thecelerdatabyoc_auto_scaling_policy
resource. -
distribution_policy
: (Optional, available only for AWS) The compute node distribution policy for the warehouse if you want to enable Multi-AZ deployment for the cluster. Valid values:specify_az
(Nodes are deployed in the primary availability zone) andcrossing_az
(Nodes are deployed across the three availability zone). For more information, see Multi-AZ Deployment.~> To enable Multi-AZ Deployment, you must deploy at least 3 coordinator nodes, that is,
coordinator_node_count
must be greater or equal to3
. -
specify_az
: (Optional, available only for AWS) The primary availability zone for node deployment. This argument is available only whendistribution_policy
is set tospecify_az
.
-
-
custom_ami
: (Optional, available only for AWS) The Amazon Machine Image (AMI) used to deploy the cluster. You can use custom AMI for deployment. You can only specify this parameter when creating the cluster. If this argument is not specified, the default AMI is used.ami
: The ID of the custom AMI.os
: The operating system (OS) of the AMI. Currently onlyal2023
(Amazon Linux 2023) is supported. The value of this field must be consistent with the actual OS of the AMI. Otherwise, the deployment will fail.
-
default_admin_password
: (Not allowed to modify) The initial password of the clusteradmin
user. -
expected_cluster_state
: When creating a cluster, you need to declare the status of the cluster you are creating. Cluster states are categorized asSuspended
andRunning
. If you want the cluster to start after provisioning, set this argument toRunning
. If you do not do so, the cluster will be suspended after provisioning. -
csp
: (Not allowed to modify) The cloud service provider of the cluster.- If you deploy the cluster on AWS, set this argument to
aws
. - If you deploy the cluster on Azure, set this argument to
azure
.
- If you deploy the cluster on AWS, set this argument to
-
region
: (Not allowed to modify) The ID of the cloud provider region to which the network hosting the cluster belongs. See Supported cloud platforms and regions.
Optional:
-
coordinator_node_count
: The number of coordinator nodes in the cluster. Valid values:1
,3
, and5
. Default value:1
. If you want to enable Multi-AZ Deployment (Available only for AWS), you must deploy at least 3 Coordinator Nodes, that is,coordinator_node_count
must be greater or equal to3
. -
coordinator_node_volume_config
: The coordinator nodes volume configuration.vol_size
: The size per disk for each coordinator node. Unit: GB. Default value:150
. You can only increase the value of this parameter.iops
: (Available only for AWS) Disk IOPS.throughput
: (Available only for AWS) Disk throughput.
-
coordinator_node_configs
: The coordinator node static configuration. -
warehouse
: (List of Object) The list of warehouses. The attributes of a warehouse include:-
name
: (Required) The warehouse name must be unique within the cluster and cannot be named "default_warehouse". -
compute_node_size
: (Required) The instance type for compute nodes in the cluster. Select a compute node instance type from the table "Supported Node Sizes". For example, you can set this argument tor6id.4xlarge
. -
compute_node_count
: The number of compute nodes in the cluster. Valid values: any non-zero positive integer. Default value:3
. -
compute_node_volume_config
: The compute nodes volume configuration.vol_number
: (Not allowed to modify) The number of disks for each compute node. Valid values: [1,24]. Default value:2
.vol_size
: The size per disk for each compute node. Unit: GB. Default value:100
. You can only increase the value of this parameter.iops
: (Available only for AWS) Disk IOPS.throughput
: (Available only for AWS) Disk throughput. ~> You can use thevol_number
andvol_size
arguments to specify the disk space. The total disk space provisioned to a compute node is equal tovol_number
*vol_size
.
-
compute_node_configs
: The compute node static configuration. -
expected_state
: When creating non-default warehouse, you can declare the status of the warehouse. Warehouse states are categorized asSuspended
andRunning
. If you want the warehouse to start after provisioning, set this argument toRunning
. If you set this argument toSuspended
, the warehouse will be suspended after provisioning. -
idle_suspend_interval
: The amount of time (in minutes) during which the warehouse can stay idle. After the specified time period elapses, the warehouse will be automatically suspended. To enable the Auto Suspend feature, set this argument to an integer with the range of 15 to 999999. To disable this feature again, remove this argument from your Terraform configuration. -
auto_scaling_policy
: This policy will automatically scale the number of Compute nodes (CN), based on CPU utilization of the warehouse. For more information, see Enable Auto Scaling for your warehouse. You can generate thepolicy_json
value for this argument using thecelerdatabyoc_auto_scaling_policy
resource. -
distribution_policy
: (Available only for AWS) The Compute Node distribution policy for the warehouse if you want to enable Multi-AZ deployment for the cluster. Valid values:specify_az
(Nodes are deployed in the primary availability zone) andcrossing_az
(Nodes are deployed across the three availability zone). For more information, see Multi-AZ Deployment.~> To enable Multi-AZ Deployment, you must deploy at least 3 Coordinator Nodes, that is,
coordinator_node_count
must be greater or equal to3
. -
specify_az
: (Available only for AWS) The primary availability zone for node deployment. This argument is available only whendistribution_policy
is set tospecify_az
.
-
-
global_session_variables
: Global session variables of the cluster. You can find all configurable variables byselect VARIABLE_NAME from information_schema.global_variables;
. -
ldap_ssl_certs
: (Available only for AWS) The path in the AWS S3 bucket that stores the LDAP SSL certificates. Multiple paths must be separated by commas (,). CelerData supports using LDAP over SSL by uploading the LDAP SSL certificates from S3. To allow CelerData to successfully fetch the certificates, you must grant theListObject
andGetObject
permissions to CelerData. To delete the certificates uploaded, you only need to remove this argument. -
ranger_certs_dir
: (Available only for AWS) The parent dir path in the AWS S3 bucket that stores the Ranger SSL certificates. CelerData supports using Ranger over SSL by uploading the Ranger SSL certificates from S3. To allow CelerData to successfully fetch the certificates, you must grant theListObject
andGetObject
permissions to CelerData. To delete the certificates uploaded, you only need to remove this argument.
~> You can only upload or delete LDAP or Ranger SSL certificates while the cluster's expected_cluster_state
is set to Running
.
-
resource_tags
: The tags to be attached to the cluster (Please note that resource_tags is a concept in ClelerData. For AWS and Azure, it will be added as a tag to the corresponding resources. For GCP Cloud, it will be added as a label to the corresponding GCP resources). -
init_scripts
: (Available only for AWS) The configuration block to specify the paths to which scripts and script execution results are stored. The maximum number of executable scripts is 20. For information about the formats supported by these arguments, seescripts.logs_dir
andscripts.script_path
in Run scripts.logs_dir
: The path in the AWS S3 bucket to which script execution results are stored. This S3 bucket can be the same as or different from the S3 bucket you specify in thecelerdatabyoc_aws_data_credential
resource.script_path
: The path in the AWS S3 bucket that stores the scripts to run via Terraform. This S3 bucket must be the one you specify in thecelerdatabyoc_aws_data_credential
resource.
-
run_scripts_parallel
: Whether to execute the scripts in parallel. Valid values:true
andfalse
. Default value:false
. -
run_scripts_timeout
: The amount of time after which the script execution times out. Unit: Seconds. Default:3600
(1 hour). The maximum value of this item is21600
(6 hours). -
query_port
: The query port, which must be within the range of 1-65535 excluding 443. The default query port is port9030
. Note that this argument can be specified only at cluster deployment, and cannot be modified once it is set. -
idle_suspend_interval
: The amount of time (in minutes) during which the cluster can stay idle. After the specified time period elapses, the cluster will be automatically suspended. The Auto Suspend feature is disabled by default. To enable the Auto Suspend feature, set this argument to an integer with the range of 15-999999. To disable this feature again, remove this argument from your Terraform configuration. -
scheduling_policy
:(Optional, List) When specified. CelerData will automatically suspend the cluster to save the majority of costs on EC2 (only EBS costs will be incurred) and resume the cluster for usage as scheduled.policy_name
: (Required) Policy name.description
: (Optional) Explanation of this policy strategy.active_days
: (Required) Configure the date when the cluster scheduling policy is triggered. Available values:MONDAY
TUESDAY
WEDNESDAY
THURSDAY
FRIDAY
SATURDAY
SUNDAY
time_zone
: (Optional) Specify your IANA Time-Zone. Default:UTC
.resume_at
: (Optional) Cluster auto resume time.resume_at
andsuspend_at
cannot both be empty.suspend_at
: (Optional) Cluster auto suspend time.enable
: (Required) Whether to enable this scheduling policy. When specified as true, the system will perform cluster scheduling according to this policy.
See Also
AWS
- AWS IAM
- Manage data credentials for AWS
- Manage deployment credentials for AWS
- Manage network configurations for AWS
Azure
- Manage data credentials for Azure
- Manage deployment credentials for Azure
- Manage network configurations for Azure
GCP
- Manage data credentials for GCP
- Manage deployment credentials for GCP
- Manage network configurations for GCP