Skip to main content

elastic_cluster_v2

note

The resource's API may change in subsequent versions to simplify user experience.

Deploy a multi-warehouse elastic CelerData cluster on AWS EC2 instances、Azure virtual machines or GCP virtual machines.

The implementation of this resource is part of the whole cluster deployment procedure and depends on the implementation of a data credential, a deployment credential, and a network configuration. For detailed procedures of cluster deployments on AWS、Azure and GCP, see Provision CelerData Cloud BYOC on AWS and Provision CelerData Cloud BYOC on Azure.

Supported Node Sizes

For information about the instance types supported by CelerData, see Supported instance types.

Example Usage

resource "celerdatabyoc_elastic_cluster_v2" "elastic_cluster_1" {
deployment_credential_id = "<deployment_credential_resource_ID>"
data_credential_id = "<data_credential_resource_ID>"
network_id = "<network_configuration_resource_ID>"

cluster_name = "<cluster_name>"
coordinator_node_size = "<coordinator_node_instance_type>"
coordinator_node_count = <coordinator_node_number>

// optional
coordinator_node_volume_config {
vol_size = <vol_size>
iops = <iops>
throughput = <throughput>
}
// optional
coordinator_node_configs = {
<key> = <value>
}

// The configuration for “default_warehouse” is required.
default_warehouse {
compute_node_size = "<compute_node_instance_type>"
compute_node_count = <compute_node_number>

// optional
compute_node_volume_config {
vol_number = <vol_number>
vol_size = <vol_size>
iops = <iops>
throughput = <throughput>
}
// optional
compute_node_configs = {
<key> = <value>
}
}

warehouse {
name = "<warehouse_name>"
compute_node_size = "<compute_node_instance_type>"
compute_node_count = <compute_node_number>
// When using an EBS-backed instance type, specify the following two parameters. Otherwise, delete them.
compute_node_ebs_disk_number = <compute_node_ebs_disk_number>
compute_node_ebs_disk_per_size = <compute_node_ebs_disk_per_size>
distribution_policy = "{specify_az | crossing_az}"

// specify_az = "us-west-2b"
// expected_state="Suspended"
// auto_scaling_policy = celerdatabyoc_auto_scaling_policy.policy_1.policy_json

// optional
compute_node_volume_config {
vol_number = <vol_number>
vol_size = <vol_size>
iops = <iops>
throughput = <throughput>
}
// optional
compute_node_configs = {
<key> = <value>
}
}

custom_ami {
ami = "<ami_id>"
// ami = "ami-09245d5773578a1d6"
os = "al2023"
}

// optional
scheduling_policy {
policy_name = "auto-resume-suspend"
description = "Auto resume/suspend"
active_days = ["TUESDAY"]
time_zone = "UTC" // IANA Time-Zone
resume_at = "09:00"
suspend_at = "18:00"
enable = true
}

default_admin_password = "<SQL_user_initial_password>"
expected_cluster_state = "{Suspended | Running}"
ldap_ssl_certs = [
"<ssl_cert_s3_path>"
]
ranger_certs_dir_path = "<ranger_config_s3_path>" // Example : "s3://your-bucket/ranger_config_dir"
resource_tags = {
celerdata = "<tag_name>"
}
csp = "{aws | azure}"
region = "<cloud_provider_region>"

init_scripts {
logs_dir = "<log_s3_path>"
script_path = "<script_s3_path>"
}
run_scripts_parallel = false
query_port = 9030
idle_suspend_interval = 60
}

Argument Reference

note

This section explains only the arguments of the celerdatabyoc_elastic_cluster_v2 resource. For the explanation of arguments of other resources, see the corresponding resource topics.

The celerdatabyoc_elastic_cluster_v2 resource contains the following required arguments and optional arguments:

Required:

  • cluster_name: (Not allowed to modify) The desired name for the cluster. Enter a unique name.

  • coordinator_node_size: The instance type for coordinator nodes in the cluster. Select a coordinator node instance type from the table "Supported Node Sizes". For example, you can set this argument to m6i.4xlarge.

  • deployment_credential_id: (Not allowed to modify) The ID of the deployment credential.

    • If you deploy the cluster on AWS, set this argument to celerdatabyoc_aws_deployment_role_credential.<resource_name>.id and replace <resource_name> with the name of the celerdatabyoc_aws_deployment_role_credential resource.
    • If you deploy the cluster on Azure, set this argument to celerdatabyoc_azure_deployment_credential.<resource_name>.id and replace <resource_name> with the name of the celerdatabyoc_azure_deployment_credential resource.
  • data_credential_id: (Not allowed to modify) The ID of the data credential.

    • If you deploy the cluster on AWS, set this argument to celerdatabyoc_aws_data_credential.<resource_name>.id and replace <resource_name> with the name of the celerdatabyoc_aws_data_credential resource.
    • If you deploy the cluster on Azure, set this argument to celerdatabyoc_azure_data_credential.<resource_name>.id and replace <resource_name> with the name of the celerdatabyoc_azure_data_credential resource.
  • network_id: (Not allowed to modify) The ID of the network configuration.

    • If you deploy the cluster on AWS, set this argument to celerdatabyoc_aws_network.<resource_name>.id and replace <resource_name> with the name of the celerdatabyoc_aws_network resource.
    • If you deploy the cluster on Azure, set this argument to celerdatabyoc_azure_network.<resource_name>.id and replace <resource_name> with the name of the celerdatabyoc_azure_network resource.
  • default_warehouse: (List of Object) The default warehouse. The attributes of a default warehouse include:

    • compute_node_size: (Required) The instance type for compute nodes in the cluster. Select a compute node instance type from the table "Supported Node Sizes". For example, you can set this argument to r6id.4xlarge.

    • compute_node_count: (Optional) The number of compute nodes in the cluster. Valid values: any non-zero positive integer. Default value: 3.

    • compute_node_volume_config: The compute nodes volume configuration.

      • vol_number: (Not allowed to modify) The number of disks for each compute node. Valid values: [1,24]. Default value: 2.
      • vol_size: The size per disk for each compute node. Unit: GB. Default value: 100. You can only increase the value of this parameter.
      • iops: (Available only for AWS) Disk IOPS.
      • throughput: (Available only for AWS) Disk throughput. ~> You can use the vol_number and vol_size arguments to specify the disk space. The total disk space provisioned to a compute node is equal to vol_number * vol_size.
    • compute_node_configs: The static configuration items you want to customize for the compute nodes in the warehouse.

    • auto_scaling_policy: (Optional) This policy will automatically scale the number of compute nodes, based on CPU utilization of the warehouse. For more information, see Enable Auto Scaling for your warehouse. You can generate the policy_json value for this argument using the celerdatabyoc_auto_scaling_policy resource.

    • distribution_policy: (Optional, available only for AWS) The compute node distribution policy for the warehouse if you want to enable Multi-AZ deployment for the cluster. Valid values: specify_az (Nodes are deployed in the primary availability zone) and crossing_az (Nodes are deployed across the three availability zone). For more information, see Multi-AZ Deployment.

      ~> To enable Multi-AZ Deployment, you must deploy at least 3 coordinator nodes, that is, coordinator_node_count must be greater or equal to 3.

    • specify_az: (Optional, available only for AWS) The primary availability zone for node deployment. This argument is available only when distribution_policy is set to specify_az.

  • custom_ami: (Optional, available only for AWS) The Amazon Machine Image (AMI) used to deploy the cluster. You can use custom AMI for deployment. You can only specify this parameter when creating the cluster. If this argument is not specified, the default AMI is used.

    • ami: The ID of the custom AMI.
    • os: The operating system (OS) of the AMI. Currently only al2023 (Amazon Linux 2023) is supported. The value of this field must be consistent with the actual OS of the AMI. Otherwise, the deployment will fail.
  • default_admin_password: (Not allowed to modify) The initial password of the cluster admin user.

  • expected_cluster_state: When creating a cluster, you need to declare the status of the cluster you are creating. Cluster states are categorized as Suspended and Running. If you want the cluster to start after provisioning, set this argument to Running. If you do not do so, the cluster will be suspended after provisioning.

  • csp: (Not allowed to modify) The cloud service provider of the cluster.

    • If you deploy the cluster on AWS, set this argument to aws.
    • If you deploy the cluster on Azure, set this argument to azure.
  • region: (Not allowed to modify) The ID of the cloud provider region to which the network hosting the cluster belongs. See Supported cloud platforms and regions.

Optional:

  • coordinator_node_count: The number of coordinator nodes in the cluster. Valid values: 1, 3, and 5. Default value: 1. If you want to enable Multi-AZ Deployment (Available only for AWS), you must deploy at least 3 Coordinator Nodes, that is, coordinator_node_count must be greater or equal to 3.

  • coordinator_node_volume_config: The coordinator nodes volume configuration.

    • vol_size: The size per disk for each coordinator node. Unit: GB. Default value: 150. You can only increase the value of this parameter.
    • iops: (Available only for AWS) Disk IOPS.
    • throughput: (Available only for AWS) Disk throughput.
  • coordinator_node_configs: The coordinator node static configuration.

  • warehouse: (List of Object) The list of warehouses. The attributes of a warehouse include:

    • name: (Required) The warehouse name must be unique within the cluster and cannot be named "default_warehouse".

    • compute_node_size: (Required) The instance type for compute nodes in the cluster. Select a compute node instance type from the table "Supported Node Sizes". For example, you can set this argument to r6id.4xlarge.

    • compute_node_count: The number of compute nodes in the cluster. Valid values: any non-zero positive integer. Default value: 3.

    • compute_node_volume_config: The compute nodes volume configuration.

      • vol_number: (Not allowed to modify) The number of disks for each compute node. Valid values: [1,24]. Default value: 2.
      • vol_size: The size per disk for each compute node. Unit: GB. Default value: 100. You can only increase the value of this parameter.
      • iops: (Available only for AWS) Disk IOPS.
      • throughput: (Available only for AWS) Disk throughput. ~> You can use the vol_number and vol_size arguments to specify the disk space. The total disk space provisioned to a compute node is equal to vol_number * vol_size.
    • compute_node_configs: The compute node static configuration.

    • expected_state: When creating non-default warehouse, you can declare the status of the warehouse. Warehouse states are categorized as Suspended and Running. If you want the warehouse to start after provisioning, set this argument to Running. If you set this argument to Suspended, the warehouse will be suspended after provisioning.

    • idle_suspend_interval: The amount of time (in minutes) during which the warehouse can stay idle. After the specified time period elapses, the warehouse will be automatically suspended. To enable the Auto Suspend feature, set this argument to an integer with the range of 15 to 999999. To disable this feature again, remove this argument from your Terraform configuration.

    • auto_scaling_policy: This policy will automatically scale the number of Compute nodes (CN), based on CPU utilization of the warehouse. For more information, see Enable Auto Scaling for your warehouse. You can generate the policy_json value for this argument using the celerdatabyoc_auto_scaling_policy resource.

    • distribution_policy: (Available only for AWS) The Compute Node distribution policy for the warehouse if you want to enable Multi-AZ deployment for the cluster. Valid values: specify_az (Nodes are deployed in the primary availability zone) and crossing_az (Nodes are deployed across the three availability zone). For more information, see Multi-AZ Deployment.

      ~> To enable Multi-AZ Deployment, you must deploy at least 3 Coordinator Nodes, that is, coordinator_node_count must be greater or equal to 3.

    • specify_az: (Available only for AWS) The primary availability zone for node deployment. This argument is available only when distribution_policy is set to specify_az.

  • global_session_variables: Global session variables of the cluster. You can find all configurable variables by select VARIABLE_NAME from information_schema.global_variables;.

  • ldap_ssl_certs: (Available only for AWS) The path in the AWS S3 bucket that stores the LDAP SSL certificates. Multiple paths must be separated by commas (,). CelerData supports using LDAP over SSL by uploading the LDAP SSL certificates from S3. To allow CelerData to successfully fetch the certificates, you must grant the ListObject and GetObject permissions to CelerData. To delete the certificates uploaded, you only need to remove this argument.

  • ranger_certs_dir: (Available only for AWS) The parent dir path in the AWS S3 bucket that stores the Ranger SSL certificates. CelerData supports using Ranger over SSL by uploading the Ranger SSL certificates from S3. To allow CelerData to successfully fetch the certificates, you must grant the ListObject and GetObject permissions to CelerData. To delete the certificates uploaded, you only need to remove this argument.

note

You can only upload or delete LDAP or Ranger SSL certificates while the cluster's expected_cluster_state is set to Running.

  • resource_tags: The tags to be attached to the cluster (Please note that resource_tags is a concept in ClelerData. For AWS and Azure, it will be added as a tag to the corresponding resources. For GCP Cloud, it will be added as a label to the corresponding GCP resources).

  • init_scripts: (Available only for AWS) The configuration block to specify the paths to which scripts and script execution results are stored. The maximum number of executable scripts is 20. For information about the formats supported by these arguments, see scripts.logs_dir and scripts.script_path in Run scripts.

    • logs_dir: The path in the AWS S3 bucket to which script execution results are stored. This S3 bucket can be the same as or different from the S3 bucket you specify in the celerdatabyoc_aws_data_credential resource.
    • script_path: The path in the AWS S3 bucket that stores the scripts to run via Terraform. This S3 bucket must be the one you specify in the celerdatabyoc_aws_data_credential resource.
  • run_scripts_parallel: Whether to execute the scripts in parallel. Valid values: true and false. Default value: false.

  • run_scripts_timeout: The amount of time after which the script execution times out. Unit: Seconds. Default: 3600 (1 hour). The maximum value of this item is 21600 (6 hours).

  • query_port: The query port, which must be within the range of 1-65535 excluding 443. The default query port is port 9030. Note that this argument can be specified only at cluster deployment, and cannot be modified once it is set.

  • idle_suspend_interval: The amount of time (in minutes) during which the cluster can stay idle. After the specified time period elapses, the cluster will be automatically suspended. The Auto Suspend feature is disabled by default. To enable the Auto Suspend feature, set this argument to an integer with the range of 15-999999. To disable this feature again, remove this argument from your Terraform configuration.

  • scheduling_policy:(Optional, List) When specified. CelerData will automatically suspend the cluster to save the majority of costs on EC2 (only EBS costs will be incurred) and resume the cluster for usage as scheduled.

    • policy_name: (Required) Policy name.
    • description: (Optional) Explanation of this policy strategy.
    • active_days: (Required) Configure the date when the cluster scheduling policy is triggered. Available values:
      • MONDAY
      • TUESDAY
      • WEDNESDAY
      • THURSDAY
      • FRIDAY
      • SATURDAY
      • SUNDAY
    • time_zone: (Optional) Specify your IANA Time-Zone. Default: UTC.
    • resume_at: (Optional) Cluster auto resume time. resume_at and suspend_at cannot both be empty.
    • suspend_at: (Optional) Cluster auto suspend time.
    • enable: (Required) Whether to enable this scheduling policy. When specified as true, the system will perform cluster scheduling according to this policy.

See Also

AWS

Azure

GCP

Warehouse