classic_cluster

note

The resource's API may change in subsequent versions to simplify user experience. Classic CelerData clusters do not support Multi-AZ Deployment. For more information, see Multi-AZ Deployment.

Deploy a classic CelerData cluster on AWS EC2 instances、Azure virtual machines or GCP virtual machines.

If you are setting up your first classic cluster, and it consists of one FE node and one BE node, both with instance types that offer up to 4 CPU cores and 16 GB RAM, then the cluster will be automatically created as a Free Developer Tier cluster. This allows you to explore and experience the features of CelerData Cloud BYOC at a minimal cost.

The implementation of this resource is part of the whole cluster deployment procedure and depends on the implementation of a data credential, a deployment credential, and a network configuration. For detailed procedures of cluster deployments on AWS、Azure and GCP, see Provision CelerData Cloud BYOC on AWS and Provision CelerData Cloud BYOC on Azure.

Supported Instance Types

For information about the instance types supported by CelerData, see Supported instance types.

Example Usage

resource "celerdatabyoc_classic_cluster" "classic_cluster_1" {
  deployment_credential_id = "<deployment_credential_resource_ID>"
  data_credential_id = "<data_credential_resource_ID>"
  network_id = "<network_configuration_resource_ID>"

  cluster_name = "<cluster_name>"
  fe_instance_type = "<fe_node_instance_type>"
  fe_node_count = 1
  
  // optional
  fe_volume_config {
    vol_size = <vol_size>
    iops = <iops>
    throughput = <throughput>
  }
  // optional
  fe_configs = {
    <key> = <value>
  }

  be_instance_type = "<be_node_instance_type>"
  be_node_count = 1
  
  // optional
  be_volume_config {
    vol_number = <vol_number>
    vol_size = <vol_size>
    iops = <iops>
    throughput = <throughput>
  }
  // optional
  be_configs = {
    <key> = <value>
  }

  // optional
  /*
  global_session_variables = {
    <key> = <value>
  }
  */

  // optional
  scheduling_policy {
    policy_name = "auto-resume-suspend"
    description = "Auto resume/suspend"
    active_days = ["TUESDAY"]
    time_zone = "UTC" // IANA Time-Zone
    resume_at   = "09:00"
    suspend_at  = "18:00"
    enable      = true
  }

  default_admin_password = "<SQL_user_initial_password>"

  expected_cluster_state = "{Suspended | Running}"
  ldap_ssl_certs = [
    "<ssl_cert_s3_path>"
  ]
  ranger_certs_dir_path = "<ranger_config_s3_path>" // Deprecated
  ranger_config_id = "<ranger_config_ID>"
  resource_tags = {
    celerdata = "<tag_name>"
  }
  csp = "{aws | azure}"
  region = "<cloud_provider_region>"

  init_scripts {
      logs_dir    = "<log_s3_path>"
      script_path = "<script_s3_path>"
  }
  
  # Unlike `init_scripts`, when the `rerun` attribute is true, the `scripts` will be executed once every time `terraform apply` is executed
  scripts {
    logs_dir    = "<log_s3_path>"
    script_path = "<script_s3_path>"
    rerun = false
  }

  run_scripts_parallel = false
  query_port = 9030
}

Argument Reference

note

This section explains only the arguments of the celerdatabyoc_classic_cluster resource. For the explanation of arguments of other resources, see the corresponding resource topics.

This resource contains the following required arguments and optional arguments:

Required:

cluster_name: (Not allowed to modify) The desired name for the cluster.
fe_instance_type: The instance type for FE nodes in the cluster. Select an FE instance type from the table "Supported Instance Types".
deployment_credential_id: (Not allowed to modify) The ID of the deployment credential.
- If you deploy the cluster on AWS, set this argument to celerdatabyoc_aws_deployment_role_credential.<resource_name>.id and replace <resource_name> with the name of the celerdatabyoc_aws_deployment_role_credential resource.
- If you deploy the cluster on Azure, set this argument to celerdatabyoc_azure_deployment_credential.<resource_name>.id and replace <resource_name> with the name of the celerdatabyoc_azure_deployment_credential resource.
data_credential_id: (Not allowed to modify) The ID of the data credential.
- If you deploy the cluster on AWS, set this argument to celerdatabyoc_aws_data_credential.<resource_name>.id and replace <resource_name> with the name of the celerdatabyoc_aws_data_credential resource.
- If you deploy the cluster on Azure, set this argument to celerdatabyoc_azure_data_credential.<resource_name>.id and replace <resource_name> with the name of the celerdatabyoc_azure_data_credential resource.
network_id: (Not allowed to modify) The ID of the network configuration.
- If you deploy the cluster on AWS, set this argument to celerdatabyoc_aws_network.<resource_name>.id and replace <resource_name> with the name of the celerdatabyoc_aws_network resource.
- If you deploy the cluster on Azure, set this argument to celerdatabyoc_azure_network.<resource_name>.id and replace <resource_name> with the name of the celerdatabyoc_azure_network resource.
be_instance_type: The instance type for BE nodes in the cluster. Select a BE instance type from the table "Supported Instance Types".
default_admin_password: (Not allowed to modify) The initial password of the cluster admin user.
expected_cluster_state: When creating a cluster, you need to declare the status of the cluster you are creating. Cluster states are categorized as Suspended and Running. If you want the cluster to start after provisioning, set this argument to Running. If you do not do so, the cluster will be suspended after provisioning.
csp: (Not allowed to modify) The cloud service provider of the cluster.
- If you deploy the cluster on AWS, set this argument to aws.
- If you deploy the cluster on Azure, set this argument to azure.
region: (Not allowed to modify) The ID of the cloud provider region to which the network hosting the cluster belongs. See Supported cloud platforms and regions.

Optional:

fe_node_count: The number of FE nodes in the cluster. Valid values: 1, 3, and 5. Default value: 1.
fe_volume_config: The FE nodes volume configuration.
- vol_size: The size per disk for each FE. Unit: GB. Default value: 150. You can only increase the value of this parameter.
- iops: Disk iops.
- throughput: Disk throughput.
fe_configs: The FE static configuration.
be_node_count: The number of BE nodes in the cluster. Valid values: any non-zero positive integer. Default value: 3.
be_volume_config: The BE nodes volume configuration.
- vol_number: (Not allowed to modify) The number of disks for each BE. Valid values: [1,24]. Default value: 2.
- vol_size: The size per disk for each BE. Unit: GB. Default value: 100. You can only increase the value of this parameter.
- iops: Disk iops.
- throughput: Disk throughput.

note

You can use the vol_number and vol_size arguments to specify the disk space. The total disk space provisioned to a cluster BE is equal to vol_number * vol_size.

be_configs: The BE static configuration.
global_session_variables: Global session variables of the cluster. You can find all configurable variables by select VARIABLE_NAME from information_schema.global_variables;.
ldap_ssl_certs: (Available only for AWS) The path in the AWS S3 bucket that stores the LDAP SSL certificates. Multiple paths must be separated by commas (,). CelerData supports using LDAP over SSL by uploading the LDAP SSL certificates from S3. To allow CelerData to successfully fetch the certificates, you must grant the ListObject and GetObject permissions to CelerData. To delete the certificates uploaded, you only need to remove this argument.
ranger_certs_dir: (Deprecated) The parent dir path in the AWS S3 bucket that stores the Ranger SSL certificates. CelerData supports using Ranger over SSL by uploading the Ranger SSL certificates from S3. To allow CelerData to successfully fetch the certificates, you must grant the ListObject and GetObject permissions to CelerData. To delete the certificates uploaded, you only need to remove this argument.

note

You can only upload or delete LDAP or Ranger SSL certificates while the cluster's expected_cluster_state is set to Running.

ranger_config_id: The ID of the Ranger configuration you want to apply to the cluster. You can set this argument to celerdatabyoc_ranger_config.<resource_name>.id or a hard-coded ID value.
resource_tags: The tags to be attached to the cluster (Please note that resource_tags is a concept in ClelerData. For AWS and Azure, it will be added as a tag to the corresponding resources. For GCP Cloud, it will be added as a label to the corresponding GCP resources).
init_scripts: (Available only for AWS) The configuration block to specify the paths to which scripts and script execution results are stored. The maximum number of executable scripts is 20. For information about the formats supported by these arguments, see scripts.logs_dir and scripts.script_path in Run scripts.
- logs_dir: The path in the AWS S3 bucket to which script execution results are stored. This S3 bucket can be the same as or different from the S3 bucket you specify in the celerdatabyoc_aws_data_credential resource.
- script_path: The path in the AWS S3 bucket that stores the scripts to run via Terraform. This S3 bucket must be the one you specify in the celerdatabyoc_aws_data_credential resource.
scripts: (Available only for AWS) The configuration block to specify the paths to which scripts and script execution results are stored. The maximum number of executable scripts is 20. For information about the formats supported by these arguments, see scripts.logs_dir and scripts.script_path in Run one-time scripts.
- logs_dir: The path in the AWS S3 bucket where the script execution results are stored. This S3 bucket can be the same as or different from the S3 bucket specified in the celerdatabyoc_aws_data_credential resource.
- script_path: The path in the AWS S3 bucket where the scripts run via Terraform are stored. This S3 bucket must be the one specified in the celerdatabyoc_aws_data_credential resource.
- rerun: If the value is true, the current script will be re-run every time terraform apply is executed.
run_scripts_parallel: Whether to execute the scripts in parallel. Valid values: true and false. Default value: false.
run_scripts_timeout: The amount of time after which the script execution times out. Unit: Seconds. Default: 3600 (1 hour). The maximum value of this item is 21600 (6 hours).
query_port: The query port, which must be within the range of 1-65535 excluding 443. The default query port is port 9030. Note that this argument can be specified only at cluster deployment, and cannot be modified once it is set.
idle_suspend_interval: The amount of time (in minutes) during which the cluster can stay idle. After the specified time period elapses, the cluster will be automatically suspended. The Auto Suspend feature is disabled by default, and therefore it is not included in the configuration example above. To enable the Auto Suspend feature, set this argument to an integer within the range of 15-999999. To disable this feature again, remove this argument from your Terraform configuration.
scheduling_policy:(Optional, List) When specified. CelerData will automatically suspend the cluster to save the majority of costs on EC2 (only EBS costs will be incurred) and resume the cluster for usage as scheduled.
- policy_name: (Required) Policy name.
- description: (Optional) Explanation of this policy strategy.
- active_days: (Required) Configure the date when the cluster scheduling policy is triggered. Available values:
  - MONDAY
  - TUESDAY
  - WEDNESDAY
  - THURSDAY
  - FRIDAY
  - SATURDAY
  - SUNDAY
- time_zone: (Optional) Specify your IANA Time-Zone. Default: UTC.
- resume_at: (Optional) Cluster auto resume time. resume_at and suspend_at cannot both be empty.
- suspend_at: (Optional) Cluster auto suspend time.
- enable: (Required) Whether to enable this scheduling policy. When specified as true, the system will perform cluster scheduling according to this policy.

Supported Instance Types​

Example Usage​

Argument Reference​

See Also​

AWS​

Azure​

GCP​

Supported Instance Types

Example Usage

Argument Reference

See Also

AWS

Azure

GCP