Deployment on GCP
CelerData provides a user-friendly deployment wizard that simplifies the process of deploying a classic or elastic cluster on GCP into four easy steps:
- STEP1: Configure the cluster resources
- STEP2: Set up your GCP credentials
- STEP3: Configure access to the cluster
- STEP4: Deploy the cluster on your GCP cloud
Start the deployment wizard
Follow these steps to start the deployment wizard:
- Sign in to the CelerData Cloud BYOC console.
- On the Clusters page, click Create cluster.
- In the dialog box that is displayed, choose Classic cluster or Elastic cluster, choose GCP as your cloud provider, and then click Next.
NOTE
For GCP, CelerData supports only manual deployments.
Configure and run a manual deployment
Before proceeding to create the following credentials and configurations, you must create a Google Cloud Project and enable necessary APIs.
To ensure a successful deployment, you must provide a data credential, a deployment credential, and a network configuration:
-
Data credential
A data credential declares read and write permissions on a Cloud Storage bucket, which is used to store query profiles. See Manage data credentials for GCP.
-
Deployment credential
A deployment credential references CelerData's public service account which is granted with necessary permissions to launch and manage cloud resources in your GCP project. It allows CelerData to launch the necessary resources required for your deployment and follow-up scaling. See Manage deployment credentials for GCP.
-
Network configuration
A network configuration enables connectivity between cluster nodes within your own VPC and between CelerData's VPC and your own VPC. See Manage network configurations for GCP.
After you start the deployment wizard (as explained in the preceding section "Start the deployment wizard"), you will be guided through four required steps (STEP1 to STEP4) for your deployment.
STEP1: Configure the cluster resources
Configure the cluster based on your business requirements, and optionally click Add label to add one or more labels to the cluster. Then, click Next to continue. The labels you add here will be attached to the GCP cloud resources associated with the cluster.
NOTE
- CelerData provides a Free Developer Tier. To use it, you must select FE and BE instance types that provide 4 CPU cores and 16-GB RAM.
- 4 CPU cores and 16-GB RAM are also the minimum configuration package for FEs and BEs.
-
For a classic cluster, configure the following configuration items.
Parameter Required Description Cluster name Yes Enter the name of the cluster. The name cannot be changed after the cluster is created. We recommend that you enter an informative name that can help you identify the cluster with ease at a later time. GCP region Yes Select the GCP region that hosts the cluster. For information about the regions supported by CelerData, see Supported cloud platforms and regions. FE HA mode No Enable or disable the FE HA mode. The FE HA mode is disabled by default. - If the FE HA mode is disabled, only one FE will be deployed. This setting is recommended if you create a proof-of-concept cluster to learn about what CelerData can do for you, or if you create a small cluster just for testing purposes.
- If the FE HA mode is enabled, three FEs will be deployed. This setting is recommended if you create a cluster for a production-ready environment. With three FEs, the cluster can process a lot more highly concurrent queries while ensuring high availability.
FE instance type Yes Select an instance type for the FE nodes in the cluster. For information about the instance types supported by CelerData, see Supported instance types. BE instance type Yes Select an instance type for the BE nodes in the cluster. For information about the instance types supported by CelerData, see Supported instance types. BE storage size Yes Specify the storage capacity that you want the BE nodes each to provide in the cluster. BE node count Yes Specify the number of BE nodes you want to deploy in the cluster. You can determine the number of BEs based on the amount of data to process. The default value is 3, because CelerData needs to store each table in three replicas on three different BEs. -
For an elastic cluster, configure the following configuration items.
Parameter Required Description Cluster name Yes Enter the name of the cluster. The name cannot be changed after the cluster is created. We recommend that you enter an informative name that can help you identify the cluster with ease at a later time. GCP region Yes Select the GCP region that hosts the cluster. For information about the regions supported by CelerData, see Supported cloud platforms and regions. Coordinator HA mode No Enable or disable the coordinator HA mode. The coordinator HA mode is disabled by default. - If the coordinator HA mode is disabled, only one coordinator will be deployed. This setting is recommended if you create a proof-of-concept cluster to learn about what CelerData can do for you, or if you create a small cluster just for testing purposes.
- If the coordinator HA mode is enabled, three coordinators will be deployed. This setting is recommended if you create a cluster for a production-ready environment. With three coordinators, the cluster can process a lot more highly concurrent queries while ensuring high availability.
Coordinator node size Yes Select an instance type for the coordinator nodes in the cluster. For information about the instance types supported by CelerData, see Supported instance types. Compute node size Yes Select an instance type for the compute nodes of the default warehouse in the cluster. For information about the instance types supported by CelerData, see Supported instance types. Compute storage size No Specify the storage size for the compute nodes of the default warehouse. You can also customize the number of volumes by ticking the box next to this field. This field is available for Hyperdisk-backed instance types only. Compute node count Yes Specify the number of compute nodes for the default warehouse in the cluster. You can determine the number of compute nodes based on the amount of data to process. The default value is 1. In Advance Settings, you can further define a storage autoscaling policy for FE and BE nodes in classic clusters or Coordinator Nodes in Elastic clusters. If the workload of your business is unpredictable and you cannot allocate a fixed number of storage volumes at cluster creation time, you can enable storage autoscaling for nodes in your CelerData cluster. With this feature enabled, CelerData automatically scales up the storage size when it detects that you are running out of the preset storage space.
Follow these steps:
-
Turn on the switch following the FE storage, BE storage, or Coordinator Storage to enable storage autoscaling for them respectively.
-
Set the storage usage threshold (in percentage) that triggers an autoscaling operation. You can set this threshold between 80% to 90%. When the storage usage of a node reached this threshold and lasted for over five minutes, CelerData will scale up its storage by the step size you defined in the following procedure.
-
Set the step size of each autoscaling operation. You can choose to set the step size in fixed size (GB) or percentage, for example, 50 GB or 15% (of the original storage size).
-
Set the maximum storage size of each node. CelerData will stop scaling up the storage when its size reaches this threshold.
NOTE
- A minimum of six hours is mandatory as the interval between two scaling operations (including manual scaling and autoscaling).
- The maximum size of each storage is 64 TB.
- Compute Nodes in Elastic clusters do not support autoscaling.
STEP2: Set up your GCP credentials
In this step, you need to create a new data credential and a new deployment credential or select existing ones that are automatically created by CelerData upon a previous successful deployment. After you complete the configurations, click Next to continue.
If you are new to CelerData, we recommend that you create a new data credential and a new deployment credential.
Choose to create new credentials
Before proceeding to create new credentials, you must create a Google Cloud Project and enable necessary APIs.
Select No, I need to manually create new credentials from scratch, as shown in the following figure. Then, create a data credential and a deployment credential.
Create a data credential
-
Sign in to your project in the Google Cloud console, follow the instructions provided in Create a Service Account for your Compute Engine to create a service account that can access your Cloud Storage bucket, copy the service account name from the Name field on the IAM & Admin > Service accounts page, copy the bucket name from the Name field on the Cloud Storage > Buckets page, and save them to a location that you can access later.
-
Return to the CelerData Cloud BYOC console. In the Data credential section, enter the name of your bucket in the Bucket name field, and paste the service account name to the Instance service account field.
The following table describes the fields in the Data credential section.
Field Required Description Bucket name Yes Enter the name of your bucket.
NOTE
When you create a cluster, you can only use a data credential that references a bucket located in the same region as the cluster.Instance service account Yes Enter the name of the instance service account that you have created to grant your CelerData cluster permission to access your bucket.
Create a deployment credential
-
In the CelerData Cloud BYOC console, click the copy button next to Grant access to to copy CelerData's public service account for cluster deployment.
-
Sign in to your project in the Google Cloud console, follow the instructions provided in Grant Google Cloud Resource Permissions to CelerData to grant CelerData necessary permissions for cluster deployment, copy the Project ID from the Project info section of your project dashboard, and save them to a location that you can access later.
-
Return to the CelerData Cloud BYOC console. In the Deployment credential section, paste your project ID to the Project ID field. Note its not the project name.
The following table describes the fields in the Deployment credential section.
Field Required Description Credential method N/A The type of credential that you use to control the permissions of CelerData to launch resources in GCP. The value is fixed at IAM role. Grant access to N/A CelerData's public service account used to launch and manage resources in your GCP project. You need to grant necessary permissions to this service account. Project ID Yes Enter the ID of the Google Cloud project that you have created to deploy your CelerData cluster in GCP. Note its not the project name.
Choose to select existing credentials
Select Yes, I hope to reuse the previous credentials and cloud storage, as shown in the following figure. Then, select an existing data credential and an existing deployment credential.
Select a data credential
In the Data credential section, expand the Reuse data credential drop-down list and select a data credential that belongs to the same GCP region (namely, the one you selected in STEP1) as the cluster.
NOTE
The drop-down list displays all data credentials that you have manually created and those that are automatically created by CelerData upon previous successful deployments.
After you select a data credential, CelerData automatically fills in the Bucket name and Instance service account fields.
The following table describes the fields in the Data credential section.
Field | Required | Description |
---|---|---|
Bucket name | Yes | Enter the name of your bucket. NOTE When you create a cluster, you can only use a data credential that references a bucket located in the same region as the cluster. |
Instance service account | Yes | Enter the name of the instance service account that you have created to grant CelerData permission to access your bucket. |
Select a deployment credential
In the Deployment credential section, expand the Reuse deployment credential drop-down list and select a deployment credential that belongs to the same GCP region (namely, the one you selected in STEP1) as the cluster.
NOTE
The drop-down list displays all deployment credentials that you have manually created and those that are automatically created by CelerData upon previous successful deployments.
After you select a deployment credential, CelerData automatically fills in the Project ID field.
The following table describes the field in the Deployment credential section.
Field | Required | Description |
---|---|---|
Project ID | Yes | Enter the ID of the Google Cloud project that you have created to launch and manage resources in GCP. |
STEP3: Configure access to the cluster
In this step, you need to:
-
Configure a network configuration. You can create a new network configuration or select an existing one that is automatically created by CelerData upon a previous successful deployment.
NOTE
CelerData allows you to reuse the same network configuration among multiple clusters, meaning that multiple clusters can share the same VPC and subnet.
-
Configure the cluster credential.
-
Test connectivity.
Configure a network configuration
In the Network configuration section, create a new network configuration or select an existing one.
If you are new to CelerData, we recommend that you create a new network configuration.
Create a new network configuration
-
Sign in to your project in the Google Cloud console, follow the instructions provided in Create a VPC Network, a Subnet, and Firewall Rules for CelerData on GCP to create a VPC network, a subnet, and firewall rules, copy the subnet name from the Subnets tab on the VPC network details page of your VPC network, copy the target tag that you set for the firewall rules in your VPC network from the Targets field on the Network Security > Firewall policies page, and save them to a location that you can access later.
NOTE
The subnet you use for cluster deployment must reside in the same region where the cluster is deployed (namely, the one you selected in STEP1).
-
Return to the CelerData Cloud BYOC console. In the Network Configuration section, paste the subnet name to the Subnet name field and the target tag that you set for the firewall rules to the Network tag field.
If you want to enable End-to-End Private Link, you need to tick Advanced security settings and paste your Private Service Connect Connection ID to the PSC connection ID field. Follow the instructions provided in Create a Private Service Connect Endpoint to create a Private Service Connect Endpoint. And if you still want to access the Cluster console via public network when Private Link is enabled, you need to tick Enable public access to the Cluster console. If public access is enabled, CelerData's VPC communicates with your own VPC over the Internet. For more information on CelerData's Private Link, see End-to-End Private Link Architecture and End-to-End Private Link Configuration and Deployment.
The following table describes the fields in the Network Configuration section.
Field Required Description Subnet name Yes The name of the subnet that you use to deploy cluster nodes for data analysis. Network tag Yes The target tag that you set for the firewall rules that you use to enable connectivity between cluster nodes within your own VPC and between CelerData's VPC and your own VPC over TLS. PSC connection ID No The ID of the Private Service Connect Connection that you create to allow direct, secure connectivity between CelerData's VPC and your own VPC.
For information about how to create a PSC Connection, see Create a Private Service Connect Endpoint.
NOTE
If PSC connection ID is not set, then the Private Service Connect connection will not be built, CelerData's VPC communicates with your own VPC over the Internet.
Select an existing network configuration
Expand the Reuse network configuration drop-down list and select a network configuration that belongs to the same GCP region (namely, the one you selected in STEP1) as the cluster.
NOTE
The drop-down list displays the network configurations that you have manually created and those that are automatically created by CelerData upon previous successful deployments.
After you select a network configuration, CelerData automatically fills in the Subnet name, Network tag, and (optional) PSC connection ID fields.
The following table describes the fields in the Network Configuration section.
Field | Required | Description |
---|---|---|
Subnet name | Yes | The name of the subnet that you use to deploy cluster nodes for data analysis. |
Network tag | Yes | The target tag that you set for the firewall rules that you use to enable connectivity between cluster nodes within your own VPC and between CelerData's VPC and your own VPC over TLS. |
PSC connection ID | No | The ID of the Private Service Connect endpoint that you create to allow direct, secure connectivity between CelerData's VPC and your own VPC. For information about how to create a PSC Connection, see Create a Private Service Connect Endpoint. NOTE If PSC connection ID is not set, then the Private Service Connect connection will not be built, CelerData's VPC communicates with your own VPC over the Internet. |
Configure the cluster credential
In the Cluster credential section, enter a password for the admin account in the Admin password field or click Random next to the Admin password field to obtain a password that is generated by CelerData.
NOTE
- The admin account is the administrator of the cluster and has all privileges enabled within your CelerData cloud account.
- After the cluster is created, you can navigate to the cluster details page and in the upper-right corner of the page choose Manage > Reset password to reset the password of the admin account. Note that you can reset the password of the admin account only when the cluster is in the Running state.
Test connectivity
-
Click Test connect to verify that CelerData's VPC can connect with your own VPC.
After the connection passes the test, the Start create button is enabled.
-
Click Start create to continue.
STEP4: Deploy the cluster on GCP
After you complete the preceding three steps, CelerData automatically launches cloud resources and deploys the cluster in your own VPC. This takes a few minutes.
When the deployment is complete, a message shown in the following figure appears.
You can click Preview Cluster in the message to view the cluster. You can also return to the Clusters page to view the cluster, which is in the Running state, upon successful deployment.
What's next
At any time, you can connect to the cluster from a JDBC driver or a MySQL client or by using the SQL Editor in the CelerData console. For more information, see Connect to a CelerData cluster.
You can also view and manage the cluster to your needs in the CelerData Cloud BYOC console: