End-to-End Private Link
For security reasons, you may want to secure all the traffic between your client and the CelerData Cloud service and between your CelerData cluster and CelerData Cloud service, ensuring that sensitive data and queries are protected. We provide an End-to-End Private Link solution to meet this requirement by restricting all the traffic within the cloud provider's internal network.
The End-to-End Private Link solution depends on the AWS PrivateLink or GCP Private Service Connect.
This topic compares the general architecture, which may expose certain traffic to the public Internet, with the End-to-End Private Link solution, which routes all traffic securely through the cloud provider's backbone network. In the following paragraphs, we will use AWS as an example to illustrate the End-to-End Private Link solution in an AWS environment. The case is similar to GCP.
General architecture
In the standard setup, a CelerData cluster is created in a public subnet through the CelerData Cloud BYOC Console. Here, the public subnet means that the cluster can communicate with the CelerData Cloud service over the public internet and clients can access the cluster over the public internet too. The communication flows as follows:
- Cloud Console access: The user logs in to the CelerData Cloud Console over the public Internet (channel 5), selects a cluster, and opens it.
- Cluster Console access: The user accesses the Cluster Console over the public Internet (channel 3).
- SQL query submission:
- The user submits SQL queries using the Cluster Console (channel 3). The Cluster Console forwards these queries to the Cluster (channel 2). After query is executed, the results are returned via the same channel.
- Alternatively, the user submits SQL queries directly to the cluster via the MySQL protocol (channel 4). After query is executed, the results are returned via the same channel.
- S3 Data Access: The cluster accesses query profiles and other data from S3 (or Google Cloud Storage on GCP) (channel 1).
The security concerns are:
- Channels 3, 4, and 5 expose traffic to the public Internet.
- Channels 1 and 2 use the cloud provider's backbone network if VPC endpoints (or PSC endpoints in GCP) are configured; otherwise, they may also traverse the public Internet.
End-to-End Private Link solution
The End-to-End Private Link solution secures traffic by routing it through the cloud provider's backbone network via VPC endpoints (or PSC endpoints in GCP). Here’s how it works:
-
VPC endpoint (or PSC endpoint in GCP) for connection (Channel 2): You can configure a VPC endpoint for channel 2 to merge the traffic previously on channel 3 with channel 2. This secures communication between the Cluster Console and the cluster.
-
PrivateLink for S3: You can also configure a PrivateLink for S3 to secure the cluster’s access to S3 data (channel 1).
NOTE
For GCP, you need to configure network for private access to Google APIs and services.
-
Secure client access (Channel 4): You can access the cluster over a VPN configured by your cluster administrator. This ensures that SQL queries submitted via the MySQL protocol and their results remain within your VPC.
The key benefits of the End-to-End Private Link solution are:
- Public Internet access to the Cluster Console (channel 3-b) and direct MySQL protocol access (channel 4) is disabled, securing SQL queries and results.
- Temporary public access to the Cluster Console can be enabled for testing or special use cases but should remain disabled during normal operation.
- Channel 5 (Cloud Console access) is not secured by PrivateLink (or GCP Private Service Connect) but does not involve SQL queries or sensitive data.
Deployment and Configuration
Enable End-to-End Private Link
You can enable the End-to-End Private Link solution for a cluster only when deploying the cluster.
- In an AWS environment, you should first create a VPC endpoint for connection, copy the VPC endpoint ID, and paste it to the VPC endpoint ID field after you have enabled Advanced security settings in the STEP3 of the Deployment wizard. For more information, see Deployment on AWS.
- In a GCP environment, you should first create a Private Service Connect endpoint, copy the PSC Connection ID, and paste it to the PSC Connection ID field after you have enabled Advanced security settings in STEP3 of the Deployment wizard. For more information, see Deployment on GCP.
You can also use a private subnet without internet access (which is typically the case) for your cluster.
- For AWS, ensure you also create a VPC endpoint for S3. Without this, the cluster will be unable to store query profile and other information in your S3 bucket. Additionally, it will be unable to read data from S3 when loading or querying data in external catalogs.
- For GCP, besides creating a PSC endpoint, you also need to configure network for private access to Google APIs and services.
If you are unfamiliar with the Private Link solution and network configuration, it is recommended to Enable public access to the Cluster console for easier access after setup. When public access is enabled, cluster users can access the console via the public internet. For GCP, before disabling the public access, you must configure network for Private Link. Otherwise, you will be unable to connect to CelerData console.
Once the cluster is successfully deployed and the Private Link solution is implemented, you can disable public access to secure the connection. This will restrict all users to accessing the cluster console only through Private Link. For more information, see Manage public access with Private Link.
When End-to-End Private Link is enabled, cluster users can access the cluster console:
- only via private subnet if public access is disabled by clicking the Open cluster button on the cluster details page.
- via both private subnet and public internet if public access is enabled.
- To access via public internet, click the Open cluster button directly on the cluster details page.
- To access via private subnet, select Open cluster from private link from the drop-down list next to the Open cluster button.