End-to-End PrivateLink
AWS PrivateLink enables a secure connection between your VPC and the CelerData BYOC environment, preventing data exposure to the Internet. With End-to-End PrivateLink enabled, all traffic between your client and the CelerData BYOC environment is secured, ensuring that sensitive data and queries are protected.
This topic compares the general architecture, which may expose certain traffic to the public Internet, with the End-to-End PrivateLink solution, which routes all traffic securely through the AWS backbone network.
General architecture
In the standard setup, a CelerData cluster is created in an AWS public subnet through the CelerData Cloud BYOC Console. The communication flows as follows:
- Cloud Console access: The user logs in to the CelerData Cloud Console over the public Internet (channel 5), selects a cluster, and opens it.
- Cluster Console access: The user accesses the Cluster Console over the public Internet (channel 3).
- SQL query submission:
- The user submits SQL queries using the Cluster Console (channel 3). The Cluster Console forwards these queries to the Cluster (channel 2). After query is executed, the results are returned via the same channel.
- Alternatively, the user submits SQL queries directly to the cluster via the MySQL protocol (channel 4). After query is executed, the results are returned via the same channel.
- S3 Data Access: The cluster accesses query profiles and other data from S3 (channel 1).
The security concerns are:
- Channels 3, 4, and 5 expose traffic to the public Internet.
- Channels 1 and 2 use the AWS backbone network if VPC endpoints are configured; otherwise, they may also traverse the public Internet.
End-to-End PrivateLink solution
The End-to-End PrivateLink solution secures traffic by routing it through the AWS backbone network via VPC endpoints. Here’s how it works:
- VPC endpoint for connection (Channel 2): You can configure a VPC endpoint for channel 2 to merge the traffic previously on channel 3 with channel 2. This secures communication between the Cluster Console and the cluster.
- PrivateLink for S3: You can also configure a PrivateLink for S3 to secure the cluster’s access to S3 data (channel 1).
- Secure client access (Channel 4): You can access the cluster over a VPN configured by your cluster administrator. This ensures that SQL queries submitted via the MySQL protocol and their results remain within your VPC.
The key benefits of the End-to-End PrivateLink solution are:
- Public Internet access to the Cluster Console (channel 3-b) and direct MySQL protocol access (channel 4) is disabled, securing SQL queries and results.
- Temporary public access to the Cluster Console can be enabled for testing or special use cases but should remain disabled during normal operation.
- Channel 5 (Cloud Console access) is not secured by PrivateLink but does not involve SQL queries or sensitive data.
Deployment and Configuration
Enable End-to-End PrivateLink
You can enable the End-to-End PrivateLink solution for a cluster only when deploying the cluster.
You should first create a VPC endpoint for connection, copy the VPC endpoint ID, and paste it to the VPC endpoint ID field after you have enabled Advanced security settings in the STEP3 of the Deployment wizard. For more information, see Deployment on AWS.
If you want to use a private subnet without internet access (which is typically the case) for your cluster, ensure you also create a VPC endpoint for S3. Without this, the cluster will be unable to store query profile and other information in your S3 bucket. Additionally, it will be unable to read data from S3 when loading or querying data in external catalogs.
If you are unfamiliar with the PrivateLink solution and network configuration, it is recommended to Enable public access to the Cluster console for easier access after setup. When public access is enabled, cluster users can access the console via the public internet.
Once the cluster is successfully deployed and the PrivateLink solution is implemented, you can disable public access to secure the connection. This will restrict all users to accessing the cluster console only through PrivateLink. For more information, see Manage public access with PrivateLink.
When End-to-End PrivateLink is enabled, cluster users can access the cluster console:
- only via private subnet if public access is disabled by clicking the Open cluster button on the cluster details page.
- via both private subnet and public internet if public access is enabled.
- To access via public internet, click the Open cluster button directly on the cluster details page.
- To access via private subnet, select Open cluster from private link from the drop-down list next to the Open cluster button.