End-to-End PrivateLink
AWS PrivateLink enables a secure connection between your VPC and the CelerData BYOC environment, preventing data exposure to the Internet. With End-to-End PrivateLink enabled, all traffic between your client and the CelerData BYOC environment is secured, ensuring that sensitive data and queries are protected.
This topic compares the general architecture, which may expose certain traffic to the public Internet, with the End-to-End PrivateLink solution, which routes all traffic securely through the AWS backbone network.
General architecture
In the standard setup, a CelerData cluster is created in an AWS public subnet through the CelerData Cloud BYOC Console. The communication flows as follows:
- Cloud Console access: The user logs in to the CelerData Cloud Console over the public Internet (channel 5), selects a cluster, and opens it.
- Cluster Console access: The user accesses the Cluster Console over the public Internet (channel 3).
- SQL query submission:
- The user submits SQL queries using the Cluster Console (channel 3). The Cluster Console forwards these queries to the Cluster (channel 2). After query is executed, the results are returned via the same channel.
- Alternatively, the user submits SQL queries directly to the cluster via the MySQL protocol (channel 4). After query is executed, the results are returned via the same channel.
- S3 Data Access: The cluster accesses query profiles and other data from S3 (channel 1).
The security concerns are:
- Channels 3, 4, and 5 expose traffic to the public Internet.
- Channels 1 and 2 use the AWS backbone network if VPC endpoints are configured; otherwise, they may also traverse the public Internet.
End-to-End PrivateLink solution
The End-to-End PrivateLink solution secures traffic by routing it through the AWS backbone network via VPC endpoints. Here’s how it works:
- VPC endpoint for connection (Channel 2): You can configure a VPC endpoint for channel 2 to merge the traffic previously on channel 3 with channel 2. This secures communication between the Cluster Console and the cluster.
- PrivateLink for S3: You can also configure a PrivateLink for S3 to secure the cluster’s access to S3 data (channel 1).
- Secure client access (Channel 4): You can access the cluster over a VPN configured by your cluster administrator. This ensures that SQL queries submitted via the MySQL protocol and their results remain within your VPC.
The key benefits of the End-to-End PrivateLink solution are:
- Public Internet access to the Cluster Console (channel 3-b) and direct MySQL protocol access (channel 4) is disabled, securing SQL queries and results.
- Temporary public access to the Cluster Console can be enabled for testing or special use cases but should remain disabled during normal operation.
- Channel 5 (Cloud Console access) is not secured by PrivateLink but does not involve SQL queries or sensitive data.
Usage
You can enable the End-to-End PrivateLink solution for a cluster only when deploying the cluster. You need to create a VPC endpoint for connection and a VPC endpoint for S3. For more information, see Deployment on AWS.