Manage permissions with Apache Ranger

Apache Ranger provides a centralized security management framework that allows users to customize access policies through a visual web page. This helps determine which roles can access which data and exercise fine-grained data access control for various components and services in the Hadoop ecosystem.

Apache Ranger provides the following core modules:

  • Ranger Admin: the core module of Ranger with a built-in web page. Users can create and update security policies on this page or through a REST interface. Plugins of various components of the Hadoop ecosystem poll and pull these policies on a regular basis.
  • Agent Plugin: plugins of components embedded in the Hadoop ecosystem. These plugins pull security policies from Ranger Admin on a regular basis and store the policies in local files. When users access a component, the corresponding plugin assesses the request based on the configured security policy and sends the authentication results to the corresponding component.
  • User Sync: used to pull user and user group information, and synchronize the permission data of users and user groups to Ranger's database.

In addition to its built-in RBAC privilege system, StarRocks (the database engine that powers CelerData) can also integrate with Apache Ranger for access control. Currently, you can:

  • Create access policies, masking policies, and row-level filter policies through Apache Ranger.
  • Use Ranger audit logs.

Note that Ranger Servers that use Kerberos for authentication are not supported.

This topic describes the permission control methods with Ranger integration and the integration process. For information on how to create security policies on Ranger to manage data security, see the Apache Ranger official website.

For the resource objects and privileges you can configure, see privilege items.

Permission control method

StarRocks integrated with Apache Ranger provides the following permission control methods:

  • Global access control: Use Ranger to uniformly manage access to StarRocks internal tables, external tables, and other objects. Access control is performed according to the access policies configured in StarRocks Service.
  • Use Ranger to manage access to StarRocks internal tables and objects. For external catalogs, reuse the policies of the corresponding external service on Ranger (such as the Hive Service) for access control. StarRocks can match Ranger services with external catalogs.
  • Use StarRocks built-in RBAC system to manage access to StarRocks internal tables and objects. Reuse the policies of external services on Ranger to manage access to external data sources.

Authentication process

  • When users initiate a query, StarRocks parses the query to understand the required privileges and passes user information and required privileges to Apache Ranger. Ranger determines whether the privileges are valid based on the access policy configured in the corresponding service. If the user has access privileges, StarRocks returns the query data; if not, StarRocks returns an error.
  • You can also use LDAP for user authentication, then use Ranger to synchronize LDAP users and configure access rules for them. StarRocks can also complete user login authentication through LDAP.

Prerequisites

  • Apache Ranger 2.1.0 or later has been installed. For the instructions on how to install Apache Ranger, see Ranger quick start.

  • All FE nodes (or coordinator nodes for an elastic cluster) have network access to Apache Ranger. You can check this by running the following command on each node:

    telnet <ranger-ip> <ranger-port>

    If Connected to <ip> is displayed, the connection is successful.

Integrate StarRocks with Ranger

(Optional) Install ranger-starrocks-plugin

The main purpose of this step is to use Ranger's resource name autocomplete feature. When authoring policies in Ranger Admin, users need to enter the name of the resources whose access needs to be protected. To make it easier for users to enter the resource names, Ranger Admin provides the autocomplete feature, which looks up the available resources in the service that match the input entered so far and automatically completes the resource name. If you do not have the permissions to operate the Ranger cluster or do not need this feature, you can skip this step.

  1. Create the starrocks folder in the Ranger Admin directory ews/webapp/WEB-INF/classes/ranger-plugins.

    mkdir {path-to-ranger}/ews/webapp/WEB-INF/classes/ranger-plugins/starrocks
  2. Download plugin-starrocks/target/ranger-starrocks-plugin-3.0.0-SNAPSHOT.jar and mysql-connector-j, and place them in the starrocks folder.

  3. Restart Ranger Admin.

    ranger-admin restart

Configure StarRocks Service on Ranger Admin

This step configures the StarRocks Service on Ranger so that users can perform access control on StarRocks objects through Ranger.

  1. Copy ranger-servicedef-starrocks.json to any directory of the FE machine or Ranger machine.

    wget https://raw.githubusercontent.com/StarRocks/ranger/master/agents-common/src/main/resources/service-defs/ranger-servicedef-starrocks.json

    If you do not need Ranger's autocomplete feature (which means you did not install the ranger-starrocks-plugin), you must set implClass in the .json file to empty ("implClass": "",). If you need Ranger's autocomplete feature (which means you have installed the ranger-starrocks-plugin), you must set implClass in the .json file to org.apache.ranger.services.starrocks.RangerServiceStarRocks ("implClass": "org.apache.ranger.services.starrocks.RangerServiceStarRocks",).

  2. Add StarRocks Service by running the following command as a Ranger administrator.

    curl -u <ranger_adminuser>:<ranger_adminpwd> \
    -X POST -H "Accept: application/json" \
    -H "Content-Type: application/json" http://<ranger-ip>:<ranger-port>/service/plugins/definitions -d@ranger-servicedef-starrocks.json
  3. Access http://<ranger-ip>:<ranger-host>/login.jsp to log in to the Apache Ranger page. The STARROCKS service appears on the page.

    home

  4. Click the plus sign (+) after STARROCKS to configure StarRocks Service.

    service detail

    property

    • Service Name: You must enter a service name.
    • Display Name: The name you want to display for the service under STARROCKS. If it is not specified, Service Name will be displayed.
    • Username and Password: FE username and password, used to auto-complete object names when creating policies. The two parameters do not affect the connectivity between StarRocks and Ranger. If you want to use auto-completion, configure at least one user with the db_admin role activated.
    • jdbc.url: Enter the StarRocks FE IP address and port.

    The following figure shows a configuration example.

    example

    The following figure shows the added service.

    added service

  5. Click Test connection to test the connectivity, and save it after the connection is successful.

  6. On each FE node, create ranger-starrocks-security.xml in the fe/conf folder and copy the content. You must modify the following two parameters and save the modifications:

    • ranger.plugin.starrocks.service.name: Change to the name of the StarRocks Service you created in Step 4.
    • ranger.plugin.starrocks.policy.rest the url: Change to the address of the Ranger Admin.

    If you need to modify other configurations, refer to official documentation of Apache Ranger. For example, you can modify ranger.plugin.starrocks.policy.pollIntervalM to change the interval for pulling policy changes.

    vim ranger-starrocks-security.xml
    
    ...
        <property>
            <name>ranger.plugin.starrocks.service.name</name>
            <value>starrocks</value> -- Change it to the StarRocks Service name.
            <description>
                Name of the Ranger service containing policies for this StarRocks instance
            </description>
        </property>
    ...
    
    ...
        <property>
            <name>ranger.plugin.starrocks.policy.rest.url</name>
            <value>http://localhost:6080</value> -- Change it to Ranger Admin address.
            <description>
                URL to Ranger Admin
            </description>
        </property>   
    ...
  7. (Optional) If you want to use the Audit Log service of Ranger, you need to create the ranger-starrocks-audit.xml file in the fe/conf folder of each FE machine. Copy the content, replace solr_url in xasecure.audit.solr.solr_url with your own solr_url, and save the file.

  8. Add the configuration access_control = ranger to the configuration files of all FEs.

    vim fe.conf
    access_control=ranger 
  9. Restart all FEs.

    -- Switch to the FE folder. 
    cd..
    
    bin/stop_fe.sh
    bin/start_fe.sh

Reuse services on Ranger to control access to external data sources

For external catalogs, you can reuse the access policies of external services on Ranger (such as Hive Service) for access control. StarRocks supports matching external catalogs with Ranger external services. When users access an external table, the system implements access control by reusing the access policy of the Ranger Service that corresponds to that external table.

To use Ranger to control access to an external data source, you need to place Ranger-related configuration files ranger-hive-security.xml and ranger-hive-audit.xml in the cloud storage associated with the data credential of your cluster. The FE node (or coordinator node for elastic clusters) will automatically pull and apply the configuration from the files every time the cluster is resumed.

  1. Place Ranger-related configuration files ranger-hive-security.xml and ranger-hive-audit.xml in the same S3 bucket associated with the cluster's data credential.

  2. In the CelerData Cloud BYOC console, choose Clusters from the left-side navigation bar and click your cluster in the cluster list.

    The cluster for which you want to apply the configuration must be in the Running state.

  3. On the cluster details page, click the Cluster parameters tab.

  4. In the Apache Ranger integration in external catalog section, click Apply configuration.

    apply

    If the cluster is not running, the Apply configuration and Clean configuration buttons will be grayed out. The Last updated and Configuration file location fields indicate the time when you previously updated the configuration files and the file location. Each successful Apply operation will update these two fields. If you apply the configuration for the first time, the two fields are empty.

  5. In the displayed Configure your source file location dialog box, enter the S3 bucket where the configuration files are stored and the file prefix, and click Test. If the test succeeds, click Apply to apply the configuration. If the test fails, check whether the file path is correct and whether your cluster has the required privilege to access the bucket.

    config

    Then, the FE nodes (or coordinator nodes) will pull the configuration and restart.

  6. Configure the external catalog.

    • When you create an external catalog, add the property "ranger.plugin.hive.service.name".

        CREATE EXTERNAL CATALOG hive_catalog_1
        PROPERTIES (
            "type" = "hive",
            "hive.metastore.type" = "hive",
            "hive.metastore.uris" = "thrift://xx.xx.xx.xx:9083",
            "ranger.plugin.hive.service.name" = "<ranger_hive_service_name>"
        )
    • You can also add this property to an existing external catalog.

      ALTER CATALOG hive_catalog_1
      SET ("ranger.plugin.hive.service.name" = "<ranger_hive_service_name>");

      This operation changes the authentication method of an existing catalog to Ranger-based authentication.

If you want to clean the configuration, click Clean configuration in the Apache Ranger integration in external catalog section. Note that this operation can only be performed on a running cluster. In the Confirm to clean the Ranger Configuration dialog box, click Confirm. This operation will remove Ranger-related configuration files for the cluster and restart FE nodes (or coordinator nodes) one by one. After the files are cleaned, the Configuration file location field is empty and the Last updated field displays the time of the clean operation.

clean

What to do next

After adding a StarRocks Service, you can click the service to create access control policies for the service and assign different permissions to different users or user groups. When users access StarRocks data, access control will be implemented based on these policies.