4 * Licensed to the Apache Software Foundation (ASF) under one
5 * or more contributor license agreements. See the NOTICE file
6 * distributed with this work for additional information
7 * regarding copyright ownership. The ASF licenses this file
8 * to you under the Apache License, Version 2.0 (the
9 * "License"); you may not use this file except in compliance
10 * with the License. You may obtain a copy of the License at
12 * http://www.apache.org/licenses/LICENSE-2.0
14 * Unless required by applicable law or agreed to in writing, software
15 * distributed under the License is distributed on an "AS IS" BASIS,
16 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
17 * See the License for the specific language governing permissions and
18 * limitations under the License.
23 = Securing Apache HBase
31 .Reporting Security Bugs
33 NOTE: To protect existing HBase installations from exploitation, please *do not* use JIRA to report security-related bugs. Instead, send your report to the mailing list private@hbase.apache.org, which allows anyone to send messages, but restricts who can read them. Someone on that list will contact you to follow up on your report.
35 HBase adheres to the Apache Software Foundation's policy on reported vulnerabilities, available at http://apache.org/security/.
37 If you wish to send an encrypted report, you can use the GPG details provided for the general ASF security list. This will likely increase the response time to your report.
42 HBase provides mechanisms to secure various components and aspects of HBase and how it relates to the rest of the Hadoop infrastructure, as well as clients and resources outside Hadoop.
44 === Using Secure HTTP (HTTPS) for the Web UI
46 A default HBase install uses insecure HTTP connections for Web UIs for the master and region servers.
47 To enable secure HTTP (HTTPS) connections instead, set `hbase.ssl.enabled` to `true` in _hbase-site.xml_(Please prepare SSL certificate and ssl configuration file in advance).
48 This does not change the port used by the Web UI.
49 To change the port for the web UI for a given HBase component, configure that port's setting in hbase-site.xml.
52 * `hbase.master.info.port`
53 * `hbase.regionserver.info.port`
55 .If you enable HTTPS, clients should avoid using the non-secure HTTP connection.
58 If you enable secure HTTP, clients should connect to HBase using the `https://` URL.
59 Clients using the `http://` URL will receive an HTTP response of `200`, but will not receive any data.
60 The following exception is logged:
63 javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?
66 This is because the same port is used for HTTP and HTTPS.
68 HBase uses Jetty for the Web UI.
69 Without modifying Jetty itself, it does not seem possible to configure Jetty to redirect one port to another on the same host.
70 See Nick Dimiduk's contribution on this link:http://stackoverflow.com/questions/20611815/redirect-from-http-to-https-in-jetty[Stack Overflow] thread for more information.
71 If you know how to fix this without opening a second port for HTTPS, patches are appreciated.
74 [[hbase.secure.spnego.ui]]
75 === Using SPNEGO for Kerberos authentication with Web UIs
77 Kerberos-authentication to HBase Web UIs can be enabled via configuring SPNEGO with the `hbase.security.authentication.ui`
78 property in _hbase-site.xml_. Enabling this authentication requires that HBase is also configured to use Kerberos authentication
79 for RPCs (e.g `hbase.security.authentication` = `kerberos`).
84 <name>hbase.security.authentication.ui</name>
85 <value>kerberos</value>
86 <description>Controls what kind of authentication should be used for the HBase web UIs.</description>
89 <name>hbase.security.authentication</name>
90 <value>kerberos</value>
91 <description>The Kerberos keytab file to use for SPNEGO authentication by the web server.</description>
95 A number of properties exist to configure SPNEGO authentication for the web server:
100 <name>hbase.security.authentication.spnego.kerberos.principal</name>
101 <value>HTTP/_HOST@EXAMPLE.COM</value>
102 <description>Required for SPNEGO, the Kerberos principal to use for SPNEGO authentication by the
103 web server. The _HOST keyword will be automatically substituted with the node's
104 hostname.</description>
107 <name>hbase.security.authentication.spnego.kerberos.keytab</name>
108 <value>/etc/security/keytabs/spnego.service.keytab</value>
109 <description>Required for SPNEGO, the Kerberos keytab file to use for SPNEGO authentication by the
110 web server.</description>
113 <name>hbase.security.authentication.spnego.kerberos.name.rules</name>
115 <description>Optional, Hadoop-style `auth_to_local` rules which will be parsed and used in the
116 handling of Kerberos principals</description>
119 <name>hbase.security.authentication.signature.secret.file</name>
121 <description>Optional, a file whose contents will be used as a secret to sign the HTTP cookies
122 as a part of the SPNEGO authentication handshake. If this is not provided, Java's `Random` library
123 will be used for the secret.</description>
127 === Defining administrators of the Web UI
129 In the previous section, we cover how to enable authentication for the Web UI via SPNEGO.
130 However, some portions of the Web UI could be used to impact the availability and performance
131 of an HBase cluster. As such, it is desirable to ensure that only those with proper authority
132 can interact with these sensitive endpoints.
134 HBase allows the adminstrators to be defined via a list of usernames or groups in hbase-site.xml
139 <name>hbase.security.authentication.spnego.admin.users</name>
143 <name>hbase.security.authentication.spnego.admin.groups</name>
148 The usernames are those which the Kerberos identity maps to, given the Hadoop `auth_to_local` rules
149 in core-site.xml. The groups here are the Unix groups associated with the mapped usernames.
151 Consider the following scenario to describe how the configuration properties operate. Consider
152 three users which are defined in the Kerberos KDC:
154 * `alice@COMPANY.COM`
156 * `charlie@COMPANY.COM`
158 The default Hadoop `auth_to_local` rules map these principals to the "shortname":
164 Unix groups membership define that `alice` is a member of the group `admins`.
165 `bob` and `charlie` are not members of the `admins` group.
170 <name>hbase.security.authentication.spnego.admin.users</name>
171 <value>charlie</value>
174 <name>hbase.security.authentication.spnego.admin.groups</name>
175 <value>admins</value>
179 Given the above configuration, `alice` is allowed to access sensitive endpoints in the Web UI
180 as she is a member of the `admins` group. `charlie` is also allowed to access sensitive endpoints
181 because he is explicitly listed as an admin in the configuration. `bob` is not allowed to access
182 sensitive endpoints because he is not a member of the `admins` group nor is listed as an explicit
183 admin user via `hbase.security.authentication.spnego.admin.users`, but can still use any
184 non-sensitive endpoints in the Web UI.
186 If it doesn't go without saying: non-authenticated users cannot access any part of the Web UI.
188 === Other UI security-related configuration
190 While it is a clear anti-pattern for HBase developers, the developers acknowledge that the HBase
191 configuration (including Hadoop configuration files) may contain sensitive information. As such,
192 a user may find that they do not want to expose the HBase service-level configuration to all
193 authenticated users. They may configure HBase to require a user must be an admin to access
194 the service-level configuration via the HBase UI. This configuration is *false* by default
195 (any authenticated user may access the configuration).
197 Users who wish to change this would set the following in their hbase-site.xml:
201 <name>hbase.security.authentication.ui.config.protected</name>
206 [[hbase.secure.configuration]]
207 == Secure Client Access to Apache HBase
209 Newer releases of Apache HBase (>= 0.92) support optional SASL authentication of clients.
210 See also Matteo Bertozzi's article on link:https://blog.cloudera.com/blog/2012/09/understanding-user-authentication-and-authorization-in-apache-hbase/[Understanding User Authentication and Authorization in Apache HBase].
212 This describes how to set up Apache HBase and clients for connection to secure HBase resources.
214 [[security.prerequisites]]
217 Hadoop Authentication Configuration::
218 To run HBase RPC with strong authentication, you must set `hbase.security.authentication` to `kerberos`.
219 In this case, you must also set `hadoop.security.authentication` to `kerberos` in core-site.xml.
220 Otherwise, you would be using strong authentication for HBase but not for the underlying HDFS, which would cancel out any benefit.
223 You need to have a working Kerberos KDC.
225 === Server-side Configuration for Secure Operation
227 First, refer to <<security.prerequisites,security.prerequisites>> and ensure that your underlying HDFS configuration is secure.
229 Add the following to the `hbase-site.xml` file on every server machine in the cluster:
234 <name>hbase.security.authentication</name>
235 <value>kerberos</value>
238 <name>hbase.security.authorization</name>
242 <name>hbase.coprocessor.region.classes</name>
243 <value>org.apache.hadoop.hbase.security.token.TokenProvider</value>
247 A full shutdown and restart of HBase service is required when deploying these configuration changes.
249 === Client-side Configuration for Secure Operation
251 First, refer to <<security.prerequisites>> and ensure that your underlying HDFS configuration is secure.
253 Add the following to the `hbase-site.xml` file on every client:
258 <name>hbase.security.authentication</name>
259 <value>kerberos</value>
263 Before 2.2.0 version, the client environment must be logged in to Kerberos from KDC or keytab via the `kinit` command before communication with the HBase cluster will be possible.
265 Since 2.2.0, client can specify the following configurations in `hbase-site.xml`:
269 <name>hbase.client.keytab.file</name>
270 <value>/local/path/to/client/keytab</value>
274 <name>hbase.client.keytab.principal</name>
275 <value>foo@EXAMPLE.COM</value>
278 Then application can automatically do the login and credential renewal jobs without client interference.
280 It's optional feature, client, who upgrades to 2.2.0, can still keep their login and credential renewal logic already did in older version, as long as keeping `hbase.client.keytab.file`
281 and `hbase.client.keytab.principal` are unset.
283 Be advised that if the `hbase.security.authentication` in the client- and server-side site files do not match, the client will not be able to communicate with the cluster.
285 Once HBase is configured for secure RPC it is possible to optionally configure encrypted communication.
286 To do so, add the following to the `hbase-site.xml` file on every client:
291 <name>hbase.rpc.protection</name>
292 <value>privacy</value>
296 This configuration property can also be set on a per-connection basis.
297 Set it in the `Configuration` supplied to `Table`:
301 Configuration conf = HBaseConfiguration.create();
302 Connection connection = ConnectionFactory.createConnection(conf);
303 conf.set("hbase.rpc.protection", "privacy");
304 try (Connection connection = ConnectionFactory.createConnection(conf);
305 Table table = connection.getTable(TableName.valueOf(tablename))) {
310 Expect a ~10% performance penalty for encrypted communication.
312 [[security.client.thrift]]
313 === Client-side Configuration for Secure Operation - Thrift Gateway
315 Add the following to the `hbase-site.xml` file for every Thrift gateway:
319 <name>hbase.thrift.keytab.file</name>
320 <value>/etc/hbase/conf/hbase.keytab</value>
323 <name>hbase.thrift.kerberos.principal</name>
324 <value>$USER/_HOST@HADOOP.LOCALDOMAIN</value>
325 <!-- TODO: This may need to be HTTP/_HOST@<REALM> and _HOST may not work.
326 You may have to put the concrete full hostname.
329 <!-- Add these if you need to configure a different DNS interface from the default -->
331 <name>hbase.thrift.dns.interface</name>
332 <value>default</value>
335 <name>hbase.thrift.dns.nameserver</name>
336 <value>default</value>
340 Substitute the appropriate credential and keytab for _$USER_ and _$KEYTAB_ respectively.
342 In order to use the Thrift API principal to interact with HBase, it is also necessary to add the `hbase.thrift.kerberos.principal` to the `_acl_` table.
343 For example, to give the Thrift API principal, `thrift_server`, administrative access, a command such as this one will suffice:
347 grant 'thrift_server', 'RWCA'
350 For more information about ACLs, please see the <<hbase.accesscontrol.configuration>> section
352 The Thrift gateway will authenticate with HBase using the supplied credential.
353 No authentication will be performed by the Thrift gateway itself.
354 All client access via the Thrift gateway will use the Thrift gateway's credential and have its privilege.
356 [[security.gateway.thrift]]
357 === Configure the Thrift Gateway to Authenticate on Behalf of the Client
359 <<security.client.thrift>> describes how to authenticate a Thrift client to HBase using a fixed user.
360 As an alternative, you can configure the Thrift gateway to authenticate to HBase on the client's behalf, and to access HBase using a proxy user.
361 This was implemented in link:https://issues.apache.org/jira/browse/HBASE-11349[HBASE-11349] for Thrift 1, and link:https://issues.apache.org/jira/browse/HBASE-11474[HBASE-11474] for Thrift 2.
363 .Limitations with Thrift Framed Transport
366 If you use framed transport, you cannot yet take advantage of this feature, because SASL does not work with Thrift framed transport at this time.
369 To enable it, do the following.
372 . Be sure Thrift is running in secure mode, by following the procedure described in <<security.client.thrift>>.
373 . Be sure that HBase is configured to allow proxy users, as described in <<security.rest.gateway>>.
374 . In _hbase-site.xml_ for each cluster node running a Thrift gateway, set the property `hbase.thrift.security.qop` to one of the following three values:
376 * `privacy` - authentication, integrity, and confidentiality checking.
377 * `integrity` - authentication and integrity checking
378 * `authentication` - authentication checking only
380 . Restart the Thrift gateway processes for the changes to take effect.
381 If a node is running Thrift, the output of the `jps` command will list a `ThriftServer` process.
382 To stop Thrift on a node, run the command `bin/hbase-daemon.sh stop thrift`.
383 To start Thrift on a node, run the command `bin/hbase-daemon.sh start thrift`.
385 [[security.gateway.thrift.doas]]
386 === Configure the Thrift Gateway to Use the `doAs` Feature
388 <<security.gateway.thrift>> describes how to configure the Thrift gateway to authenticate to HBase on the client's behalf, and to access HBase using a proxy user. The limitation of this approach is that after the client is initialized with a particular set of credentials, it cannot change these credentials during the session. The `doAs` feature provides a flexible way to impersonate multiple principals using the same client. This feature was implemented in link:https://issues.apache.org/jira/browse/HBASE-12640[HBASE-12640] for Thrift 1, but is currently not available for Thrift 2.
390 *To enable the `doAs` feature*, add the following to the _hbase-site.xml_ file for every Thrift gateway:
395 <name>hbase.regionserver.thrift.http</name>
399 <name>hbase.thrift.support.proxyuser</name>
404 *To allow proxy users* when using `doAs` impersonation, add the following to the _hbase-site.xml_ file for every HBase node:
409 <name>hadoop.security.authorization</name>
413 <name>hadoop.proxyuser.$USER.groups</name>
414 <value>$GROUPS</value>
417 <name>hadoop.proxyuser.$USER.hosts</name>
418 <value>$GROUPS</value>
423 link:https://github.com/apache/hbase/blob/master/hbase-examples/src/main/java/org/apache/hadoop/hbase/thrift/HttpDoAsClient.java[demo client]
424 to get an overall idea of how to use this feature in your client.
426 === Client-side Configuration for Secure Operation - REST Gateway
428 Add the following to the `hbase-site.xml` file for every REST gateway:
433 <name>hbase.rest.keytab.file</name>
434 <value>$KEYTAB</value>
437 <name>hbase.rest.kerberos.principal</name>
438 <value>$USER/_HOST@HADOOP.LOCALDOMAIN</value>
442 Substitute the appropriate credential and keytab for _$USER_ and _$KEYTAB_ respectively.
444 The REST gateway will authenticate with HBase using the supplied credential.
446 In order to use the REST API principal to interact with HBase, it is also necessary to add the `hbase.rest.kerberos.principal` to the `_acl_` table.
447 For example, to give the REST API principal, `rest_server`, administrative access, a command such as this one will suffice:
451 grant 'rest_server', 'RWCA'
454 For more information about ACLs, please see the <<hbase.accesscontrol.configuration>> section
456 HBase REST gateway supports link:https://hadoop.apache.org/docs/stable/hadoop-auth/index.html[SPNEGO HTTP authentication] for client access to the gateway.
457 To enable REST gateway Kerberos authentication for client access, add the following to the `hbase-site.xml` file for every REST gateway.
462 <name>hbase.rest.support.proxyuser</name>
466 <name>hbase.rest.authentication.type</name>
467 <value>kerberos</value>
470 <name>hbase.rest.authentication.kerberos.principal</name>
471 <value>HTTP/_HOST@HADOOP.LOCALDOMAIN</value>
474 <name>hbase.rest.authentication.kerberos.keytab</name>
475 <value>$KEYTAB</value>
477 <!-- Add these if you need to configure a different DNS interface from the default -->
479 <name>hbase.rest.dns.interface</name>
480 <value>default</value>
483 <name>hbase.rest.dns.nameserver</name>
484 <value>default</value>
488 Substitute the keytab for HTTP for _$KEYTAB_.
490 HBase REST gateway supports different 'hbase.rest.authentication.type': simple, kerberos.
491 You can also implement a custom authentication by implementing Hadoop AuthenticationHandler, then specify the full class name as 'hbase.rest.authentication.type' value.
492 For more information, refer to link:https://hadoop.apache.org/docs/stable/hadoop-auth/index.html[SPNEGO HTTP authentication].
494 [[security.rest.gateway]]
495 === REST Gateway Impersonation Configuration
497 By default, the REST gateway doesn't support impersonation.
498 It accesses the HBase on behalf of clients as the user configured as in the previous section.
499 To the HBase server, all requests are from the REST gateway user.
500 The actual users are unknown.
501 You can turn on the impersonation support.
502 With impersonation, the REST gateway user is a proxy user.
503 The HBase server knows the actual/real user of each request.
504 So it can apply proper authorizations.
506 To turn on REST gateway impersonation, we need to configure HBase servers (masters and region servers) to allow proxy users; configure REST gateway to enable impersonation.
508 To allow proxy users, add the following to the `hbase-site.xml` file for every HBase server:
513 <name>hadoop.security.authorization</name>
517 <name>hadoop.proxyuser.$USER.groups</name>
518 <value>$GROUPS</value>
521 <name>hadoop.proxyuser.$USER.hosts</name>
522 <value>$GROUPS</value>
526 Substitute the REST gateway proxy user for _$USER_, and the allowed group list for _$GROUPS_.
528 To enable REST gateway impersonation, add the following to the `hbase-site.xml` file for every REST gateway.
533 <name>hbase.rest.authentication.type</name>
534 <value>kerberos</value>
537 <name>hbase.rest.authentication.kerberos.principal</name>
538 <value>HTTP/_HOST@HADOOP.LOCALDOMAIN</value>
541 <name>hbase.rest.authentication.kerberos.keytab</name>
542 <value>$KEYTAB</value>
546 Substitute the keytab for HTTP for _$KEYTAB_.
548 [[hbase.secure.simpleconfiguration]]
549 == Simple User Access to Apache HBase
551 Newer releases of Apache HBase (>= 0.92) support optional SASL authentication of clients.
552 See also Matteo Bertozzi's article on link:https://blog.cloudera.com/blog/2012/09/understanding-user-authentication-and-authorization-in-apache-hbase/[Understanding User Authentication and Authorization in Apache HBase].
554 This describes how to set up Apache HBase and clients for simple user access to HBase resources.
556 === Simple versus Secure Access
558 The following section shows how to set up simple user access.
559 Simple user access is not a secure method of operating HBase.
560 This method is used to prevent users from making mistakes.
561 It can be used to mimic the Access Control using on a development system without having to set up Kerberos.
563 This method is not used to prevent malicious or hacking attempts.
564 To make HBase secure against these types of attacks, you must configure HBase for secure operation.
565 Refer to the section <<hbase.secure.configuration>> and complete all of the steps described there.
571 === Server-side Configuration for Simple User Access Operation
573 Add the following to the `hbase-site.xml` file on every server machine in the cluster:
578 <name>hbase.security.authentication</name>
579 <value>simple</value>
582 <name>hbase.security.authorization</name>
586 <name>hbase.coprocessor.master.classes</name>
587 <value>org.apache.hadoop.hbase.security.access.AccessController</value>
590 <name>hbase.coprocessor.region.classes</name>
591 <value>org.apache.hadoop.hbase.security.access.AccessController</value>
594 <name>hbase.coprocessor.regionserver.classes</name>
595 <value>org.apache.hadoop.hbase.security.access.AccessController</value>
599 For 0.94, add the following to the `hbase-site.xml` file on every server machine in the cluster:
604 <name>hbase.rpc.engine</name>
605 <value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
608 <name>hbase.coprocessor.master.classes</name>
609 <value>org.apache.hadoop.hbase.security.access.AccessController</value>
612 <name>hbase.coprocessor.region.classes</name>
613 <value>org.apache.hadoop.hbase.security.access.AccessController</value>
617 A full shutdown and restart of HBase service is required when deploying these configuration changes.
619 === Client-side Configuration for Simple User Access Operation
621 Add the following to the `hbase-site.xml` file on every client:
626 <name>hbase.security.authentication</name>
627 <value>simple</value>
631 For 0.94, add the following to the `hbase-site.xml` file on every server machine in the cluster:
636 <name>hbase.rpc.engine</name>
637 <value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
641 Be advised that if the `hbase.security.authentication` in the client- and server-side site files do not match, the client will not be able to communicate with the cluster.
643 ==== Client-side Configuration for Simple User Access Operation - Thrift Gateway
645 The Thrift gateway user will need access.
646 For example, to give the Thrift API user, `thrift_server`, administrative access, a command such as this one will suffice:
650 grant 'thrift_server', 'RWCA'
653 For more information about ACLs, please see the <<hbase.accesscontrol.configuration>> section
655 The Thrift gateway will authenticate with HBase using the supplied credential.
656 No authentication will be performed by the Thrift gateway itself.
657 All client access via the Thrift gateway will use the Thrift gateway's credential and have its privilege.
659 ==== Client-side Configuration for Simple User Access Operation - REST Gateway
661 The REST gateway will authenticate with HBase using the supplied credential.
662 No authentication will be performed by the REST gateway itself.
663 All client access via the REST gateway will use the REST gateway's credential and have its privilege.
665 The REST gateway user will need access.
666 For example, to give the REST API user, `rest_server`, administrative access, a command such as this one will suffice:
670 grant 'rest_server', 'RWCA'
673 For more information about ACLs, please see the <<hbase.accesscontrol.configuration>> section
675 It should be possible for clients to authenticate with the HBase cluster through the REST gateway in a pass-through manner via SPNEGO HTTP authentication.
678 == Securing Access to HDFS and ZooKeeper
679 Secure HBase requires secure ZooKeeper and HDFS so that users cannot access and/or modify the metadata and data from under HBase. HBase uses HDFS (or configured file system) to keep its data files as well as write ahead logs (WALs) and other data. HBase uses ZooKeeper to store some metadata for operations (master address, table locks, recovery state, etc).
681 === Securing ZooKeeper Data
682 ZooKeeper has a pluggable authentication mechanism to enable access from clients using different methods. ZooKeeper even allows authenticated and un-authenticated clients at the same time. The access to znodes can be restricted by providing Access Control Lists (ACLs) per znode. An ACL contains two components, the authentication method and the principal. ACLs are NOT enforced hierarchically. See link:https://zookeeper.apache.org/doc/r3.3.6/zookeeperProgrammers.html#sc_ZooKeeperPluggableAuthentication[ZooKeeper Programmers Guide] for details.
684 HBase daemons authenticate to ZooKeeper via SASL and kerberos (See <<zk.sasl.auth>>). HBase sets up the znode ACLs so that only the HBase user and the configured hbase superuser (`hbase.superuser`) can access and modify the data. In cases where ZooKeeper is used for service discovery or sharing state with the client, the znodes created by HBase will also allow anyone (regardless of authentication) to read these znodes (clusterId, master address, meta location, etc), but only the HBase user can modify them.
686 === Securing File System (HDFS) Data
687 All of the data under management is kept under the root directory in the file system (`hbase.rootdir`). Access to the data and WAL files in the filesystem should be restricted so that users cannot bypass the HBase layer, and peek at the underlying data files from the file system. HBase assumes the filesystem used (HDFS or other) enforces permissions hierarchically. If sufficient protection from the file system (both authorization and authentication) is not provided, HBase level authorization control (ACLs, visibility labels, etc) is meaningless since the user can always access the data from the file system.
689 HBase enforces the posix-like permissions 700 (`rwx------`) to its root directory. It means that only the HBase user can read or write the files in FS. The default setting can be changed by configuring `hbase.rootdir.perms` in hbase-site.xml. A restart of the active master is needed so that it changes the used permissions. For versions before 1.2.0, you can check whether HBASE-13780 is committed, and if not, you can manually set the permissions for the root directory if needed. Using HDFS, the command would be:
692 sudo -u hdfs hadoop fs -chmod 700 /hbase
694 You should change `/hbase` if you are using a different `hbase.rootdir`.
696 In secure mode, SecureBulkLoadEndpoint should be configured and used for properly handing of users files created from MR jobs to the HBase daemons and HBase user. The staging directory in the distributed file system used for bulk load (`hbase.bulkload.staging.dir`, defaults to `/tmp/hbase-staging`) should have (mode 711, or `rwx--x--x`) so that users can access the staging directory created under that parent directory, but cannot do any other operation. See <<hbase.secure.bulkload>> for how to configure SecureBulkLoadEndPoint.
698 == Securing Access To Your Data
700 After you have configured secure authentication between HBase client and server processes and gateways, you need to consider the security of your data itself.
701 HBase provides several strategies for securing your data:
703 * Role-based Access Control (RBAC) controls which users or groups can read and write to a given HBase resource or execute a coprocessor endpoint, using the familiar paradigm of roles.
704 * Visibility Labels which allow you to label cells and control access to labelled cells, to further restrict who can read or write to certain subsets of your data.
705 Visibility labels are stored as tags.
706 See <<hbase.tags,hbase.tags>> for more information.
707 * Transparent encryption of data at rest on the underlying filesystem, both in HFiles and in the WAL.
708 This protects your data at rest from an attacker who has access to the underlying filesystem, without the need to change the implementation of the client.
709 It can also protect against data leakage from improperly disposed disks, which can be important for legal and regulatory compliance.
711 Server-side configuration, administration, and implementation details of each of these features are discussed below, along with any performance trade-offs.
712 An example security configuration is given at the end, to show these features all used together, as they might be in a real-world scenario.
714 CAUTION: All aspects of security in HBase are in active development and evolving rapidly.
715 Any strategy you employ for security of your data should be thoroughly tested.
716 In addition, some of these features are still in the experimental stage of development.
717 To take advantage of many of these features, you must be running HBase 0.98+ and using the HFile v3 file format.
719 .Protecting Sensitive Files
722 Several procedures in this section require you to copy files between cluster nodes.
723 When copying keys, configuration files, or other files containing sensitive strings, use a secure method, such as `ssh`, to avoid leaking sensitive data.
726 [[security.data.basic.server.side]]
727 .Procedure: Basic Server-Side Configuration
728 . Enable HFile v3, by setting `hfile.format.version` to 3 in _hbase-site.xml_.
729 This is the default for HBase 1.0 and newer.
734 <name>hfile.format.version</name>
739 . Enable SASL and Kerberos authentication for RPC and ZooKeeper, as described in <<security.prerequisites,security.prerequisites>> and <<zk.sasl.auth>>.
744 [firstterm]_Tags_ are a feature of HFile v3.
745 A tag is a piece of metadata which is part of a cell, separate from the key, value, and version.
746 Tags are an implementation detail which provides a foundation for other security-related features such as cell-level ACLs and visibility labels.
747 Tags are stored in the HFiles themselves.
748 It is possible that in the future, tags will be used to implement other HBase features.
749 You don't need to know a lot about tags in order to use the security features they enable.
751 ==== Implementation Details
753 Every cell can have zero or more tags.
754 Every tag has a type and the actual tag byte array.
756 Just as row keys, column families, qualifiers and values can be encoded (see <<data.block.encoding.types,data.block.encoding.types>>), tags can also be encoded as well.
757 You can enable or disable tag encoding at the level of the column family, and it is enabled by default.
758 Use the `HColumnDescriptor#setCompressionTags(boolean compressTags)` method to manage encoding settings on a column family.
759 You also need to enable the DataBlockEncoder for the column family, for encoding of tags to take effect.
761 You can enable compression of each tag in the WAL, if WAL compression is also enabled, by setting the value of `hbase.regionserver.wal.tags.enablecompression` to `true` in _hbase-site.xml_.
762 Tag compression uses dictionary encoding.
764 Coprocessors that run server-side on RegionServers can perform get and set operations on cell Tags. Tags are stripped out at the RPC layer before the read response is sent back, so clients do not see these tags.
765 Tag compression is not supported when using WAL encryption.
767 [[hbase.accesscontrol.configuration]]
768 === Access Control Labels (ACLs)
772 ACLs in HBase are based upon a user's membership in or exclusion from groups, and a given group's permissions to access a given resource.
773 ACLs are implemented as a coprocessor called AccessController.
775 HBase does not maintain a private group mapping, but relies on a [firstterm]_Hadoop group mapper_, which maps between entities in a directory such as LDAP or Active Directory, and HBase users.
776 Any supported Hadoop group mapper will work.
777 Users are then granted specific permissions (Read, Write, Execute, Create, Admin) against resources (global, namespaces, tables, cells, or endpoints).
779 NOTE: With Kerberos and Access Control enabled, client access to HBase is authenticated and user data is private unless access has been explicitly granted.
781 HBase has a simpler security model than relational databases, especially in terms of client operations.
782 No distinction is made between an insert (new record) and update (of existing record), for example, as both collapse down into a Put.
784 ===== Understanding Access Levels
786 HBase access levels are granted independently of each other and allow for different types of operations at a given scope.
788 * _Read \(R)_ - can read data at the given scope
789 * _Write (W)_ - can write data at the given scope
790 * _Execute (X)_ - can execute coprocessor endpoints at the given scope
791 * _Create \(C)_ - can create tables or drop tables (even those they did not create) at the given scope
792 * _Admin (A)_ - can perform cluster operations such as balancing the cluster or assigning regions at the given scope
794 The possible scopes are:
796 * _Superuser_ - superusers can perform any operation available in HBase, to any resource.
797 The user who runs HBase on your cluster is a superuser, as are any principals assigned to the configuration property `hbase.superuser` in _hbase-site.xml_ on the HMaster.
798 * _Global_ - permissions granted at _global_ scope allow the admin to operate on all tables of the cluster.
799 * _Namespace_ - permissions granted at _namespace_ scope apply to all tables within a given namespace.
800 * _Table_ - permissions granted at _table_ scope apply to data or metadata within a given table.
801 * _ColumnFamily_ - permissions granted at _ColumnFamily_ scope apply to cells within that ColumnFamily.
802 * _Cell_ - permissions granted at _cell_ scope apply to that exact cell coordinate (key, value, timestamp). This allows for policy evolution along with data.
804 To change an ACL on a specific cell, write an updated cell with new ACL to the precise coordinates of the original.
806 If you have a multi-versioned schema and want to update ACLs on all visible versions, you need to write new cells for all visible versions.
807 The application has complete control over policy evolution.
809 The exception to the above rule is `append` and `increment` processing.
810 Appends and increments can carry an ACL in the operation.
811 If one is included in the operation, then it will be applied to the result of the `append` or `increment`.
812 Otherwise, the ACL of the existing cell you are appending to or incrementing is preserved.
815 The combination of access levels and scopes creates a matrix of possible access levels that can be granted to a user.
816 In a production environment, it is useful to think of access levels in terms of what is needed to do a specific job.
817 The following list describes appropriate access levels for some common types of HBase users.
818 It is important not to grant more access than is required for a given user to perform their required tasks.
820 * _Superusers_ - In a production system, only the HBase user should have superuser access.
821 In a development environment, an administrator may need superuser access in order to quickly control and manage the cluster.
822 However, this type of administrator should usually be a Global Admin rather than a superuser.
823 * _Global Admins_ - A global admin can perform tasks and access every table in HBase.
824 In a typical production environment, an admin should not have Read or Write permissions to data within tables.
825 * A global admin with Admin permissions can perform cluster-wide operations on the cluster, such as balancing, assigning or unassigning regions, or calling an explicit major compaction.
826 This is an operations role.
827 * A global admin with Create permissions can create or drop any table within HBase.
828 This is more of a DBA-type role.
830 In a production environment, it is likely that different users will have only one of Admin and Create permissions.
834 In the current implementation, a Global Admin with `Admin` permission can grant himself `Read` and `Write` permissions on a table and gain access to that table's data.
835 For this reason, only grant `Global Admin` permissions to trusted user who actually need them.
837 Also be aware that a `Global Admin` with `Create` permission can perform a `Put` operation on the ACL table, simulating a `grant` or `revoke` and circumventing the authorization check for `Global Admin` permissions.
839 Due to these issues, be cautious with granting `Global Admin` privileges.
842 * _Namespace Admins_ - a namespace admin with `Create` permissions can create or drop tables within that namespace, and take and restore snapshots.
843 A namespace admin with `Admin` permissions can perform operations such as splits or major compactions on tables within that namespace.
844 * _Table Admins_ - A table admin can perform administrative operations only on that table.
845 A table admin with `Create` permissions can create snapshots from that table or restore that table from a snapshot.
846 A table admin with `Admin` permissions can perform operations such as splits or major compactions on that table.
847 * _Users_ - Users can read or write data, or both.
848 Users can also execute coprocessor endpoints, if given `Executable` permissions.
850 .Real-World Example of Access Levels
851 [cols="1,1,1,1", options="header"]
858 | Senior Administrator
861 | Manages the cluster and gives access to Junior Administrators.
863 | Junior Administrator
866 | Creates tables and gives access to Table Administrators.
868 | Table Administrator
871 | Maintains a table from an operations point of view.
876 | Creates reports from HBase data.
881 | Puts data into HBase and uses HBase data to perform operations.
885 For more details on how ACLs map to specific HBase operations and tasks, see <<appendix_acl_matrix,appendix acl matrix>>.
887 ===== Implementation Details
889 Cell-level ACLs are implemented using tags (see <<hbase.tags>>). In order to use cell-level ACLs, you must be using HFile v3 and HBase 0.98 or newer.
891 . Files created by HBase are owned by the operating system user running the HBase process.
892 To interact with HBase files, you should use the API or bulk load facility.
893 . HBase does not model "roles" internally in HBase.
894 Instead, group names can be granted permissions.
895 This allows external modeling of roles via group membership.
896 Groups are created and manipulated externally to HBase, via the Hadoop group mapping service.
898 ===== Server-Side Configuration
900 . As a prerequisite, perform the steps in <<security.data.basic.server.side>>.
901 . Install and configure the AccessController coprocessor, by setting the following properties in _hbase-site.xml_.
902 These properties take a list of classes.
904 NOTE: If you use the AccessController along with the VisibilityController, the AccessController must come first in the list, because with both components active, the VisibilityController will delegate access control on its system tables to the AccessController.
905 For an example of using both together, see <<security.example.config>>.
910 <name>hbase.security.authorization</name>
914 <name>hbase.coprocessor.region.classes</name>
915 <value>org.apache.hadoop.hbase.security.access.AccessController, org.apache.hadoop.hbase.security.token.TokenProvider</value>
918 <name>hbase.coprocessor.master.classes</name>
919 <value>org.apache.hadoop.hbase.security.access.AccessController</value>
922 <name>hbase.coprocessor.regionserver.classes</name>
923 <value>org.apache.hadoop.hbase.security.access.AccessController</value>
926 <name>hbase.security.exec.permission.checks</name>
931 Optionally, you can enable transport security, by setting `hbase.rpc.protection` to `privacy`.
932 This requires HBase 0.98.4 or newer.
934 . Set up the Hadoop group mapper in the Hadoop namenode's _core-site.xml_.
935 This is a Hadoop file, not an HBase file.
936 Customize it to your site's needs.
937 Following is an example.
942 <name>hadoop.security.group.mapping</name>
943 <value>org.apache.hadoop.security.LdapGroupsMapping</value>
947 <name>hadoop.security.group.mapping.ldap.url</name>
948 <value>ldap://server</value>
952 <name>hadoop.security.group.mapping.ldap.bind.user</name>
953 <value>Administrator@example-ad.local</value>
957 <name>hadoop.security.group.mapping.ldap.bind.password</name>
962 <name>hadoop.security.group.mapping.ldap.base</name>
963 <value>dc=example-ad,dc=local</value>
967 <name>hadoop.security.group.mapping.ldap.search.filter.user</name>
968 <value>(&(objectClass=user)(sAMAccountName={0}))</value>
972 <name>hadoop.security.group.mapping.ldap.search.filter.group</name>
973 <value>(objectClass=group)</value>
977 <name>hadoop.security.group.mapping.ldap.search.attr.member</name>
978 <value>member</value>
982 <name>hadoop.security.group.mapping.ldap.search.attr.group.name</name>
986 . Optionally, enable the early-out evaluation strategy.
987 Prior to HBase 0.98.0, if a user was not granted access to a column family, or at least a column qualifier, an AccessDeniedException would be thrown.
988 HBase 0.98.0 removed this exception in order to allow cell-level exceptional grants.
989 To restore the old behavior in HBase 0.98.0-0.98.6, set `hbase.security.access.early_out` to `true` in _hbase-site.xml_.
990 In HBase 0.98.6, the default has been returned to `true`.
991 . Distribute your configuration and restart your cluster for changes to take effect.
992 . To test your configuration, log into HBase Shell as a given user and use the `whoami` command to report the groups your user is part of.
993 In this example, the user is reported as being a member of the `services` group.
997 service (auth:KERBEROS)
1002 ===== Administration
1004 Administration tasks can be performed from HBase Shell or via an API.
1009 Many of the API examples below are taken from source files _hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java_ and _hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/SecureTestUtil.java_.
1011 Neither the examples, nor the source files they are taken from, are part of the public HBase API, and are provided for illustration only.
1012 Refer to the official API for usage instructions.
1016 . User and Group Administration
1018 Users and groups are maintained external to HBase, in your directory.
1020 . Granting Access To A Namespace, Table, Column Family, or Cell
1022 There are a few different types of syntax for grant statements.
1023 The first, and most familiar, is as follows, with the table and column family being optional:
1027 grant 'user', 'RWXCA', 'TABLE', 'CF', 'CQ'
1030 Groups and users are granted access in the same way, but groups are prefixed with an `@` symbol.
1031 In the same way, tables and namespaces are specified in the same way, but namespaces are prefixed with an `@` symbol.
1033 It is also possible to grant multiple permissions against the same resource in a single statement, as in this example.
1034 The first sub-clause maps users to ACLs and the second sub-clause specifies the resource.
1036 NOTE: HBase Shell support for granting and revoking access at the cell level is for testing and verification support, and should not be employed for production use because it won't apply the permissions to cells that don't exist yet.
1037 The correct way to apply cell level permissions is to do so in the application code when storing the values.
1039 .ACL Granularity and Evaluation Order
1040 ACLs are evaluated from least granular to most granular, and when an ACL is reached that grants permission, evaluation stops.
1041 This means that cell ACLs do not override ACLs at less granularity.
1048 hbase> grant '@admins', 'RWXCA'
1054 hbase> grant 'service', 'RWXCA', '@test-NS'
1060 hbase> grant 'service', 'RWXCA', 'user'
1066 hbase> grant '@developers', 'RW', 'user', 'i'
1072 hbase> grant 'service, 'RW', 'user', 'i', 'foo'
1077 The syntax for granting cell ACLs uses the following syntax:
1081 { '<user-or-group>' => \
1082 '<permissions>', ... }, \
1083 { <scanner-specification> }
1086 * _<user-or-group>_ is the user or group name, prefixed with `@` in the case of a group.
1087 * _<permissions>_ is a string containing any or all of "RWXCA", though only R and W are meaningful at cell scope.
1088 * _<scanner-specification>_ is the scanner specification syntax and conventions used by the 'scan' shell command.
1089 For some examples of scanner specifications, issue the following HBase Shell command.
1096 If you need to enable cell acl,the hfile.format.version option in hbase-site.xml should be greater than or equal to 3,and the hbase.security.access.early_out option should be set to false.This example grants read access to the 'testuser' user and read/write access to the 'developers' group, on cells in the 'pii' column which match the filter.
1099 hbase> grant 'user', \
1100 { '@developers' => 'RW', 'testuser' => 'R' }, \
1101 { COLUMNS => 'pii', FILTER => "(PrefixFilter ('test'))" }
1104 The shell will run a scanner with the given criteria, rewrite the found cells with new ACLs, and store them back to their exact coordinates.
1110 The following example shows how to grant access at the table level.
1114 public static void grantOnTable(final HBaseTestingUtility util, final String user,
1115 final TableName table, final byte[] family, final byte[] qualifier,
1116 final Permission.Action... actions) throws Exception {
1117 SecureTestUtil.updateACLs(util, new Callable<Void>() {
1119 public Void call() throws Exception {
1120 try (Connection connection = ConnectionFactory.createConnection(util.getConfiguration())) {
1121 connection.getAdmin().grant(new UserPermission(user, Permission.newBuilder(table)
1122 .withFamily(family).withQualifier(qualifier).withActions(actions).build()),
1131 To grant permissions at the cell level, you can use the `Mutation.setACL` method:
1135 Mutation.setACL(String user, Permission perms)
1136 Mutation.setACL(Map<String, Permission> perms)
1139 Specifically, this example provides read permission to a user called `user1` on any cells contained in a particular Put operation:
1143 put.setACL(“user1”, new Permission(Permission.Action.READ))
1147 . Revoking Access Control From a Namespace, Table, Column Family, or Cell
1149 The `revoke` command and API are twins of the grant command and API, and the syntax is exactly the same.
1150 The only exception is that you cannot revoke permissions at the cell level.
1151 You can only revoke access that has previously been granted, and a `revoke` statement is not the same thing as explicit denial to a resource.
1153 NOTE: HBase Shell support for granting and revoking access is for testing and verification support, and should not be employed for production use because it won't apply the permissions to cells that don't exist yet.
1154 The correct way to apply cell-level permissions is to do so in the application code when storing the values.
1156 .Revoking Access To a Table
1160 public static void revokeFromTable(final HBaseTestingUtility util, final String user,
1161 final TableName table, final byte[] family, final byte[] qualifier,
1162 final Permission.Action... actions) throws Exception {
1163 SecureTestUtil.updateACLs(util, new Callable<Void>() {
1165 public Void call() throws Exception {
1166 try (Connection connection = ConnectionFactory.createConnection(util.getConfiguration())) {
1167 connection.getAdmin().revoke(new UserPermission(user, Permission.newBuilder(table)
1168 .withFamily(family).withQualifier(qualifier).withActions(actions).build()));
1177 . Showing a User's Effective Permissions
1181 hbase> user_permission 'user'
1183 hbase> user_permission '.*'
1185 hbase> user_permission JAVA_REGEX
1192 public static void verifyAllowed(User user, AccessTestAction action, int count) throws Exception {
1194 Object obj = user.runAs(action);
1195 if (obj != null && obj instanceof List<?>) {
1196 List<?> results = (List<?>) obj;
1197 if (results != null && results.isEmpty()) {
1198 fail("Empty non null results from action for user '" ` user.getShortName() ` "'");
1200 assertEquals(count, results.size());
1202 } catch (AccessDeniedException ade) {
1203 fail("Expected action to pass for user '" ` user.getShortName() ` "' but was denied");
1209 [[hbase.visibility.labels]]
1210 === Visibility Labels
1212 Visibility labels control can be used to only permit users or principals associated with a given label to read or access cells with that label.
1213 For instance, you might label a cell `top-secret`, and only grant access to that label to the `managers` group.
1214 Visibility labels are implemented using Tags, which are a feature of HFile v3, and allow you to store metadata on a per-cell basis.
1215 A label is a string, and labels can be combined into expressions by using logical operators (&, |, or !), and using parentheses for grouping.
1216 HBase does not do any kind of validation of expressions beyond basic well-formedness.
1217 Visibility labels have no meaning on their own, and may be used to denote sensitivity level, privilege level, or any other arbitrary semantic meaning.
1219 If a user's labels do not match a cell's label or expression, the user is denied access to the cell.
1221 In HBase 0.98.6 and newer, UTF-8 encoding is supported for visibility labels and expressions.
1222 When creating labels using the `addLabels(conf, labels)` method provided by the `org.apache.hadoop.hbase.security.visibility.VisibilityClient` class and passing labels in Authorizations via Scan or Get, labels can contain UTF-8 characters, as well as the logical operators normally used in visibility labels, with normal Java notations, without needing any escaping method.
1223 However, when you pass a CellVisibility expression via a Mutation, you must enclose the expression with the `CellVisibility.quote()` method if you use UTF-8 characters or logical operators.
1224 See `TestExpressionParser` and the source file _hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestScan.java_.
1226 A user adds visibility expressions to a cell during a Put operation.
1227 In the default configuration, the user does not need to have access to a label in order to label cells with it.
1228 This behavior is controlled by the configuration option `hbase.security.visibility.mutations.checkauths`.
1229 If you set this option to `true`, the labels the user is modifying as part of the mutation must be associated with the user, or the mutation will fail.
1230 Whether a user is authorized to read a labelled cell is determined during a Get or Scan, and results which the user is not allowed to read are filtered out.
1231 This incurs the same I/O penalty as if the results were returned, but reduces load on the network.
1233 Visibility labels can also be specified during Delete operations.
1234 For details about visibility labels and Deletes, see link:https://issues.apache.org/jira/browse/HBASE-10885[HBASE-10885].
1236 The user's effective label set is built in the RPC context when a request is first received by the RegionServer.
1237 The way that users are associated with labels is pluggable.
1238 The default plugin passes through labels specified in Authorizations added to the Get or Scan and checks those against the calling user's authenticated labels list.
1239 When the client passes labels for which the user is not authenticated, the default plugin drops them.
1240 You can pass a subset of user authenticated labels via the `Get#setAuthorizations(Authorizations(String,...))` and `Scan#setAuthorizations(Authorizations(String,...));` methods.
1242 Groups can be granted visibility labels the same way as users. Groups are prefixed with an @ symbol. When checking visibility labels of a user, the server will include the visibility labels of the groups of which the user is a member, together with the user's own labels.
1243 When the visibility labels are retrieved using API `VisibilityClient#getAuths` or Shell command `get_auths` for a user, we will return labels added specifically for that user alone, not the group level labels.
1245 Visibility label access checking is performed by the VisibilityController coprocessor.
1246 You can use interface `VisibilityLabelService` to provide a custom implementation and/or control the way that visibility labels are stored with cells.
1247 See the source file _hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithCustomVisLabService.java_ for one example.
1249 Visibility labels can be used in conjunction with ACLs.
1251 NOTE: The labels have to be explicitly defined before they can be used in visibility labels. See below for an example of how this can be done.
1253 NOTE: There is currently no way to determine which labels have been applied to a cell. See link:https://issues.apache.org/jira/browse/HBASE-12470[HBASE-12470] for details.
1255 NOTE: Visibility labels are not currently applied for superusers.
1257 .Examples of Visibility Expressions
1258 [cols="l,1", options="header"]
1264 | Allow access to users associated with the fulltime label.
1267 | Allow access to users not associated with the public label.
1269 | ( secret \| topsecret ) & !probationary
1270 | Allow access to users associated with either the secret or topsecret label and not associated with the probationary label.
1273 ==== Server-Side Configuration
1276 . As a prerequisite, perform the steps in <<security.data.basic.server.side>>.
1277 . Install and configure the VisibilityController coprocessor by setting the following properties in _hbase-site.xml_.
1278 These properties take a list of class names.
1283 <name>hbase.security.authorization</name>
1287 <name>hbase.coprocessor.region.classes</name>
1288 <value>org.apache.hadoop.hbase.security.visibility.VisibilityController</value>
1291 <name>hbase.coprocessor.master.classes</name>
1292 <value>org.apache.hadoop.hbase.security.visibility.VisibilityController</value>
1296 NOTE: If you use the AccessController and VisibilityController coprocessors together, the AccessController must come first in the list, because with both components active, the VisibilityController will delegate access control on its system tables to the AccessController.
1298 . Adjust Configuration
1300 By default, users can label cells with any label, including labels they are not associated with, which means that a user can Put data that he cannot read.
1301 For example, a user could label a cell with the (hypothetical) 'topsecret' label even if the user is not associated with that label.
1302 If you only want users to be able to label cells with labels they are associated with, set `hbase.security.visibility.mutations.checkauths` to `true`.
1303 In that case, the mutation will fail if it makes use of labels the user is not associated with.
1305 . Distribute your configuration and restart your cluster for changes to take effect.
1309 Administration tasks can be performed using the HBase Shell or the Java API.
1310 For defining the list of visibility labels and associating labels with users, the HBase Shell is probably simpler.
1315 Many of the Java API examples in this section are taken from the source file _hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabels.java_.
1316 Refer to that file or the API documentation for more context.
1318 Neither these examples, nor the source file they were taken from, are part of the public HBase API, and are provided for illustration only.
1319 Refer to the official API for usage instructions.
1323 . Define the List of Visibility Labels
1327 hbase> add_labels [ 'admin', 'service', 'developer', 'test' ]
1334 public static void addLabels() throws Exception {
1335 PrivilegedExceptionAction<VisibilityLabelsResponse> action = new PrivilegedExceptionAction<VisibilityLabelsResponse>() {
1336 public VisibilityLabelsResponse run() throws Exception {
1337 String[] labels = { SECRET, TOPSECRET, CONFIDENTIAL, PUBLIC, PRIVATE, COPYRIGHT, ACCENT,
1338 UNICODE_VIS_TAG, UC1, UC2 };
1340 VisibilityClient.addLabels(conf, labels);
1341 } catch (Throwable t) {
1342 throw new IOException(t);
1347 SUPERUSER.runAs(action);
1352 . Associate Labels with Users
1356 hbase> set_auths 'service', [ 'service' ]
1360 hbase> set_auths 'testuser', [ 'test' ]
1364 hbase> set_auths 'qa', [ 'test', 'developer' ]
1368 hbase> set_auths '@qagroup', [ 'test' ]
1375 public void testSetAndGetUserAuths() throws Throwable {
1376 final String user = "user1";
1377 PrivilegedExceptionAction<Void> action = new PrivilegedExceptionAction<Void>() {
1378 public Void run() throws Exception {
1379 String[] auths = { SECRET, CONFIDENTIAL };
1381 VisibilityClient.setAuths(conf, auths, user);
1382 } catch (Throwable e) {
1390 . Clear Labels From Users
1394 hbase> clear_auths 'service', [ 'service' ]
1398 hbase> clear_auths 'testuser', [ 'test' ]
1402 hbase> clear_auths 'qa', [ 'test', 'developer' ]
1406 hbase> clear_auths '@qagroup', [ 'test', 'developer' ]
1414 auths = new String[] { SECRET, PUBLIC, CONFIDENTIAL };
1415 VisibilityLabelsResponse response = null;
1417 response = VisibilityClient.clearAuths(conf, auths, user);
1418 } catch (Throwable e) {
1419 fail("Should not have failed");
1425 . Apply a Label or Expression to a Cell
1427 The label is only applied when data is written.
1428 The label is associated with a given version of the cell.
1432 hbase> set_visibility 'user', 'admin|service|developer', { COLUMNS => 'i' }
1436 hbase> set_visibility 'user', 'admin|service', { COLUMNS => 'pii' }
1440 hbase> set_visibility 'user', 'test', { COLUMNS => [ 'i', 'pii' ], FILTER => "(PrefixFilter ('test'))" }
1443 NOTE: HBase Shell support for applying labels or permissions to cells is for testing and verification support, and should not be employed for production use because it won't apply the labels to cells that don't exist yet.
1444 The correct way to apply cell level labels is to do so in the application code when storing the values.
1450 static Table createTableAndWriteDataWithLabels(TableName tableName, String... labelExps)
1452 Configuration conf = HBaseConfiguration.create();
1453 Connection connection = ConnectionFactory.createConnection(conf);
1456 table = TEST_UTIL.createTable(tableName, fam);
1458 List<Put> puts = new ArrayList<Put>();
1459 for (String labelExp : labelExps) {
1460 Put put = new Put(Bytes.toBytes("row" + i));
1461 put.add(fam, qual, HConstants.LATEST_TIMESTAMP, value);
1462 put.setCellVisibility(new CellVisibility(labelExp));
1468 if (table != null) {
1469 table.flushCommits();
1475 [[reading_cells_with_labels]]
1476 ==== Reading Cells with Labels
1478 When you issue a Scan or Get, HBase uses your default set of authorizations to
1479 filter out cells that you do not have access to. A superuser can set the default
1480 set of authorizations for a given user by using the `set_auths` HBase Shell command
1482 link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/security/visibility/VisibilityClient.html#setAuths-org.apache.hadoop.hbase.client.Connection-java.lang.String:A-java.lang.String-[VisibilityClient.setAuths()] method.
1484 You can specify a different authorization during the Scan or Get, by passing the
1485 AUTHORIZATIONS option in HBase Shell, or the
1486 link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setAuthorizations-org.apache.hadoop.hbase.security.visibility.Authorizations-[Scan.setAuthorizations()]
1487 method if you use the API. This authorization will be combined with your default
1488 set as an additional filter. It will further filter your results, rather than
1489 giving you additional authorization.
1493 hbase> get_auths 'myUser'
1494 hbase> scan 'table1', AUTHORIZATIONS => ['private']
1502 public Void run() throws Exception {
1503 String[] auths1 = { SECRET, CONFIDENTIAL };
1504 GetAuthsResponse authsResponse = null;
1506 VisibilityClient.setAuths(conf, auths1, user);
1508 authsResponse = VisibilityClient.getAuths(conf, user);
1509 } catch (Throwable e) {
1510 fail("Should not have failed");
1512 } catch (Throwable e) {
1514 List<String> authsList = new ArrayList<String>();
1515 for (ByteString authBS : authsResponse.getAuthList()) {
1516 authsList.add(Bytes.toString(authBS.toByteArray()));
1518 assertEquals(2, authsList.size());
1519 assertTrue(authsList.contains(SECRET));
1520 assertTrue(authsList.contains(CONFIDENTIAL));
1529 ==== Implementing Your Own Visibility Label Algorithm
1531 Interpreting the labels authenticated for a given get/scan request is a pluggable algorithm.
1533 You can specify a custom plugin or plugins by using the property `hbase.regionserver.scan.visibility.label.generator.class`. The output for the first `ScanLabelGenerator` will be the input for the next one, until the end of the list.
1535 The default implementation, which was implemented in link:https://issues.apache.org/jira/browse/HBASE-12466[HBASE-12466], loads two plugins, `FeedUserAuthScanLabelGenerator` and `DefinedSetFilterScanLabelGenerator`. See <<reading_cells_with_labels>>.
1537 ==== Replicating Visibility Tags as Strings
1539 As mentioned in the above sections, the interface `VisibilityLabelService` could be used to implement a different way of storing the visibility expressions in the cells. Clusters with replication enabled also must replicate the visibility expressions to the peer cluster. If `DefaultVisibilityLabelServiceImpl` is used as the implementation for `VisibilityLabelService`, all the visibility expression are converted to the corresponding expression based on the ordinals for each visibility label stored in the labels table. During replication, visible cells are also replicated with the ordinal-based expression intact. The peer cluster may not have the same `labels` table with the same ordinal mapping for the visibility labels. In that case, replicating the ordinals makes no sense. It would be better if the replication occurred with the visibility expressions transmitted as strings. To replicate the visibility expression as strings to the peer cluster, create a `RegionServerObserver` configuration which works based on the implementation of the `VisibilityLabelService` interface. The configuration below enables replication of visibility expressions to peer clusters as strings. See link:https://issues.apache.org/jira/browse/HBASE-11639[HBASE-11639] for more details.
1544 <name>hbase.security.authorization</name>
1548 <name>hbase.coprocessor.regionserver.classes</name>
1549 <value>org.apache.hadoop.hbase.security.visibility.VisibilityController$VisibilityReplication</value>
1553 [[hbase.encryption.server]]
1554 === Transparent Encryption of Data At Rest
1556 HBase provides a mechanism for protecting your data at rest, in HFiles and the WAL, which reside within HDFS or another distributed filesystem.
1557 A two-tier architecture is used for flexible and non-intrusive key rotation.
1558 "Transparent" means that no implementation changes are needed on the client side.
1559 When data is written, it is encrypted.
1560 When it is read, it is decrypted on demand.
1564 The administrator provisions a master key for the cluster, which is stored in a key provider accessible to every trusted HBase process, including the HMaster, RegionServers, and clients (such as HBase Shell) on administrative workstations.
1565 The default key provider is integrated with the Java KeyStore API and any key management systems with support for it.
1566 Other custom key provider implementations are possible.
1567 The key retrieval mechanism is configured in the _hbase-site.xml_ configuration file.
1568 The master key may be stored on the cluster servers, protected by a secure KeyStore file, or on an external keyserver, or in a hardware security module.
1569 This master key is resolved as needed by HBase processes through the configured key provider.
1571 Next, encryption use can be specified in the schema, per column family, by creating or modifying a column descriptor to include two additional attributes: the name of the encryption algorithm to use (currently only "AES" is supported), and optionally, a data key wrapped (encrypted) with the cluster master key.
1572 If a data key is not explicitly configured for a ColumnFamily, HBase will create a random data key per HFile.
1573 This provides an incremental improvement in security over the alternative.
1574 Unless you need to supply an explicit data key, such as in a case where you are generating encrypted HFiles for bulk import with a given data key, only specify the encryption algorithm in the ColumnFamily schema metadata and let HBase create data keys on demand.
1575 Per Column Family keys facilitate low impact incremental key rotation and reduce the scope of any external leak of key material.
1576 The wrapped data key is stored in the ColumnFamily schema metadata, and in each HFile for the Column Family, encrypted with the cluster master key.
1577 After the Column Family is configured for encryption, any new HFiles will be written encrypted.
1578 To ensure encryption of all HFiles, trigger a major compaction after enabling this feature.
1580 When the HFile is opened, the data key is extracted from the HFile, decrypted with the cluster master key, and used for decryption of the remainder of the HFile.
1581 The HFile will be unreadable if the master key is not available.
1582 If a remote user somehow acquires access to the HFile data because of some lapse in HDFS permissions, or from inappropriately discarded media, it will not be possible to decrypt either the data key or the file data.
1584 It is also possible to encrypt the WAL.
1585 Even though WALs are transient, it is necessary to encrypt the WALEdits to avoid circumventing HFile protections for encrypted column families, in the event that the underlying filesystem is compromised.
1586 When WAL encryption is enabled, all WALs are encrypted, regardless of whether the relevant HFiles are encrypted.
1589 ==== Enable or disable the feature.
1591 The "Transparent Encryption of Data At Rest" feature is enabled by default, meaning the users can
1592 define tables with column families where the HFiles and WAL files will be encrypted by HBase,
1593 assuming the feature is properly configured (see <<hbase.encryption.server.configuration>>).
1595 In some cases (e.g. due to custom security policies), the operator of the HBase cluster might wish
1596 to only rely on an encryption at rest mechanism outside of HBase (e.g. those offered by HDFS) and
1597 wants to ensure that HBase's encryption at rest system is inactive. Since
1598 link:https://issues.apache.org/jira/browse/HBASE-25181[HBASE-25181] it is possible to explicitly
1599 disable HBase's own encryption by setting `hbase.crypto.enabled` to `false`. This configuration is
1600 `true` by default. If it is set to `false`, the users won't be able to create any table
1601 (column family) with HFile and WAL file encryption and the related create table shell (or API)
1602 commands will fail if they try.
1604 [[hbase.encryption.server.configuration]]
1605 ==== Server-Side Configuration
1607 This procedure assumes you are using the default Java keystore implementation.
1608 If you are using a custom implementation, check its documentation and adjust accordingly.
1611 . Create a secret key of appropriate length for AES encryption, using the
1616 $ keytool -keystore /path/to/hbase/conf/hbase.jks \
1617 -storetype jceks -storepass **** \
1618 -genseckey -keyalg AES -keysize 128 \
1622 Replace [replaceable]_****_ with the password for the keystore file and <alias> with the username of the HBase service account, or an arbitrary string.
1623 If you use an arbitrary string, you will need to configure HBase to use it, and that is covered below.
1624 Specify a keysize that is appropriate.
1625 Do not specify a separate password for the key, but press kbd:[Return] when prompted.
1627 . Set appropriate permissions on the keyfile and distribute it to all the HBase
1630 The previous command created a file called _hbase.jks_ in the HBase _conf/_ directory.
1631 Set the permissions and ownership on this file such that only the HBase service account user can read the file, and securely distribute the key to all HBase servers.
1633 . Configure the HBase daemons.
1635 Set the following properties in _hbase-site.xml_ on the region servers, to configure HBase daemons to use a key provider backed by the KeyStore file or retrieving the cluster master key.
1636 In the example below, replace [replaceable]_****_ with the password.
1641 <name>hbase.crypto.keyprovider</name>
1642 <value>org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider</value>
1645 <name>hbase.crypto.keyprovider.parameters</name>
1646 <value>jceks:///path/to/hbase/conf/hbase.jks?password=****</value>
1650 By default, the HBase service account name will be used to resolve the cluster master key.
1651 However, you can store it with an arbitrary alias (in the `keytool` command). In that case, set the following property to the alias you used.
1656 <name>hbase.crypto.master.key.name</name>
1657 <value>my-alias</value>
1661 You also need to be sure your HFiles use HFile v3, in order to use transparent encryption.
1662 This is the default configuration for HBase 1.0 onward.
1663 For previous versions, set the following property in your _hbase-site.xml_ file.
1668 <name>hfile.format.version</name>
1673 Optionally, you can use a different cipher provider, either a Java Cryptography Encryption (JCE) algorithm provider or a custom HBase cipher implementation.
1676 ** Install a signed JCE provider (supporting `AES/CTR/NoPadding` mode with 128 bit keys)
1677 ** Add it with highest preference to the JCE site configuration file _$JAVA_HOME/lib/security/java.security_.
1678 ** Update `hbase.crypto.algorithm.aes.provider` and `hbase.crypto.algorithm.rng.provider` options in [path]_hbase-site.xml_.
1680 * Custom HBase Cipher:
1681 ** Implement `org.apache.hadoop.hbase.io.crypto.CipherProvider`.
1682 ** Add the implementation to the server classpath.
1683 ** Update `hbase.crypto.cipherprovider` in _hbase-site.xml_.
1686 . Configure WAL encryption.
1688 Configure WAL encryption in every RegionServer's _hbase-site.xml_, by setting the following properties.
1689 You can include these in the HMaster's _hbase-site.xml_ as well, but the HMaster does not have a WAL and will not use them.
1694 <name>hbase.regionserver.hlog.reader.impl</name>
1695 <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader</value>
1698 <name>hbase.regionserver.hlog.writer.impl</name>
1699 <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter</value>
1702 <name>hbase.regionserver.wal.encryption</name>
1707 . (Optional) Configure encryption key hash algorithm.
1709 Since link:https://issues.apache.org/jira/browse/HBASE-25181[HBASE-25181] it is possible to use
1710 custom encryption key hash algorithm instead of the default MD5 algorithm. This hash is needed to
1711 verify the secret key during decryption. The MD5 algorithm is considered weak, and can not be used
1712 in some (e.g. FIPS compliant) clusters.
1714 The hash is set via the configuration option `hbase.crypto.key.hash.algorithm`. It should be set to
1715 a JDK `MessageDigest` algorithm like "MD5", "SHA-384" or "SHA-512". The default is "MD5" for
1716 backward compatibility. An example of this configuration parameter on a FIPS-compliant cluster:
1721 <name>hbase.crypto.key.hash.algorithm</name>
1722 <value>SHA-384</value>
1726 . Configure permissions on the _hbase-site.xml_ file.
1728 Because the keystore password is stored in the hbase-site.xml, you need to ensure that only the HBase user can read the _hbase-site.xml_ file, using file ownership and permissions.
1730 . Restart your cluster.
1732 Distribute the new configuration file to all nodes and restart your cluster.
1737 Administrative tasks can be performed in HBase Shell or the Java API.
1742 Java API examples in this section are taken from the source file _hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckEncryption.java_.
1745 Neither these examples, nor the source files they are taken from, are part of the public HBase API, and are provided for illustration only.
1746 Refer to the official API for usage instructions.
1749 Enable Encryption on a Column Family::
1750 To enable encryption on a column family, you can either use HBase Shell or the Java API.
1751 After enabling encryption, trigger a major compaction.
1752 When the major compaction completes, the compacted new HFiles will be encrypted.
1753 However, depending on the compaction settings, it is possible that not all the HFiles will be
1754 rewritten during a major compaction and there still might remain some old unencrypted HFiles.
1755 Also please note, that the snapshots are immutable. So the snapshots taken before you enabled the
1756 encryption will still contain the unencrypted HFiles.
1758 Rotate the Data Key::
1759 To rotate the data key, first change the ColumnFamily key in the column descriptor, then trigger a major compaction.
1760 Until the compaction completes, the old HFiles will still be readable using the old key.
1761 During compaction, the compacted HFiles will be re-encrypted using the new data key.
1762 However, depending on the compaction settings, it is possible that not all the HFiles will be
1763 rewritten during a major compaction and there still might remain some old HFiles encrypted with the old key.
1764 Also please note, that the snapshots are immutable. So the snapshots taken before the changing of
1765 the encryption key will still contain the HFiles written using the old key.
1767 Switching Between Using a Random Data Key and Specifying A Key::
1768 If you configured a column family to use a specific key and you want to return to the default behavior of using a randomly-generated key for that column family, use the Java API to alter the `HColumnDescriptor` so that no value is sent with the key `ENCRYPTION_KEY`.
1770 Rotate the Master Key::
1771 To rotate the master key, first generate and distribute the new key.
1772 Then update the KeyStore to contain a new master key, and keep the old master key in the KeyStore using a different alias.
1773 Next, configure fallback to the old master key in the _hbase-site.xml_ file.
1776 [[hbase.secure.bulkload]]
1777 === Secure Bulk Load
1779 Bulk loading in secure mode is a bit more involved than normal setup, since the client has to transfer the ownership of the files generated from the MapReduce job to HBase.
1780 Secure bulk loading is implemented by a coprocessor, named
1781 link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/security/access/SecureBulkLoadEndpoint.html[SecureBulkLoadEndpoint],
1782 which uses a staging directory configured by the configuration property `hbase.bulkload.staging.dir`, which defaults to
1783 _/tmp/hbase-staging/_.
1785 .Secure Bulk Load Algorithm
1787 * One time only, create a staging directory which is world-traversable and owned by the user which runs HBase (mode 711, or `rwx--x--x`). A listing of this directory will look similar to the following:
1791 $ ls -ld /tmp/hbase-staging
1792 drwx--x--x 2 hbase hbase 68 3 Sep 14:54 /tmp/hbase-staging
1795 * A user writes out data to a secure output directory owned by that user.
1796 For example, _/user/foo/data_.
1797 * Internally, HBase creates a secret staging directory which is globally readable/writable (`-rwxrwxrwx, 777`). For example, _/tmp/hbase-staging/averylongandrandomdirectoryname_.
1798 The name and location of this directory is not exposed to the user.
1799 HBase manages creation and deletion of this directory.
1800 * The user makes the data world-readable and world-writable, moves it into the random staging directory, then calls the `SecureBulkLoadClient#bulkLoadHFiles` method.
1802 The strength of the security lies in the length and randomness of the secret directory.
1804 To enable secure bulk load, add the following properties to _hbase-site.xml_.
1809 <name>hbase.security.authorization</name>
1813 <name>hbase.bulkload.staging.dir</name>
1814 <value>/tmp/hbase-staging</value>
1817 <name>hbase.coprocessor.region.classes</name>
1818 <value>org.apache.hadoop.hbase.security.token.TokenProvider,
1819 org.apache.hadoop.hbase.security.access.AccessController,org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint</value>
1823 [[hbase.secure.enable]]
1825 After hbase-2.x, the default 'hbase.security.authorization' changed.
1826 Before hbase-2.x, it defaulted to true, in later HBase versions, the
1827 default became false.
1828 So to enable hbase authorization, the following propertie must be configured in _hbase-site.xml_.
1829 See link:https://issues.apache.org/jira/browse/HBASE-19483[HBASE-19483];
1834 <name>hbase.security.authorization</name>
1839 [[security.example.config]]
1840 == Security Configuration Example
1842 This configuration example includes support for HFile v3, ACLs, Visibility Labels, and transparent encryption of data at rest and the WAL.
1843 All options have been discussed separately in the sections above.
1845 .Example Security Settings in _hbase-site.xml_
1849 <!-- HFile v3 Support -->
1851 <name>hfile.format.version</name>
1854 <!-- HBase Superuser -->
1856 <name>hbase.superuser</name>
1857 <value>hbase,admin</value>
1859 <!-- Coprocessors for ACLs and Visibility Tags -->
1861 <name>hbase.security.authorization</name>
1865 <name>hbase.coprocessor.region.classes</name>
1866 <value>org.apache.hadoop.hbase.security.access.AccessController,
1867 org.apache.hadoop.hbase.security.visibility.VisibilityController,
1868 org.apache.hadoop.hbase.security.token.TokenProvider</value>
1871 <name>hbase.coprocessor.master.classes</name>
1872 <value>org.apache.hadoop.hbase.security.access.AccessController,
1873 org.apache.hadoop.hbase.security.visibility.VisibilityController</value>
1876 <name>hbase.coprocessor.regionserver.classes</name>
1877 <value>org.apache.hadoop.hbase.security.access.AccessController</value>
1879 <!-- Executable ACL for Coprocessor Endpoints -->
1881 <name>hbase.security.exec.permission.checks</name>
1884 <!-- Whether a user needs authorization for a visibility tag to set it on a cell -->
1886 <name>hbase.security.visibility.mutations.checkauth</name>
1887 <value>false</value>
1889 <!-- Secure RPC Transport -->
1891 <name>hbase.rpc.protection</name>
1892 <value>privacy</value>
1894 <!-- Transparent Encryption -->
1896 <name>hbase.crypto.keyprovider</name>
1897 <value>org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider</value>
1900 <name>hbase.crypto.keyprovider.parameters</name>
1901 <value>jceks:///path/to/hbase/conf/hbase.jks?password=***</value>
1904 <name>hbase.crypto.master.key.name</name>
1905 <value>hbase</value>
1907 <!-- WAL Encryption -->
1909 <name>hbase.regionserver.hlog.reader.impl</name>
1910 <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader</value>
1913 <name>hbase.regionserver.hlog.writer.impl</name>
1914 <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter</value>
1917 <name>hbase.regionserver.wal.encryption</name>
1920 <!-- For key rotation -->
1922 <name>hbase.crypto.master.alternate.key.name</name>
1923 <value>hbase.old</value>
1925 <!-- Secure Bulk Load -->
1927 <name>hbase.bulkload.staging.dir</name>
1928 <value>/tmp/hbase-staging</value>
1931 <name>hbase.coprocessor.region.classes</name>
1932 <value>org.apache.hadoop.hbase.security.token.TokenProvider,
1933 org.apache.hadoop.hbase.security.access.AccessController,org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint</value>
1938 .Example Group Mapper in Hadoop _core-site.xml_
1940 Adjust these settings to suit your environment.
1945 <name>hadoop.security.group.mapping</name>
1946 <value>org.apache.hadoop.security.LdapGroupsMapping</value>
1949 <name>hadoop.security.group.mapping.ldap.url</name>
1950 <value>ldap://server</value>
1953 <name>hadoop.security.group.mapping.ldap.bind.user</name>
1954 <value>Administrator@example-ad.local</value>
1957 <name>hadoop.security.group.mapping.ldap.bind.password</name>
1958 <value>****</value> <!-- Replace with the actual password -->
1961 <name>hadoop.security.group.mapping.ldap.base</name>
1962 <value>dc=example-ad,dc=local</value>
1965 <name>hadoop.security.group.mapping.ldap.search.filter.user</name>
1966 <value>(&(objectClass=user)(sAMAccountName={0}))</value>
1969 <name>hadoop.security.group.mapping.ldap.search.filter.group</name>
1970 <value>(objectClass=group)</value>
1973 <name>hadoop.security.group.mapping.ldap.search.attr.member</name>
1974 <value>member</value>
1977 <name>hadoop.security.group.mapping.ldap.search.attr.group.name</name>