2 Licensed to the Apache Software Foundation (ASF) under one
3 or more contributor license agreements. See the NOTICE file
4 distributed with this work for additional information
5 regarding copyright ownership. The ASF licenses this file
6 to you under the Apache License, Version 2.0 (the
7 "License"); you may not use this file except in compliance
8 with the License. You may obtain a copy of the License at
10 http://www.apache.org/licenses/LICENSE-2.0
12 Unless required by applicable law or agreed to in writing, software
13 distributed under the License is distributed on an "AS IS" BASIS,
14 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15 See the License for the specific language governing permissions and
16 limitations under the License.
19 # Pluggable Authentication for HBase RPCs
23 As a distributed database, HBase must be able to authenticate users and HBase
24 services across an untrusted network. Clients and HBase services are treated
25 equivalently in terms of authentication (and this is the only time we will
26 draw such a distinction).
28 There are currently three modes of authentication which are supported by HBase
29 today via the configuration property `hbase.security.authentication`
35 `SIMPLE` authentication is effectively no authentication; HBase assumes the user
36 is who they claim to be. `KERBEROS` authenticates clients via the KerberosV5
37 protocol using the GSSAPI mechanism of the Java Simple Authentication and Security
38 Layer (SASL) protocol. `TOKEN` is a username-password based authentication protocol
39 which uses short-lived passwords that can only be obtained via a `KERBEROS` authenticated
40 request. `TOKEN` authentication is synonymous with Hadoop-style [Delegation Tokens](https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/hadoop_tokens.html#delegation-tokens). `TOKEN` authentication uses the `DIGEST-MD5`
43 [SASL](https://docs.oracle.com/javase/8/docs/technotes/guides/security/sasl/sasl-refguide.html)
44 is a library which specifies a network protocol that can authenticate a client
45 and a server using an arbitrary mechanism. SASL ships with a [number of mechanisms](https://www.iana.org/assignments/sasl-mechanisms/sasl-mechanisms.xhtml)
46 out of the box and it is possible to implement custom mechanisms. SASL is effectively
47 decoupling an RPC client-server model from the mechanism used to authenticate those
48 requests (e.g. the RPC code is identical whether username-password, Kerberos, or any
49 other method is used to authenticate the request).
51 RFC's define what [SASL mechanisms exist](https://www.iana.org/assignments/sasl-mechanisms/sasl-mechanisms.xml),
52 but what RFC's define are a superset of the mechanisms that are
53 [implemented in Java](https://docs.oracle.com/javase/8/docs/technotes/guides/security/sasl/sasl-refguide.html#SUN).
54 This document limits discussion to SASL mechanisms in the abstract, focusing on those which are well-defined and
55 implemented in Java today by the JDK itself. However, it is completely possible that a developer can implement
56 and register their own SASL mechanism. Writing a custom mechanism is outside of the scope of this document, but
57 not outside of the realm of possibility.
59 The `SIMPLE` implementation does not use SASL, but instead has its own RPC logic
60 built into the HBase RPC protocol. `KERBEROS` and `TOKEN` both use SASL to authenticate,
61 relying on the `Token` interface that is intertwined with the Hadoop `UserGroupInformation`
62 class. SASL decouples an RPC from the mechanism used to authenticate that request.
66 Despite HBase already shipping authentication implementations which leverage SASL,
67 it is (effectively) impossible to add a new authentication implementation to HBase. The
68 use of the `org.apache.hadoop.hbase.security.AuthMethod` enum makes it impossible
69 to define a new method of authentication. Also, the RPC implementation is written
70 to only use the methods that are expressly shipped in HBase. Adding a new authentication
71 method would require copying and modifying the RpcClient implementation, in addition
72 to modifying the RpcServer to invoke the correct authentication check.
74 While it is possible to add a new authentication method to HBase, it cannot be done
75 cleanly or sustainably. This is what is meant by "impossible".
79 HBase should expose interfaces which allow for pluggable authentication mechanisms
80 such that HBase can authenticate against external systems. Because the RPC implementation
81 can already support SASL, HBase can standardize on SASL, allowing any authentication method
82 which is capable of using SASL to negotiate authentication. `KERBEROS` and `TOKEN` methods
83 will naturally fit into these new interfaces, but `SIMPLE` authentication will not (see the following
84 chapter for a tangent on SIMPLE authentication today)
86 ### Tangent: on SIMPLE authentication
88 `SIMPLE` authentication in HBase today is treated as a special case. My impression is that
89 this stems from HBase not originally shipping an RPC solution that had any authentication.
91 Re-implementing `SIMPLE` authentication such that it also flows through SASL (e.g. via
92 the `PLAIN` SASL mechanism) would simplify the HBase codebase such that all authentication
93 occurs via SASL. This was not done for the initial implementation to reduce the scope
94 of the changeset. Changing `SIMPLE` authentication to use SASL may result in some
95 performance impact in setting up a new RPC. The same conditional logic to determine
96 `if (sasl) ... else SIMPLE` logic is propagated in this implementation.
98 ## Implementation Overview
100 HBASE-23347 includes a refactoring of HBase RPC authentication where all current methods
101 are ported to a new set of interfaces, and all RPC implementations are updated to use
102 the new interfaces. In the spirit of SASL, the expectation is that users can provide
103 their own authentication methods at runtime, and HBase should be capable of negotiating
104 a client who tries to authenticate via that custom authentication method. The implementation
105 refers to this "bundle" of client and server logic as an "authentication provider".
109 One authentication provider includes the following pieces:
111 1. Client-side logic (providing a credential)
112 2. Server-side logic (validating a credential from a client)
113 3. Client selection logic to choose a provider (from many that may be available)
115 A provider's client and server side logic are considered to be one-to-one. A `Foo` client-side provider
116 should never be used to authenticate against a `Bar` server-side provider.
118 We do expect that both clients and servers will have access to multiple providers. A server may
119 be capable of authenticating via methods which a client is unaware of. A client may attempt to authenticate
120 against a server which the server does not know how to process. In both cases, the RPC
121 should fail when a client and server do not have matching providers. The server identifies
122 client authentication mechanisms via a `byte authCode` (which is already sent today with HBase RPCs).
124 A client may also have multiple providers available for it to use in authenticating against
125 HBase. The client must have some logic to select which provider to use. Because we are
126 allowing custom providers, we must also allow a custom selection logic such that the
127 correct provider can be chosen. This is a formalization of the logic already present
128 in `org.apache.hadoop.hbase.security.token.AuthenticationTokenSelector`.
130 To enable the above, we have some new interfaces to support the user extensibility:
132 1. `interface SaslAuthenticationProvider`
133 2. `interface SaslClientAuthenticationProvider extends SaslAuthenticationProvider`
134 3. `interface SaslServerAuthenticationProvider extends SaslAuthenticationProvider`
135 4. `interface AuthenticationProviderSelector`
137 The `SaslAuthenticationProvider` shares logic which is common to the client and the
138 server (though, this is up to the developer to guarantee this). The client and server
139 interfaces each have logic specific to the HBase RPC client and HBase RPC server
140 codebase, as their name implies. As described above, an implementation
141 of one `SaslClientAuthenticationProvider` must match exactly one implementation of
142 `SaslServerAuthenticationProvider`. Each Authentication Provider implementation is
143 a singleton and is intended to be shared across all RPCs. A provider selector is
144 chosen per client based on that client's configuration.
146 A client authentication provider is uniquely identified among other providers
147 by the following characteristics:
149 1. A name, e.g. "KERBEROS", "TOKEN"
150 2. A byte (a value between 0 and 255)
152 In addition to these attributes, a provider also must define the following attributes:
154 3. The SASL mechanism being used.
155 4. The Hadoop AuthenticationMethod, e.g. "TOKEN", "KERBEROS", "CERTIFICATE"
156 5. The Token "kind", the name used to identify a TokenIdentifier, e.g. `HBASE_AUTH_TOKEN`
158 It is allowed (even expected) that there may be multiple providers that use `TOKEN` authentication.
160 N.b. Hadoop requires all `TokenIdentifier` implements to have a no-args constructor and a `ServiceLoader`
161 entry in their packaging JAR file (e.g. `META-INF/services/org.apache.hadoop.security.token.TokenIdentifier`).
162 Otherwise, parsing the `TokenIdentifier` on the server-side end of an RPC from a Hadoop `Token` will return
163 `null` to the caller (often, in the `CallbackHandler` implementation).
167 To ease development with these unknown set of providers, there are two classes which
168 find, instantiate, and cache the provider singletons.
170 1. Client side: `class SaslClientAuthenticationProviders`
171 2. Server side: `class SaslServerAuthenticationProviders`
173 These classes use [Java ServiceLoader](https://docs.oracle.com/javase/8/docs/api/java/util/ServiceLoader.html)
174 to find implementations available on the classpath. The provided HBase implementations
175 for the three out-of-the-box implementations all register themselves via the `ServiceLoader`.
177 Each class also enables providers to be added via explicit configuration in hbase-site.xml.
178 This enables unit tests to define custom implementations that may be toy/naive/unsafe without
179 any worry that these may be inadvertently deployed onto a production HBase cluster.