4 * Licensed to the Apache Software Foundation (ASF) under one
5 * or more contributor license agreements. See the NOTICE file
6 * distributed with this work for additional information
7 * regarding copyright ownership. The ASF licenses this file
8 * to you under the Apache License, Version 2.0 (the
9 * "License"); you may not use this file except in compliance
10 * with the License. You may obtain a copy of the License at
12 * http://www.apache.org/licenses/LICENSE-2.0
14 * Unless required by applicable law or agreed to in writing, software
15 * distributed under the License is distributed on an "AS IS" BASIS,
16 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
17 * See the License for the specific language governing permissions and
18 * limitations under the License.
24 == 0.95 RPC Specification
31 In 0.95, all client/server communication is done with link:https://developers.google.com/protocol-buffers/[protobuf'ed] Messages rather than with link:https://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/Writable.html[Hadoop
33 Our RPC wire format therefore changes.
34 This document describes the client/server request/response protocol and our new RPC wire-format.
38 For what RPC is like in 0.94 and previous, see BenoƮt/Tsuna's link:https://github.com/OpenTSDB/asynchbase/blob/master/src/HBaseRpc.java#L164[Unofficial
39 Hadoop / HBase RPC protocol documentation].
40 For more background on how we arrived at this spec., see link:https://docs.google.com/document/d/1WCKwgaLDqBw2vpux0jPsAu2WPTRISob7HGCO8YhfDTA/edit#[HBase
49 . A wire-format we can evolve
50 . A format that does not require our rewriting server core or radically changing its current architecture (for later).
56 . List of problems with currently specified format and where we would like to go in a version2, etc.
57 For example, what would we have to change if anything to move server async or to support streaming/chunking?
58 . Diagram on how it works
59 . A grammar that succinctly describes the wire-format.
60 Currently we have these words and the content of the rpc protobuf idl but a grammar for the back and forth would help with groking rpc.
61 Also, a little state machine on client/server interactions would help with understanding (and ensuring correct implementation).
65 The client will send setup information on connection establish.
66 Thereafter, the client invokes methods against the remote server sending a protobuf Message and receiving a protobuf Message in response.
67 Communication is synchronous.
68 All back and forth is preceded by an int that has the total length of the request/response.
69 Optionally, Cells(KeyValues) can be passed outside of protobufs in follow-behind Cell blocks
70 (because link:https://docs.google.com/document/d/1WEtrq-JTIUhlnlnvA0oYRLp0F8MKpEBeBSCFcQiacdw/edit#[we can't protobuf megabytes of KeyValues] or Cells). These CellBlocks are encoded and optionally compressed.
72 For more detail on the protobufs involved, see the
73 link:https://github.com/apache/hbase/blob/master/hbase-protocol/src/main/protobuf/RPC.proto[RPC.proto] file in master.
77 Client initiates connection.
80 On connection setup, client sends a preamble followed by a connection header.
85 <MAGIC 4 byte integer> <1 byte RPC Format Version> <1 byte auth type>
88 We need the auth method spec.
89 here so the connection header is encoded if auth enabled.
91 E.g.: HBas0x000x50 -- 4 bytes of MAGIC -- `HBas' -- plus one-byte of version, 0 in this case, and one byte, 0x50 (SIMPLE). of an auth type.
93 .<Protobuf ConnectionHeader Message>
94 Has user info, and ``protocol'', as well as the encoders and compression the client will use sending CellBlocks.
95 CellBlock encoders and compressors are for the life of the connection.
96 CellBlock encoders implement org.apache.hadoop.hbase.codec.Codec.
97 CellBlocks may then also be compressed.
98 Compressors implement org.apache.hadoop.io.compress.CompressionCodec.
99 This protobuf is written using writeDelimited so is prefaced by a pb varint with its serialized length
103 After client sends preamble and connection header, server does NOT respond if successful connection setup.
104 No response means server is READY to accept requests and to give out response.
105 If the version or authentication in the preamble is not agreeable or the server has trouble parsing the preamble, it will throw a org.apache.hadoop.hbase.ipc.FatalConnectionException explaining the error and will then disconnect.
106 If the client in the connection header -- i.e.
107 the protobuf'd Message that comes after the connection preamble -- asks for a Service the server does not support or a codec the server does not have, again we throw a FatalConnectionException with explanation.
111 After a Connection has been set up, client makes requests.
114 A request is made up of a protobuf RequestHeader followed by a protobuf Message parameter.
115 The header includes the method name and optionally, metadata on the optional CellBlock that may be following.
116 The parameter type suits the method being invoked: i.e.
117 if we are doing a getRegionInfo request, the protobuf Message param will be an instance of GetRegionInfoRequest.
118 The response will be a GetRegionInfoResponse.
119 The CellBlock is optionally used ferrying the bulk of the RPC data: i.e. Cells/KeyValues.
124 The request is prefaced by an int that holds the total length of what follows.
126 .<Protobuf RequestHeader Message>
127 Will have call.id, trace.id, and method name, etc.
128 including optional Metadata on the Cell block IFF one is following.
129 Data is protobuf'd inline in this pb Message or optionally comes in the following CellBlock
131 .<Protobuf Param Message>
132 If the method being invoked is getRegionInfo, if you study the Service descriptor for the client to regionserver protocol, you will find that the request sends a GetRegionInfoRequest protobuf Message param in this position.
135 An encoded and optionally compressed Cell block.
139 Same as Request, it is a protobuf ResponseHeader followed by a protobuf Message response where the Message response type suits the method invoked.
140 Bulk of the data may come in a following CellBlock.
145 The response is prefaced by an int that holds the total length of what follows.
147 .<Protobuf ResponseHeader Message>
148 Will have call.id, etc.
149 Will include exception if failed processing.
150 Optionally includes metadata on optional, IFF there is a CellBlock following.
152 .<Protobuf Response Message>
154 Return or may be nothing if exception.
155 If the method being invoked is getRegionInfo, if you study the Service descriptor for the client to regionserver protocol, you will find that the response sends a GetRegionInfoResponse protobuf Message param in this position.
159 An encoded and optionally compressed Cell block.
163 There are two distinct types.
164 There is the request failed which is encapsulated inside the response header for the response.
165 The connection stays open to receive new requests.
166 The second type, the FatalConnectionException, kills the connection.
168 Exceptions can carry extra information.
169 See the ExceptionResponse protobuf type.
170 It has a flag to indicate do-no-retry as well as other miscellaneous payload to help improve client responsiveness.
174 These are not versioned.
175 Server can do the codec or it cannot.
176 If new version of a codec with say, tighter encoding, then give it a new class name.
177 Codecs will live on the server for all time so old clients can connect.
182 In some part, current wire-format -- i.e.
183 all requests and responses preceded by a length -- has been dictated by current server non-async architecture.
185 .One fat pb request or header+param
186 We went with pb header followed by pb param making a request and a pb header followed by pb response for now.
187 Doing header+param rather than a single protobuf Message with both header and param content:
189 . Is closer to what we currently have
190 . Having a single fat pb requires extra copying putting the already pb'd param into the body of the fat request pb (and same making result)
191 . We can decide whether to accept the request or not before we read the param; for example, the request might be low priority.
192 As is, we read header+param in one go as server is currently implemented so this is a TODO.
194 The advantages are minor.
195 If later, fat request has clear advantage, can roll out a v2 later.
198 ==== RPC Configurations
201 To enable a codec other than the default `KeyValueCodec`, set `hbase.client.rpc.codec` to the name of the Codec class to use.
202 Codec must implement hbase's `Codec` Interface.
203 After connection setup, all passed cellblocks will be sent with this codec.
204 The server will return cellblocks using this same codec as long as the codec is on the servers' CLASSPATH (else you will get `UnsupportedCellCodecException`).
206 To change the default codec, set `hbase.client.default.rpc.codec`.
208 To disable cellblocks completely and to go pure protobuf, set the default to the empty String and do not specify a codec in your Configuration.
209 So, set `hbase.client.default.rpc.codec` to the empty string and do not set `hbase.client.rpc.codec`.
210 This will cause the client to connect to the server with no codec specified.
211 If a server sees no codec, it will return all responses in pure protobuf.
212 Running pure protobuf all the time will be slower than running with cellblocks.
215 Uses hadoop's compression codecs.
216 To enable compressing of passed CellBlocks, set `hbase.client.rpc.compressor` to the name of the Compressor to use.
217 Compressor must implement Hadoop's CompressionCodec Interface.
218 After connection setup, all passed cellblocks will be sent compressed.
219 The server will return cellblocks compressed using this same compressor as long as the compressor is on its CLASSPATH (else you will get `UnsupportedCompressionCodecException`).