src/main/asciidoc/_chapters/external_apis.adoc

   1 ////
   2 /**
   3  *
   4  * Licensed to the Apache Software Foundation (ASF) under one
   5  * or more contributor license agreements.  See the NOTICE file
   6  * distributed with this work for additional information
   7  * regarding copyright ownership.  The ASF licenses this file
   8  * to you under the Apache License, Version 2.0 (the
   9  * "License"); you may not use this file except in compliance
  10  * with the License.  You may obtain a copy of the License at
  11  *
  12  *     http://www.apache.org/licenses/LICENSE-2.0
  13  *
  14  * Unless required by applicable law or agreed to in writing, software
  15  * distributed under the License is distributed on an "AS IS" BASIS,
  16  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  17  * See the License for the specific language governing permissions and
  18  * limitations under the License.
  19  */
  20 ////
  21
  22 [[external_apis]]
  23 = Apache HBase External APIs
  24 :doctype: book
  25 :numbered:
  26 :toc: left
  27 :icons: font
  28 :experimental:
  29
  30 This chapter will cover access to Apache HBase either through non-Java languages and
  31 through custom protocols. For information on using the native HBase APIs, refer to
  32 link:https://hbase.apache.org/apidocs/index.html[User API Reference] and the
  33 <<hbase_apis,HBase APIs>> chapter.
  34
  35 == REST
  36
  37 Representational State Transfer (REST) was introduced in 2000 in the doctoral
  38 dissertation of Roy Fielding, one of the principal authors of the HTTP specification.
  39
  40 REST itself is out of the scope of this documentation, but in general, REST allows
  41 client-server interactions via an API that is tied to the URL itself. This section
  42 discusses how to configure and run the REST server included with HBase, which exposes
  43 HBase tables, rows, cells, and metadata as URL specified resources.
  44 There is also a nice series of blogs on
  45 link:http://blog.cloudera.com/blog/2013/03/how-to-use-the-apache-hbase-rest-interface-part-1/[How-to: Use the Apache HBase REST Interface]
  46 by Jesse Anderson.
  47
  48 === Starting and Stopping the REST Server
  49
  50 The included REST server can run as a daemon which starts an embedded Jetty
  51 servlet container and deploys the servlet into it. Use one of the following commands
  52 to start the REST server in the foreground or background. The port is optional, and
  53 defaults to 8080.
  54
  55 [source, bash]
  56 ----
  57 # Foreground
  58 $ bin/hbase rest start -p <port>
  59
  60 # Background, logging to a file in $HBASE_LOGS_DIR
  61 $ bin/hbase-daemon.sh start rest -p <port>
  62 ----
  63
  64 To stop the REST server, use Ctrl-C if you were running it in the foreground, or the
  65 following command if you were running it in the background.
  66
  67 [source, bash]
  68 ----
  69 $ bin/hbase-daemon.sh stop rest
  70 ----
  71
  72 === Configuring the REST Server and Client
  73
  74 For information about configuring the REST server and client for SSL, as well as `doAs`
  75 impersonation for the REST server, see <<security.gateway.thrift>> and other portions
  76 of the <<security>> chapter.
  77
  78 === Using REST Endpoints
  79
  80 The following examples use the placeholder server pass:[http://example.com:8000], and
  81 the following commands can all be run using `curl` or `wget` commands. You can request
  82 plain text (the default), XML , or JSON output by adding no header for plain text,
  83 or the header "Accept: text/xml" for XML, "Accept: application/json" for JSON, or
  84 "Accept: application/x-protobuf" to for protocol buffers.
  85
  86 NOTE: Unless specified, use `GET` requests for queries, `PUT` or `POST` requests for
  87 creation or mutation, and `DELETE` for deletion.
  88
  89 .Cluster-Wide Endpoints
  90 [options="header", cols="2m,m,3d,6l"]
  91 |===
  92 |Endpoint
  93 |HTTP Verb
  94 |Description
  95 |Example
  96
  97 |/version/cluster
  98 |GET
  99 |Version of HBase running on this cluster
 100 |curl -vi -X GET \
 101   -H "Accept: text/xml" \
 102   "http://example.com:8000/version/cluster"
 103
 104 |/status/cluster
 105 |GET
 106 |Cluster status
 107 |curl -vi -X GET \
 108   -H "Accept: text/xml" \
 109   "http://example.com:8000/status/cluster"
 110
 111 |/
 112 |GET
 113 |List of all non-system tables
 114 |curl -vi -X GET \
 115   -H "Accept: text/xml" \
 116   "http://example.com:8000/"
 117
 118 |===
 119
 120 .Namespace Endpoints
 121 [options="header", cols="2m,m,3d,6l"]
 122 |===
 123 |Endpoint
 124 |HTTP Verb
 125 |Description
 126 |Example
 127
 128 |/namespaces
 129 |GET
 130 |List all namespaces
 131 |curl -vi -X GET \
 132   -H "Accept: text/xml" \
 133   "http://example.com:8000/namespaces/"
 134
 135 |/namespaces/_namespace_
 136 |GET
 137 |Describe a specific namespace
 138 |curl -vi -X GET \
 139   -H "Accept: text/xml" \
 140   "http://example.com:8000/namespaces/special_ns"
 141
 142 |/namespaces/_namespace_
 143 |POST
 144 |Create a new namespace
 145 |curl -vi -X POST \
 146   -H "Accept: text/xml" \
 147   "example.com:8000/namespaces/special_ns"
 148
 149 |/namespaces/_namespace_/tables
 150 |GET
 151 |List all tables in a specific namespace
 152 |curl -vi -X GET \
 153   -H "Accept: text/xml" \
 154   "http://example.com:8000/namespaces/special_ns/tables"
 155
 156 |/namespaces/_namespace_
 157 |PUT
 158 |Alter an existing namespace. Currently not used.
 159 |curl -vi -X PUT \
 160   -H "Accept: text/xml" \
 161   "http://example.com:8000/namespaces/special_ns
 162
 163 |/namespaces/_namespace_
 164 |DELETE
 165 |Delete a namespace. The namespace must be empty.
 166 |curl -vi -X DELETE \
 167   -H "Accept: text/xml" \
 168   "example.com:8000/namespaces/special_ns"
 169
 170 |===
 171
 172 .Table Endpoints
 173 [options="header", cols="2m,m,3d,6l"]
 174 |===
 175 |Endpoint
 176 |HTTP Verb
 177 |Description
 178 |Example
 179
 180 |/_table_/schema
 181 |GET
 182 |Describe the schema of the specified table.
 183 |curl -vi -X GET \
 184   -H "Accept: text/xml" \
 185   "http://example.com:8000/users/schema"
 186
 187 |/_table_/schema
 188 |POST
 189 |Update an existing table with the provided schema fragment
 190 |curl -vi -X POST \
 191   -H "Accept: text/xml" \
 192   -H "Content-Type: text/xml" \
 193   -d '&lt;?xml version="1.0" encoding="UTF-8"?>&lt;TableSchema name="users">&lt;ColumnSchema name="cf" KEEP_DELETED_CELLS="true" />&lt;/TableSchema>' \
 194   "http://example.com:8000/users/schema"
 195
 196 |/_table_/schema
 197 |PUT
 198 |Create a new table, or replace an existing table's schema
 199 |curl -vi -X PUT \
 200   -H "Accept: text/xml" \
 201   -H "Content-Type: text/xml" \
 202   -d '&lt;?xml version="1.0" encoding="UTF-8"?>&lt;TableSchema name="users">&lt;ColumnSchema name="cf" />&lt;/TableSchema>' \
 203   "http://example.com:8000/users/schema"
 204
 205 |/_table_/schema
 206 |DELETE
 207 |Delete the table. You must use the `/_table_/schema` endpoint, not just `/_table_/`.
 208 |curl -vi -X DELETE \
 209   -H "Accept: text/xml" \
 210   "http://example.com:8000/users/schema"
 211
 212 |/_table_/regions
 213 |GET
 214 |List the table regions
 215 |curl -vi -X GET \
 216   -H "Accept: text/xml" \
 217   "http://example.com:8000/users/regions
 218 |===
 219
 220 .Endpoints for `Get` Operations
 221 [options="header", cols="2m,m,3d,6l"]
 222 |===
 223 |Endpoint
 224 |HTTP Verb
 225 |Description
 226 |Example
 227
 228 |/_table_/_row_
 229 |GET
 230 |Get all columns of a single row. Values are Base-64 encoded. This requires the "Accept" request header with a type that can hold multiple columns (like xml, json or protobuf).
 231 |curl -vi -X GET \
 232   -H "Accept: text/xml" \
 233   "http://example.com:8000/users/row1"
 234
 235 |/_table_/_row_/_column:qualifier_/_timestamp_
 236 |GET
 237 |Get the value of a single column. Values are Base-64 encoded.
 238 |curl -vi -X GET \
 239   -H "Accept: text/xml" \
 240   "http://example.com:8000/users/row1/cf:a/1458586888395"
 241
 242 |/_table_/_row_/_column:qualifier_
 243 |GET
 244 |Get the value of a single column. Values are Base-64 encoded.
 245 |curl -vi -X GET \
 246   -H "Accept: text/xml" \
 247   "http://example.com:8000/users/row1/cf:a"
 248
 249 curl -vi -X GET \
 250   -H "Accept: text/xml" \
 251    "http://example.com:8000/users/row1/cf:a/"
 252
 253 |/_table_/_row_/_column:qualifier_/?v=_number_of_versions_
 254 |GET
 255 |Multi-Get a specified number of versions of a given cell. Values are Base-64 encoded.
 256 |curl -vi -X GET \
 257   -H "Accept: text/xml" \
 258   "http://example.com:8000/users/row1/cf:a?v=2"
 259
 260 |===
 261
 262 .Endpoints for `Scan` Operations
 263 [options="header", cols="2m,m,3d,6l"]
 264 |===
 265 |Endpoint
 266 |HTTP Verb
 267 |Description
 268 |Example
 269
 270 |/_table_/scanner/
 271 |PUT
 272 |Get a Scanner object. Required by all other Scan operations. Adjust the batch parameter
 273 to the number of rows the scan should return in a batch. See the next example for
 274 adding filters to your scanner. The scanner endpoint URL is returned as the `Location`
 275 in the HTTP response. The other examples in this table assume that the scanner endpoint
 276 is `\http://example.com:8000/users/scanner/145869072824375522207`.
 277 |curl -vi -X PUT \
 278   -H "Accept: text/xml" \
 279   -H "Content-Type: text/xml" \
 280   -d '<Scanner batch="1"/>' \
 281   "http://example.com:8000/users/scanner/"
 282
 283 |/_table_/scanner/
 284 |PUT
 285 |To supply filters to the Scanner object or configure the
 286 Scanner in any other way, you can create a text file and add
 287 your filter to the file. For example, to return only rows for
 288 which keys start with <codeph>u123</codeph> and use a batch size
 289 of 100, the filter file would look like this:
 290
 291 [source,xml]
 292 ----
 293 <Scanner batch="100">
 294   <filter>
 295     {
 296       "type": "PrefixFilter",
 297       "value": "u123"
 298     }
 299   </filter>
 300 </Scanner>
 301 ----
 302
 303 Pass the file to the `-d` argument of the `curl` request.
 304 |curl -vi -X PUT \
 305   -H "Accept: text/xml" \
 306   -H "Content-Type:text/xml" \
 307   -d @filter.txt \
 308   "http://example.com:8000/users/scanner/"
 309
 310 |/_table_/scanner/_scanner-id_
 311 |GET
 312 |Get the next batch from the scanner. Cell values are byte-encoded. If the scanner
 313 has been exhausted, HTTP status `204` is returned.
 314 |curl -vi -X GET \
 315   -H "Accept: text/xml" \
 316   "http://example.com:8000/users/scanner/145869072824375522207"
 317
 318 |_table_/scanner/_scanner-id_
 319 |DELETE
 320 |Deletes the scanner and frees the resources it used.
 321 |curl -vi -X DELETE \
 322   -H "Accept: text/xml" \
 323   "http://example.com:8000/users/scanner/145869072824375522207"
 324
 325 |===
 326
 327 .Endpoints for `Put` Operations
 328 [options="header", cols="2m,m,3d,6l"]
 329 |===
 330 |Endpoint
 331 |HTTP Verb
 332 |Description
 333 |Example
 334
 335 |/_table_/_row_key_
 336 |PUT
 337 |Write a row to a table. The row, column qualifier, and value must each be Base-64
 338 encoded. To encode a string, use the `base64` command-line utility. To decode the
 339 string, use `base64 -d`. The payload is in the `--data` argument, and the `/users/fakerow`
 340 value is a placeholder. Insert multiple rows by adding them to the `<CellSet>`
 341 element. You can also save the data to be inserted to a file and pass it to the `-d`
 342 parameter with syntax like `-d @filename.txt`.
 343 |curl -vi -X PUT \
 344   -H "Accept: text/xml" \
 345   -H "Content-Type: text/xml" \
 346   -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="cm93NQo="><Cell column="Y2Y6ZQo=">dmFsdWU1Cg==</Cell></Row></CellSet>' \
 347   "http://example.com:8000/users/fakerow"
 348
 349 curl -vi -X PUT \
 350   -H "Accept: text/json" \
 351   -H "Content-Type: text/json" \
 352   -d '{"Row":[{"key":"cm93NQo=", "Cell": [{"column":"Y2Y6ZQo=", "$":"dmFsdWU1Cg=="}]}]}'' \
 353   "example.com:8000/users/fakerow"
 354
 355 |===
 356
 357 .Endpoints for `Check-And-Put` Operations
 358 [options="header", cols="2m,m,3d,6l"]
 359 |===
 360 |Endpoint
 361 |HTTP Verb
 362 |Description
 363 |Example
 364
 365 |/_table_/_row_key_/?check=put
 366 |PUT
 367 |Conditional Put - Change the current version value of a cell: Compare the current or latest version value (`current-version-value`) of a cell with the `check-value`, and if `current-version-value` == `check-value`, write new data (the `new-value`) into the cell as the current or latest version. The row, column qualifier, and value must each be Base-64 encoded. To encode a string, use the `base64` command-line utility. To decode the string, use `base64 -d`. The payload is in the `--data` or `-d` argument, with `the check cell name (column family:column name) and value` always at the end and right after `the new Put cell name (column family:column name) and value` of the same row key. You can also save the data to be inserted to a file and pass it to the `-d` parameter with syntax like `-d @filename.txt`.
 368 |curl -vi -X PUT \
 369   -H "Accept: text/xml" \
 370   -H "Content-Type: text/xml" \
 371   -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="cm93MQ=="><Cell column="Y2ZhOmFsaWFz">T2xkR3V5</Cell><Cell column="Y2ZhOmFsaWFz">TmV3R3V5</Cell></Row></CellSet>' \
 372   "http://example.com:8000/users/row1/?check=put"
 373
 374 curl -vi -X PUT \
 375   -H "Accept: application/json" \
 376   -H "Content-Type: application/json" \
 377   -d '{"Row":[{"key":"cm93MQ==","Cell":[{"column":"Y2ZhOmFsaWFz","$":"T2xkR3V5"},{"column":"Y2ZhOmFsaWFz", "$":"TmV3R3V5"}] }]}' \
 378   "http://example.com:8000/users/row1/?check=put"
 379 |===
 380 Detailed Explanation:
 381
 382 *** In the above json-format example:
 383 1. `{"column":"Y2ZhOmFsaWFz", "$":"TmV3R3V5"}` at the end of `-d` option are `the check cell name and check cell value in Base-64` respectively: `"Y2ZhOmFsaWFz" for "cfa:alias"`, and `"TmV3R3V5" for "NewGuy"`
 384 2. `{"column":"Y2ZhOmFsaWFz","$":"T2xkR3V5"}` are `the new Put cell name and cell value in Base-64` respectively: `"Y2ZhOmFsaWFz" for "cfa:alias"`, and `"T2xkR3V5" for "OldGuy"`
 385 3. `"cm93MQ=="` is `the Base-64 for "row1"` for the checkAndPut `row key`
 386 4. `"/?check=put"` after the `"row key" in the request URL` is required for checkAndPut WebHBase operation to work
 387 5. The `"row key" in the request URL` should be URL-encoded, e.g., `"david%20chen"` and `"row1"` are the URL-encoded formats of row keys `"david chen"` and `"row1"`, respectively
 388
 389   Note: "cfa" is the column family name and "alias" are the column (qualifier) name for the non-Base64 encoded cell name.
 390
 391 *** Basically, the xml-format example is the same as the json-format example, and will not be explained here in detail.
 392
 393 .Endpoints for `Check-And-Delete` Operations
 394 [options="header", cols="2m,m,3d,6l"]
 395 |===
 396 |Endpoint
 397 |HTTP Verb
 398 |Description
 399 |Example
 400
 401 |/_table_/_row_key_/?check=delete
 402 |DELETE
 403 |Conditional Deleting a Row: Compare the value of any version of a cell (`any-version-value`) with the `check-value`, and if `any-version-value` == `check-value`, delete the row specified by the `row_key` inside the requesting URL.The row, column qualifier, and value for checking in the payload must each be Base-64 encoded. To encode a string, use the base64 command-line utility. To decode the string, use base64 -d. The payload is in the --data argument. You can also save the data to be checked to a file and pass it to the `-d` parameter with syntax like `-d @filename.txt`.
 404 |curl -vi -X DELETE \
 405   -H "Accept: text/xml" \
 406   -H "Content-Type: text/xml" \
 407   -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="cm93MQ=="><Cell column="Y2ZhOmFsaWFz">TmV3R3V5</Cell></Row></CellSet>' \
 408   "http://example.com:8000/users/row1/?check=delete"
 409
 410 curl -vi -X DELETE \
 411   -H "Accept: application/json" \
 412   -H "Content-Type: application/json" \
 413   -d '{"Row":[{"key":"cm93MQ==","Cell":[{"column":"Y2ZhOmFsaWFz","$":"TmV3R3V5"}]}]}' \
 414   "http://example.com:8000/users/row1/?check=delete"
 415
 416 |/_table_/_row_key_
 417 /_column_family_
 418 /?check=delete
 419 |DELETE
 420 |Conditional Deleting a Column Family of a Row: Compare the value of any version of a cell (`any-version-value`) with the `check-value`, and if `any-version-value` == `check-value`, delete the column family of a row specified by the `row_key/column_family` inside the requesting URL. Anything else is the same as those in `Conditional Deleting a Row`.
 421 |curl -vi -X DELETE \
 422   -H "Accept: text/xml" \
 423   -H "Content-Type: text/xml" \
 424   -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="cm93MQ=="><Cell column="Y2ZhOmFsaWFz">TmV3R3V5</Cell></Row></CellSet>' \
 425   "http://example.com:8000/users/row1/cfa/?check=delete"
 426
 427 curl -vi -X DELETE \
 428   -H "Accept: application/json" \
 429   -H "Content-Type: application/json" \
 430   -d '{"Row":[{"key":"cm93MQ==","Cell":[{"column":"Y2ZhOmFsaWFz","$":"TmV3R3V5"}]}]}' \
 431   "http://example.com:8000/users/row1/cfa/?check=delete"
 432
 433 |/_table_/_row_key_
 434 /_column:qualifier_
 435 /?check=delete
 436 |DELETE
 437 |Conditional Deleting All Versions of a Column of a Row: Compare the value of any version of a cell (`any-version-value`) with the `check-value`, and if `any-version-value` == `check-value`, delete the column of a row specified by the `row_key/column:qualifier` inside the requesting URL. The `column:qualifier` in the requesting URL is the `column_family:column_name`. Anything else is the same as those in `Conditional Deleting a Row`.
 438 |curl -vi -X DELETE \
 439   -H "Accept: text/xml" \
 440   -H "Content-Type: text/xml" \
 441   -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="cm93MQ=="><Cell column="Y2ZhOmFsaWFz">TmV3R3V5</Cell></Row></CellSet>' \
 442   "http://example.com:8000/users/row1/cfa:alias/?check=delete"
 443
 444 curl -vi -X DELETE \
 445   -H "Accept: application/json" \
 446   -H "Content-Type: application/json" \
 447   -d '{"Row":[{"key":"cm93MQ==","Cell":[{"column":"Y2ZhOmFsaWFz","$":"TmV3R3V5"}]}]}' \
 448   "http://example.com:8000/users/row1/cfa:alias/?check=delete"
 449
 450 |/_table_/_row_key_
 451 /_column:qualifier_
 452 /_version_id_/?check=delete
 453 |DELETE
 454 |Conditional Deleting a Single Version of a Column of a Row: Compare the value of any version of a cell (`any-version-value`) with the `check-value`, and if `any-version-value` == `check-value`, delete the version of a column of a row specified by the `row_key/column:qualifier/version_id` inside the requesting URL. The `column:qualifier` in the requesting URL is the `column_family:column_name`. The `version_id` in the requesting URL is a number, which equals to `the timestamp of the targeted version + 1`. Anything else is the same as those in `Conditional Deleting a Row`.
 455 |curl -vi -X DELETE \
 456   -H "Accept: text/xml" \
 457   -H "Content-Type: text/xml" \
 458   -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="cm93MQ=="><Cell column="Y2ZhOmFsaWFz">TmV3R3V5</Cell></Row></CellSet>' \
 459   "http://example.com:8000/users/row1/cfa:alias/1519423552160/?check=delete"
 460
 461 curl -vi -X DELETE \
 462   -H "Accept: application/json" \
 463   -H "Content-Type: application/json" \
 464   -d '{"Row":[{"key":"cm93MQ==","Cell":[{"column":"Y2ZhOmFsaWFz","$":"TmV3R3V5"}]}]}' \
 465   "http://example.com:8000/users/row1/cfa:alias/1519423552160/?check=delete"
 466 |===
 467 Detailed Explanation:
 468
 469 *** In the above 4 json-format examples:
 470 1. `{"column":"Y2ZhOmFsaWFz", "$":"TmV3R3V5"}` at the end of `-d` option are `the check cell name and check cell value in Base-64` respectively: `"Y2ZhOmFsaWFz" for "cfa:alias"`, and `"TmV3R3V5" for "NewGuy"`
 471 2. `"cm93MQ=="` is `the Base-64 for "row1"` for the checkAndDelete `row key`
 472 3. `"/?check=delete"` at the end of `the request URL` is required for checkAndDelete WebHBase operation to work
 473 4. `"version_id"` in the `request URL` of the last json-format example should be equivalent to the value of `"the timestamp number + 1"`
 474 5. The `"row key"`, `"column family"`, `"cell name" or "column family:column name"`, and `"version_id"` in `the request URL` of a checkAndDelete WebHBase operation should be URL-encoded, e.g., `"row1"`, `"cfa"`, `"cfa:alias"` and `"1519423552160"` in the examples are the URL-encoded `"row key"`, `"column family"`, `"column family:column name"`, and `"version_id"`, respectively
 475
 476 *** Basically, the 4 xml-format examples are the same as the 4 corresponding json-format examples, and will not be explained here in detail.
 477
 478 [[xml_schema]]
 479 === REST XML Schema
 480
 481 [source,xml]
 482 ----
 483 <schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:tns="RESTSchema">
 484
 485   <element name="Version" type="tns:Version"></element>
 486
 487   <complexType name="Version">
 488     <attribute name="REST" type="string"></attribute>
 489     <attribute name="JVM" type="string"></attribute>
 490     <attribute name="OS" type="string"></attribute>
 491     <attribute name="Server" type="string"></attribute>
 492     <attribute name="Jersey" type="string"></attribute>
 493   </complexType>
 494
 495   <element name="TableList" type="tns:TableList"></element>
 496
 497   <complexType name="TableList">
 498     <sequence>
 499       <element name="table" type="tns:Table" maxOccurs="unbounded" minOccurs="1"></element>
 500     </sequence>
 501   </complexType>
 502
 503   <complexType name="Table">
 504     <sequence>
 505       <element name="name" type="string"></element>
 506     </sequence>
 507   </complexType>
 508
 509   <element name="TableInfo" type="tns:TableInfo"></element>
 510
 511   <complexType name="TableInfo">
 512     <sequence>
 513       <element name="region" type="tns:TableRegion" maxOccurs="unbounded" minOccurs="1"></element>
 514     </sequence>
 515     <attribute name="name" type="string"></attribute>
 516   </complexType>
 517
 518   <complexType name="TableRegion">
 519     <attribute name="name" type="string"></attribute>
 520     <attribute name="id" type="int"></attribute>
 521     <attribute name="startKey" type="base64Binary"></attribute>
 522     <attribute name="endKey" type="base64Binary"></attribute>
 523     <attribute name="location" type="string"></attribute>
 524   </complexType>
 525
 526   <element name="TableSchema" type="tns:TableSchema"></element>
 527
 528   <complexType name="TableSchema">
 529     <sequence>
 530       <element name="column" type="tns:ColumnSchema" maxOccurs="unbounded" minOccurs="1"></element>
 531     </sequence>
 532     <attribute name="name" type="string"></attribute>
 533     <anyAttribute></anyAttribute>
 534   </complexType>
 535
 536   <complexType name="ColumnSchema">
 537     <attribute name="name" type="string"></attribute>
 538     <anyAttribute></anyAttribute>
 539   </complexType>
 540
 541   <element name="CellSet" type="tns:CellSet"></element>
 542
 543   <complexType name="CellSet">
 544     <sequence>
 545       <element name="row" type="tns:Row" maxOccurs="unbounded" minOccurs="1"></element>
 546     </sequence>
 547   </complexType>
 548
 549   <element name="Row" type="tns:Row"></element>
 550
 551   <complexType name="Row">
 552     <sequence>
 553       <element name="key" type="base64Binary"></element>
 554       <element name="cell" type="tns:Cell" maxOccurs="unbounded" minOccurs="1"></element>
 555     </sequence>
 556   </complexType>
 557
 558   <element name="Cell" type="tns:Cell"></element>
 559
 560   <complexType name="Cell">
 561     <sequence>
 562       <element name="value" maxOccurs="1" minOccurs="1">
 563         <simpleType><restriction base="base64Binary">
 564         </simpleType>
 565       </element>
 566     </sequence>
 567     <attribute name="column" type="base64Binary" />
 568     <attribute name="timestamp" type="int" />
 569   </complexType>
 570
 571   <element name="Scanner" type="tns:Scanner"></element>
 572
 573   <complexType name="Scanner">
 574     <sequence>
 575       <element name="column" type="base64Binary" minOccurs="0" maxOccurs="unbounded"></element>
 576     </sequence>
 577     <sequence>
 578       <element name="filter" type="string" minOccurs="0" maxOccurs="1"></element>
 579     </sequence>
 580     <attribute name="startRow" type="base64Binary"></attribute>
 581     <attribute name="endRow" type="base64Binary"></attribute>
 582     <attribute name="batch" type="int"></attribute>
 583     <attribute name="startTime" type="int"></attribute>
 584     <attribute name="endTime" type="int"></attribute>
 585   </complexType>
 586
 587   <element name="StorageClusterVersion" type="tns:StorageClusterVersion" />
 588
 589   <complexType name="StorageClusterVersion">
 590     <attribute name="version" type="string"></attribute>
 591   </complexType>
 592
 593   <element name="StorageClusterStatus"
 594     type="tns:StorageClusterStatus">
 595   </element>
 596
 597   <complexType name="StorageClusterStatus">
 598     <sequence>
 599       <element name="liveNode" type="tns:Node"
 600         maxOccurs="unbounded" minOccurs="0">
 601       </element>
 602       <element name="deadNode" type="string" maxOccurs="unbounded"
 603         minOccurs="0">
 604       </element>
 605     </sequence>
 606     <attribute name="regions" type="int"></attribute>
 607     <attribute name="requests" type="int"></attribute>
 608     <attribute name="averageLoad" type="float"></attribute>
 609   </complexType>
 610
 611   <complexType name="Node">
 612     <sequence>
 613       <element name="region" type="tns:Region"
 614    maxOccurs="unbounded" minOccurs="0">
 615       </element>
 616     </sequence>
 617     <attribute name="name" type="string"></attribute>
 618     <attribute name="startCode" type="int"></attribute>
 619     <attribute name="requests" type="int"></attribute>
 620     <attribute name="heapSizeMB" type="int"></attribute>
 621     <attribute name="maxHeapSizeMB" type="int"></attribute>
 622   </complexType>
 623
 624   <complexType name="Region">
 625     <attribute name="name" type="base64Binary"></attribute>
 626     <attribute name="stores" type="int"></attribute>
 627     <attribute name="storefiles" type="int"></attribute>
 628     <attribute name="storefileSizeMB" type="int"></attribute>
 629     <attribute name="memstoreSizeMB" type="int"></attribute>
 630     <attribute name="storefileIndexSizeMB" type="int"></attribute>
 631   </complexType>
 632
 633 </schema>
 634 ----
 635
 636 [[protobufs_schema]]
 637 === REST Protobufs Schema
 638
 639 [source,json]
 640 ----
 641 message Version {
 642   optional string restVersion = 1;
 643   optional string jvmVersion = 2;
 644   optional string osVersion = 3;
 645   optional string serverVersion = 4;
 646   optional string jerseyVersion = 5;
 647 }
 648
 649 message StorageClusterStatus {
 650   message Region {
 651     required bytes name = 1;
 652     optional int32 stores = 2;
 653     optional int32 storefiles = 3;
 654     optional int32 storefileSizeMB = 4;
 655     optional int32 memstoreSizeMB = 5;
 656     optional int32 storefileIndexSizeMB = 6;
 657   }
 658   message Node {
 659     required string name = 1;    // name:port
 660     optional int64 startCode = 2;
 661     optional int32 requests = 3;
 662     optional int32 heapSizeMB = 4;
 663     optional int32 maxHeapSizeMB = 5;
 664     repeated Region regions = 6;
 665   }
 666   // node status
 667   repeated Node liveNodes = 1;
 668   repeated string deadNodes = 2;
 669   // summary statistics
 670   optional int32 regions = 3;
 671   optional int32 requests = 4;
 672   optional double averageLoad = 5;
 673 }
 674
 675 message TableList {
 676   repeated string name = 1;
 677 }
 678
 679 message TableInfo {
 680   required string name = 1;
 681   message Region {
 682     required string name = 1;
 683     optional bytes startKey = 2;
 684     optional bytes endKey = 3;
 685     optional int64 id = 4;
 686     optional string location = 5;
 687   }
 688   repeated Region regions = 2;
 689 }
 690
 691 message TableSchema {
 692   optional string name = 1;
 693   message Attribute {
 694     required string name = 1;
 695     required string value = 2;
 696   }
 697   repeated Attribute attrs = 2;
 698   repeated ColumnSchema columns = 3;
 699   // optional helpful encodings of commonly used attributes
 700   optional bool inMemory = 4;
 701   optional bool readOnly = 5;
 702 }
 703
 704 message ColumnSchema {
 705   optional string name = 1;
 706   message Attribute {
 707     required string name = 1;
 708     required string value = 2;
 709   }
 710   repeated Attribute attrs = 2;
 711   // optional helpful encodings of commonly used attributes
 712   optional int32 ttl = 3;
 713   optional int32 maxVersions = 4;
 714   optional string compression = 5;
 715 }
 716
 717 message Cell {
 718   optional bytes row = 1;       // unused if Cell is in a CellSet
 719   optional bytes column = 2;
 720   optional int64 timestamp = 3;
 721   optional bytes data = 4;
 722 }
 723
 724 message CellSet {
 725   message Row {
 726     required bytes key = 1;
 727     repeated Cell values = 2;
 728   }
 729   repeated Row rows = 1;
 730 }
 731
 732 message Scanner {
 733   optional bytes startRow = 1;
 734   optional bytes endRow = 2;
 735   repeated bytes columns = 3;
 736   optional int32 batch = 4;
 737   optional int64 startTime = 5;
 738   optional int64 endTime = 6;
 739 }
 740 ----
 741
 742 == Thrift
 743
 744 Documentation about Thrift has moved to <<thrift>>.
 745
 746 [[c]]
 747 == C/C++ Apache HBase Client
 748
 749 FB's Chip Turner wrote a pure C/C++ client.
 750 link:https://github.com/hinaria/native-cpp-hbase-client[Check it out].
 751
 752 C++ client implementation. To see link:https://issues.apache.org/jira/browse/HBASE-14850[HBASE-14850].
 753
 754 [[jdo]]
 755
 756 == Using Java Data Objects (JDO) with HBase
 757
 758 link:https://db.apache.org/jdo/[Java Data Objects (JDO)] is a standard way to
 759 access persistent data in databases, using plain old Java objects (POJO) to
 760 represent persistent data.
 761
 762 .Dependencies
 763 This code example has the following dependencies:
 764
 765 . HBase 0.90.x or newer
 766 . commons-beanutils.jar (https://commons.apache.org/)
 767 . commons-pool-1.5.5.jar (https://commons.apache.org/)
 768 . transactional-tableindexed for HBase 0.90 (https://github.com/hbase-trx/hbase-transactional-tableindexed)
 769
 770 .Download `hbase-jdo`
 771 Download the code from http://code.google.com/p/hbase-jdo/.
 772
 773 .JDO Example
 774 ====
 775
 776 This example uses JDO to create a table and an index, insert a row into a table, get
 777 a row, get a column value, perform a query, and do some additional HBase operations.
 778
 779 [source, java]
 780 ----
 781 package com.apache.hadoop.hbase.client.jdo.examples;
 782
 783 import java.io.File;
 784 import java.io.FileInputStream;
 785 import java.io.InputStream;
 786 import java.util.Hashtable;
 787
 788 import org.apache.hadoop.fs.Path;
 789 import org.apache.hadoop.hbase.client.tableindexed.IndexedTable;
 790
 791 import com.apache.hadoop.hbase.client.jdo.AbstractHBaseDBO;
 792 import com.apache.hadoop.hbase.client.jdo.HBaseBigFile;
 793 import com.apache.hadoop.hbase.client.jdo.HBaseDBOImpl;
 794 import com.apache.hadoop.hbase.client.jdo.query.DeleteQuery;
 795 import com.apache.hadoop.hbase.client.jdo.query.HBaseOrder;
 796 import com.apache.hadoop.hbase.client.jdo.query.HBaseParam;
 797 import com.apache.hadoop.hbase.client.jdo.query.InsertQuery;
 798 import com.apache.hadoop.hbase.client.jdo.query.QSearch;
 799 import com.apache.hadoop.hbase.client.jdo.query.SelectQuery;
 800 import com.apache.hadoop.hbase.client.jdo.query.UpdateQuery;
 801
 802 /**
 803  * Hbase JDO Example.
 804  *
 805  * dependency library.
 806  * - commons-beanutils.jar
 807  * - commons-pool-1.5.5.jar
 808  * - hbase0.90.0-transactionl.jar
 809  *
 810  * you can expand Delete,Select,Update,Insert Query classes.
 811  *
 812  */
 813 public class HBaseExample {
 814   public static void main(String[] args) throws Exception {
 815     AbstractHBaseDBO dbo = new HBaseDBOImpl();
 816
 817     //*drop if table is already exist.*
 818     if(dbo.isTableExist("user")){
 819      dbo.deleteTable("user");
 820     }
 821
 822     //*create table*
 823     dbo.createTableIfNotExist("user",HBaseOrder.DESC,"account");
 824     //dbo.createTableIfNotExist("user",HBaseOrder.ASC,"account");
 825
 826     //create index.
 827     String[] cols={"id","name"};
 828     dbo.addIndexExistingTable("user","account",cols);
 829
 830     //insert
 831     InsertQuery insert = dbo.createInsertQuery("user");
 832     UserBean bean = new UserBean();
 833     bean.setFamily("account");
 834     bean.setAge(20);
 835     bean.setEmail("ncanis@gmail.com");
 836     bean.setId("ncanis");
 837     bean.setName("ncanis");
 838     bean.setPassword("1111");
 839     insert.insert(bean);
 840
 841     //select 1 row
 842     SelectQuery select = dbo.createSelectQuery("user");
 843     UserBean resultBean = (UserBean)select.select(bean.getRow(),UserBean.class);
 844
 845     // select column value.
 846     String value = (String)select.selectColumn(bean.getRow(),"account","id",String.class);
 847
 848     // search with option (QSearch has EQUAL, NOT_EQUAL, LIKE)
 849     // select id,password,name,email from account where id='ncanis' limit startRow,20
 850     HBaseParam param = new HBaseParam();
 851     param.setPage(bean.getRow(),20);
 852     param.addColumn("id","password","name","email");
 853     param.addSearchOption("id","ncanis",QSearch.EQUAL);
 854     select.search("account", param, UserBean.class);
 855
 856     // search column value is existing.
 857     boolean isExist = select.existColumnValue("account","id","ncanis".getBytes());
 858
 859     // update password.
 860     UpdateQuery update = dbo.createUpdateQuery("user");
 861     Hashtable<String, byte[]> colsTable = new Hashtable<String, byte[]>();
 862     colsTable.put("password","2222".getBytes());
 863     update.update(bean.getRow(),"account",colsTable);
 864
 865     //delete
 866     DeleteQuery delete = dbo.createDeleteQuery("user");
 867     delete.deleteRow(resultBean.getRow());
 868
 869     ////////////////////////////////////
 870     // etc
 871
 872     // HTable pool with apache commons pool
 873     // borrow and release. HBasePoolManager(maxActive, minIdle etc..)
 874     IndexedTable table = dbo.getPool().borrow("user");
 875     dbo.getPool().release(table);
 876
 877     // upload bigFile by hadoop directly.
 878     HBaseBigFile bigFile = new HBaseBigFile();
 879     File file = new File("doc/movie.avi");
 880     FileInputStream fis = new FileInputStream(file);
 881     Path rootPath = new Path("/files/");
 882     String filename = "movie.avi";
 883     bigFile.uploadFile(rootPath,filename,fis,true);
 884
 885     // receive file stream from hadoop.
 886     Path p = new Path(rootPath,filename);
 887     InputStream is = bigFile.path2Stream(p,4096);
 888
 889   }
 890 }
 891 ----
 892 ====
 893
 894 [[scala]]
 895 == Scala
 896
 897 === Setting the Classpath
 898
 899 To use Scala with HBase, your CLASSPATH must include HBase's classpath as well as
 900 the Scala JARs required by your code. First, use the following command on a server
 901 running the HBase RegionServer process, to get HBase's classpath.
 902
 903 [source, bash]
 904 ----
 905 $ ps aux |grep regionserver| awk -F 'java.library.path=' {'print $2'} | awk {'print $1'}
 906
 907 /usr/lib/hadoop/lib/native:/usr/lib/hbase/lib/native/Linux-amd64-64
 908 ----
 909
 910 Set the `$CLASSPATH` environment variable to include the path you found in the previous
 911 step, plus the path of `scala-library.jar` and each additional Scala-related JAR needed for
 912 your project.
 913
 914 [source, bash]
 915 ----
 916 $ export CLASSPATH=$CLASSPATH:/usr/lib/hadoop/lib/native:/usr/lib/hbase/lib/native/Linux-amd64-64:/path/to/scala-library.jar
 917 ----
 918
 919 === Scala SBT File
 920
 921 Your `build.sbt` file needs the following `resolvers` and `libraryDependencies` to work
 922 with HBase.
 923
 924 ----
 925 resolvers += "Apache HBase" at "https://repository.apache.org/content/repositories/releases"
 926
 927 resolvers += "Thrift" at "https://people.apache.org/~rawson/repo/"
 928
 929 libraryDependencies ++= Seq(
 930     "org.apache.hadoop" % "hadoop-core" % "0.20.2",
 931     "org.apache.hbase" % "hbase" % "0.90.4"
 932 )
 933 ----
 934
 935 === Example Scala Code
 936
 937 This example lists HBase tables, creates a new table, adds a row to it, and gets the value of the row.
 938
 939 [source, scala]
 940 ----
 941 import org.apache.hadoop.hbase.{HBaseConfiguration, TableName}
 942 import org.apache.hadoop.hbase.client.{Admin, Connection, ConnectionFactory, Get, Put}
 943 import org.apache.hadoop.hbase.util.Bytes
 944
 945 val conf = HBaseConfiguration.create()
 946 val connection = ConnectionFactory.createConnection(conf);
 947 val admin = connection.getAdmin();
 948
 949 // list the tables
 950 val listtables = admin.listTables()
 951 listtables.foreach(println)
 952
 953 // let's insert some data in 'mytable' and get the row
 954 val table = connection.getTable(TableName.valueOf("mytable"))
 955
 956 val theput = new Put(Bytes.toBytes("rowkey1"))
 957
 958 theput.addColumn(Bytes.toBytes("ids"),Bytes.toBytes("id1"),Bytes.toBytes("one"))
 959 table.put(theput)
 960
 961 val theget = new Get(Bytes.toBytes("rowkey1"))
 962 val result = table.get(theget)
 963 val value = result.value()
 964 println(Bytes.toString(value))
 965 ----
 966
 967 [[jython]]
 968 == Jython
 969
 970
 971 === Setting the Classpath
 972
 973 To use Jython with HBase, your CLASSPATH must include HBase's classpath as well as
 974 the Jython JARs required by your code.
 975
 976 Set the path to directory containing the `jython.jar` and each additional Jython-related JAR needed for
 977 your project. Then export HBASE_CLASSPATH pointing to the $JYTHON_HOME env. variable.
 978
 979 [source, bash]
 980 ----
 981 $ export HBASE_CLASSPATH=/directory/jython.jar
 982 ----
 983
 984 Start a Jython shell with HBase and Hadoop JARs in the classpath:
 985 $ bin/hbase org.python.util.jython
 986
 987 === Jython Code Examples
 988
 989 .Table Creation, Population, Get, and Delete with Jython
 990 ====
 991 The following Jython code example checks for table,
 992 if it exists, deletes it and then creates it. Then it
 993 populates the table with data and fetches the data.
 994
 995 [source,jython]
 996 ----
 997 import java.lang
 998 from org.apache.hadoop.hbase import HBaseConfiguration, HTableDescriptor, HColumnDescriptor, TableName
 999 from org.apache.hadoop.hbase.client import Admin, Connection, ConnectionFactory, Get, Put, Result, Table
1000 from org.apache.hadoop.conf import Configuration
1001
1002 # First get a conf object.  This will read in the configuration
1003 # that is out in your hbase-*.xml files such as location of the
1004 # hbase master node.
1005 conf = HBaseConfiguration.create()
1006 connection = ConnectionFactory.createConnection(conf)
1007 admin = connection.getAdmin()
1008
1009 # Create a table named 'test' that has a column family
1010 # named 'content'.
1011 tableName = TableName.valueOf("test")
1012 table = connection.getTable(tableName)
1013
1014 desc = HTableDescriptor(tableName)
1015 desc.addFamily(HColumnDescriptor("content"))
1016
1017 # Drop and recreate if it exists
1018 if admin.tableExists(tableName):
1019     admin.disableTable(tableName)
1020     admin.deleteTable(tableName)
1021
1022 admin.createTable(desc)
1023
1024 # Add content to 'column:' on a row named 'row_x'
1025 row = 'row_x'
1026 put = Put(row)
1027 put.addColumn("content", "qual", "some content")
1028 table.put(put)
1029
1030 # Now fetch the content just added, returns a byte[]
1031 get = Get(row)
1032
1033 result = table.get(get)
1034 data = java.lang.String(result.getValue("content", "qual"), "UTF8")
1035
1036 print "The fetched row contains the value '%s'" % data
1037 ----
1038 ====
1039
1040 .Table Scan Using Jython
1041 ====
1042 This example scans a table and returns the results that match a given family qualifier.
1043
1044 [source, jython]
1045 ----
1046 import java.lang
1047 from org.apache.hadoop.hbase import TableName, HBaseConfiguration
1048 from org.apache.hadoop.hbase.client import Connection, ConnectionFactory, Result, ResultScanner, Table, Admin
1049 from org.apache.hadoop.conf import Configuration
1050 conf = HBaseConfiguration.create()
1051 connection = ConnectionFactory.createConnection(conf)
1052 admin = connection.getAdmin()
1053 tableName = TableName.valueOf('wiki')
1054 table = connection.getTable(tableName)
1055
1056 cf = "title"
1057 attr = "attr"
1058 scanner = table.getScanner(cf)
1059 while 1:
1060     result = scanner.next()
1061     if not result:
1062        break
1063     print java.lang.String(result.row), java.lang.String(result.getValue(cf, attr))
1064 ----
1065 ====