4 * Licensed to the Apache Software Foundation (ASF) under one
5 * or more contributor license agreements. See the NOTICE file
6 * distributed with this work for additional information
7 * regarding copyright ownership. The ASF licenses this file
8 * to you under the Apache License, Version 2.0 (the
9 * "License"); you may not use this file except in compliance
10 * with the License. You may obtain a copy of the License at
12 * http://www.apache.org/licenses/LICENSE-2.0
14 * Unless required by applicable law or agreed to in writing, software
15 * distributed under the License is distributed on an "AS IS" BASIS,
16 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
17 * See the License for the specific language governing permissions and
18 * limitations under the License.
23 = The Apache HBase Shell
31 The Apache HBase Shell is link:http://jruby.org[(J)Ruby]'s IRB with some HBase particular commands added.
32 Anything you can do in IRB, you should be able to do in the HBase Shell.
34 To run the HBase shell, do as follows:
41 Type `help` and then `<RETURN>` to see a listing of shell commands and options.
42 Browse at least the paragraphs at the end of the help output for the gist of how variables and command arguments are entered into the HBase shell; in particular note how table names, rows, and columns, etc., must be quoted.
44 See <<shell_exercises,shell exercises>> for example basic shell operation.
46 Here is a nicely formatted listing of link:http://learnhbase.wordpress.com/2013/03/02/hbase-shell-commands/[all shell
47 commands] by Rajeshbabu Chintaguntla.
50 == Scripting with Ruby
52 For examples scripting Apache HBase, look in the HBase _bin_ directory.
53 Look at the files that end in _*.rb_.
54 To run one of these files, do as follows:
58 $ ./bin/hbase org.jruby.Main PATH_TO_SCRIPT
61 == Running the Shell in Non-Interactive Mode
63 A new non-interactive mode has been added to the HBase Shell (link:https://issues.apache.org/jira/browse/HBASE-11658[HBASE-11658)].
64 Non-interactive mode captures the exit status (success or failure) of HBase Shell commands and passes that status back to the command interpreter.
65 If you use the normal interactive mode, the HBase Shell will only ever return its own exit status, which will nearly always be `0` for success.
67 To invoke non-interactive mode, pass the `-n` or `--non-interactive` option to HBase Shell.
69 [[hbase.shell.noninteractive]]
70 == HBase Shell in OS Scripts
72 You can use the HBase shell from within operating system script interpreters like the Bash shell which is the default command interpreter for most Linux and UNIX distributions.
73 The following guidelines use Bash syntax, but could be adjusted to work with C-style shells such as csh or tcsh, and could probably be modified to work with the Microsoft Windows script interpreter as well. Submissions are welcome.
75 NOTE: Spawning HBase Shell commands in this way is slow, so keep that in mind when you are deciding when combining HBase operations with the operating system command line is appropriate.
77 .Passing Commands to the HBase Shell
79 You can pass commands to the HBase Shell in non-interactive mode (see <<hbase.shell.noninteractive,hbase.shell.noninteractive>>) using the `echo` command and the `|` (pipe) operator.
80 Be sure to escape characters in the HBase commands which would otherwise be interpreted by the shell.
81 Some debug-level output has been truncated from the example below.
85 $ echo "describe 'test1'" | ./hbase shell -n
87 Version 0.98.3-hadoop2, rd5e65a9144e315bb0a964e7730871af32f5018d5, Sat May 31 19:56:09 PDT 2014
92 'test1', {NAME => 'cf', DATA_BLOCK_ENCODING => 'NON true
93 E', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0',
94 VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIO
95 NS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS =>
96 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false'
97 , BLOCKCACHE => 'true'}
98 1 row(s) in 3.2410 seconds
101 To suppress all output, echo it to _/dev/null:_
105 $ echo "describe 'test'" | ./hbase shell -n > /dev/null 2>&1
109 .Checking the Result of a Scripted Command
111 Since scripts are not designed to be run interactively, you need a way to check whether your command failed or succeeded.
112 The HBase shell uses the standard convention of returning a value of `0` for successful commands, and some non-zero value for failed commands.
113 Bash stores a command's return value in a special environment variable called `$?`.
114 Because that variable is overwritten each time the shell runs any command, you should store the result in a different, script-defined variable.
116 This is a naive script that shows one way to store the return value and make a decision based upon it.
122 echo "describe 'test'" | ./hbase shell -n > /dev/null 2>&1
124 echo "The status was " $status
125 if ($status == 0); then
126 echo "The command succeeded"
128 echo "The command may have failed."
134 === Checking for Success or Failure In Scripts
136 Getting an exit code of `0` means that the command you scripted definitely succeeded.
137 However, getting a non-zero exit code does not necessarily mean the command failed.
138 The command could have succeeded, but the client lost connectivity, or some other event obscured its success.
139 This is because RPC commands are stateless.
140 The only way to be sure of the status of an operation is to check.
141 For instance, if your script creates a table, but returns a non-zero exit value, you should check whether the table was actually created before trying again to create it.
143 == Read HBase Shell Commands from a Command File
145 You can enter HBase Shell commands into a text file, one command per line, and pass that file to the HBase Shell.
147 .Example Command File
152 put 'test', 'row1', 'cf:a', 'value1'
153 put 'test', 'row2', 'cf:b', 'value2'
154 put 'test', 'row3', 'cf:c', 'value3'
155 put 'test', 'row4', 'cf:d', 'value4'
163 .Directing HBase Shell to Execute the Commands
165 Pass the path to the command file as the only argument to the `hbase shell` command.
166 Each command is executed and its output is shown.
167 If you do not include the `exit` command in your script, you are returned to the HBase shell prompt.
168 There is no way to programmatically check each individual command for success or failure.
169 Also, though you see the output for each command, the commands themselves are not echoed to the screen so it can be difficult to line up the command with its output.
173 $ ./hbase shell ./sample_commands.txt
174 0 row(s) in 3.4170 seconds
178 1 row(s) in 0.0590 seconds
180 0 row(s) in 0.1540 seconds
182 0 row(s) in 0.0080 seconds
184 0 row(s) in 0.0060 seconds
186 0 row(s) in 0.0060 seconds
189 row1 column=cf:a, timestamp=1407130286968, value=value1
190 row2 column=cf:b, timestamp=1407130286997, value=value2
191 row3 column=cf:c, timestamp=1407130287007, value=value3
192 row4 column=cf:d, timestamp=1407130287015, value=value4
193 4 row(s) in 0.0420 seconds
196 cf:a timestamp=1407130286968, value=value1
197 1 row(s) in 0.0110 seconds
199 0 row(s) in 1.5630 seconds
201 0 row(s) in 0.4360 seconds
205 == Passing VM Options to the Shell
207 You can pass VM options to the HBase Shell using the `HBASE_SHELL_OPTS` environment variable.
208 You can set this in your environment, for instance by editing _~/.bashrc_, or set it as part of the command to launch HBase Shell.
209 The following example sets several garbage-collection-related variables, just for the lifetime of the VM running the HBase Shell.
210 The command should be run all on a single line, but is broken by the `\` character, for readability.
214 $ HBASE_SHELL_OPTS="-verbose:gc -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps \
215 -XX:+PrintGCDetails -Xloggc:$HBASE_HOME/logs/gc-hbase.log" ./bin/hbase shell
222 HBase 0.95 adds shell commands that provides jruby-style object-oriented references for tables.
223 Previously all of the shell commands that act upon a table have a procedural style that always took the name of the table as an argument.
224 HBase 0.95 introduces the ability to assign a table to a jruby variable.
225 The table reference can be used to perform data read write operations such as puts, scans, and gets well as admin functionality such as disabling, dropping, describing tables.
227 For example, previously you would always specify a table name:
230 hbase(main):000:0> create ‘t’, ‘f’
231 0 row(s) in 1.0970 seconds
232 hbase(main):001:0> put 't', 'rold', 'f', 'v'
233 0 row(s) in 0.0080 seconds
235 hbase(main):002:0> scan 't'
237 rold column=f:, timestamp=1378473207660, value=v
238 1 row(s) in 0.0130 seconds
240 hbase(main):003:0> describe 't'
242 't', {NAME => 'f', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_ true
243 SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2
244 147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false
245 ', BLOCKCACHE => 'true'}
246 1 row(s) in 1.4430 seconds
248 hbase(main):004:0> disable 't'
249 0 row(s) in 14.8700 seconds
251 hbase(main):005:0> drop 't'
252 0 row(s) in 23.1670 seconds
257 Now you can assign the table to a variable and use the results in jruby shell code.
260 hbase(main):007 > t = create 't', 'f'
261 0 row(s) in 1.0970 seconds
264 hbase(main):008 > t.put 'r', 'f', 'v'
265 0 row(s) in 0.0640 seconds
266 hbase(main):009 > t.scan
268 r column=f:, timestamp=1331865816290, value=v
269 1 row(s) in 0.0110 seconds
270 hbase(main):010:0> t.describe
272 't', {NAME => 'f', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_ true
273 SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2
274 147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false
275 ', BLOCKCACHE => 'true'}
276 1 row(s) in 0.0210 seconds
277 hbase(main):038:0> t.disable
278 0 row(s) in 6.2350 seconds
279 hbase(main):039:0> t.drop
280 0 row(s) in 0.2340 seconds
283 If the table has already been created, you can assign a Table to a variable by using the get_table method:
286 hbase(main):011 > create 't','f'
287 0 row(s) in 1.2500 seconds
290 hbase(main):012:0> tab = get_table 't'
291 0 row(s) in 0.0010 seconds
294 hbase(main):013:0> tab.put ‘r1’ ,’f’, ‘v’
295 0 row(s) in 0.0100 seconds
296 hbase(main):014:0> tab.scan
298 r1 column=f:, timestamp=1378473876949, value=v
299 1 row(s) in 0.0240 seconds
303 The list functionality has also been extended so that it returns a list of table names as strings.
304 You can then use jruby to script table operations based on these names.
305 The list_snapshots command also acts similarly.
308 hbase(main):016 > tables = list(‘t.*’)
311 1 row(s) in 0.1040 seconds
313 => #<#<Class:0x7677ce29>:0x21d377a4>
314 hbase(main):017:0> tables.map { |t| disable t ; drop t}
315 0 row(s) in 2.2510 seconds
324 Create an _.irbrc_ file for yourself in your home directory.
326 A useful one is command history so commands are save across Shell invocations:
330 require 'irb/ext/save-history'
331 IRB.conf[:SAVE_HISTORY] = 100
332 IRB.conf[:HISTORY_FILE] = "#{ENV['HOME']}/.irb-save-history"
335 If you'd like to avoid printing the result of evaluting each expression to stderr, for example the array of tables returned from the "list" command:
339 $ echo "IRB.conf[:ECHO] = false" >>~/.irbrc
342 See the `ruby` documentation of _.irbrc_ to learn about other possible configurations.
344 === LOG data to timestamp
346 To convert the date '08/08/16 20:56:29' from an hbase log into a timestamp, do:
349 hbase(main):021:0> import java.text.SimpleDateFormat
350 hbase(main):022:0> import java.text.ParsePosition
351 hbase(main):023:0> SimpleDateFormat.new("yy/MM/dd HH:mm:ss").parse("08/08/16 20:56:29", ParsePosition.new(0)).getTime() => 1218920189000
354 To go the other direction:
357 hbase(main):021:0> import java.util.Date
358 hbase(main):022:0> Date.new(1218920189000).toString() => "Sat Aug 16 20:56:29 UTC 2008"
361 To output in a format that is exactly like that of the HBase log format will take a little messing with link:http://download.oracle.com/javase/6/docs/api/java/text/SimpleDateFormat.html[SimpleDateFormat].
363 === Query Shell Configuration
365 hbase(main):001:0> @shell.hbase.configuration.get("hbase.rpc.timeout")
368 To set a config in the shell:
370 hbase(main):005:0> @shell.hbase.configuration.setInt("hbase.rpc.timeout", 61010)
371 hbase(main):006:0> @shell.hbase.configuration.get("hbase.rpc.timeout")
377 === Pre-splitting tables with the HBase Shell
378 You can use a variety of options to pre-split tables when creating them via the HBase Shell `create` command.
380 The simplest approach is to specify an array of split points when creating the table. Note that when specifying string literals as split points, these will create split points based on the underlying byte representation of the string. So when specifying a split point of '10', we are actually specifying the byte split point '\x31\30'.
382 The split points will define `n+1` regions where `n` is the number of split points. The lowest region will contain all keys from the lowest possible key up to but not including the first split point key.
383 The next region will contain keys from the first split point up to, but not including the next split point key.
384 This will continue for all split points up to the last. The last region will be defined from the last split point up to the maximum possible key.
388 hbase>create 't1','f',SPLITS => ['10','20','30']
391 In the above example, the table 't1' will be created with column family 'f', pre-split to four regions. Note the first region will contain all keys from '\x00' up to '\x30' (as '\x31' is the ASCII code for '1').
393 You can pass the split points in a file using following variation. In this example, the splits are read from a file corresponding to the local path on the local filesystem. Each line in the file specifies a split point key.
397 hbase>create 't14','f',SPLITS_FILE=>'splits.txt'
400 The other options are to automatically compute splits based on a desired number of regions and a splitting algorithm.
401 HBase supplies algorithms for splitting the key range based on uniform splits or based on hexadecimal keys, but you can provide your own splitting algorithm to subdivide the key range.
405 # create table with four regions based on random bytes keys
406 hbase>create 't2','f1', { NUMREGIONS => 4 , SPLITALGO => 'UniformSplit' }
408 # create table with five regions based on hex keys
409 hbase>create 't3','f1', { NUMREGIONS => 5, SPLITALGO => 'HexStringSplit' }
412 As the HBase Shell is effectively a Ruby environment, you can use simple Ruby scripts to compute splits algorithmically.
416 # generate splits for long (Ruby fixnum) key range from start to end key
417 hbase(main):070:0> def gen_splits(start_key,end_key,num_regions)
418 hbase(main):071:1> results=[]
419 hbase(main):072:1> range=end_key-start_key
420 hbase(main):073:1> incr=(range/num_regions).floor
421 hbase(main):074:1> for i in 1 .. num_regions-1
422 hbase(main):075:2> results.push([i*incr+start_key].pack("N"))
423 hbase(main):076:2> end
424 hbase(main):077:1> return results
425 hbase(main):078:1> end
427 hbase(main):080:0> splits=gen_splits(1,2000000,10)
428 => ["\000\003\r@", "\000\006\032\177", "\000\t'\276", "\000\f4\375", "\000\017B<", "\000\022O{", "\000\025\\\272", "\000\030i\371", "\000\ew8"]
429 hbase(main):081:0> create 'test_splits','f',SPLITS=>splits
430 0 row(s) in 0.2670 seconds
432 => Hbase::Table - test_splits
435 Note that the HBase Shell command `truncate` effectively drops and recreates the table with default options which will discard any pre-splitting.
436 If you need to truncate a pre-split table, you must drop and recreate the table explicitly to re-specify custom split options.
440 ==== Shell debug switch
442 You can set a debug switch in the shell to see more output -- e.g.
443 more of the stack trace on exception -- when you run a command:
447 hbase> debug <RETURN>
452 To enable DEBUG level logging in the shell, launch it with the `-d` option.
456 $ ./bin/hbase shell -d
463 Count command returns the number of rows in a table.
464 It's quite fast when configured with the right CACHE
468 hbase> count '<tablename>', CACHE => 1000
471 The above count fetches 1000 rows at a time.
472 Set CACHE lower if your rows are big.
473 Default is to fetch one row at a time.