src/main/asciidoc/_chapters/configuration.adoc

   1 ////
   2 /**
   3  *
   4  * Licensed to the Apache Software Foundation (ASF) under one
   5  * or more contributor license agreements.  See the NOTICE file
   6  * distributed with this work for additional information
   7  * regarding copyright ownership.  The ASF licenses this file
   8  * to you under the Apache License, Version 2.0 (the
   9  * "License"); you may not use this file except in compliance
  10  * with the License.  You may obtain a copy of the License at
  11  *
  12  *     http://www.apache.org/licenses/LICENSE-2.0
  13  *
  14  * Unless required by applicable law or agreed to in writing, software
  15  * distributed under the License is distributed on an "AS IS" BASIS,
  16  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  17  * See the License for the specific language governing permissions and
  18  * limitations under the License.
  19  */
  20 ////
  21
  22 [[configuration]]
  23 = Apache HBase Configuration
  24 :doctype: book
  25 :numbered:
  26 :toc: left
  27 :icons: font
  28 :experimental:
  29
  30 This chapter expands upon the <<getting_started>> chapter to further explain configuration of Apache HBase.
  31 Please read this chapter carefully, especially the <<basic.prerequisites,Basic Prerequisites>>
  32 to ensure that your HBase testing and deployment goes smoothly.
  33 Familiarize yourself with <<hbase_supported_tested_definitions>> as well.
  34
  35 == Configuration Files
  36 Apache HBase uses the same configuration system as Apache Hadoop.
  37 All configuration files are located in the _conf/_ directory, which needs to be kept in sync for each node on your cluster.
  38
  39 .HBase Configuration File Descriptions
  40 _backup-masters_::
  41   Not present by default.
  42   A plain-text file which lists hosts on which the Master should start a backup Master process, one host per line.
  43
  44 _hadoop-metrics2-hbase.properties_::
  45   Used to connect HBase Hadoop's Metrics2 framework.
  46   See the link:https://cwiki.apache.org/confluence/display/HADOOP2/HADOOP-6728-MetricsV2[Hadoop Wiki entry] for more information on Metrics2.
  47   Contains only commented-out examples by default.
  48
  49 _hbase-env.cmd_ and _hbase-env.sh_::
  50   Script for Windows and Linux / Unix environments to set up the working environment for HBase, including the location of Java, Java options, and other environment variables.
  51   The file contains many commented-out examples to provide guidance.
  52
  53 _hbase-policy.xml_::
  54   The default policy configuration file used by RPC servers to make authorization decisions on client requests.
  55   Only used if HBase <<security,security>> is enabled.
  56
  57 _hbase-site.xml_::
  58   The main HBase configuration file.
  59   This file specifies configuration options which override HBase's default configuration.
  60   You can view (but do not edit) the default configuration file at _docs/hbase-default.xml_.
  61   You can also view the entire effective configuration for your cluster (defaults and overrides) in the [label]#HBase Configuration# tab of the HBase Web UI.
  62
  63 _log4j.properties_::
  64   Configuration file for HBase logging via `log4j`.
  65
  66 _regionservers_::
  67   A plain-text file containing a list of hosts which should run a RegionServer in your HBase cluster.
  68   By default this file contains the single entry `localhost`.
  69   It should contain a list of hostnames or IP addresses, one per line, and should only contain `localhost` if each node in your cluster will run a RegionServer on its `localhost` interface.
  70
  71 .Checking XML Validity
  72 [TIP]
  73 ====
  74 When you edit XML, it is a good idea to use an XML-aware editor to be sure that your syntax is correct and your XML is well-formed.
  75 You can also use the `xmllint` utility to check that your XML is well-formed.
  76 By default, `xmllint` re-flows and prints the XML to standard output.
  77 To check for well-formedness and only print output if errors exist, use the command `xmllint -noout filename.xml`.
  78 ====
  79 .Keep Configuration In Sync Across the Cluster
  80 [WARNING]
  81 ====
  82 When running in distributed mode, after you make an edit to an HBase configuration, make sure you copy the contents of the _conf/_ directory to all nodes of the cluster.
  83 HBase will not do this for you.
  84 Use `rsync`, `scp`, or another secure mechanism for copying the configuration files to your nodes.
  85 For most configurations, a restart is needed for servers to pick up changes. Dynamic configuration is an exception to this, to be described later below.
  86 ====
  87
  88 [[basic.prerequisites]]
  89 == Basic Prerequisites
  90
  91 This section lists required services and some required system configuration.
  92
  93 [[java]]
  94 .Java
  95
  96 The following table summarizes the recommendation of the HBase community wrt deploying on various Java versions.
  97 A icon:check-circle[role="green"] symbol is meant to indicate a base level of testing and willingness to help diagnose and address issues you might run into.
  98 Similarly, an entry of icon:exclamation-circle[role="yellow"] or icon:times-circle[role="red"] generally means that should you run into an issue the community is likely to ask you to change the Java environment before proceeding to help.
  99 In some cases, specific guidance on limitations (e.g. whether compiling / unit tests work, specific operational issues, etc) will also be noted.
 100
 101 .Long Term Support JDKs are recommended
 102 [TIP]
 103 ====
 104 HBase recommends downstream users rely on JDK releases that are marked as Long Term Supported (LTS) either from the OpenJDK project or vendors. As of March 2018 that means Java 8 is the only applicable version and that the next likely version to see testing will be Java 11 near Q3 2018.
 105 ====
 106
 107 .Java support by release line
 108 [cols="6*^.^", options="header"]
 109 |===
 110 |HBase Version
 111 |JDK 7
 112 |JDK 8
 113 |JDK 9 (Non-LTS)
 114 |JDK 10 (Non-LTS)
 115 |JDK 11
 116
 117 |2.1+
 118 |icon:times-circle[role="red"]
 119 |icon:check-circle[role="green"]
 120 v|icon:exclamation-circle[role="yellow"]
 121 link:https://issues.apache.org/jira/browse/HBASE-20264[HBASE-20264]
 122 v|icon:exclamation-circle[role="yellow"]
 123 link:https://issues.apache.org/jira/browse/HBASE-20264[HBASE-20264]
 124 v|icon:exclamation-circle[role="yellow"]
 125 link:https://issues.apache.org/jira/browse/HBASE-21110[HBASE-21110]
 126
 127 |1.3+
 128 |icon:check-circle[role="green"]
 129 |icon:check-circle[role="green"]
 130 v|icon:exclamation-circle[role="yellow"]
 131 link:https://issues.apache.org/jira/browse/HBASE-20264[HBASE-20264]
 132 v|icon:exclamation-circle[role="yellow"]
 133 link:https://issues.apache.org/jira/browse/HBASE-20264[HBASE-20264]
 134 v|icon:exclamation-circle[role="yellow"]
 135 link:https://issues.apache.org/jira/browse/HBASE-21110[HBASE-21110]
 136
 137 |===
 138
 139 NOTE: HBase will neither build nor run with Java 6.
 140
 141 NOTE: You must set `JAVA_HOME` on each node of your cluster. _hbase-env.sh_ provides a handy mechanism to do this.
 142
 143 [[os]]
 144 .Operating System Utilities
 145 ssh::
 146   HBase uses the Secure Shell (ssh) command and utilities extensively to communicate between cluster nodes. Each server in the cluster must be running `ssh` so that the Hadoop and HBase daemons can be managed. You must be able to connect to all nodes via SSH, including the local node, from the Master as well as any backup Master, using a shared key rather than a password. You can see the basic methodology for such a set-up in Linux or Unix systems at "<<passwordless.ssh.quickstart>>". If your cluster nodes use OS X, see the section, link:https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120730246#RunningHadoopOnOSX10.564-bit(Single-NodeCluster)-SSH:SettingupRemoteDesktopandEnablingSelf-Login[SSH: Setting up Remote Desktop and Enabling Self-Login] on the Hadoop wiki.
 147
 148 DNS::
 149   HBase uses the local hostname to self-report its IP address.
 150
 151 NTP::
 152   The clocks on cluster nodes should be synchronized. A small amount of variation is acceptable, but larger amounts of skew can cause erratic and unexpected behavior. Time synchronization is one of the first things to check if you see unexplained problems in your cluster. It is recommended that you run a Network Time Protocol (NTP) service, or another time-synchronization mechanism on your cluster and that all nodes look to the same service for time synchronization. See the link:http://www.tldp.org/LDP/sag/html/basic-ntp-config.html[Basic NTP Configuration] at [citetitle]_The Linux Documentation Project (TLDP)_ to set up NTP.
 153
 154 [[ulimit]]
 155 Limits on Number of Files and Processes (ulimit)::
 156   Apache HBase is a database. It requires the ability to open a large number of files at once. Many Linux distributions limit the number of files a single user is allowed to open to `1024` (or `256` on older versions of OS X). You can check this limit on your servers by running the command `ulimit -n` when logged in as the user which runs HBase. See <<trouble.rs.runtime.filehandles,the Troubleshooting section>> for some of the problems you may experience if the limit is too low. You may also notice errors such as the following:
 157 +
 158 ----
 159 2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Exception increateBlockOutputStream java.io.EOFException
 160 2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-6935524980745310745_1391901
 161 ----
 162 +
 163 It is recommended to raise the ulimit to at least 10,000, but more likely 10,240, because the value is usually expressed in multiples of 1024. Each ColumnFamily has at least one StoreFile, and possibly more than six StoreFiles if the region is under load. The number of open files required depends upon the number of ColumnFamilies and the number of regions. The following is a rough formula for calculating the potential number of open files on a RegionServer.
 164 +
 165 .Calculate the Potential Number of Open Files
 166 ----
 167 (StoreFiles per ColumnFamily) x (regions per RegionServer)
 168 ----
 169 +
 170 For example, assuming that a schema had 3 ColumnFamilies per region with an average of 3 StoreFiles per ColumnFamily, and there are 100 regions per RegionServer, the JVM will open `3 * 3 * 100 = 900` file descriptors, not counting open JAR files, configuration files, and others. Opening a file does not take many resources, and the risk of allowing a user to open too many files is minimal.
 171 +
 172 Another related setting is the number of processes a user is allowed to run at once. In Linux and Unix, the number of processes is set using the `ulimit -u` command. This should not be confused with the `nproc` command, which controls the number of CPUs available to a given user. Under load, a `ulimit -u` that is too low can cause OutOfMemoryError exceptions.
 173 +
 174 Configuring the maximum number of file descriptors and processes for the user who is running the HBase process is an operating system configuration, rather than an HBase configuration. It is also important to be sure that the settings are changed for the user that actually runs HBase. To see which user started HBase, and that user's ulimit configuration, look at the first line of the HBase log for that instance.
 175 +
 176 .`ulimit` Settings on Ubuntu
 177 ====
 178 To configure ulimit settings on Ubuntu, edit _/etc/security/limits.conf_, which is a space-delimited file with four columns. Refer to the man page for _limits.conf_ for details about the format of this file. In the following example, the first line sets both soft and hard limits for the number of open files (nofile) to 32768 for the operating system user with the username hadoop. The second line sets the number of processes to 32000 for the same user.
 179 ----
 180 hadoop  -       nofile  32768
 181 hadoop  -       nproc   32000
 182 ----
 183 The settings are only applied if the Pluggable Authentication Module (PAM) environment is directed to use them. To configure PAM to use these limits, be sure that the _/etc/pam.d/common-session_ file contains the following line:
 184 ----
 185 session required  pam_limits.so
 186 ----
 187 ====
 188
 189 Linux Shell::
 190   All of the shell scripts that come with HBase rely on the link:http://www.gnu.org/software/bash[GNU Bash] shell.
 191
 192 Windows::
 193   Running production systems on Windows machines is not recommended.
 194
 195
 196 [[hadoop]]
 197 === link:https://hadoop.apache.org[Hadoop](((Hadoop)))
 198
 199 The following table summarizes the versions of Hadoop supported with each version of HBase. Older versions not appearing in this table are considered unsupported and likely missing necessary features, while newer versions are untested but may be suitable.
 200
 201 Based on the version of HBase, you should select the most appropriate version of Hadoop.
 202 You can use Apache Hadoop, or a vendor's distribution of Hadoop.
 203 No distinction is made here.
 204 See link:https://cwiki.apache.org/confluence/display/HADOOP2/Distributions+and+Commercial+Support[the Hadoop wiki] for information about vendors of Hadoop.
 205
 206 .Hadoop 2.x is recommended.
 207 [TIP]
 208 ====
 209 Hadoop 2.x is faster and includes features, such as short-circuit reads (see <<perf.hdfs.configs.localread>>),
 210 which will help improve your HBase random read profile.
 211 Hadoop 2.x also includes important bug fixes that will improve your overall HBase experience. HBase does not support running with
 212 earlier versions of Hadoop. See the table below for requirements specific to different HBase versions.
 213
 214 Hadoop 3.x is still in early access releases and has not yet been sufficiently tested by the HBase community for production use cases.
 215 ====
 216
 217 Use the following legend to interpret this table:
 218
 219 .Hadoop version support matrix
 220
 221 * icon:check-circle[role="green"] = Tested to be fully-functional
 222 * icon:times-circle[role="red"] = Known to not be fully-functional, or there are link:https://hadoop.apache.org/cve_list.html[CVEs] so we drop the support in newer minor releases
 223 * icon:exclamation-circle[role="yellow"] = Not tested, may/may-not function
 224
 225 [cols="1,6*^.^", options="header"]
 226 |===
 227 | | HBase-1.3.x | HBase-1.4.x | HBase-1.5.x | HBase-2.1.x | HBase-2.2.x | HBase-2.3.x
 228 |Hadoop-2.4.x | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"]
 229 |Hadoop-2.5.x | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"]
 230 |Hadoop-2.6.0 | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"]
 231 |Hadoop-2.6.1+ | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"]
 232 |Hadoop-2.7.0 | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"]
 233 |Hadoop-2.7.1+ | icon:check-circle[role="green"] | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"]
 234 |Hadoop-2.8.[0-2] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"]
 235 |Hadoop-2.8.[3-4] | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:times-circle[role="red"] | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"]
 236 |Hadoop-2.8.5+ | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:check-circle[role="green"] | icon:check-circle[role="green"] | icon:check-circle[role="green"] | icon:times-circle[role="red"]
 237 |Hadoop-2.9.[0-1] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"]
 238 |Hadoop-2.9.2+ | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:check-circle[role="green"] | icon:exclamation-circle[role="yellow"] | icon:check-circle[role="green"] | icon:times-circle[role="red"]
 239 |Hadoop-2.10.0 | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:check-circle[role="green"] | icon:exclamation-circle[role="yellow"] | icon:exclamation-circle[role="yellow"] | icon:check-circle[role="green"]
 240 |Hadoop-3.0.[0-2] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"]
 241 |Hadoop-3.0.3+ | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:check-circle[role="green"] | icon:times-circle[role="red"] | icon:times-circle[role="red"]
 242 |Hadoop-3.1.0 | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"]
 243 |Hadoop-3.1.1+ | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:times-circle[role="red"] | icon:check-circle[role="green"] | icon:check-circle[role="green"] | icon:check-circle[role="green"]
 244 |===
 245
 246 .Hadoop Pre-2.6.1 and JDK 1.8 Kerberos
 247 [TIP]
 248 ====
 249 When using pre-2.6.1 Hadoop versions and JDK 1.8 in a Kerberos environment, HBase server can fail
 250 and abort due to Kerberos keytab relogin error. Late version of JDK 1.7 (1.7.0_80) has the problem too.
 251 Refer to link:https://issues.apache.org/jira/browse/HADOOP-10786[HADOOP-10786] for additional details.
 252 Consider upgrading to Hadoop 2.6.1+ in this case.
 253 ====
 254
 255 .Hadoop 2.6.x
 256 [TIP]
 257 ====
 258 Hadoop distributions based on the 2.6.x line *must* have
 259 link:https://issues.apache.org/jira/browse/HADOOP-11710[HADOOP-11710] applied if you plan to run
 260 HBase on top of an HDFS Encryption Zone. Failure to do so will result in cluster failure and
 261 data loss. This patch is present in Apache Hadoop releases 2.6.1+.
 262 ====
 263
 264 .Hadoop 2.y.0 Releases
 265 [TIP]
 266 ====
 267 Starting around the time of Hadoop version 2.7.0, the Hadoop PMC got into the habit of calling out new minor releases on their major version 2 release line as not stable / production ready. As such, HBase expressly advises downstream users to avoid running on top of these releases. Note that additionally the 2.8.1 release was given the same caveat by the Hadoop PMC. For reference, see the release announcements for link:https://s.apache.org/hadoop-2.7.0-announcement[Apache Hadoop 2.7.0], link:https://s.apache.org/hadoop-2.8.0-announcement[Apache Hadoop 2.8.0], link:https://s.apache.org/hadoop-2.8.1-announcement[Apache Hadoop 2.8.1], and link:https://s.apache.org/hadoop-2.9.0-announcement[Apache Hadoop 2.9.0].
 268 ====
 269
 270 .Hadoop 3.0.x Releases
 271 [TIP]
 272 ====
 273 Hadoop distributions that include the Application Timeline Service feature may cause unexpected versions of HBase classes to be present in the application classpath. Users planning on running MapReduce applications with HBase should make sure that link:https://issues.apache.org/jira/browse/YARN-7190[YARN-7190] is present in their YARN service (currently fixed in 2.9.1+ and 3.1.0+).
 274 ====
 275
 276 .Hadoop 3.1.0 Release
 277 [TIP]
 278 ====
 279 The Hadoop PMC called out the 3.1.0 release as not stable / production ready. As such, HBase expressly advises downstream users to avoid running on top of this release. For reference, see the link:https://s.apache.org/hadoop-3.1.0-announcement[release announcement for Hadoop 3.1.0].
 280 ====
 281
 282 .Replace the Hadoop Bundled With HBase!
 283 [NOTE]
 284 ====
 285 Because HBase depends on Hadoop, it bundles Hadoop jars under its _lib_ directory.
 286 The bundled jars are ONLY for use in standalone mode.
 287 In distributed mode, it is _critical_ that the version of Hadoop that is out on your cluster match what is under HBase.
 288 Replace the hadoop jars found in the HBase lib directory with the equivalent hadoop jars from the version you are running
 289 on your cluster to avoid version mismatch issues.
 290 Make sure you replace the jars under HBase across your whole cluster.
 291 Hadoop version mismatch issues have various manifestations. Check for mismatch if
 292 HBase appears hung.
 293 ====
 294
 295 [[dfs.datanode.max.transfer.threads]]
 296 ==== `dfs.datanode.max.transfer.threads` (((dfs.datanode.max.transfer.threads)))
 297
 298 An HDFS DataNode has an upper bound on the number of files that it will serve at any one time.
 299 Before doing any loading, make sure you have configured Hadoop's _conf/hdfs-site.xml_, setting the `dfs.datanode.max.transfer.threads` value to at least the following:
 300
 301 [source,xml]
 302 ----
 303
 304 <property>
 305   <name>dfs.datanode.max.transfer.threads</name>
 306   <value>4096</value>
 307 </property>
 308 ----
 309
 310 Be sure to restart your HDFS after making the above configuration.
 311
 312 Not having this configuration in place makes for strange-looking failures.
 313 One manifestation is a complaint about missing blocks.
 314 For example:
 315
 316 ----
 317 10/12/08 20:10:31 INFO hdfs.DFSClient: Could not obtain block
 318           blk_XXXXXXXXXXXXXXXXXXXXXX_YYYYYYYY from any node: java.io.IOException: No live nodes
 319           contain current block. Will get new block locations from namenode and retry...
 320 ----
 321
 322 See also <<casestudies.max.transfer.threads,casestudies.max.transfer.threads>> and note that this property was previously known as `dfs.datanode.max.xcievers` (e.g. link:http://ccgtech.blogspot.com/2010/02/hadoop-hdfs-deceived-by-xciever.html[Hadoop HDFS: Deceived by Xciever]).
 323
 324 [[zookeeper.requirements]]
 325 === ZooKeeper Requirements
 326
 327 ZooKeeper 3.4.x is required.
 328
 329 [[standalone_dist]]
 330 == HBase run modes: Standalone and Distributed
 331
 332 HBase has two run modes: <<standalone,standalone>> and <<distributed,distributed>>.
 333 Out of the box, HBase runs in standalone mode.
 334 Whatever your mode, you will need to configure HBase by editing files in the HBase _conf_ directory.
 335 At a minimum, you must edit [code]+conf/hbase-env.sh+ to tell HBase which +java+ to use.
 336 In this file you set HBase environment variables such as the heapsize and other options for the `JVM`, the preferred location for log files, etc.
 337 Set [var]+JAVA_HOME+ to point at the root of your +java+ install.
 338
 339 [[standalone]]
 340 === Standalone HBase
 341
 342 This is the default mode.
 343 Standalone mode is what is described in the <<quickstart,quickstart>> section.
 344 In standalone mode, HBase does not use HDFS -- it uses the local filesystem instead -- and it runs all HBase daemons and a local ZooKeeper all up in the same JVM.
 345 ZooKeeper binds to a well known port so clients may talk to HBase.
 346
 347 [[standalone.over.hdfs]]
 348 ==== Standalone HBase over HDFS
 349 A sometimes useful variation on standalone hbase has all daemons running inside the
 350 one JVM but rather than persist to the local filesystem, instead
 351 they persist to an HDFS instance.
 352
 353 You might consider this profile when you are intent on
 354 a simple deploy profile, the loading is light, but the
 355 data must persist across node comings and goings. Writing to
 356 HDFS where data is replicated ensures the latter.
 357
 358 To configure this standalone variant, edit your _hbase-site.xml_
 359 setting _hbase.rootdir_  to point at a directory in your
 360 HDFS instance but then set _hbase.cluster.distributed_
 361 to _false_. For example:
 362
 363 [source,xml]
 364 ----
 365 <configuration>
 366   <property>
 367     <name>hbase.rootdir</name>
 368     <value>hdfs://namenode.example.org:8020/hbase</value>
 369   </property>
 370   <property>
 371     <name>hbase.cluster.distributed</name>
 372     <value>false</value>
 373   </property>
 374 </configuration>
 375 ----
 376
 377 [[distributed]]
 378 === Distributed
 379
 380 Distributed mode can be subdivided into distributed but all daemons run on a single node -- a.k.a. _pseudo-distributed_ -- and _fully-distributed_ where the daemons are spread across all nodes in the cluster.
 381 The _pseudo-distributed_ vs. _fully-distributed_ nomenclature comes from Hadoop.
 382
 383 Pseudo-distributed mode can run against the local filesystem or it can run against an instance of the _Hadoop Distributed File System_ (HDFS). Fully-distributed mode can ONLY run on HDFS.
 384 See the Hadoop link:https://hadoop.apache.org/docs/current/[documentation] for how to set up HDFS.
 385 A good walk-through for setting up HDFS on Hadoop 2 can be found at http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide.
 386
 387 [[pseudo]]
 388 ==== Pseudo-distributed
 389
 390 .Pseudo-Distributed Quickstart
 391 [NOTE]
 392 ====
 393 A quickstart has been added to the <<quickstart,quickstart>> chapter.
 394 See <<quickstart_pseudo,quickstart-pseudo>>.
 395 Some of the information that was originally in this section has been moved there.
 396 ====
 397
 398 A pseudo-distributed mode is simply a fully-distributed mode run on a single host.
 399 Use this HBase configuration for testing and prototyping purposes only.
 400 Do not use this configuration for production or for performance evaluation.
 401
 402 [[fully_dist]]
 403 === Fully-distributed
 404
 405 By default, HBase runs in standalone mode.
 406 Both standalone mode and pseudo-distributed mode are provided for the purposes of small-scale testing.
 407 For a production environment, distributed mode is advised.
 408 In distributed mode, multiple instances of HBase daemons run on multiple servers in the cluster.
 409
 410 Just as in pseudo-distributed mode, a fully distributed configuration requires that you set the `hbase.cluster.distributed` property to `true`.
 411 Typically, the `hbase.rootdir` is configured to point to a highly-available HDFS filesystem.
 412
 413 In addition, the cluster is configured so that multiple cluster nodes enlist as RegionServers, ZooKeeper QuorumPeers, and backup HMaster servers.
 414 These configuration basics are all demonstrated in <<quickstart_fully_distributed,quickstart-fully-distributed>>.
 415
 416 .Distributed RegionServers
 417 Typically, your cluster will contain multiple RegionServers all running on different servers, as well as primary and backup Master and ZooKeeper daemons.
 418 The _conf/regionservers_ file on the master server contains a list of hosts whose RegionServers are associated with this cluster.
 419 Each host is on a separate line.
 420 All hosts listed in this file will have their RegionServer processes started and stopped when the master server starts or stops.
 421
 422 .ZooKeeper and HBase
 423 See the <<zookeeper,ZooKeeper>> section for ZooKeeper setup instructions for HBase.
 424
 425 .Example Distributed HBase Cluster
 426 ====
 427 This is a bare-bones _conf/hbase-site.xml_ for a distributed HBase cluster.
 428 A cluster that is used for real-world work would contain more custom configuration parameters.
 429 Most HBase configuration directives have default values, which are used unless the value is overridden in the _hbase-site.xml_.
 430 See "<<config.files,Configuration Files>>" for more information.
 431
 432 [source,xml]
 433 ----
 434
 435 <configuration>
 436   <property>
 437     <name>hbase.rootdir</name>
 438     <value>hdfs://namenode.example.org:8020/hbase</value>
 439   </property>
 440   <property>
 441     <name>hbase.cluster.distributed</name>
 442     <value>true</value>
 443   </property>
 444   <property>
 445     <name>hbase.zookeeper.quorum</name>
 446     <value>node-a.example.com,node-b.example.com,node-c.example.com</value>
 447   </property>
 448 </configuration>
 449 ----
 450
 451 This is an example _conf/regionservers_ file, which contains a list of nodes that should run a RegionServer in the cluster.
 452 These nodes need HBase installed and they need to use the same contents of the _conf/_ directory as the Master server
 453
 454 [source]
 455 ----
 456
 457 node-a.example.com
 458 node-b.example.com
 459 node-c.example.com
 460 ----
 461
 462 This is an example _conf/backup-masters_ file, which contains a list of each node that should run a backup Master instance.
 463 The backup Master instances will sit idle unless the main Master becomes unavailable.
 464
 465 [source]
 466 ----
 467
 468 node-b.example.com
 469 node-c.example.com
 470 ----
 471 ====
 472
 473 .Distributed HBase Quickstart
 474 See <<quickstart_fully_distributed,quickstart-fully-distributed>> for a walk-through of a simple three-node cluster configuration with multiple ZooKeeper, backup HMaster, and RegionServer instances.
 475
 476 .Procedure: HDFS Client Configuration
 477 . Of note, if you have made HDFS client configuration changes on your Hadoop cluster, such as configuration directives for HDFS clients, as opposed to server-side configurations, you must use one of the following methods to enable HBase to see and use these configuration changes:
 478 +
 479 a. Add a pointer to your `HADOOP_CONF_DIR` to the `HBASE_CLASSPATH` environment variable in _hbase-env.sh_.
 480 b. Add a copy of _hdfs-site.xml_ (or _hadoop-site.xml_) or, better, symlinks, under _${HBASE_HOME}/conf_, or
 481 c. if only a small set of HDFS client configurations, add them to _hbase-site.xml_.
 482
 483
 484 An example of such an HDFS client configuration is `dfs.replication`.
 485 If for example, you want to run with a replication factor of 5, HBase will create files with the default of 3 unless you do the above to make the configuration available to HBase.
 486
 487 [[confirm]]
 488 == Running and Confirming Your Installation
 489
 490 Make sure HDFS is running first.
 491 Start and stop the Hadoop HDFS daemons by running _bin/start-hdfs.sh_ over in the `HADOOP_HOME` directory.
 492 You can ensure it started properly by testing the `put` and `get` of files into the Hadoop filesystem.
 493 HBase does not normally use the MapReduce or YARN daemons. These do not need to be started.
 494
 495 _If_ you are managing your own ZooKeeper, start it and confirm it's running, else HBase will start up ZooKeeper for you as part of its start process.
 496
 497 Start HBase with the following command:
 498
 499 ----
 500 bin/start-hbase.sh
 501 ----
 502
 503 Run the above from the `HBASE_HOME` directory.
 504
 505 You should now have a running HBase instance.
 506 HBase logs can be found in the _logs_ subdirectory.
 507 Check them out especially if HBase had trouble starting.
 508
 509 HBase also puts up a UI listing vital attributes.
 510 By default it's deployed on the Master host at port 16010 (HBase RegionServers listen on port 16020 by default and put up an informational HTTP server at port 16030). If the Master is running on a host named `master.example.org` on the default port, point your browser at pass:[http://master.example.org:16010] to see the web interface.
 511
 512 Once HBase has started, see the <<shell_exercises,shell exercises>> section for how to create tables, add data, scan your insertions, and finally disable and drop your tables.
 513
 514 To stop HBase after exiting the HBase shell enter
 515
 516 ----
 517 $ ./bin/stop-hbase.sh
 518 stopping hbase...............
 519 ----
 520
 521 Shutdown can take a moment to complete.
 522 It can take longer if your cluster is comprised of many machines.
 523 If you are running a distributed operation, be sure to wait until HBase has shut down completely before stopping the Hadoop daemons.
 524
 525 [[config.files]]
 526 == Default Configuration
 527
 528 [[hbase.site]]
 529 === _hbase-site.xml_ and _hbase-default.xml_
 530
 531 Just as in Hadoop where you add site-specific HDFS configuration to the _hdfs-site.xml_ file, for HBase, site specific customizations go into the file _conf/hbase-site.xml_.
 532 For the list of configurable properties, see <<hbase_default_configurations,hbase default configurations>> below or view the raw _hbase-default.xml_ source file in the HBase source code at _src/main/resources_.
 533
 534 Not all configuration options make it out to _hbase-default.xml_.
 535 Some configurations would only appear in source code; the only way to identify these changes are through code review.
 536
 537 Currently, changes here will require a cluster restart for HBase to notice the change.
 538 // hbase/src/main/asciidoc
 539 //
 540 include::{docdir}/../../../target/asciidoc/hbase-default.adoc[]
 541
 542
 543 [[hbase.env.sh]]
 544 === _hbase-env.sh_
 545
 546 Set HBase environment variables in this file.
 547 Examples include options to pass the JVM on start of an HBase daemon such as heap size and garbage collector configs.
 548 You can also set configurations for HBase configuration, log directories, niceness, ssh options, where to locate process pid files, etc.
 549 Open the file at _conf/hbase-env.sh_ and peruse its content.
 550 Each option is fairly well documented.
 551 Add your own environment variables here if you want them read by HBase daemons on startup.
 552
 553 Changes here will require a cluster restart for HBase to notice the change.
 554
 555 [[log4j]]
 556 === _log4j.properties_
 557
 558 Edit this file to change rate at which HBase files are rolled and to change the level at which HBase logs messages.
 559
 560 Changes here will require a cluster restart for HBase to notice the change though log levels can be changed for particular daemons via the HBase UI.
 561
 562 [[client_dependencies]]
 563 === Client configuration and dependencies connecting to an HBase cluster
 564
 565 If you are running HBase in standalone mode, you don't need to configure anything for your client to work provided that they are all on the same machine.
 566
 567 Starting release 3.0.0, the default connection registry has been switched to a master based implementation. Refer to <<client.masterregistry>> for more details about
 568 what a connection registry is and implications of this change. Depending on your HBase version, following is the expected minimal client configuration.
 569
 570 ==== Up until 2.x.y releases
 571 In 2.x.y releases, the default connection registry was based on ZooKeeper as the source of truth. This means that the clients always looked up ZooKeeper znodes to fetch
 572 the required metadata. For example, if an active master crashed and the a new master is elected, clients looked up the master znode to fetch
 573 the active master address (similarly for meta locations). This meant that the clients needed to have access to ZooKeeper and need to know
 574 the ZooKeeper ensemble information before they can do anything. This can be configured in the client configuration xml as follows:
 575
 576 [source,xml]
 577 ----
 578 <?xml version="1.0"?>
 579 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 580 <configuration>
 581   <property>
 582     <name>hbase.zookeeper.quorum</name>
 583     <value>example1,example2,example3</value>
 584     <description> Zookeeper ensemble information</description>
 585   </property>
 586 </configuration>
 587 ----
 588
 589 ==== Starting 3.0.0 release
 590
 591 The default implementation was switched to a master based connection registry. With this implementation, clients always contact the active or
 592 stand-by master RPC end points to fetch the the connection registry information. This means that the clients should have access to the list of active and master
 593 end points before they can do anything. This can be configured in the client configuration xml as follows:
 594
 595 [source,xml]
 596 ----
 597 <?xml version="1.0"?>
 598 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 599 <configuration>
 600   <property>
 601     <name>hbase.masters</name>
 602     <value>example1,example2,example3</value>
 603     <description>List of master rpc end points for the hbase cluster.</description>
 604   </property>
 605 </configuration>
 606 ----
 607
 608 The configuration value for _hbase.masters_ is a comma separated list of _host:port_ values. If no port value is specified, the default of _16000_ is assumed.
 609
 610 Usually this configuration is kept out in the _hbase-site.xml_ and is picked up by the client from the `CLASSPATH`.
 611
 612 If you are configuring an IDE to run an HBase client, you should include the _conf/_ directory on your classpath so _hbase-site.xml_ settings can be found (or add _src/test/resources_ to pick up the hbase-site.xml used by tests).
 613
 614 For Java applications using Maven, including the hbase-shaded-client module is the recommended dependency when connecting to a cluster:
 615 [source,xml]
 616 ----
 617 <dependency>
 618   <groupId>org.apache.hbase</groupId>
 619   <artifactId>hbase-shaded-client</artifactId>
 620   <version>2.0.0</version>
 621 </dependency>
 622 ----
 623
 624 [[java.client.config]]
 625 ==== Java client configuration
 626
 627 The configuration used by a Java client is kept in an link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HBaseConfiguration[HBaseConfiguration] instance.
 628
 629 The factory method on HBaseConfiguration, `HBaseConfiguration.create();`, on invocation, will read in the content of the first _hbase-site.xml_ found on the client's `CLASSPATH`, if one is present (Invocation will also factor in any _hbase-default.xml_ found; an _hbase-default.xml_ ships inside the _hbase.X.X.X.jar_). It is also possible to specify configuration directly without having to read from a _hbase-site.xml_.
 630 For example, to set the ZooKeeper ensemble for the cluster programmatically do as follows:
 631
 632 [source,java]
 633 ----
 634 Configuration config = HBaseConfiguration.create();
 635 config.set("hbase.zookeeper.quorum", "localhost");  // Until 2.x.y versions
 636 // ---- or ----
 637 config.set("hbase.masters", "localhost:1234"); // Starting 3.0.0 version
 638 ----
 639
 640 [[config_timeouts]]
 641 === Timeout settings
 642
 643 HBase provides a wide variety of timeout settings to limit the execution time of various remote operations.
 644
 645 * hbase.rpc.timeout
 646 * hbase.rpc.read.timeout
 647 * hbase.rpc.write.timeout
 648 * hbase.client.operation.timeout
 649 * hbase.client.meta.operation.timeout
 650 * hbase.client.scanner.timeout.period
 651
 652 The `hbase.rpc.timeout` property limits how long a single RPC call can run before timing out.
 653 To fine tune read or write related RPC timeouts set `hbase.rpc.read.timeout` and `hbase.rpc.write.timeout` configuration properties.
 654 In the absence of these properties `hbase.rpc.timeout` will be used.
 655
 656 A higher-level timeout is `hbase.client.operation.timeout` which is valid for each client call.
 657 When an RPC call fails for instance for a timeout due to `hbase.rpc.timeout` it will be retried until `hbase.client.operation.timeout` is reached.
 658 Client operation timeout for system tables can be fine tuned by setting `hbase.client.meta.operation.timeout` configuration value.
 659 When this is not set its value will use `hbase.client.operation.timeout`.
 660
 661 Timeout for scan operations is controlled differently. Use `hbase.client.scanner.timeout.period` property to set this timeout.
 662
 663 [[example_config]]
 664 == Example Configurations
 665
 666 === Basic Distributed HBase Install
 667
 668 Here is a basic configuration example for a distributed ten node cluster:
 669 * The nodes are named `example0`, `example1`, etc., through node `example9` in this example.
 670 * The HBase Master and the HDFS NameNode are running on the node `example0`.
 671 * RegionServers run on nodes `example1`-`example9`.
 672 * A 3-node ZooKeeper ensemble runs on `example1`, `example2`, and `example3` on the default ports.
 673 * ZooKeeper data is persisted to the directory _/export/zookeeper_.
 674
 675 Below we show what the main configuration files -- _hbase-site.xml_, _regionservers_, and _hbase-env.sh_ -- found in the HBase _conf_ directory might look like.
 676
 677 [[hbase_site]]
 678 ==== _hbase-site.xml_
 679
 680 [source,xml]
 681 ----
 682 <?xml version="1.0"?>
 683 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 684 <configuration>
 685   <property>
 686     <name>hbase.zookeeper.quorum</name>
 687     <value>example1,example2,example3</value>
 688     <description>The directory shared by RegionServers.
 689     </description>
 690   </property>
 691   <property>
 692     <name>hbase.zookeeper.property.dataDir</name>
 693     <value>/export/zookeeper</value>
 694     <description>Property from ZooKeeper config zoo.cfg.
 695     The directory where the snapshot is stored.
 696     </description>
 697   </property>
 698   <property>
 699     <name>hbase.rootdir</name>
 700     <value>hdfs://example0:8020/hbase</value>
 701     <description>The directory shared by RegionServers.
 702     </description>
 703   </property>
 704   <property>
 705     <name>hbase.cluster.distributed</name>
 706     <value>true</value>
 707     <description>The mode the cluster will be in. Possible values are
 708       false: standalone and pseudo-distributed setups with managed ZooKeeper
 709       true: fully-distributed with unmanaged ZooKeeper Quorum (see hbase-env.sh)
 710     </description>
 711   </property>
 712 </configuration>
 713 ----
 714
 715 [[regionservers]]
 716 ==== _regionservers_
 717
 718 In this file you list the nodes that will run RegionServers.
 719 In our case, these nodes are `example1`-`example9`.
 720
 721 [source]
 722 ----
 723 example1
 724 example2
 725 example3
 726 example4
 727 example5
 728 example6
 729 example7
 730 example8
 731 example9
 732 ----
 733
 734 [[hbase_env]]
 735 ==== _hbase-env.sh_
 736
 737 The following lines in the _hbase-env.sh_ file show how to set the `JAVA_HOME` environment variable (required for HBase) and set the heap to 4 GB (rather than the default value of 1 GB). If you copy and paste this example, be sure to adjust the `JAVA_HOME` to suit your environment.
 738
 739 ----
 740 # The java implementation to use.
 741 export JAVA_HOME=/usr/java/jdk1.8.0/
 742
 743 # The maximum amount of heap to use. Default is left to JVM default.
 744 export HBASE_HEAPSIZE=4G
 745 ----
 746
 747 Use +rsync+ to copy the content of the _conf_ directory to all nodes of the cluster.
 748
 749 [[important_configurations]]
 750 == The Important Configurations
 751
 752 Below we list some _important_ configurations.
 753 We've divided this section into required configuration and worth-a-look recommended configs.
 754
 755 [[required_configuration]]
 756 === Required Configurations
 757
 758 Review the <<os,os>> and <<hadoop,hadoop>> sections.
 759
 760 [[big.cluster.config]]
 761 ==== Big Cluster Configurations
 762
 763 If you have a cluster with a lot of regions, it is possible that a Regionserver checks in briefly after the Master starts while all the remaining RegionServers lag behind. This first server to check in will be assigned all regions which is not optimal.
 764 To prevent the above scenario from happening, up the `hbase.master.wait.on.regionservers.mintostart` property from its default value of 1.
 765 See link:https://issues.apache.org/jira/browse/HBASE-6389[HBASE-6389 Modify the
 766             conditions to ensure that Master waits for sufficient number of Region Servers before
 767             starting region assignments] for more detail.
 768
 769 [[recommended_configurations]]
 770 === Recommended Configurations
 771
 772 [[recommended_configurations.zk]]
 773 ==== ZooKeeper Configuration
 774
 775 [[sect.zookeeper.session.timeout]]
 776 ===== `zookeeper.session.timeout`
 777
 778 The default timeout is 90 seconds (specified in milliseconds). This means that if a server crashes, it will be 90 seconds before the Master notices the crash and starts recovery.
 779 You might need to tune the timeout down to a minute or even less so the Master notices failures sooner.
 780 Before changing this value, be sure you have your JVM garbage collection configuration under control, otherwise, a long garbage collection that lasts beyond the ZooKeeper session timeout will take out your RegionServer. (You might be fine with this -- you probably want recovery to start on the server if a RegionServer has been in GC for a long period of time).
 781
 782 To change this configuration, edit _hbase-site.xml_, copy the changed file across the cluster and restart.
 783
 784 We set this value high to save our having to field questions up on the mailing lists asking why a RegionServer went down during a massive import.
 785 The usual cause is that their JVM is untuned and they are running into long GC pauses.
 786 Our thinking is that while users are getting familiar with HBase, we'd save them having to know all of its intricacies.
 787 Later when they've built some confidence, then they can play with configuration such as this.
 788
 789 [[zookeeper.instances]]
 790 ===== Number of ZooKeeper Instances
 791
 792 See <<zookeeper,zookeeper>>.
 793
 794 [[recommended.configurations.hdfs]]
 795 ==== HDFS Configurations
 796
 797 [[dfs.datanode.failed.volumes.tolerated]]
 798 ===== `dfs.datanode.failed.volumes.tolerated`
 799
 800 This is the "...number of volumes that are allowed to fail before a DataNode stops offering service.
 801 By default any volume failure will cause a datanode to shutdown" from the _hdfs-default.xml_ description.
 802 You might want to set this to about half the amount of your available disks.
 803
 804 [[hbase.regionserver.handler.count]]
 805 ===== `hbase.regionserver.handler.count`
 806
 807 This setting defines the number of threads that are kept open to answer incoming requests to user tables.
 808 The rule of thumb is to keep this number low when the payload per request approaches the MB (big puts, scans using a large cache) and high when the payload is small (gets, small puts, ICVs, deletes). The total size of the queries in progress is limited by the setting `hbase.ipc.server.max.callqueue.size`.
 809
 810 It is safe to set that number to the maximum number of incoming clients if their payload is small, the typical example being a cluster that serves a website since puts aren't typically buffered and most of the operations are gets.
 811
 812 The reason why it is dangerous to keep this setting high is that the aggregate size of all the puts that are currently happening in a region server may impose too much pressure on its memory, or even trigger an OutOfMemoryError.
 813 A RegionServer running on low memory will trigger its JVM's garbage collector to run more frequently up to a point where GC pauses become noticeable (the reason being that all the memory used to keep all the requests' payloads cannot be trashed, no matter how hard the garbage collector tries). After some time, the overall cluster throughput is affected since every request that hits that RegionServer will take longer, which exacerbates the problem even more.
 814
 815 You can get a sense of whether you have too little or too many handlers by <<rpc.logging,rpc.logging>> on an individual RegionServer then tailing its logs (Queued requests consume memory).
 816
 817 [[big_memory]]
 818 ==== Configuration for large memory machines
 819
 820 HBase ships with a reasonable, conservative configuration that will work on nearly all machine types that people might want to test with.
 821 If you have larger machines -- HBase has 8G and larger heap -- you might find the following configuration options helpful.
 822 TODO.
 823
 824 [[config.compression]]
 825 ==== Compression
 826
 827 You should consider enabling ColumnFamily compression.
 828 There are several options that are near-frictionless and in most all cases boost performance by reducing the size of StoreFiles and thus reducing I/O.
 829
 830 See <<compression,compression>> for more information.
 831
 832 [[config.wals]]
 833 ==== Configuring the size and number of WAL files
 834
 835 HBase uses <<wal,wal>> to recover the memstore data that has not been flushed to disk in case of an RS failure.
 836 These WAL files should be configured to be slightly smaller than HDFS block (by default a HDFS block is 64Mb and a WAL file is ~60Mb).
 837
 838 HBase also has a limit on the number of WAL files, designed to ensure there's never too much data that needs to be replayed during recovery.
 839 This limit needs to be set according to memstore configuration, so that all the necessary data would fit.
 840 It is recommended to allocate enough WAL files to store at least that much data (when all memstores are close to full). For example, with 16Gb RS heap, default memstore settings (0.4), and default WAL file size (~60Mb), 16Gb*0.4/60, the starting point for WAL file count is ~109.
 841 However, as all memstores are not expected to be full all the time, less WAL files can be allocated.
 842
 843 [[disable.splitting]]
 844 ==== Managed Splitting
 845
 846 HBase generally handles splitting of your regions based upon the settings in your _hbase-default.xml_ and _hbase-site.xml_          configuration files.
 847 Important settings include `hbase.regionserver.region.split.policy`, `hbase.hregion.max.filesize`, `hbase.regionserver.regionSplitLimit`.
 848 A simplistic view of splitting is that when a region grows to `hbase.hregion.max.filesize`, it is split.
 849 For most usage patterns, you should use automatic splitting.
 850 See <<manual_region_splitting_decisions,manual region splitting decisions>> for more information about manual region splitting.
 851
 852 Instead of allowing HBase to split your regions automatically, you can choose to manage the splitting yourself.
 853 Manually managing splits works if you know your keyspace well, otherwise let HBase figure where to split for you.
 854 Manual splitting can mitigate region creation and movement under load.
 855 It also makes it so region boundaries are known and invariant (if you disable region splitting). If you use manual splits, it is easier doing staggered, time-based major compactions to spread out your network IO load.
 856
 857 .Disable Automatic Splitting
 858 To disable automatic splitting, you can set region split policy in either cluster configuration or table configuration to be `org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy`
 859
 860 .Automatic Splitting Is Recommended
 861 [NOTE]
 862 ====
 863 If you disable automatic splits to diagnose a problem or during a period of fast data growth, it is recommended to re-enable them when your situation becomes more stable.
 864 The potential benefits of managing region splits yourself are not undisputed.
 865 ====
 866
 867 .Determine the Optimal Number of Pre-Split Regions
 868 The optimal number of pre-split regions depends on your application and environment.
 869 A good rule of thumb is to start with 10 pre-split regions per server and watch as data grows over time.
 870 It is better to err on the side of too few regions and perform rolling splits later.
 871 The optimal number of regions depends upon the largest StoreFile in your region.
 872 The size of the largest StoreFile will increase with time if the amount of data grows.
 873 The goal is for the largest region to be just large enough that the compaction selection algorithm only compacts it during a timed major compaction.
 874 Otherwise, the cluster can be prone to compaction storms with a large number of regions under compaction at the same time.
 875 It is important to understand that the data growth causes compaction storms and not the manual split decision.
 876
 877 If the regions are split into too many large regions, you can increase the major compaction interval by configuring `HConstants.MAJOR_COMPACTION_PERIOD`.
 878 The `org.apache.hadoop.hbase.util.RegionSplitter` utility also provides a network-IO-safe rolling split of all regions.
 879
 880 [[managed.compactions]]
 881 ==== Managed Compactions
 882
 883 By default, major compactions are scheduled to run once in a 7-day period.
 884
 885 If you need to control exactly when and how often major compaction runs, you can disable managed major compactions.
 886 See the entry for `hbase.hregion.majorcompaction` in the <<compaction.parameters,compaction.parameters>> table for details.
 887
 888 .Do Not Disable Major Compactions
 889 [WARNING]
 890 ====
 891 Major compactions are absolutely necessary for StoreFile clean-up.
 892 Do not disable them altogether.
 893 You can run major compactions manually via the HBase shell or via the link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html#majorCompact-org.apache.hadoop.hbase.TableName-[Admin API].
 894 ====
 895
 896 For more information about compactions and the compaction file selection process, see <<compaction,compaction>>
 897
 898 [[spec.ex]]
 899 ==== Speculative Execution
 900
 901 Speculative Execution of MapReduce tasks is on by default, and for HBase clusters it is generally advised to turn off Speculative Execution at a system-level unless you need it for a specific case, where it can be configured per-job.
 902 Set the properties `mapreduce.map.speculative` and `mapreduce.reduce.speculative` to false.
 903
 904 [[other_configuration]]
 905 === Other Configurations
 906
 907 [[balancer_config]]
 908 ==== Balancer
 909
 910 The balancer is a periodic operation which is run on the master to redistribute regions on the cluster.
 911 It is configured via `hbase.balancer.period` and defaults to 300000 (5 minutes).
 912
 913 See <<master.processes.loadbalancer,master.processes.loadbalancer>> for more information on the LoadBalancer.
 914
 915 [[disabling.blockcache]]
 916 ==== Disabling Blockcache
 917
 918 Do not turn off block cache (You'd do it by setting `hfile.block.cache.size` to zero). Currently we do not do well if you do this because the RegionServer will spend all its time loading HFile indices over and over again.
 919 If your working set is such that block cache does you no good, at least size the block cache such that HFile indices will stay up in the cache (you can get a rough idea on the size you need by surveying RegionServer UIs; you'll see index block size accounted near the top of the webpage).
 920
 921 [[nagles]]
 922 ==== link:http://en.wikipedia.org/wiki/Nagle's_algorithm[Nagle's] or the small package problem
 923
 924 If a big 40ms or so occasional delay is seen in operations against HBase, try the Nagles' setting.
 925 For example, see the user mailing list thread, link:http://search-hadoop.com/m/pduLg2fydtE/Inconsistent+scan+performance+with+caching+set+&subj=Re+Inconsistent+scan+performance+with+caching+set+to+1[Inconsistent scan performance with caching set to 1] and the issue cited therein where setting `notcpdelay` improved scan speeds.
 926 You might also see the graphs on the tail of link:https://issues.apache.org/jira/browse/HBASE-7008[HBASE-7008 Set scanner caching to a better default] where our Lars Hofhansl tries various data sizes w/ Nagle's on and off measuring the effect.
 927
 928 [[mttr]]
 929 ==== Better Mean Time to Recover (MTTR)
 930
 931 This section is about configurations that will make servers come back faster after a fail.
 932 See the Deveraj Das and Nicolas Liochon blog post link:http://hortonworks.com/blog/introduction-to-hbase-mean-time-to-recover-mttr/[Introduction to HBase Mean Time to Recover (MTTR)] for a brief introduction.
 933
 934 The issue link:https://issues.apache.org/jira/browse/HBASE-8389[HBASE-8354 forces Namenode into loop with lease recovery requests] is messy but has a bunch of good discussion toward the end on low timeouts and how to cause faster recovery including citation of fixes added to HDFS. Read the Varun Sharma comments.
 935 The below suggested configurations are Varun's suggestions distilled and tested.
 936 Make sure you are running on a late-version HDFS so you have the fixes he refers to and himself adds to HDFS that help HBase MTTR (e.g.
 937 HDFS-3703, HDFS-3712, and HDFS-4791 -- Hadoop 2 for sure has them and late Hadoop 1 has some). Set the following in the RegionServer.
 938
 939 [source,xml]
 940 ----
 941 <property>
 942   <name>hbase.lease.recovery.dfs.timeout</name>
 943   <value>23000</value>
 944   <description>How much time we allow elapse between calls to recover lease.
 945   Should be larger than the dfs timeout.</description>
 946 </property>
 947 <property>
 948   <name>dfs.client.socket-timeout</name>
 949   <value>10000</value>
 950   <description>Down the DFS timeout from 60 to 10 seconds.</description>
 951 </property>
 952 ----
 953
 954 And on the NameNode/DataNode side, set the following to enable 'staleness' introduced in HDFS-3703, HDFS-3912.
 955
 956 [source,xml]
 957 ----
 958 <property>
 959   <name>dfs.client.socket-timeout</name>
 960   <value>10000</value>
 961   <description>Down the DFS timeout from 60 to 10 seconds.</description>
 962 </property>
 963 <property>
 964   <name>dfs.datanode.socket.write.timeout</name>
 965   <value>10000</value>
 966   <description>Down the DFS timeout from 8 * 60 to 10 seconds.</description>
 967 </property>
 968 <property>
 969   <name>ipc.client.connect.timeout</name>
 970   <value>3000</value>
 971   <description>Down from 60 seconds to 3.</description>
 972 </property>
 973 <property>
 974   <name>ipc.client.connect.max.retries.on.timeouts</name>
 975   <value>2</value>
 976   <description>Down from 45 seconds to 3 (2 == 3 retries).</description>
 977 </property>
 978 <property>
 979   <name>dfs.namenode.avoid.read.stale.datanode</name>
 980   <value>true</value>
 981   <description>Enable stale state in hdfs</description>
 982 </property>
 983 <property>
 984   <name>dfs.namenode.stale.datanode.interval</name>
 985   <value>20000</value>
 986   <description>Down from default 30 seconds</description>
 987 </property>
 988 <property>
 989   <name>dfs.namenode.avoid.write.stale.datanode</name>
 990   <value>true</value>
 991   <description>Enable stale state in hdfs</description>
 992 </property>
 993 ----
 994
 995 [[jmx_config]]
 996 ==== JMX
 997
 998 JMX (Java Management Extensions) provides built-in instrumentation that enables you to monitor and manage the Java VM.
 999 To enable monitoring and management from remote systems, you need to set system property `com.sun.management.jmxremote.port` (the port number through which you want to enable JMX RMI connections) when you start the Java VM.
1000 See the link:http://docs.oracle.com/javase/8/docs/technotes/guides/management/agent.html[official documentation] for more information.
1001 Historically, besides above port mentioned, JMX opens two additional random TCP listening ports, which could lead to port conflict problem. (See link:https://issues.apache.org/jira/browse/HBASE-10289[HBASE-10289] for details)
1002
1003 As an alternative, you can use the coprocessor-based JMX implementation provided by HBase.
1004 To enable it, add below property in _hbase-site.xml_:
1005
1006 [source,xml]
1007 ----
1008 <property>
1009   <name>hbase.coprocessor.regionserver.classes</name>
1010   <value>org.apache.hadoop.hbase.JMXListener</value>
1011 </property>
1012 ----
1013
1014 NOTE: DO NOT set `com.sun.management.jmxremote.port` for Java VM at the same time.
1015
1016 Currently it supports Master and RegionServer Java VM.
1017 By default, the JMX listens on TCP port 10102, you can further configure the port using below properties:
1018
1019 [source,xml]
1020 ----
1021 <property>
1022   <name>regionserver.rmi.registry.port</name>
1023   <value>61130</value>
1024 </property>
1025 <property>
1026   <name>regionserver.rmi.connector.port</name>
1027   <value>61140</value>
1028 </property>
1029 ----
1030
1031 The registry port can be shared with connector port in most cases, so you only need to configure regionserver.rmi.registry.port.
1032 However if you want to use SSL communication, the 2 ports must be configured to different values.
1033
1034 By default the password authentication and SSL communication is disabled.
1035 To enable password authentication, you need to update _hbase-env.sh_          like below:
1036 [source,bash]
1037 ----
1038 export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.authenticate=true                  \
1039                        -Dcom.sun.management.jmxremote.password.file=your_password_file   \
1040                        -Dcom.sun.management.jmxremote.access.file=your_access_file"
1041
1042 export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE "
1043 export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE "
1044 ----
1045
1046 See example password/access file under _$JRE_HOME/lib/management_.
1047
1048 To enable SSL communication with password authentication, follow below steps:
1049
1050 [source,bash]
1051 ----
1052 #1. generate a key pair, stored in myKeyStore
1053 keytool -genkey -alias jconsole -keystore myKeyStore
1054
1055 #2. export it to file jconsole.cert
1056 keytool -export -alias jconsole -keystore myKeyStore -file jconsole.cert
1057
1058 #3. copy jconsole.cert to jconsole client machine, import it to jconsoleKeyStore
1059 keytool -import -alias jconsole -keystore jconsoleKeyStore -file jconsole.cert
1060 ----
1061
1062 And then update _hbase-env.sh_ like below:
1063
1064 [source,bash]
1065 ----
1066 export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=true                         \
1067                        -Djavax.net.ssl.keyStore=/home/tianq/myKeyStore                 \
1068                        -Djavax.net.ssl.keyStorePassword=your_password_in_step_1       \
1069                        -Dcom.sun.management.jmxremote.authenticate=true                \
1070                        -Dcom.sun.management.jmxremote.password.file=your_password file \
1071                        -Dcom.sun.management.jmxremote.access.file=your_access_file"
1072
1073 export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE "
1074 export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE "
1075 ----
1076
1077 Finally start `jconsole` on the client using the key store:
1078
1079 [source,bash]
1080 ----
1081 jconsole -J-Djavax.net.ssl.trustStore=/home/tianq/jconsoleKeyStore
1082 ----
1083
1084 NOTE: To enable the HBase JMX implementation on Master, you also need to add below property in _hbase-site.xml_:
1085
1086 [source,xml]
1087 ----
1088 <property>
1089   <name>hbase.coprocessor.master.classes</name>
1090   <value>org.apache.hadoop.hbase.JMXListener</value>
1091 </property>
1092 ----
1093
1094 The corresponding properties for port configuration are `master.rmi.registry.port` (by default 10101) and `master.rmi.connector.port` (by default the same as registry.port)
1095
1096 [[dyn_config]]
1097 == Dynamic Configuration
1098
1099 It is possible to change a subset of the configuration without requiring a server restart.
1100 In the HBase shell, the operations `update_config` and `update_all_config` will prompt a server or all servers to reload configuration.
1101
1102 Only a subset of all configurations can currently be changed in the running server.
1103 Here are those configurations:
1104
1105 .Configurations support dynamically change
1106 [cols="1",options="header"]
1107 |===
1108 | Key
1109 | hbase.ipc.server.fallback-to-simple-auth-allowed
1110 | hbase.cleaner.scan.dir.concurrent.size
1111 | hbase.regionserver.thread.compaction.large
1112 | hbase.regionserver.thread.compaction.small
1113 | hbase.regionserver.thread.split
1114 | hbase.regionserver.throughput.controller
1115 | hbase.regionserver.thread.hfilecleaner.throttle
1116 | hbase.regionserver.hfilecleaner.large.queue.size
1117 | hbase.regionserver.hfilecleaner.small.queue.size
1118 | hbase.regionserver.hfilecleaner.large.thread.count
1119 | hbase.regionserver.hfilecleaner.small.thread.count
1120 | hbase.regionserver.hfilecleaner.thread.timeout.msec
1121 | hbase.regionserver.hfilecleaner.thread.check.interval.msec
1122 | hbase.regionserver.flush.throughput.controller
1123 | hbase.hstore.compaction.max.size
1124 | hbase.hstore.compaction.max.size.offpeak
1125 | hbase.hstore.compaction.min.size
1126 | hbase.hstore.compaction.min
1127 | hbase.hstore.compaction.max
1128 | hbase.hstore.compaction.ratio
1129 | hbase.hstore.compaction.ratio.offpeak
1130 | hbase.regionserver.thread.compaction.throttle
1131 | hbase.hregion.majorcompaction
1132 | hbase.hregion.majorcompaction.jitter
1133 | hbase.hstore.min.locality.to.skip.major.compact
1134 | hbase.hstore.compaction.date.tiered.max.storefile.age.millis
1135 | hbase.hstore.compaction.date.tiered.incoming.window.min
1136 | hbase.hstore.compaction.date.tiered.window.policy.class
1137 | hbase.hstore.compaction.date.tiered.single.output.for.minor.compaction
1138 | hbase.hstore.compaction.date.tiered.window.factory.class
1139 | hbase.offpeak.start.hour
1140 | hbase.offpeak.end.hour
1141 | hbase.oldwals.cleaner.thread.size
1142 | hbase.oldwals.cleaner.thread.timeout.msec
1143 | hbase.oldwals.cleaner.thread.check.interval.msec
1144 | hbase.procedure.worker.keep.alive.time.msec
1145 | hbase.procedure.worker.add.stuck.percentage
1146 | hbase.procedure.worker.monitor.interval.msec
1147 | hbase.procedure.worker.stuck.threshold.msec
1148 | hbase.regions.slop
1149 | hbase.regions.overallSlop
1150 | hbase.balancer.tablesOnMaster
1151 | hbase.balancer.tablesOnMaster.systemTablesOnly
1152 | hbase.util.ip.to.rack.determiner
1153 | hbase.ipc.server.max.callqueue.length
1154 | hbase.ipc.server.priority.max.callqueue.length
1155 | hbase.ipc.server.callqueue.type
1156 | hbase.ipc.server.callqueue.codel.target.delay
1157 | hbase.ipc.server.callqueue.codel.interval
1158 | hbase.ipc.server.callqueue.codel.lifo.threshold
1159 | hbase.master.balancer.stochastic.maxSteps
1160 | hbase.master.balancer.stochastic.stepsPerRegion
1161 | hbase.master.balancer.stochastic.maxRunningTime
1162 | hbase.master.balancer.stochastic.runMaxSteps
1163 | hbase.master.balancer.stochastic.numRegionLoadsToRemember
1164 | hbase.master.loadbalance.bytable
1165 | hbase.master.balancer.stochastic.minCostNeedBalance
1166 | hbase.master.balancer.stochastic.localityCost
1167 | hbase.master.balancer.stochastic.rackLocalityCost
1168 | hbase.master.balancer.stochastic.readRequestCost
1169 | hbase.master.balancer.stochastic.writeRequestCost
1170 | hbase.master.balancer.stochastic.memstoreSizeCost
1171 | hbase.master.balancer.stochastic.storefileSizeCost
1172 | hbase.master.balancer.stochastic.regionReplicaHostCostKey
1173 | hbase.master.balancer.stochastic.regionReplicaRackCostKey
1174 | hbase.master.balancer.stochastic.regionCountCost
1175 | hbase.master.balancer.stochastic.primaryRegionCountCost
1176 | hbase.master.balancer.stochastic.moveCost
1177 | hbase.master.balancer.stochastic.maxMovePercent
1178 | hbase.master.balancer.stochastic.tableSkewCost
1179 | hbase.master.regions.recovery.check.interval
1180 | hbase.regions.recovery.store.file.ref.count
1181 |===
1182
1183 ifdef::backend-docbook[]
1184 [index]
1185 == Index
1186 // Generated automatically by the DocBook toolchain.
1187 endif::backend-docbook[]