HBASE-26787 TestRegionReplicaReplicationError should inject error in replicateToRepli...
[hbase.git] / src / main / asciidoc / _chapters / slow_log_responses_from_systable.adoc
blob72189045e8cb0a7001d3b4de6e482a16c7caf9be
1 ////
2 /**
3  *
4  * Licensed to the Apache Software Foundation (ASF) under one
5  * or more contributor license agreements.  See the NOTICE file
6  * distributed with this work for additional information
7  * regarding copyright ownership.  The ASF licenses this file
8  * to you under the Apache License, Version 2.0 (the
9  * "License"); you may not use this file except in compliance
10  * with the License.  You may obtain a copy of the License at
11  *
12  *     http://www.apache.org/licenses/LICENSE-2.0
13  *
14  * Unless required by applicable law or agreed to in writing, software
15  * distributed under the License is distributed on an "AS IS" BASIS,
16  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
17  * See the License for the specific language governing permissions and
18  * limitations under the License.
19  */
20 ////
22 [[slow_log_responses_from_systable]]
23 ==== Get Slow/Large Response Logs from System table hbase:slowlog
25 The above section provides details about Admin APIs:
27 * get_slowlog_responses
28 * get_largelog_responses
29 * clear_slowlog_responses
31 All of the above APIs access online in-memory ring buffers from
32 individual RegionServers and accumulate logs from ring buffers to display
33 to end user. However, since the logs are stored in memory, after RegionServer is
34 restarted, all the objects held in memory of that RegionServer will be cleaned up
35 and previous logs are lost. What if we want to persist all these logs forever?
36 What if we want to store them in such a manner that operator can get all historical
37 records with some filters? e.g get me all large/slow RPC logs that are triggered by
38 user1 and are related to region:
39 cluster_test,cccccccc,1589635796466.aa45e1571d533f5ed0bb31cdccaaf9cf. ?
41 If we have a system table that stores such logs in increasing (not so strictly though)
42 order of time, it can definitely help operators debug some historical events
43 (scan, get, put, compaction, flush etc) with detailed inputs.
45 Config which enabled system table to be created and store all log events is
46 `hbase.regionserver.slowlog.systable.enabled`.
48 The default value for this config is `false`. If provided `true`
49 (Note: `hbase.regionserver.slowlog.buffer.enabled` should also be `true`),
50 a cron job running in every RegionServer will persist the slow/large logs into
51 table hbase:slowlog. By default cron job runs every 10 min. Duration can be configured
52 with key: `hbase.slowlog.systable.chore.duration`. By default, RegionServer will
53 store upto 1000(config key: `hbase.regionserver.slowlog.systable.queue.size`)
54 slow/large logs in an internal queue and the chore will retrieve these logs
55 from the queue and perform batch insertion in hbase:slowlog.
57 hbase:slowlog has single ColumnFamily: `info`
58 `info` contains multiple qualifiers which are the same attributes present as
59 part of `get_slowlog_responses` API response.
61 * info:call_details
62 * info:client_address
63 * info:method_name
64 * info:param
65 * info:processing_time
66 * info:queue_time
67 * info:region_name
68 * info:response_size
69 * info:server_class
70 * info:start_time
71 * info:type
72 * info:username
74 And example of 2 rows from hbase:slowlog scan result:
75 [source]
76 ----
78  \x024\xC1\x03\xE9\x04\xF5@                                  column=info:call_details, timestamp=2020-05-16T14:58:14.211Z, value=Scan(org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ScanRequest)
79  \x024\xC1\x03\xE9\x04\xF5@                                  column=info:client_address, timestamp=2020-05-16T14:58:14.211Z, value=172.20.10.2:57347
80  \x024\xC1\x03\xE9\x04\xF5@                                  column=info:method_name, timestamp=2020-05-16T14:58:14.211Z, value=Scan
81  \x024\xC1\x03\xE9\x04\xF5@                                  column=info:param, timestamp=2020-05-16T14:58:14.211Z, value=region { type: REGION_NAME value: "hbase:meta,,1" } scan { column { family: "info" } attribute { name: "_isolationle
82                                                              vel_" value: "\x5C000" } start_row: "cluster_test,33333333,99999999999999" stop_row: "cluster_test,," time_range { from: 0 to: 9223372036854775807 } max_versions: 1 cache_blocks
83                                                              : true max_result_size: 2097152 reversed: true caching: 10 include_stop_row: true readType: PREAD } number_of_rows: 10 close_scanner: false client_handles_partials: true client_
84                                                              handles_heartbeats: true track_scan_metrics: false
85  \x024\xC1\x03\xE9\x04\xF5@                                  column=info:processing_time, timestamp=2020-05-16T14:58:14.211Z, value=18
86  \x024\xC1\x03\xE9\x04\xF5@                                  column=info:queue_time, timestamp=2020-05-16T14:58:14.211Z, value=0
87  \x024\xC1\x03\xE9\x04\xF5@                                  column=info:region_name, timestamp=2020-05-16T14:58:14.211Z, value=hbase:meta,,1
88  \x024\xC1\x03\xE9\x04\xF5@                                  column=info:response_size, timestamp=2020-05-16T14:58:14.211Z, value=1575
89  \x024\xC1\x03\xE9\x04\xF5@                                  column=info:server_class, timestamp=2020-05-16T14:58:14.211Z, value=HRegionServer
90  \x024\xC1\x03\xE9\x04\xF5@                                  column=info:start_time, timestamp=2020-05-16T14:58:14.211Z, value=1589640743732
91  \x024\xC1\x03\xE9\x04\xF5@                                  column=info:type, timestamp=2020-05-16T14:58:14.211Z, value=ALL
92  \x024\xC1\x03\xE9\x04\xF5@                                  column=info:username, timestamp=2020-05-16T14:58:14.211Z, value=user2
93  \x024\xC1\x06X\x81\xF6\xEC                                  column=info:call_details, timestamp=2020-05-16T14:59:58.764Z, value=Scan(org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ScanRequest)
94  \x024\xC1\x06X\x81\xF6\xEC                                  column=info:client_address, timestamp=2020-05-16T14:59:58.764Z, value=172.20.10.2:57348
95  \x024\xC1\x06X\x81\xF6\xEC                                  column=info:method_name, timestamp=2020-05-16T14:59:58.764Z, value=Scan
96  \x024\xC1\x06X\x81\xF6\xEC                                  column=info:param, timestamp=2020-05-16T14:59:58.764Z, value=region { type: REGION_NAME value: "cluster_test,cccccccc,1589635796466.aa45e1571d533f5ed0bb31cdccaaf9cf." } scan { a
97                                                              ttribute { name: "_isolationlevel_" value: "\x5C000" } start_row: "cccccccc" time_range { from: 0 to: 9223372036854775807 } max_versions: 1 cache_blocks: true max_result_size: 2
98                                                              097152 caching: 2147483647 include_stop_row: false } number_of_rows: 2147483647 close_scanner: false client_handles_partials: true client_handles_heartbeats: true track_scan_met
99                                                              rics: false
100  \x024\xC1\x06X\x81\xF6\xEC                                  column=info:processing_time, timestamp=2020-05-16T14:59:58.764Z, value=24
101  \x024\xC1\x06X\x81\xF6\xEC                                  column=info:queue_time, timestamp=2020-05-16T14:59:58.764Z, value=0
102  \x024\xC1\x06X\x81\xF6\xEC                                  column=info:region_name, timestamp=2020-05-16T14:59:58.764Z, value=cluster_test,cccccccc,1589635796466.aa45e1571d533f5ed0bb31cdccaaf9cf.
103  \x024\xC1\x06X\x81\xF6\xEC                                  column=info:response_size, timestamp=2020-05-16T14:59:58.764Z, value=211227
104  \x024\xC1\x06X\x81\xF6\xEC                                  column=info:server_class, timestamp=2020-05-16T14:59:58.764Z, value=HRegionServer
105  \x024\xC1\x06X\x81\xF6\xEC                                  column=info:start_time, timestamp=2020-05-16T14:59:58.764Z, value=1589640743932
106  \x024\xC1\x06X\x81\xF6\xEC                                  column=info:type, timestamp=2020-05-16T14:59:58.764Z, value=ALL
107  \x024\xC1\x06X\x81\xF6\xEC                                  column=info:username, timestamp=2020-05-16T14:59:58.764Z, value=user1
108 ----
110 Operator can use ColumnValueFilter to filter records based on region_name, username,
111 client_address etc.
113 Time range based queries will also be very useful.
114 Example:
115 [source]
116 ----
117 scan 'hbase:slowlog', { TIMERANGE => [1589621394000, 1589637999999] }
118 ----