2 Licensed to the Apache Software Foundation (ASF) under one
3 or more contributor license agreements. See the NOTICE file
4 distributed with this work for additional information
5 regarding copyright ownership. The ASF licenses this file
6 to you under the Apache License, Version 2.0 (the
7 "License"); you may not use this file except in compliance
8 with the License. You may obtain a copy of the License at
10 http://www.apache.org/licenses/LICENSE-2.0
12 Unless required by applicable law or agreed to in writing,
13 software distributed under the License is distributed on an
14 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15 KIND, either express or implied. See the License for the
16 specific language governing permissions and limitations
20 # Git / JIRA Release Audit
22 This is an application for performing an audit between the histories on our git
23 branches and the `fixVersion` field set on issues in JIRA. It does this by
24 building a Sqlite database from the commits found on each git branch,
25 identifying Jira IDs and release tags, and then requesting information about
26 those issues from Jira. Once both sources have been collected, queries can be
27 performed against the database to look for discrepancies between the sources of
28 truth (and, possibly, bugs in this script).
32 The system prerequisites are Python3 with VirtualEnv available and Sqlite. Also,
33 you'll need the content of this directory and a local checkout of git repository.
35 Build a VirtualEnv with the script's dependencies with:
40 $ python3 -m venv ./venv
41 $ ./venv/bin/pip install -r ./requirements.txt
43 Successfully installed...
48 The tool provides basic help docs.
51 $ ./venv/bin/python ./git_jira_release_audit.py --help
52 usage: git_jira_release_audit.py [-h] [--populate-from-git POPULATE_FROM_GIT]
53 [--populate-from-jira POPULATE_FROM_JIRA]
55 [--initialize-db INITIALIZE_DB]
56 [--report-new-for-release-line REPORT_NEW_FOR_RELEASE_LINE]
57 [--report-new-for-release-branch REPORT_NEW_FOR_RELEASE_BRANCH]
58 [--git-repo-path GIT_REPO_PATH]
59 [--remote-name REMOTE_NAME]
60 [--development-branch DEVELOPMENT_BRANCH]
61 [--development-branch-fix-version DEVELOPMENT_BRANCH_FIX_VERSION]
62 [--release-line-regexp RELEASE_LINE_REGEXP]
63 [--parse-release-tags PARSE_RELEASE_TAGS]
64 [--fallback-actions-path FALLBACK_ACTIONS_PATH]
65 [--branch-filter-regexp BRANCH_FILTER_REGEXP]
66 [--jira-url JIRA_URL] --branch-1-fix-version
67 BRANCH_1_FIX_VERSION --branch-2-fix-version
71 -h, --help show this help message and exit
73 Building the audit database:
74 --populate-from-git POPULATE_FROM_GIT
75 When true, populate the audit database from the Git
76 repository. (default: True)
77 --populate-from-jira POPULATE_FROM_JIRA
78 When true, populate the audit database from Jira.
80 --db-path DB_PATH Path to the database file, or leave unspecified for a
81 transient db. (default: audit.db)
82 --initialize-db INITIALIZE_DB
83 When true, initialize the database tables. This is
84 destructive to the contents of an existing database.
88 --report-new-for-release-line REPORT_NEW_FOR_RELEASE_LINE
89 Builds a report of the Jira issues that are new on the
90 target release line, not present on any of the
91 associated release branches. (i.e., on branch-2 but
92 not branch-{2.0,2.1,...}) (default: None)
93 --report-new-for-release-branch REPORT_NEW_FOR_RELEASE_BRANCH
94 Builds a report of the Jira issues that are new on the
95 target release branch, not present on any of the
96 previous release branches. (i.e., on branch-2.3 but
97 not branch-{2.0,2.1,...}) (default: None)
99 Interactions with the Git repo:
100 --git-repo-path GIT_REPO_PATH
101 Path to the git repo, or leave unspecified to infer
102 from the current file's path. (default:
103 ./git_jira_release_audit.py)
104 --remote-name REMOTE_NAME
105 The name of the git remote to use when identifying
106 branches. Default: 'origin' (default: origin)
107 --development-branch DEVELOPMENT_BRANCH
108 The name of the branch from which all release lines
109 originate. Default: 'master' (default: master)
110 --development-branch-fix-version DEVELOPMENT_BRANCH_FIX_VERSION
111 The Jira fixVersion used to indicate an issue is
112 committed to the development branch. (default: 3.0.0)
113 --release-line-regexp RELEASE_LINE_REGEXP
114 A regexp used to identify release lines. (default:
116 --parse-release-tags PARSE_RELEASE_TAGS
117 When true, look for release tags and annotate commits
118 according to their release version. An Expensive
119 calculation, disabled by default. (default: False)
120 --fallback-actions-path FALLBACK_ACTIONS_PATH
121 Path to a file containing _DB.Actions applicable to
122 specific git shas. (default: fallback_actions.csv)
123 --branch-filter-regexp BRANCH_FILTER_REGEXP
124 Limit repo parsing to branch names that match this
125 filter expression. (default: .*)
126 --branch-1-fix-version BRANCH_1_FIX_VERSION
127 The Jira fixVersion used to indicate an issue is
128 committed to the specified release line branch
130 --branch-2-fix-version BRANCH_2_FIX_VERSION
131 The Jira fixVersion used to indicate an issue is
132 committed to the specified release line branch
135 Interactions with Jira:
136 --jira-url JIRA_URL A URL locating the target JIRA instance. (default:
137 https://issues.apache.org/jira)
142 This invocation will build a "simple" database, correlating commits to
143 branches. It omits gathering the detailed release tag data, so it runs pretty
149 $ ./venv/bin/python3 ./git_jira_release_audit.py \
151 --development-branch-fix-version=3.0.0 \
152 --branch-1-fix-version=1.7.0 \
153 --branch-2-fix-version=2.4.0
154 INFO:git_jira_release_audit.py:origin/branch-1.0 has 1433 commits since its origin at 0167558eb31ff48308d592ef70b6d005ba6d21fb.
155 INFO:git_jira_release_audit.py:origin/branch-1.1 has 2111 commits since its origin at 0167558eb31ff48308d592ef70b6d005ba6d21fb.
156 INFO:git_jira_release_audit.py:origin/branch-1.2 has 2738 commits since its origin at 0167558eb31ff48308d592ef70b6d005ba6d21fb.
157 INFO:git_jira_release_audit.py:origin/branch-1.3 has 3296 commits since its origin at 0167558eb31ff48308d592ef70b6d005ba6d21fb.
158 INFO:git_jira_release_audit.py:origin/branch-1.4 has 3926 commits since its origin at 0167558eb31ff48308d592ef70b6d005ba6d21fb.
159 INFO:git_jira_release_audit.py:origin/branch-2 has 3325 commits since its origin at 0d0c330401ade938bf934aafd79ec23705edcc60.
160 INFO:git_jira_release_audit.py:origin/branch-2.0 has 2198 commits since its origin at 0d0c330401ade938bf934aafd79ec23705edcc60.
161 INFO:git_jira_release_audit.py:origin/branch-2.1 has 2749 commits since its origin at 0d0c330401ade938bf934aafd79ec23705edcc60.
162 INFO:git_jira_release_audit.py:origin/branch-2.2 has 2991 commits since its origin at 0d0c330401ade938bf934aafd79ec23705edcc60.
163 INFO:git_jira_release_audit.py:origin/branch-2.3 has 3312 commits since its origin at 0d0c330401ade938bf934aafd79ec23705edcc60.
164 INFO:git_jira_release_audit.py:retrieving 5850 jira_ids from the issue tracker
166 origin/branch-1 100%|████████████████████████████████████| 4084/4084 [00:00<00:00, 9805.33 commit/s]
167 origin/branch-1.0 100%|█████████████████████████████████| 1433/1433 [00:00<00:00, 10479.89 commit/s]
168 origin/branch-1.1 100%|█████████████████████████████████| 2111/2111 [00:00<00:00, 10280.60 commit/s]
169 origin/branch-1.2 100%|██████████████████████████████████| 2738/2738 [00:00<00:00, 8833.51 commit/s]
170 origin/branch-1.3 100%|██████████████████████████████████| 3296/3296 [00:00<00:00, 9746.93 commit/s]
171 origin/branch-1.4 100%|██████████████████████████████████| 3926/3926 [00:00<00:00, 9750.96 commit/s]
172 origin/branch-2 100%|████████████████████████████████████| 3325/3325 [00:00<00:00, 9688.14 commit/s]
173 origin/branch-2.0 100%|██████████████████████████████████| 2198/2198 [00:00<00:00, 8804.18 commit/s]
174 origin/branch-2.1 100%|██████████████████████████████████| 2749/2749 [00:00<00:00, 9328.67 commit/s]
175 origin/branch-2.2 100%|██████████████████████████████████| 2991/2991 [00:00<00:00, 9215.56 commit/s]
176 origin/branch-2.3 100%|██████████████████████████████████| 3312/3312 [00:00<00:00, 9063.19 commit/s]
177 fetch from Jira 100%|████████████████████████████████████████| 5850/5850 [10:40<00:00, 9.14 issue/s]
180 Optionally, the database can be build to include release tags, by specifying
181 `--parse-release-tags=true`. This is more time-consuming, but is necessary for
182 auditing discrepancies between git and Jira. Optionally, limit the branches
183 under consideration by specifying a regex filter with `--branch-filter-regexp`.
184 Running the same command but including this flag looks like this:
187 origin/branch-1 100%|███████████████████████████████████████| 4084/4084 [08:58<00:00, 7.59 commit/s]
188 origin/branch-1.0 100%|█████████████████████████████████████| 1433/1433 [03:54<00:00, 6.13 commit/s]
189 origin/branch-1.1 100%|█████████████████████████████████████| 2111/2111 [41:26<00:00, 0.85 commit/s]
190 origin/branch-1.2 100%|█████████████████████████████████████| 2738/2738 [07:10<00:00, 6.37 commit/s]
191 origin/branch-1.3 100%|██████████████████████████████████| 3296/3296 [2h 33:13<00:00, 0.36 commit/s]
192 origin/branch-1.4 100%|██████████████████████████████████| 3926/3926 [7h 22:41<00:00, 0.15 commit/s]
193 origin/branch-2 100%|████████████████████████████████████| 3325/3325 [2h 05:43<00:00, 0.44 commit/s]
194 origin/branch-2.0 100%|█████████████████████████████████████| 2198/2198 [52:18<00:00, 0.70 commit/s]
195 origin/branch-2.1 100%|█████████████████████████████████████| 2749/2749 [17:09<00:00, 2.67 commit/s]
196 origin/branch-2.2 100%|█████████████████████████████████████| 2991/2991 [52:15<00:00, 0.95 commit/s]
197 origin/branch-2.3 100%|████████████████████████████████████| 3312/3312 [05:08<00:00, 10.74 commit/s]
198 fetch from Jira 100%|████████████████████████████████████████| 5850/5850 [10:46<00:00, 9.06 issue/s]
203 With a database populated with branch information, the build-in reports can be
206 `--report-new-for-release-line`
207 > Builds a report of the Jira issues that are new on the target release line,
208 not present on any of the associated release branches. (i.e., on branch-2 but
209 not branch-{2.0,2.1,...})
211 `--report-new-for-release-branch`
212 > Builds a report of the Jira issues that are new on the target release branch,
213 not present on any of the previous release branches. (i.e., on branch-2.3 but
214 not branch-{2.0,2.1,...})
216 Either way, the output is a csv file containing a summary of each JIRA id found
217 matching the report criteria.
222 $ ./venv/bin/python3.7 ./git_jira_release_audit.py \
223 --populate-from-git=false \
224 --populate-from-jira=false \
225 --branch-1-fix-version=1.7.0 \
226 --branch-2-fix-version=2.4.0 \
227 --report-new-for-release-branch=origin/branch-2.3
228 INFO:git_jira_release_audit.py:retrieving 292 jira_ids from the issue tracker
229 INFO:git_jira_release_audit.py:generated report at new_for_origin-branch-2.3.csv
231 fetch from Jira 100%|████████████████████████████████████████| 292/292 [00:03<00:00, 114.01 issue/s]
232 $ head -n5 new_for_origin-branch-2.3.csv
233 key,issue_type,priority,summary,resolution,components
234 HBASE-21070,Bug,Critical,SnapshotFileCache won't update for snapshots stored in S3,Fixed,['snapshots']
235 HBASE-21773,Bug,Critical,rowcounter utility should respond to pleas for help,Fixed,['tooling']
236 HBASE-21505,Bug,Major,Several inconsistencies on information reported for Replication Sources by hbase shell status 'replication' command.,Fixed,['Replication']
237 HBASE-22057,Bug,Major,Impose upper-bound on size of ZK ops sent in a single multi(),Fixed,[]
240 ### Explore the Database
242 With a populated database, query it with sqlite:
246 SQLite version 3.24.0 2018-06-04 14:10:15
247 Enter ".help" for usage hints.
248 sqlite> -- count the number of distinct commits on a release branch
249 sqlite> select count(distinct jira_id), branch from git_commits group by branch;
251 1189|origin/branch-1.0
252 1728|origin/branch-1.1
253 2289|origin/branch-1.2
254 2788|origin/branch-1.3
255 3289|origin/branch-1.4
257 1813|origin/branch-2.0
258 2327|origin/branch-2.1
259 2566|origin/branch-2.2
260 2839|origin/branch-2.3
262 sqlite> -- find the issues for which the git commit record and JIRA fixVersion disagree
263 sqlite> -- this query requires the database be built with --parse-release-tags
264 sqlite> select g.jira_id, g.git_tag, j.fix_version
266 inner join jira_versions j
267 on g.jira_id = j.jira_id
268 and g.branch = 'origin/branch-2.2'
269 and g.git_tag is not null
270 and j.fix_version like '2.2.%'
271 and g.git_tag != j.fix_version;
272 HBASE-22941|2.2.2|2.2.1
274 sqlite> -- show jira fixVersions for all issues on branch-2.3 but not on any earlier
275 sqlite> -- branch; i.e., issues that are missing a fixVersion or are marked for
276 sqlite> -- a release other than the expected (3.0.0, 2.3.0).
277 sqlite> -- this query requires the database be built with --parse-release-tags
278 sqlite> select jira_id, fix_version
281 SELECT distinct jira_id
283 WHERE branch = 'origin/branch-2.3'
284 EXCEPT SELECT distinct jira_id
287 SELECT distinct branch
289 WHERE branch != 'origin/branch-2.3'))
290 AND fix_version NOT IN ('3.0.0', '2.3.0')
296 HBASE-23032|connector-1.0.1
297 HBASE-23032|hbase-filesystem-1.0.0-alpha2
298 HBASE-23604|HBASE-18095
300 HBASE-23647|HBASE-18095
301 HBASE-23648|HBASE-18095
302 HBASE-23731|HBASE-18095
304 HBASE-23752|HBASE-18095
305 HBASE-23804|HBASE-18095