Autogenerated HTML docs for v2.41.0-191-g6ff334
[git-htmldocs.git] / gitformat-pack.html
blobb2a4385ec0ede63bec006f8bb9fafb74bb02cb22
1 <?xml version="1.0" encoding="UTF-8"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
3 "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
4 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
5 <head>
6 <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />
7 <meta name="generator" content="AsciiDoc 10.2.0" />
8 <title>gitformat-pack(5)</title>
9 <style type="text/css">
10 /* Shared CSS for AsciiDoc xhtml11 and html5 backends */
12 /* Default font. */
13 body {
14 font-family: Georgia,serif;
17 /* Title font. */
18 h1, h2, h3, h4, h5, h6,
19 div.title, caption.title,
20 thead, p.table.header,
21 #toctitle,
22 #author, #revnumber, #revdate, #revremark,
23 #footer {
24 font-family: Arial,Helvetica,sans-serif;
27 body {
28 margin: 1em 5% 1em 5%;
31 a {
32 color: blue;
33 text-decoration: underline;
35 a:visited {
36 color: fuchsia;
39 em {
40 font-style: italic;
41 color: navy;
44 strong {
45 font-weight: bold;
46 color: #083194;
49 h1, h2, h3, h4, h5, h6 {
50 color: #527bbd;
51 margin-top: 1.2em;
52 margin-bottom: 0.5em;
53 line-height: 1.3;
56 h1, h2, h3 {
57 border-bottom: 2px solid silver;
59 h2 {
60 padding-top: 0.5em;
62 h3 {
63 float: left;
65 h3 + * {
66 clear: left;
68 h5 {
69 font-size: 1.0em;
72 div.sectionbody {
73 margin-left: 0;
76 hr {
77 border: 1px solid silver;
80 p {
81 margin-top: 0.5em;
82 margin-bottom: 0.5em;
85 ul, ol, li > p {
86 margin-top: 0;
88 ul > li { color: #aaa; }
89 ul > li > * { color: black; }
91 .monospaced, code, pre {
92 font-family: "Courier New", Courier, monospace;
93 font-size: inherit;
94 color: navy;
95 padding: 0;
96 margin: 0;
98 pre {
99 white-space: pre-wrap;
102 #author {
103 color: #527bbd;
104 font-weight: bold;
105 font-size: 1.1em;
107 #email {
109 #revnumber, #revdate, #revremark {
112 #footer {
113 font-size: small;
114 border-top: 2px solid silver;
115 padding-top: 0.5em;
116 margin-top: 4.0em;
118 #footer-text {
119 float: left;
120 padding-bottom: 0.5em;
122 #footer-badges {
123 float: right;
124 padding-bottom: 0.5em;
127 #preamble {
128 margin-top: 1.5em;
129 margin-bottom: 1.5em;
131 div.imageblock, div.exampleblock, div.verseblock,
132 div.quoteblock, div.literalblock, div.listingblock, div.sidebarblock,
133 div.admonitionblock {
134 margin-top: 1.0em;
135 margin-bottom: 1.5em;
137 div.admonitionblock {
138 margin-top: 2.0em;
139 margin-bottom: 2.0em;
140 margin-right: 10%;
141 color: #606060;
144 div.content { /* Block element content. */
145 padding: 0;
148 /* Block element titles. */
149 div.title, caption.title {
150 color: #527bbd;
151 font-weight: bold;
152 text-align: left;
153 margin-top: 1.0em;
154 margin-bottom: 0.5em;
156 div.title + * {
157 margin-top: 0;
160 td div.title:first-child {
161 margin-top: 0.0em;
163 div.content div.title:first-child {
164 margin-top: 0.0em;
166 div.content + div.title {
167 margin-top: 0.0em;
170 div.sidebarblock > div.content {
171 background: #ffffee;
172 border: 1px solid #dddddd;
173 border-left: 4px solid #f0f0f0;
174 padding: 0.5em;
177 div.listingblock > div.content {
178 border: 1px solid #dddddd;
179 border-left: 5px solid #f0f0f0;
180 background: #f8f8f8;
181 padding: 0.5em;
184 div.quoteblock, div.verseblock {
185 padding-left: 1.0em;
186 margin-left: 1.0em;
187 margin-right: 10%;
188 border-left: 5px solid #f0f0f0;
189 color: #888;
192 div.quoteblock > div.attribution {
193 padding-top: 0.5em;
194 text-align: right;
197 div.verseblock > pre.content {
198 font-family: inherit;
199 font-size: inherit;
201 div.verseblock > div.attribution {
202 padding-top: 0.75em;
203 text-align: left;
205 /* DEPRECATED: Pre version 8.2.7 verse style literal block. */
206 div.verseblock + div.attribution {
207 text-align: left;
210 div.admonitionblock .icon {
211 vertical-align: top;
212 font-size: 1.1em;
213 font-weight: bold;
214 text-decoration: underline;
215 color: #527bbd;
216 padding-right: 0.5em;
218 div.admonitionblock td.content {
219 padding-left: 0.5em;
220 border-left: 3px solid #dddddd;
223 div.exampleblock > div.content {
224 border-left: 3px solid #dddddd;
225 padding-left: 0.5em;
228 div.imageblock div.content { padding-left: 0; }
229 span.image img { border-style: none; vertical-align: text-bottom; }
230 a.image:visited { color: white; }
232 dl {
233 margin-top: 0.8em;
234 margin-bottom: 0.8em;
236 dt {
237 margin-top: 0.5em;
238 margin-bottom: 0;
239 font-style: normal;
240 color: navy;
242 dd > *:first-child {
243 margin-top: 0.1em;
246 ul, ol {
247 list-style-position: outside;
249 ol.arabic {
250 list-style-type: decimal;
252 ol.loweralpha {
253 list-style-type: lower-alpha;
255 ol.upperalpha {
256 list-style-type: upper-alpha;
258 ol.lowerroman {
259 list-style-type: lower-roman;
261 ol.upperroman {
262 list-style-type: upper-roman;
265 div.compact ul, div.compact ol,
266 div.compact p, div.compact p,
267 div.compact div, div.compact div {
268 margin-top: 0.1em;
269 margin-bottom: 0.1em;
272 tfoot {
273 font-weight: bold;
275 td > div.verse {
276 white-space: pre;
279 div.hdlist {
280 margin-top: 0.8em;
281 margin-bottom: 0.8em;
283 div.hdlist tr {
284 padding-bottom: 15px;
286 dt.hdlist1.strong, td.hdlist1.strong {
287 font-weight: bold;
289 td.hdlist1 {
290 vertical-align: top;
291 font-style: normal;
292 padding-right: 0.8em;
293 color: navy;
295 td.hdlist2 {
296 vertical-align: top;
298 div.hdlist.compact tr {
299 margin: 0;
300 padding-bottom: 0;
303 .comment {
304 background: yellow;
307 .footnote, .footnoteref {
308 font-size: 0.8em;
311 span.footnote, span.footnoteref {
312 vertical-align: super;
315 #footnotes {
316 margin: 20px 0 20px 0;
317 padding: 7px 0 0 0;
320 #footnotes div.footnote {
321 margin: 0 0 5px 0;
324 #footnotes hr {
325 border: none;
326 border-top: 1px solid silver;
327 height: 1px;
328 text-align: left;
329 margin-left: 0;
330 width: 20%;
331 min-width: 100px;
334 div.colist td {
335 padding-right: 0.5em;
336 padding-bottom: 0.3em;
337 vertical-align: top;
339 div.colist td img {
340 margin-top: 0.3em;
343 @media print {
344 #footer-badges { display: none; }
347 #toc {
348 margin-bottom: 2.5em;
351 #toctitle {
352 color: #527bbd;
353 font-size: 1.1em;
354 font-weight: bold;
355 margin-top: 1.0em;
356 margin-bottom: 0.1em;
359 div.toclevel0, div.toclevel1, div.toclevel2, div.toclevel3, div.toclevel4 {
360 margin-top: 0;
361 margin-bottom: 0;
363 div.toclevel2 {
364 margin-left: 2em;
365 font-size: 0.9em;
367 div.toclevel3 {
368 margin-left: 4em;
369 font-size: 0.9em;
371 div.toclevel4 {
372 margin-left: 6em;
373 font-size: 0.9em;
376 span.aqua { color: aqua; }
377 span.black { color: black; }
378 span.blue { color: blue; }
379 span.fuchsia { color: fuchsia; }
380 span.gray { color: gray; }
381 span.green { color: green; }
382 span.lime { color: lime; }
383 span.maroon { color: maroon; }
384 span.navy { color: navy; }
385 span.olive { color: olive; }
386 span.purple { color: purple; }
387 span.red { color: red; }
388 span.silver { color: silver; }
389 span.teal { color: teal; }
390 span.white { color: white; }
391 span.yellow { color: yellow; }
393 span.aqua-background { background: aqua; }
394 span.black-background { background: black; }
395 span.blue-background { background: blue; }
396 span.fuchsia-background { background: fuchsia; }
397 span.gray-background { background: gray; }
398 span.green-background { background: green; }
399 span.lime-background { background: lime; }
400 span.maroon-background { background: maroon; }
401 span.navy-background { background: navy; }
402 span.olive-background { background: olive; }
403 span.purple-background { background: purple; }
404 span.red-background { background: red; }
405 span.silver-background { background: silver; }
406 span.teal-background { background: teal; }
407 span.white-background { background: white; }
408 span.yellow-background { background: yellow; }
410 span.big { font-size: 2em; }
411 span.small { font-size: 0.6em; }
413 span.underline { text-decoration: underline; }
414 span.overline { text-decoration: overline; }
415 span.line-through { text-decoration: line-through; }
417 div.unbreakable { page-break-inside: avoid; }
421 * xhtml11 specific
423 * */
425 div.tableblock {
426 margin-top: 1.0em;
427 margin-bottom: 1.5em;
429 div.tableblock > table {
430 border: 3px solid #527bbd;
432 thead, p.table.header {
433 font-weight: bold;
434 color: #527bbd;
436 p.table {
437 margin-top: 0;
439 /* Because the table frame attribute is overridden by CSS in most browsers. */
440 div.tableblock > table[frame="void"] {
441 border-style: none;
443 div.tableblock > table[frame="hsides"] {
444 border-left-style: none;
445 border-right-style: none;
447 div.tableblock > table[frame="vsides"] {
448 border-top-style: none;
449 border-bottom-style: none;
454 * html5 specific
456 * */
458 table.tableblock {
459 margin-top: 1.0em;
460 margin-bottom: 1.5em;
462 thead, p.tableblock.header {
463 font-weight: bold;
464 color: #527bbd;
466 p.tableblock {
467 margin-top: 0;
469 table.tableblock {
470 border-width: 3px;
471 border-spacing: 0px;
472 border-style: solid;
473 border-color: #527bbd;
474 border-collapse: collapse;
476 th.tableblock, td.tableblock {
477 border-width: 1px;
478 padding: 4px;
479 border-style: solid;
480 border-color: #527bbd;
483 table.tableblock.frame-topbot {
484 border-left-style: hidden;
485 border-right-style: hidden;
487 table.tableblock.frame-sides {
488 border-top-style: hidden;
489 border-bottom-style: hidden;
491 table.tableblock.frame-none {
492 border-style: hidden;
495 th.tableblock.halign-left, td.tableblock.halign-left {
496 text-align: left;
498 th.tableblock.halign-center, td.tableblock.halign-center {
499 text-align: center;
501 th.tableblock.halign-right, td.tableblock.halign-right {
502 text-align: right;
505 th.tableblock.valign-top, td.tableblock.valign-top {
506 vertical-align: top;
508 th.tableblock.valign-middle, td.tableblock.valign-middle {
509 vertical-align: middle;
511 th.tableblock.valign-bottom, td.tableblock.valign-bottom {
512 vertical-align: bottom;
517 * manpage specific
519 * */
521 body.manpage h1 {
522 padding-top: 0.5em;
523 padding-bottom: 0.5em;
524 border-top: 2px solid silver;
525 border-bottom: 2px solid silver;
527 body.manpage h2 {
528 border-style: none;
530 body.manpage div.sectionbody {
531 margin-left: 3em;
534 @media print {
535 body.manpage div#toc { display: none; }
539 </style>
540 <script type="text/javascript">
541 /*<![CDATA[*/
542 var asciidoc = { // Namespace.
544 /////////////////////////////////////////////////////////////////////
545 // Table Of Contents generator
546 /////////////////////////////////////////////////////////////////////
548 /* Author: Mihai Bazon, September 2002
549 * http://students.infoiasi.ro/~mishoo
551 * Table Of Content generator
552 * Version: 0.4
554 * Feel free to use this script under the terms of the GNU General Public
555 * License, as long as you do not remove or alter this notice.
558 /* modified by Troy D. Hanson, September 2006. License: GPL */
559 /* modified by Stuart Rackham, 2006, 2009. License: GPL */
561 // toclevels = 1..4.
562 toc: function (toclevels) {
564 function getText(el) {
565 var text = "";
566 for (var i = el.firstChild; i != null; i = i.nextSibling) {
567 if (i.nodeType == 3 /* Node.TEXT_NODE */) // IE doesn't speak constants.
568 text += i.data;
569 else if (i.firstChild != null)
570 text += getText(i);
572 return text;
575 function TocEntry(el, text, toclevel) {
576 this.element = el;
577 this.text = text;
578 this.toclevel = toclevel;
581 function tocEntries(el, toclevels) {
582 var result = new Array;
583 var re = new RegExp('[hH]([1-'+(toclevels+1)+'])');
584 // Function that scans the DOM tree for header elements (the DOM2
585 // nodeIterator API would be a better technique but not supported by all
586 // browsers).
587 var iterate = function (el) {
588 for (var i = el.firstChild; i != null; i = i.nextSibling) {
589 if (i.nodeType == 1 /* Node.ELEMENT_NODE */) {
590 var mo = re.exec(i.tagName);
591 if (mo && (i.getAttribute("class") || i.getAttribute("className")) != "float") {
592 result[result.length] = new TocEntry(i, getText(i), mo[1]-1);
594 iterate(i);
598 iterate(el);
599 return result;
602 var toc = document.getElementById("toc");
603 if (!toc) {
604 return;
607 // Delete existing TOC entries in case we're reloading the TOC.
608 var tocEntriesToRemove = [];
609 var i;
610 for (i = 0; i < toc.childNodes.length; i++) {
611 var entry = toc.childNodes[i];
612 if (entry.nodeName.toLowerCase() == 'div'
613 && entry.getAttribute("class")
614 && entry.getAttribute("class").match(/^toclevel/))
615 tocEntriesToRemove.push(entry);
617 for (i = 0; i < tocEntriesToRemove.length; i++) {
618 toc.removeChild(tocEntriesToRemove[i]);
621 // Rebuild TOC entries.
622 var entries = tocEntries(document.getElementById("content"), toclevels);
623 for (var i = 0; i < entries.length; ++i) {
624 var entry = entries[i];
625 if (entry.element.id == "")
626 entry.element.id = "_toc_" + i;
627 var a = document.createElement("a");
628 a.href = "#" + entry.element.id;
629 a.appendChild(document.createTextNode(entry.text));
630 var div = document.createElement("div");
631 div.appendChild(a);
632 div.className = "toclevel" + entry.toclevel;
633 toc.appendChild(div);
635 if (entries.length == 0)
636 toc.parentNode.removeChild(toc);
640 /////////////////////////////////////////////////////////////////////
641 // Footnotes generator
642 /////////////////////////////////////////////////////////////////////
644 /* Based on footnote generation code from:
645 * http://www.brandspankingnew.net/archive/2005/07/format_footnote.html
648 footnotes: function () {
649 // Delete existing footnote entries in case we're reloading the footnodes.
650 var i;
651 var noteholder = document.getElementById("footnotes");
652 if (!noteholder) {
653 return;
655 var entriesToRemove = [];
656 for (i = 0; i < noteholder.childNodes.length; i++) {
657 var entry = noteholder.childNodes[i];
658 if (entry.nodeName.toLowerCase() == 'div' && entry.getAttribute("class") == "footnote")
659 entriesToRemove.push(entry);
661 for (i = 0; i < entriesToRemove.length; i++) {
662 noteholder.removeChild(entriesToRemove[i]);
665 // Rebuild footnote entries.
666 var cont = document.getElementById("content");
667 var spans = cont.getElementsByTagName("span");
668 var refs = {};
669 var n = 0;
670 for (i=0; i<spans.length; i++) {
671 if (spans[i].className == "footnote") {
672 n++;
673 var note = spans[i].getAttribute("data-note");
674 if (!note) {
675 // Use [\s\S] in place of . so multi-line matches work.
676 // Because JavaScript has no s (dotall) regex flag.
677 note = spans[i].innerHTML.match(/\s*\[([\s\S]*)]\s*/)[1];
678 spans[i].innerHTML =
679 "[<a id='_footnoteref_" + n + "' href='#_footnote_" + n +
680 "' title='View footnote' class='footnote'>" + n + "</a>]";
681 spans[i].setAttribute("data-note", note);
683 noteholder.innerHTML +=
684 "<div class='footnote' id='_footnote_" + n + "'>" +
685 "<a href='#_footnoteref_" + n + "' title='Return to text'>" +
686 n + "</a>. " + note + "</div>";
687 var id =spans[i].getAttribute("id");
688 if (id != null) refs["#"+id] = n;
691 if (n == 0)
692 noteholder.parentNode.removeChild(noteholder);
693 else {
694 // Process footnoterefs.
695 for (i=0; i<spans.length; i++) {
696 if (spans[i].className == "footnoteref") {
697 var href = spans[i].getElementsByTagName("a")[0].getAttribute("href");
698 href = href.match(/#.*/)[0]; // Because IE return full URL.
699 n = refs[href];
700 spans[i].innerHTML =
701 "[<a href='#_footnote_" + n +
702 "' title='View footnote' class='footnote'>" + n + "</a>]";
708 install: function(toclevels) {
709 var timerId;
711 function reinstall() {
712 asciidoc.footnotes();
713 if (toclevels) {
714 asciidoc.toc(toclevels);
718 function reinstallAndRemoveTimer() {
719 clearInterval(timerId);
720 reinstall();
723 timerId = setInterval(reinstall, 500);
724 if (document.addEventListener)
725 document.addEventListener("DOMContentLoaded", reinstallAndRemoveTimer, false);
726 else
727 window.onload = reinstallAndRemoveTimer;
731 asciidoc.install();
732 /*]]>*/
733 </script>
734 </head>
735 <body class="manpage">
736 <div id="header">
737 <h1>
738 gitformat-pack(5) Manual Page
739 </h1>
740 <h2>NAME</h2>
741 <div class="sectionbody">
742 <p>gitformat-pack -
743 Git pack format
744 </p>
745 </div>
746 </div>
747 <div id="content">
748 <div class="sect1">
749 <h2 id="_synopsis">SYNOPSIS</h2>
750 <div class="sectionbody">
751 <div class="verseblock">
752 <pre class="content">$GIT_DIR/objects/pack/pack-<strong>.{pack,idx}
753 $GIT_DIR/objects/pack/pack-</strong>.rev
754 $GIT_DIR/objects/pack/pack-*.mtimes
755 $GIT_DIR/objects/pack/multi-pack-index</pre>
756 <div class="attribution">
757 </div></div>
758 </div>
759 </div>
760 <div class="sect1">
761 <h2 id="_description">DESCRIPTION</h2>
762 <div class="sectionbody">
763 <div class="paragraph"><p>The Git pack format is now Git stores most of its primary repository
764 data. Over the lietime af a repository loose objects (if any) and
765 smaller packs are consolidated into larger pack(s). See
766 <a href="git-gc.html">git-gc(1)</a> and <a href="git-pack-objects.html">git-pack-objects(1)</a>.</p></div>
767 <div class="paragraph"><p>The pack format is also used over-the-wire, see
768 e.g. <a href="gitprotocol-v2.html">gitprotocol-v2(5)</a>, as well as being a part of
769 other container formats in the case of <a href="gitformat-bundle.html">gitformat-bundle(5)</a>.</p></div>
770 </div>
771 </div>
772 <div class="sect1">
773 <h2 id="_checksums_and_object_ids">Checksums and object IDs</h2>
774 <div class="sectionbody">
775 <div class="paragraph"><p>In a repository using the traditional SHA-1, pack checksums, index checksums,
776 and object IDs (object names) mentioned below are all computed using SHA-1.
777 Similarly, in SHA-256 repositories, these values are computed using SHA-256.</p></div>
778 </div>
779 </div>
780 <div class="sect1">
781 <h2 id="_pack_pack_files_have_the_following_format">pack-*.pack files have the following format:</h2>
782 <div class="sectionbody">
783 <div class="ulist"><ul>
784 <li>
786 A header appears at the beginning and consists of the following:
787 </p>
788 <div class="literalblock">
789 <div class="content">
790 <pre><code>4-byte signature:
791 The signature is: {'P', 'A', 'C', 'K'}</code></pre>
792 </div></div>
793 <div class="literalblock">
794 <div class="content">
795 <pre><code>4-byte version number (network byte order):
796 Git currently accepts version number 2 or 3 but
797 generates version 2 only.</code></pre>
798 </div></div>
799 <div class="literalblock">
800 <div class="content">
801 <pre><code>4-byte number of objects contained in the pack (network byte order)</code></pre>
802 </div></div>
803 <div class="literalblock">
804 <div class="content">
805 <pre><code>Observation: we cannot have more than 4G versions ;-) and
806 more than 4G objects in a pack.</code></pre>
807 </div></div>
808 </li>
809 <li>
811 The header is followed by number of object entries, each of
812 which looks like this:
813 </p>
814 <div class="literalblock">
815 <div class="content">
816 <pre><code>(undeltified representation)
817 n-byte type and length (3-bit type, (n-1)*7+4-bit length)
818 compressed data</code></pre>
819 </div></div>
820 <div class="literalblock">
821 <div class="content">
822 <pre><code>(deltified representation)
823 n-byte type and length (3-bit type, (n-1)*7+4-bit length)
824 base object name if OBJ_REF_DELTA or a negative relative
825 offset from the delta object's position in the pack if this
826 is an OBJ_OFS_DELTA object
827 compressed delta data</code></pre>
828 </div></div>
829 <div class="literalblock">
830 <div class="content">
831 <pre><code>Observation: length of each object is encoded in a variable
832 length format and is not constrained to 32-bit or anything.</code></pre>
833 </div></div>
834 </li>
835 <li>
837 The trailer records a pack checksum of all of the above.
838 </p>
839 </li>
840 </ul></div>
841 <div class="sect2">
842 <h3 id="_object_types">Object types</h3>
843 <div class="paragraph"><p>Valid object types are:</p></div>
844 <div class="ulist"><ul>
845 <li>
847 OBJ_COMMIT (1)
848 </p>
849 </li>
850 <li>
852 OBJ_TREE (2)
853 </p>
854 </li>
855 <li>
857 OBJ_BLOB (3)
858 </p>
859 </li>
860 <li>
862 OBJ_TAG (4)
863 </p>
864 </li>
865 <li>
867 OBJ_OFS_DELTA (6)
868 </p>
869 </li>
870 <li>
872 OBJ_REF_DELTA (7)
873 </p>
874 </li>
875 </ul></div>
876 <div class="paragraph"><p>Type 5 is reserved for future expansion. Type 0 is invalid.</p></div>
877 </div>
878 <div class="sect2">
879 <h3 id="_size_encoding">Size encoding</h3>
880 <div class="paragraph"><p>This document uses the following "size encoding" of non-negative
881 integers: From each byte, the seven least significant bits are
882 used to form the resulting integer. As long as the most significant
883 bit is 1, this process continues; the byte with MSB 0 provides the
884 last seven bits. The seven-bit chunks are concatenated. Later
885 values are more significant.</p></div>
886 <div class="paragraph"><p>This size encoding should not be confused with the "offset encoding",
887 which is also used in this document.</p></div>
888 </div>
889 <div class="sect2">
890 <h3 id="_deltified_representation">Deltified representation</h3>
891 <div class="paragraph"><p>Conceptually there are only four object types: commit, tree, tag and
892 blob. However to save space, an object could be stored as a "delta" of
893 another "base" object. These representations are assigned new types
894 ofs-delta and ref-delta, which is only valid in a pack file.</p></div>
895 <div class="paragraph"><p>Both ofs-delta and ref-delta store the "delta" to be applied to
896 another object (called <em>base object</em>) to reconstruct the object. The
897 difference between them is, ref-delta directly encodes base object
898 name. If the base object is in the same pack, ofs-delta encodes
899 the offset of the base object in the pack instead.</p></div>
900 <div class="paragraph"><p>The base object could also be deltified if it&#8217;s in the same pack.
901 Ref-delta can also refer to an object outside the pack (i.e. the
902 so-called "thin pack"). When stored on disk however, the pack should
903 be self contained to avoid cyclic dependency.</p></div>
904 <div class="paragraph"><p>The delta data starts with the size of the base object and the
905 size of the object to be reconstructed. These sizes are
906 encoded using the size encoding from above. The remainder of
907 the delta data is a sequence of instructions to reconstruct the object
908 from the base object. If the base object is deltified, it must be
909 converted to canonical form first. Each instruction appends more and
910 more data to the target object until it&#8217;s complete. There are two
911 supported instructions so far: one for copy a byte range from the
912 source object and one for inserting new data embedded in the
913 instruction itself.</p></div>
914 <div class="paragraph"><p>Each instruction has variable length. Instruction type is determined
915 by the seventh bit of the first octet. The following diagrams follow
916 the convention in RFC 1951 (Deflate compressed data format).</p></div>
917 <div class="sect3">
918 <h4 id="_instruction_to_copy_from_base_object">Instruction to copy from base object</h4>
919 <div class="literalblock">
920 <div class="content">
921 <pre><code>+----------+---------+---------+---------+---------+-------+-------+-------+
922 | 1xxxxxxx | offset1 | offset2 | offset3 | offset4 | size1 | size2 | size3 |
923 +----------+---------+---------+---------+---------+-------+-------+-------+</code></pre>
924 </div></div>
925 <div class="paragraph"><p>This is the instruction format to copy a byte range from the source
926 object. It encodes the offset to copy from and the number of bytes to
927 copy. Offset and size are in little-endian order.</p></div>
928 <div class="paragraph"><p>All offset and size bytes are optional. This is to reduce the
929 instruction size when encoding small offsets or sizes. The first seven
930 bits in the first octet determines which of the next seven octets is
931 present. If bit zero is set, offset1 is present. If bit one is set
932 offset2 is present and so on.</p></div>
933 <div class="paragraph"><p>Note that a more compact instruction does not change offset and size
934 encoding. For example, if only offset2 is omitted like below, offset3
935 still contains bits 16-23. It does not become offset2 and contains
936 bits 8-15 even if it&#8217;s right next to offset1.</p></div>
937 <div class="literalblock">
938 <div class="content">
939 <pre><code>+----------+---------+---------+
940 | 10000101 | offset1 | offset3 |
941 +----------+---------+---------+</code></pre>
942 </div></div>
943 <div class="paragraph"><p>In its most compact form, this instruction only takes up one byte
944 (0x80) with both offset and size omitted, which will have default
945 values zero. There is another exception: size zero is automatically
946 converted to 0x10000.</p></div>
947 </div>
948 <div class="sect3">
949 <h4 id="_instruction_to_add_new_data">Instruction to add new data</h4>
950 <div class="literalblock">
951 <div class="content">
952 <pre><code>+----------+============+
953 | 0xxxxxxx | data |
954 +----------+============+</code></pre>
955 </div></div>
956 <div class="paragraph"><p>This is the instruction to construct target object without the base
957 object. The following data is appended to the target object. The first
958 seven bits of the first octet determines the size of data in
959 bytes. The size must be non-zero.</p></div>
960 </div>
961 <div class="sect3">
962 <h4 id="_reserved_instruction">Reserved instruction</h4>
963 <div class="literalblock">
964 <div class="content">
965 <pre><code>+----------+============
966 | 00000000 |
967 +----------+============</code></pre>
968 </div></div>
969 <div class="paragraph"><p>This is the instruction reserved for future expansion.</p></div>
970 </div>
971 </div>
972 </div>
973 </div>
974 <div class="sect1">
975 <h2 id="_original_version_1_pack_idx_files_have_the_following_format">Original (version 1) pack-*.idx files have the following format:</h2>
976 <div class="sectionbody">
977 <div class="ulist"><ul>
978 <li>
980 The header consists of 256 4-byte network byte order
981 integers. N-th entry of this table records the number of
982 objects in the corresponding pack, the first byte of whose
983 object name is less than or equal to N. This is called the
984 <em>first-level fan-out</em> table.
985 </p>
986 </li>
987 <li>
989 The header is followed by sorted 24-byte entries, one entry
990 per object in the pack. Each entry is:
991 </p>
992 <div class="literalblock">
993 <div class="content">
994 <pre><code>4-byte network byte order integer, recording where the
995 object is stored in the packfile as the offset from the
996 beginning.</code></pre>
997 </div></div>
998 <div class="literalblock">
999 <div class="content">
1000 <pre><code>one object name of the appropriate size.</code></pre>
1001 </div></div>
1002 </li>
1003 <li>
1005 The file is concluded with a trailer:
1006 </p>
1007 <div class="literalblock">
1008 <div class="content">
1009 <pre><code>A copy of the pack checksum at the end of the corresponding
1010 packfile.</code></pre>
1011 </div></div>
1012 <div class="literalblock">
1013 <div class="content">
1014 <pre><code>Index checksum of all of the above.</code></pre>
1015 </div></div>
1016 </li>
1017 </ul></div>
1018 <div class="paragraph"><p>Pack Idx file:</p></div>
1019 <div class="literalblock">
1020 <div class="content">
1021 <pre><code> -- +--------------------------------+
1022 fanout | fanout[0] = 2 (for example) |-.
1023 table +--------------------------------+ |
1024 | fanout[1] | |
1025 +--------------------------------+ |
1026 | fanout[2] | |
1027 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
1028 | fanout[255] = total objects |---.
1029 -- +--------------------------------+ | |
1030 main | offset | | |
1031 index | object name 00XXXXXXXXXXXXXXXX | | |
1032 table +--------------------------------+ | |
1033 | offset | | |
1034 | object name 00XXXXXXXXXXXXXXXX | | |
1035 +--------------------------------+&lt;+ |
1036 .-| offset | |
1037 | | object name 01XXXXXXXXXXXXXXXX | |
1038 | +--------------------------------+ |
1039 | | offset | |
1040 | | object name 01XXXXXXXXXXXXXXXX | |
1041 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
1042 | | offset | |
1043 | | object name FFXXXXXXXXXXXXXXXX | |
1044 --| +--------------------------------+&lt;--+
1045 trailer | | packfile checksum |
1046 | +--------------------------------+
1047 | | idxfile checksum |
1048 | +--------------------------------+
1049 .-------.
1051 Pack file entry: &lt;+</code></pre>
1052 </div></div>
1053 <div class="literalblock">
1054 <div class="content">
1055 <pre><code>packed object header:
1056 1-byte size extension bit (MSB)
1057 type (next 3 bit)
1058 size0 (lower 4-bit)
1059 n-byte sizeN (as long as MSB is set, each 7-bit)
1060 size0..sizeN form 4+7+7+..+7 bit integer, size0
1061 is the least significant part, and sizeN is the
1062 most significant part.
1063 packed object data:
1064 If it is not DELTA, then deflated bytes (the size above
1065 is the size before compression).
1066 If it is REF_DELTA, then
1067 base object name (the size above is the
1068 size of the delta data that follows).
1069 delta data, deflated.
1070 If it is OFS_DELTA, then
1071 n-byte offset (see below) interpreted as a negative
1072 offset from the type-byte of the header of the
1073 ofs-delta entry (the size above is the size of
1074 the delta data that follows).
1075 delta data, deflated.</code></pre>
1076 </div></div>
1077 <div class="literalblock">
1078 <div class="content">
1079 <pre><code>offset encoding:
1080 n bytes with MSB set in all but the last one.
1081 The offset is then the number constructed by
1082 concatenating the lower 7 bit of each byte, and
1083 for n &gt;= 2 adding 2^7 + 2^14 + ... + 2^(7*(n-1))
1084 to the result.</code></pre>
1085 </div></div>
1086 </div>
1087 </div>
1088 <div class="sect1">
1089 <h2 id="_version_2_pack_idx_files_support_packs_larger_than_4_gib_and">Version 2 pack-*.idx files support packs larger than 4 GiB, and</h2>
1090 <div class="sectionbody">
1091 <div class="literalblock">
1092 <div class="content">
1093 <pre><code>have some other reorganizations. They have the format:</code></pre>
1094 </div></div>
1095 <div class="ulist"><ul>
1096 <li>
1098 A 4-byte magic number <em>\377tOc</em> which is an unreasonable
1099 fanout[0] value.
1100 </p>
1101 </li>
1102 <li>
1104 A 4-byte version number (= 2)
1105 </p>
1106 </li>
1107 <li>
1109 A 256-entry fan-out table just like v1.
1110 </p>
1111 </li>
1112 <li>
1114 A table of sorted object names. These are packed together
1115 without offset values to reduce the cache footprint of the
1116 binary search for a specific object name.
1117 </p>
1118 </li>
1119 <li>
1121 A table of 4-byte CRC32 values of the packed object data.
1122 This is new in v2 so compressed data can be copied directly
1123 from pack to pack during repacking without undetected
1124 data corruption.
1125 </p>
1126 </li>
1127 <li>
1129 A table of 4-byte offset values (in network byte order).
1130 These are usually 31-bit pack file offsets, but large
1131 offsets are encoded as an index into the next table with
1132 the msbit set.
1133 </p>
1134 </li>
1135 <li>
1137 A table of 8-byte offset entries (empty for pack files less
1138 than 2 GiB). Pack files are organized with heavily used
1139 objects toward the front, so most object references should
1140 not need to refer to this table.
1141 </p>
1142 </li>
1143 <li>
1145 The same trailer as a v1 pack file:
1146 </p>
1147 <div class="literalblock">
1148 <div class="content">
1149 <pre><code>A copy of the pack checksum at the end of
1150 corresponding packfile.</code></pre>
1151 </div></div>
1152 <div class="literalblock">
1153 <div class="content">
1154 <pre><code>Index checksum of all of the above.</code></pre>
1155 </div></div>
1156 </li>
1157 </ul></div>
1158 </div>
1159 </div>
1160 <div class="sect1">
1161 <h2 id="_pack_rev_files_have_the_format">pack-*.rev files have the format:</h2>
1162 <div class="sectionbody">
1163 <div class="ulist"><ul>
1164 <li>
1166 A 4-byte magic number <em>0x52494458</em> (<em>RIDX</em>).
1167 </p>
1168 </li>
1169 <li>
1171 A 4-byte version identifier (= 1).
1172 </p>
1173 </li>
1174 <li>
1176 A 4-byte hash function identifier (= 1 for SHA-1, 2 for SHA-256).
1177 </p>
1178 </li>
1179 <li>
1181 A table of index positions (one per packed object, num_objects in
1182 total, each a 4-byte unsigned integer in network order), sorted by
1183 their corresponding offsets in the packfile.
1184 </p>
1185 </li>
1186 <li>
1188 A trailer, containing a:
1189 </p>
1190 <div class="literalblock">
1191 <div class="content">
1192 <pre><code>checksum of the corresponding packfile, and</code></pre>
1193 </div></div>
1194 <div class="literalblock">
1195 <div class="content">
1196 <pre><code>a checksum of all of the above.</code></pre>
1197 </div></div>
1198 </li>
1199 </ul></div>
1200 <div class="paragraph"><p>All 4-byte numbers are in network order.</p></div>
1201 </div>
1202 </div>
1203 <div class="sect1">
1204 <h2 id="_pack_mtimes_files_have_the_format">pack-*.mtimes files have the format:</h2>
1205 <div class="sectionbody">
1206 <div class="paragraph"><p>All 4-byte numbers are in network byte order.</p></div>
1207 <div class="ulist"><ul>
1208 <li>
1210 A 4-byte magic number <em>0x4d544d45</em> (<em>MTME</em>).
1211 </p>
1212 </li>
1213 <li>
1215 A 4-byte version identifier (= 1).
1216 </p>
1217 </li>
1218 <li>
1220 A 4-byte hash function identifier (= 1 for SHA-1, 2 for SHA-256).
1221 </p>
1222 </li>
1223 <li>
1225 A table of 4-byte unsigned integers. The ith value is the
1226 modification time (mtime) of the ith object in the corresponding
1227 pack by lexicographic (index) order. The mtimes count standard
1228 epoch seconds.
1229 </p>
1230 </li>
1231 <li>
1233 A trailer, containing a checksum of the corresponding packfile,
1234 and a checksum of all of the above (each having length according
1235 to the specified hash function).
1236 </p>
1237 </li>
1238 </ul></div>
1239 </div>
1240 </div>
1241 <div class="sect1">
1242 <h2 id="_multi_pack_index_midx_files_have_the_following_format">multi-pack-index (MIDX) files have the following format:</h2>
1243 <div class="sectionbody">
1244 <div class="paragraph"><p>The multi-pack-index files refer to multiple pack-files and loose objects.</p></div>
1245 <div class="paragraph"><p>In order to allow extensions that add extra data to the MIDX, we organize
1246 the body into "chunks" and provide a lookup table at the beginning of the
1247 body. The header includes certain length values, such as the number of packs,
1248 the number of base MIDX files, hash lengths and types.</p></div>
1249 <div class="paragraph"><p>All 4-byte numbers are in network order.</p></div>
1250 <div class="paragraph"><p>HEADER:</p></div>
1251 <div class="literalblock">
1252 <div class="content">
1253 <pre><code>4-byte signature:
1254 The signature is: {'M', 'I', 'D', 'X'}</code></pre>
1255 </div></div>
1256 <div class="literalblock">
1257 <div class="content">
1258 <pre><code>1-byte version number:
1259 Git only writes or recognizes version 1.</code></pre>
1260 </div></div>
1261 <div class="literalblock">
1262 <div class="content">
1263 <pre><code>1-byte Object Id Version
1264 We infer the length of object IDs (OIDs) from this value:
1265 1 =&gt; SHA-1
1266 2 =&gt; SHA-256
1267 If the hash type does not match the repository's hash algorithm,
1268 the multi-pack-index file should be ignored with a warning
1269 presented to the user.</code></pre>
1270 </div></div>
1271 <div class="literalblock">
1272 <div class="content">
1273 <pre><code>1-byte number of "chunks"</code></pre>
1274 </div></div>
1275 <div class="literalblock">
1276 <div class="content">
1277 <pre><code>1-byte number of base multi-pack-index files:
1278 This value is currently always zero.</code></pre>
1279 </div></div>
1280 <div class="literalblock">
1281 <div class="content">
1282 <pre><code>4-byte number of pack files</code></pre>
1283 </div></div>
1284 <div class="paragraph"><p>CHUNK LOOKUP:</p></div>
1285 <div class="literalblock">
1286 <div class="content">
1287 <pre><code>(C + 1) * 12 bytes providing the chunk offsets:
1288 First 4 bytes describe chunk id. Value 0 is a terminating label.
1289 Other 8 bytes provide offset in current file for chunk to start.
1290 (Chunks are provided in file-order, so you can infer the length
1291 using the next chunk position if necessary.)</code></pre>
1292 </div></div>
1293 <div class="literalblock">
1294 <div class="content">
1295 <pre><code>The CHUNK LOOKUP matches the table of contents from
1296 the chunk-based file format, see linkgit:gitformat-chunk[5].</code></pre>
1297 </div></div>
1298 <div class="literalblock">
1299 <div class="content">
1300 <pre><code>The remaining data in the body is described one chunk at a time, and
1301 these chunks may be given in any order. Chunks are required unless
1302 otherwise specified.</code></pre>
1303 </div></div>
1304 <div class="paragraph"><p>CHUNK DATA:</p></div>
1305 <div class="literalblock">
1306 <div class="content">
1307 <pre><code>Packfile Names (ID: {'P', 'N', 'A', 'M'})
1308 Stores the packfile names as concatenated, null-terminated strings.
1309 Packfiles must be listed in lexicographic order for fast lookups by
1310 name. This is the only chunk not guaranteed to be a multiple of four
1311 bytes in length, so should be the last chunk for alignment reasons.</code></pre>
1312 </div></div>
1313 <div class="literalblock">
1314 <div class="content">
1315 <pre><code>OID Fanout (ID: {'O', 'I', 'D', 'F'})
1316 The ith entry, F[i], stores the number of OIDs with first
1317 byte at most i. Thus F[255] stores the total
1318 number of objects.</code></pre>
1319 </div></div>
1320 <div class="literalblock">
1321 <div class="content">
1322 <pre><code>OID Lookup (ID: {'O', 'I', 'D', 'L'})
1323 The OIDs for all objects in the MIDX are stored in lexicographic
1324 order in this chunk.</code></pre>
1325 </div></div>
1326 <div class="literalblock">
1327 <div class="content">
1328 <pre><code>Object Offsets (ID: {'O', 'O', 'F', 'F'})
1329 Stores two 4-byte values for every object.
1330 1: The pack-int-id for the pack storing this object.
1331 2: The offset within the pack.
1332 If all offsets are less than 2^32, then the large offset chunk
1333 will not exist and offsets are stored as in IDX v1.
1334 If there is at least one offset value larger than 2^32-1, then
1335 the large offset chunk must exist, and offsets larger than
1336 2^31-1 must be stored in it instead. If the large offset chunk
1337 exists and the 31st bit is on, then removing that bit reveals
1338 the row in the large offsets containing the 8-byte offset of
1339 this object.</code></pre>
1340 </div></div>
1341 <div class="literalblock">
1342 <div class="content">
1343 <pre><code>[Optional] Object Large Offsets (ID: {'L', 'O', 'F', 'F'})
1344 8-byte offsets into large packfiles.</code></pre>
1345 </div></div>
1346 <div class="literalblock">
1347 <div class="content">
1348 <pre><code>[Optional] Bitmap pack order (ID: {'R', 'I', 'D', 'X'})
1349 A list of MIDX positions (one per object in the MIDX, num_objects in
1350 total, each a 4-byte unsigned integer in network byte order), sorted
1351 according to their relative bitmap/pseudo-pack positions.</code></pre>
1352 </div></div>
1353 <div class="paragraph"><p>TRAILER:</p></div>
1354 <div class="literalblock">
1355 <div class="content">
1356 <pre><code>Index checksum of the above contents.</code></pre>
1357 </div></div>
1358 </div>
1359 </div>
1360 <div class="sect1">
1361 <h2 id="_multi_pack_index_reverse_indexes">multi-pack-index reverse indexes</h2>
1362 <div class="sectionbody">
1363 <div class="paragraph"><p>Similar to the pack-based reverse index, the multi-pack index can also
1364 be used to generate a reverse index.</p></div>
1365 <div class="paragraph"><p>Instead of mapping between offset, pack-, and index position, this
1366 reverse index maps between an object&#8217;s position within the MIDX, and
1367 that object&#8217;s position within a pseudo-pack that the MIDX describes
1368 (i.e., the ith entry of the multi-pack reverse index holds the MIDX
1369 position of ith object in pseudo-pack order).</p></div>
1370 <div class="paragraph"><p>To clarify the difference between these orderings, consider a multi-pack
1371 reachability bitmap (which does not yet exist, but is what we are
1372 building towards here). Each bit needs to correspond to an object in the
1373 MIDX, and so we need an efficient mapping from bit position to MIDX
1374 position.</p></div>
1375 <div class="paragraph"><p>One solution is to let bits occupy the same position in the oid-sorted
1376 index stored by the MIDX. But because oids are effectively random, their
1377 resulting reachability bitmaps would have no locality, and thus compress
1378 poorly. (This is the reason that single-pack bitmaps use the pack
1379 ordering, and not the .idx ordering, for the same purpose.)</p></div>
1380 <div class="paragraph"><p>So we&#8217;d like to define an ordering for the whole MIDX based around
1381 pack ordering, which has far better locality (and thus compresses more
1382 efficiently). We can think of a pseudo-pack created by the concatenation
1383 of all of the packs in the MIDX. E.g., if we had a MIDX with three packs
1384 (a, b, c), with 10, 15, and 20 objects respectively, we can imagine an
1385 ordering of the objects like:</p></div>
1386 <div class="literalblock">
1387 <div class="content">
1388 <pre><code>|a,0|a,1|...|a,9|b,0|b,1|...|b,14|c,0|c,1|...|c,19|</code></pre>
1389 </div></div>
1390 <div class="paragraph"><p>where the ordering of the packs is defined by the MIDX&#8217;s pack list,
1391 and then the ordering of objects within each pack is the same as the
1392 order in the actual packfile.</p></div>
1393 <div class="paragraph"><p>Given the list of packs and their counts of objects, you can
1394 naïvely reconstruct that pseudo-pack ordering (e.g., the object at
1395 position 27 must be (c,1) because packs "a" and "b" consumed 25 of the
1396 slots). But there&#8217;s a catch. Objects may be duplicated between packs, in
1397 which case the MIDX only stores one pointer to the object (and thus we&#8217;d
1398 want only one slot in the bitmap).</p></div>
1399 <div class="paragraph"><p>Callers could handle duplicates themselves by reading objects in order
1400 of their bit-position, but that&#8217;s linear in the number of objects, and
1401 much too expensive for ordinary bitmap lookups. Building a reverse index
1402 solves this, since it is the logical inverse of the index, and that
1403 index has already removed duplicates. But, building a reverse index on
1404 the fly can be expensive. Since we already have an on-disk format for
1405 pack-based reverse indexes, let&#8217;s reuse it for the MIDX&#8217;s pseudo-pack,
1406 too.</p></div>
1407 <div class="paragraph"><p>Objects from the MIDX are ordered as follows to string together the
1408 pseudo-pack. Let <code>pack(o)</code> return the pack from which <code>o</code> was selected
1409 by the MIDX, and define an ordering of packs based on their numeric ID
1410 (as stored by the MIDX). Let <code>offset(o)</code> return the object offset of <code>o</code>
1411 within <code>pack(o)</code>. Then, compare <code>o1</code> and <code>o2</code> as follows:</p></div>
1412 <div class="ulist"><ul>
1413 <li>
1415 If one of <code>pack(o1)</code> and <code>pack(o2)</code> is preferred and the other
1416 is not, then the preferred one sorts first.
1417 </p>
1418 <div class="paragraph"><p>(This is a detail that allows the MIDX bitmap to determine which
1419 pack should be used by the pack-reuse mechanism, since it can ask
1420 the MIDX for the pack containing the object at bit position 0).</p></div>
1421 </li>
1422 <li>
1424 If <code>pack(o1) ≠ pack(o2)</code>, then sort the two objects in descending
1425 order based on the pack ID.
1426 </p>
1427 </li>
1428 <li>
1430 Otherwise, <code>pack(o1) = pack(o2)</code>, and the objects are sorted in
1431 pack-order (i.e., <code>o1</code> sorts ahead of <code>o2</code> exactly when <code>offset(o1)
1432 &lt; offset(o2)</code>).
1433 </p>
1434 </li>
1435 </ul></div>
1436 <div class="paragraph"><p>In short, a MIDX&#8217;s pseudo-pack is the de-duplicated concatenation of
1437 objects in packs stored by the MIDX, laid out in pack order, and the
1438 packs arranged in MIDX order (with the preferred pack coming first).</p></div>
1439 <div class="paragraph"><p>The MIDX&#8217;s reverse index is stored in the optional <em>RIDX</em> chunk within
1440 the MIDX itself.</p></div>
1441 </div>
1442 </div>
1443 <div class="sect1">
1444 <h2 id="_cruft_packs">cruft packs</h2>
1445 <div class="sectionbody">
1446 <div class="paragraph"><p>The cruft packs feature offer an alternative to Git&#8217;s traditional mechanism of
1447 removing unreachable objects. This document provides an overview of Git&#8217;s
1448 pruning mechanism, and how a cruft pack can be used instead to accomplish the
1449 same.</p></div>
1450 <div class="sect2">
1451 <h3 id="_background">Background</h3>
1452 <div class="paragraph"><p>To remove unreachable objects from your repository, Git offers <code>git repack -Ad</code>
1453 (see <a href="git-repack.html">git-repack(1)</a>). Quoting from the documentation:</p></div>
1454 <div class="listingblock">
1455 <div class="content">
1456 <pre><code>[...] unreachable objects in a previous pack become loose, unpacked objects,
1457 instead of being left in the old pack. [...] loose unreachable objects will be
1458 pruned according to normal expiry rules with the next 'git gc' invocation.</code></pre>
1459 </div></div>
1460 <div class="paragraph"><p>Unreachable objects aren&#8217;t removed immediately, since doing so could race with
1461 an incoming push which may reference an object which is about to be deleted.
1462 Instead, those unreachable objects are stored as loose objects and stay that way
1463 until they are older than the expiration window, at which point they are removed
1464 by <a href="git-prune.html">git-prune(1)</a>.</p></div>
1465 <div class="paragraph"><p>Git must store these unreachable objects loose in order to keep track of their
1466 per-object mtimes. If these unreachable objects were written into one big pack,
1467 then either freshening that pack (because an object contained within it was
1468 re-written) or creating a new pack of unreachable objects would cause the pack&#8217;s
1469 mtime to get updated, and the objects within it would never leave the expiration
1470 window. Instead, objects are stored loose in order to keep track of the
1471 individual object mtimes and avoid a situation where all cruft objects are
1472 freshened at once.</p></div>
1473 <div class="paragraph"><p>This can lead to undesirable situations when a repository contains many
1474 unreachable objects which have not yet left the grace period. Having large
1475 directories in the shards of <code>.git/objects</code> can lead to decreased performance in
1476 the repository. But given enough unreachable objects, this can lead to inode
1477 starvation and degrade the performance of the whole system. Since we
1478 can never pack those objects, these repositories often take up a large amount of
1479 disk space, since we can only zlib compress them, but not store them in delta
1480 chains.</p></div>
1481 </div>
1482 <div class="sect2">
1483 <h3 id="_cruft_packs_2">Cruft packs</h3>
1484 <div class="paragraph"><p>A cruft pack eliminates the need for storing unreachable objects in a loose
1485 state by including the per-object mtimes in a separate file alongside a single
1486 pack containing all loose objects.</p></div>
1487 <div class="paragraph"><p>A cruft pack is written by <code>git repack --cruft</code> when generating a new pack.
1488 <a href="git-pack-objects.html">git-pack-objects(1)</a>'s <code>--cruft</code> option. Note that <code>git repack --cruft</code>
1489 is a classic all-into-one repack, meaning that everything in the resulting pack is
1490 reachable, and everything else is unreachable. Once written, the <code>--cruft</code>
1491 option instructs <code>git repack</code> to generate another pack containing only objects
1492 not packed in the previous step (which equates to packing all unreachable
1493 objects together). This progresses as follows:</p></div>
1494 <div class="olist arabic"><ol class="arabic">
1495 <li>
1497 Enumerate every object, marking any object which is (a) not contained in a
1498 kept-pack, and (b) whose mtime is within the grace period as a traversal
1499 tip.
1500 </p>
1501 </li>
1502 <li>
1504 Perform a reachability traversal based on the tips gathered in the previous
1505 step, adding every object along the way to the pack.
1506 </p>
1507 </li>
1508 <li>
1510 Write the pack out, along with a <code>.mtimes</code> file that records the per-object
1511 timestamps.
1512 </p>
1513 </li>
1514 </ol></div>
1515 <div class="paragraph"><p>This mode is invoked internally by <a href="git-repack.html">git-repack(1)</a> when instructed to
1516 write a cruft pack. Crucially, the set of in-core kept packs is exactly the set
1517 of packs which will not be deleted by the repack; in other words, they contain
1518 all of the repository&#8217;s reachable objects.</p></div>
1519 <div class="paragraph"><p>When a repository already has a cruft pack, <code>git repack --cruft</code> typically only
1520 adds objects to it. An exception to this is when <code>git repack</code> is given the
1521 <code>--cruft-expiration</code> option, which allows the generated cruft pack to omit
1522 expired objects instead of waiting for <a href="git-gc.html">git-gc(1)</a> to expire those objects
1523 later on.</p></div>
1524 <div class="paragraph"><p>It is <a href="git-gc.html">git-gc(1)</a> that is typically responsible for removing expired
1525 unreachable objects.</p></div>
1526 </div>
1527 <div class="sect2">
1528 <h3 id="_caution_for_mixed_version_environments">Caution for mixed-version environments</h3>
1529 <div class="paragraph"><p>Repositories that have cruft packs in them will continue to work with any older
1530 version of Git. Note, however, that previous versions of Git which do not
1531 understand the <code>.mtimes</code> file will use the cruft pack&#8217;s mtime as the mtime for
1532 all of the objects in it. In other words, do not expect older (pre-cruft pack)
1533 versions of Git to interpret or even read the contents of the <code>.mtimes</code> file.</p></div>
1534 <div class="paragraph"><p>Note that having mixed versions of Git GC-ing the same repository can lead to
1535 unreachable objects never being completely pruned. This can happen under the
1536 following circumstances:</p></div>
1537 <div class="ulist"><ul>
1538 <li>
1540 An older version of Git running GC explodes the contents of an existing
1541 cruft pack loose, using the cruft pack&#8217;s mtime.
1542 </p>
1543 </li>
1544 <li>
1546 A newer version running GC collects those loose objects into a cruft pack,
1547 where the .mtime file reflects the loose object&#8217;s actual mtimes, but the
1548 cruft pack mtime is "now".
1549 </p>
1550 </li>
1551 </ul></div>
1552 <div class="paragraph"><p>Repeating this process will lead to unreachable objects not getting pruned as a
1553 result of repeatedly resetting the objects' mtimes to the present time.</p></div>
1554 <div class="paragraph"><p>If you are GC-ing repositories in a mixed version environment, consider omitting
1555 the <code>--cruft</code> option when using <a href="git-repack.html">git-repack(1)</a> and <a href="git-gc.html">git-gc(1)</a>, and
1556 setting the <code>gc.cruftPacks</code> configuration to "false" until all writers
1557 understand cruft packs.</p></div>
1558 </div>
1559 <div class="sect2">
1560 <h3 id="_alternatives">Alternatives</h3>
1561 <div class="paragraph"><p>Notable alternatives to this design include:</p></div>
1562 <div class="ulist"><ul>
1563 <li>
1565 The location of the per-object mtime data, and
1566 </p>
1567 </li>
1568 <li>
1570 Storing unreachable objects in multiple cruft packs.
1571 </p>
1572 </li>
1573 </ul></div>
1574 <div class="paragraph"><p>On the location of mtime data, a new auxiliary file tied to the pack was chosen
1575 to avoid complicating the <code>.idx</code> format. If the <code>.idx</code> format were ever to gain
1576 support for optional chunks of data, it may make sense to consolidate the
1577 <code>.mtimes</code> format into the <code>.idx</code> itself.</p></div>
1578 <div class="paragraph"><p>Storing unreachable objects among multiple cruft packs (e.g., creating a new
1579 cruft pack during each repacking operation including only unreachable objects
1580 which aren&#8217;t already stored in an earlier cruft pack) is significantly more
1581 complicated to construct, and so aren&#8217;t pursued here. The obvious drawback to
1582 the current implementation is that the entire cruft pack must be re-written from
1583 scratch.</p></div>
1584 </div>
1585 </div>
1586 </div>
1587 <div class="sect1">
1588 <h2 id="_git">GIT</h2>
1589 <div class="sectionbody">
1590 <div class="paragraph"><p>Part of the <a href="git.html">git(1)</a> suite</p></div>
1591 </div>
1592 </div>
1593 </div>
1594 <div id="footnotes"><hr /></div>
1595 <div id="footer">
1596 <div id="footer-text">
1597 Last updated
1598 2023-04-28 16:26:53 PDT
1599 </div>
1600 </div>
1601 </body>
1602 </html>