perf.mkd

   1 % gitolite performance
   2
   3 include sidebar-toc
   4
   5 TOP TIP: If you have more than 2000 or so repos, then you should be using v3.2
   6 or later; there was a bit of code that went in there that makes a *huge*
   7 difference for really large sites.
   8
   9 # tips for performance worriers
  10
  11 Gitolite is pretty efficient in most cases, and generally nothing needs to be
  12 done.  If you think you have a performance problem, let me know on the mailing
  13 list.  Meanwhile, here are some tips:
  14
  15   * Look in the gitolite log file after any operation that you think ran
  16     slowly.  In particular, pushing to the admin repo, or a user creating a
  17     new wild repo, might be a little slow, and the log file will tell you a
  18     bit more detail on what took time.
  19
  20   * If you don't use gitweb or git-daemon, or use them but are perfectly happy
  21     to control access to them from outside gitolite, you can comment out the
  22     corresponding lines in the ENABLE list the rc file.
  23
  24   * If you can't get rid of those scripts, and they are still taking too long,
  25     you can make them run in the background.  They'll eventually finish but
  26     the user doesn't have to wait.  See src/triggers/bg.  *This should not
  27     normally be needed; if you feel you need it, please talk to me so I can
  28     understand why and maybe help*.
  29
  30   * If you're more concerned about your users' time when they create a new
  31     wild repo (and not so much about the admin push taking time), you can fix
  32     a couple of scripts and send me a patch :)
  33
  34     Here's the scoop:
  35
  36     Scripts invoked via `POST_CREATE` *do* get information about what repo has
  37     just been created.  However, the gitweb and daemon scripts are not set to
  38     take advantage of this, only the git-config one is.  So use the git-config
  39     script as an example, and/or read the [triggers][] page, and fix the other
  40     two programs.
  41
  42     (This should be easy enough for the daemon update, but the gitweb update
  43     may be a little more tricky, since it may involve *deleting* lines from
  44     the "projects.list" file.)
  45
  46 # why there's really no need to worry!
  47
  48 In general, gitolite has a constant overhead of about 0.2 seconds on my
  49 laptop.  There really is nothing to optimise, but you can comment out some
  50 triggers as the previous section said.
  51
  52 Here's the big-O stuff:
  53
  54   * N = number of normal repos, each with its own set of rules.  In `repo r1
  55     r2 r3`, N = 3.  Add up all such lines.
  56   * G = number of groups or repo regexes.  In `repo @g1 @g2 foo/[a-z]*`, G =
  57     3.
  58   * M = number of members.  In `@g1 = r1 r2 <nl> @g2 = r3 r4 r5`, M = 5.
  59   * A = average number of rule lines in each "repo" block.  Usually about 5,
  60     maybe 10 sometimes.  You may have more.
  61
  62 Gitolite overheads compared to a normal ssh push are:
  63
  64 1.  perl startup time.  Fairly constant and fairly small.  I have generally
  65     found it pretty hard to measure, especially with a hot cache.
  66 2.  rule parse time.  Details below
  67 3.  rule interpretation time.  Fairly constant, or at least subject to much
  68     smaller variations than #2.
  69
  70 "rule parse time" is where it makes a difference.  There are 2 files gitolite
  71 parses on each "access": `~/.gitolite/conf/gitolite.conf-compiled.pm` and
  72 `~/repositories/your_repo.git/gl-conf`.  The former contains O(N + M + G*A)
  73 lines.  In addition, the gl-conf files contains about "A" lines (remember we
  74 called it an average), which is negligible.
  75
  76 In practice, you can't measure this at a scale that a developer running a "git
  77 push" might even pretend to notice, unless you have more than, say, 5000 repos
  78 or so.  On my testbed of 11,100 repos, where the compiled.pm is almost 0.7 MB,
  79 it takes less than 0.2 seconds to do this.
  80
  81 And on a busy system, when that file will be pretty much always in cache, it's
  82 even less.
  83
  84 # the only thing that will take more time
  85
  86 Literally, the only thing that will take time is something like "ssh git@host
  87 info" because it finds all possible repos and for each of them it tries to
  88 check the access.  On that same test bed, therefore, this ends up reading all
  89 11,100 "gl-conf" files.
  90
  91 On my laptop this takes about 14 seconds.  In contrast, a normal git operation
  92 (clone, pull, push, etc) is so small it is hard to measure without software.