1 From: Kyle J. McKay <mackyle@gmail.com>
2 Subject: [PATCH] strbuf.c: allow use of fgetln when no getdelim
4 Since 10c497aa (read_packed_refs: use a strbuf for reading lines, v2.2.2),
5 packed-refs has been read with repeated calls to strbuf_getwholeline
6 instead of repeated calls to fgets.
8 Unfortunately this resulted in a performance penalty since the original
9 implementation of strbuf_getwholeline used repeated calls to fgetc.
11 This was corrected in 0cc30e0e (strbuf_getwholeline: use getdelim if it is
12 available, v2.5.0) and the performance when getdelim is not available was
13 improved in 82912d1d (strbuf_getwholeline: use getc_unlocked, v2.5.0) by
14 replacing fgetc with getc_unlocked, but still, on systems without getdelim,
15 the performance remains less than the previous way packed-refs was read
16 using repeated fgets calls (on systems with getdelim the performance has
17 actually been improved compared to the original fgets method).
19 Although getdelim is POSIX, it only appears in the newer versions of the
20 standard and as a result it is only present on newer systems and systems
21 that use glibc (where it has always been available).
23 However, on older systems lacking support for the newer POSIX getdelim
24 function there may be an alternative fgetln function available that works,
25 but only when the delimiter is \n.
27 Since \n is the common case and the delimiter used when reading packed-refs,
28 make it possible to use fgetln when calling strbuf_getwholeline with a
29 delimiter of \n and the getdelim function is not available, but fgetln is.
31 This is triggered by defining HAVE_FGETLN which is automatically set for
32 BSD and BSD-derived systems. Since fgetln will only be used when getdelim
33 is NOT available there's no point in defining it on systems that may have
34 it available (such as Linux via the -lbsd library) but are always expected
35 to have getdelim available.
37 Here are some timing results produced by using a simple wrapper that
38 calls strbuf_getwholeline with \n as the delimiter repeatedly to read a
39 160 MiB file consisting of over 2 million lines using various versions of
40 strbuf_getwholeline and also a separate version using fgets to simulate the
43 fgets -- fgets + strlen loop (like pre-v2.2.2 packed-refs reading)
44 fgetc -- strbuf_getwholeline fgetc loop
45 getc_unlocked -- strbuf_getwholeline flockfile and getc_unlocked loop
46 getdelim -- strbuf_getwholeline using getdelim
47 fgetln -- strbuf_getwholeline using fgetln when delim is \n
49 Times are given as u+s (user plus system) averaged over 3 runs.
51 Note that the Linux machine is a faster system than the FreeBSD one.
53 FreeBSD (newer version with POSIX getdelim):
67 In both cases the fgetln version is just a tad faster than the previous
68 fgets version. On Linux fgetln is about 56% faster than the getc_unlocked
69 version although Linux is expected to always have getdelim courtesy of
70 glibc. On FreeBSD fgetln is about 34% faster than the getc_unlocked
71 version and that's probably the most common case where fgetln can be found
74 Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
78 config.mak.uname | 5 +++++
79 strbuf.c | 32 ++++++++++++++++++++++++--------
80 3 files changed, 35 insertions(+), 8 deletions(-)
82 diff --git a/Makefile b/Makefile
83 index ddd1bdfc..1b60bee0 100644
86 @@ -384,6 +384,8 @@ all::
88 # to say "export LESS=FRX (and LV=-c) if the environment variable
89 # LESS (and LV) is not set, respectively".
91 +# Define HAVE_FGETLN if your system has the fgetln() function.
93 GIT-VERSION-FILE: FORCE
94 @$(SHELL_PATH) ./GIT-VERSION-GEN
95 @@ -1510,6 +1512,10 @@ ifdef HAVE_GETDELIM
96 BASIC_CFLAGS += -DHAVE_GETDELIM
100 + BASIC_CFLAGS += -DHAVE_FGETLN
103 ifeq ($(TCLTK_PATH),)
106 diff --git a/config.mak.uname b/config.mak.uname
107 index b232908f..d5fd4dc8 100644
108 --- a/config.mak.uname
109 +++ b/config.mak.uname
110 @@ -115,6 +115,7 @@ ifeq ($(uname_S),Darwin)
111 BASIC_CFLAGS += -DPRECOMPOSE_UNICODE
112 BASIC_CFLAGS += -DPROTECT_HFS_DEFAULT=1
113 HAVE_BSD_SYSCTL = YesPlease
114 + HAVE_FGETLN = YesPlease
116 ifeq ($(uname_S),SunOS)
117 NEEDS_SOCKET = YesPlease
118 @@ -210,6 +211,7 @@ ifeq ($(uname_S),FreeBSD)
119 GMTIME_UNRELIABLE_ERRORS = UnfortunatelyYes
120 HAVE_BSD_SYSCTL = YesPlease
121 PAGER_ENV = LESS=FRX LV=-c MORE=FRX
122 + HAVE_FGETLN = YesPlease
124 ifeq ($(uname_S),OpenBSD)
125 NO_STRCASESTR = YesPlease
126 @@ -220,6 +222,7 @@ ifeq ($(uname_S),OpenBSD)
127 BASIC_LDFLAGS += -L/usr/local/lib
128 HAVE_PATHS_H = YesPlease
129 HAVE_BSD_SYSCTL = YesPlease
130 + HAVE_FGETLN = YesPlease
132 ifeq ($(uname_S),MirBSD)
133 NO_STRCASESTR = YesPlease
134 @@ -228,6 +231,7 @@ ifeq ($(uname_S),MirBSD)
135 NEEDS_LIBICONV = YesPlease
136 HAVE_PATHS_H = YesPlease
137 HAVE_BSD_SYSCTL = YesPlease
138 + HAVE_FGETLN = YesPlease
140 ifeq ($(uname_S),NetBSD)
141 ifeq ($(shell expr "$(uname_R)" : '[01]\.'),2)
142 @@ -239,6 +243,7 @@ ifeq ($(uname_S),NetBSD)
143 NO_MKSTEMPS = YesPlease
144 HAVE_PATHS_H = YesPlease
145 HAVE_BSD_SYSCTL = YesPlease
146 + HAVE_FGETLN = YesPlease
148 ifeq ($(uname_S),AIX)
150 diff --git a/strbuf.c b/strbuf.c
151 index b839be49..d47f351f 100644
154 @@ -514,15 +514,31 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
159 - while ((ch = getc_unlocked(fp)) != EOF) {
160 - if (!strbuf_avail(sb))
161 - strbuf_grow(sb, 1);
162 - sb->buf[sb->len++] = ch;
166 + if (term == '\n') {
169 + if ((line = fgetln(fp, &len))) {
170 + if (strbuf_avail(sb) < len)
171 + strbuf_grow(sb, len);
172 + memcpy(sb->buf + sb->len, line, len);
181 + while ((ch = getc_unlocked(fp)) != EOF) {
182 + if (!strbuf_avail(sb))
183 + strbuf_grow(sb, 1);
184 + sb->buf[sb->len++] = ch;
191 if (ch == EOF && sb->len == 0)