1 /* stb_image - v2.10 - public domain image loader - http://nothings.org/stb_image.h
2 no warranty implied; use at your own risk
5 #define STB_IMAGE_IMPLEMENTATION
6 before you include this file in *one* C or C++ file to create the implementation.
8 // i.e. it should look like this:
12 #define STB_IMAGE_IMPLEMENTATION
13 #include "stb_image.h"
15 You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16 And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
20 Primarily of interest to game developers and other people who can
21 avoid problematic images and only need the trivial interface
23 JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24 PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
26 TGA (not sure what subset, if a subset)
28 PSD (composited view only, no extra channels, 8/16 bit-per-channel)
30 GIF (*comp always reports as 4-channel)
31 HDR (radiance rgbE format)
33 PNM (PPM and PGM binary only)
35 Animated GIF still needs a proper API, but here's one way to do it:
36 http://gist.github.com/urraka/685d9a6340b26b830d49
38 - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
39 - decode from arbitrary I/O callbacks
40 - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
42 Full documentation under "DOCUMENTATION" below.
45 Revision 2.00 release notes:
47 - Progressive JPEG is now supported.
49 - PPM and PGM binary formats are now supported, thanks to Ken Miller.
51 - x86 platforms now make use of SSE2 SIMD instructions for
52 JPEG decoding, and ARM platforms can use NEON SIMD if requested.
53 This work was done by Fabian "ryg" Giesen. SSE2 is used by
54 default, but NEON must be enabled explicitly; see docs.
56 With other JPEG optimizations included in this version, we see
57 2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
58 on a JPEG on an ARM machine, relative to previous versions of this
59 library. The same results will not obtain for all JPGs and for all
60 x86/ARM machines. (Note that progressive JPEGs are significantly
61 slower to decode than regular JPEGs.) This doesn't mean that this
62 is the fastest JPEG decoder in the land; rather, it brings it
63 closer to parity with standard libraries. If you want the fastest
64 decode, look elsewhere. (See "Philosophy" section of docs below.)
66 See final bullet items below for more info on SIMD.
68 - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
69 the memory allocator. Unlike other STBI libraries, these macros don't
70 support a context parameter, so if you need to pass a context in to
71 the allocator, you'll have to store it in a global or a thread-local
74 - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
76 STBI_NO_HDR: suppress implementation of .hdr reader format
77 STBI_NO_LINEAR: suppress high-dynamic-range light-linear float API
79 - You can suppress implementation of any of the decoders to reduce
80 your code footprint by #defining one or more of the following
81 symbols before creating the implementation.
91 STBI_NO_PNM (.ppm and .pgm)
93 - You can request *only* certain decoders and suppress all other ones
94 (this will be more forward-compatible, as addition of new decoders
95 doesn't require you to disable them explicitly):
105 STBI_ONLY_PNM (.ppm and .pgm)
107 Note that you can define multiples of these, and you will get all
108 of them ("only x" and "only y" is interpreted to mean "only x&y").
110 - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
111 want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
113 - Compilation of all SIMD code can be suppressed with
115 It should not be necessary to disable SIMD unless you have issues
116 compiling (e.g. using an x86 compiler which doesn't support SSE
117 intrinsics or that doesn't support the method used to detect
118 SSE2 support at run-time), and even those can be reported as
119 bugs so I can refine the built-in compile-time checking to be
122 - The old STBI_SIMD system which allowed installing a user-defined
123 IDCT etc. has been removed. If you need this, don't upgrade. My
124 assumption is that almost nobody was doing this, and those who
125 were will find the built-in SIMD more satisfactory anyway.
127 - RGB values computed for JPEG images are slightly different from
128 previous versions of stb_image. (This is due to using less
129 integer precision in SIMD.) The C code has been adjusted so
130 that the same RGB values will be computed regardless of whether
131 SIMD support is available, so your app should always produce
132 consistent results. But these results are slightly different from
133 previous versions. (Specifically, about 3% of available YCbCr values
134 will compute different RGB results from pre-1.49 versions by +-1;
135 most of the deviating values are one smaller in the G channel.)
137 - If you must produce consistent results with previous versions of
138 stb_image, #define STBI_JPEG_OLD and you will get the same results
139 you used to; however, you will not get the SIMD speedups for
140 the YCbCr-to-RGB conversion step (although you should still see
141 significant JPEG speedup from the other changes).
143 Please note that STBI_JPEG_OLD is a temporary feature; it will be
144 removed in future versions of the library. It is only intended for
145 near-term back-compatibility use.
148 Latest revision history:
149 2.10 (2016-01-22) avoid warning introduced in 2.09
150 2.09 (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
151 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
152 2.07 (2015-09-13) partial animated GIF support
153 limited 16-bit PSD support
154 minor bugs, code cleanup, and compiler warnings
155 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
156 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
157 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
158 2.03 (2015-04-12) additional corruption checking
159 stbi_set_flip_vertically_on_load
160 fix NEON support; fix mingw support
161 2.02 (2015-01-19) fix incorrect assert, fix warning
162 2.01 (2015-01-17) fix various warnings
163 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
164 2.00 (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
167 STBI_MALLOC,STBI_REALLOC,STBI_FREE
168 STBI_NO_*, STBI_ONLY_*
170 1.48 (2014-12-14) fix incorrectly-named assert()
171 1.47 (2014-12-14) 1/2/4-bit PNG support (both grayscale and paletted)
173 fix bug in interlaced PNG with user-specified channel count
175 See end of file for full revision history.
178 ============================ Contributors =========================
180 Image formats Extensions, features
181 Sean Barrett (jpeg, png, bmp) Jetro Lauha (stbi_info)
182 Nicolas Schulz (hdr, psd) Martin "SpartanJ" Golini (stbi_info)
183 Jonathan Dummer (tga) James "moose2000" Brown (iPhone PNG)
184 Jean-Marc Lienher (gif) Ben "Disch" Wenger (io callbacks)
185 Tom Seddon (pic) Omar Cornut (1/2/4-bit PNG)
186 Thatcher Ulrich (psd) Nicolas Guillemot (vertical flip)
187 Ken Miller (pgm, ppm) Richard Mitton (16-bit PSD)
188 urraka@github (animated gif) Junggon Kim (PNM comments)
189 Daniel Gibson (16-bit TGA)
191 Optimizations & bugfixes
196 Marc LeBlanc David Woo Guillaume George Martins Mozeiko
197 Christpher Lloyd Martin Golini Jerry Jansson Joseph Thomson
198 Dave Moore Roy Eltham Hayaki Saito Phil Jordan
199 Won Chun Luke Graham Johan Duparc Nathan Reed
200 the Horde3D community Thomas Ruf Ronny Chevalier Nick Verigakis
201 Janez Zemva John Bartholomew Michal Cichon svdijk@github
202 Jonathan Blow Ken Hamada Tero Hanninen Baldur Karlsson
203 Laurent Gomila Cort Stratton Sergio Gonzalez romigrou@github
204 Aruelien Pocheville Thibault Reuille Cass Everitt
205 Ryamond Barbiero Paul Du Bois Engin Manap
206 Blazej Dariusz Roszkowski
207 Michaelangel007@github
212 This software is in the public domain. Where that dedication is not
213 recognized, you are granted a perpetual, irrevocable license to copy,
214 distribute, and modify this file as you see fit.
218 #ifndef STBI_INCLUDE_STB_IMAGE_H
219 #define STBI_INCLUDE_STB_IMAGE_H
224 // - no 16-bit-per-channel PNG
225 // - no 12-bit-per-channel JPEG
226 // - no JPEGs with arithmetic coding
228 // - GIF always returns *comp=4
230 // Basic usage (see HDR discussion below for HDR usage):
232 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
233 // // ... process data if not NULL ...
234 // // ... x = width, y = height, n = # 8-bit components per pixel ...
235 // // ... replace '0' with '1'..'4' to force that many components per pixel
236 // // ... but 'n' will always be the number that it would have been if you said 0
237 // stbi_image_free(data)
239 // Standard parameters:
240 // int *x -- outputs image width in pixels
241 // int *y -- outputs image height in pixels
242 // int *comp -- outputs # of image components in image file
243 // int req_comp -- if non-zero, # of image components requested in result
245 // The return value from an image loader is an 'unsigned char *' which points
246 // to the pixel data, or NULL on an allocation failure or if the image is
247 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
248 // with each pixel consisting of N interleaved 8-bit components; the first
249 // pixel pointed to is top-left-most in the image. There is no padding between
250 // image scanlines or between pixels, regardless of format. The number of
251 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
252 // If req_comp is non-zero, *comp has the number of components that _would_
253 // have been output otherwise. E.g. if you set req_comp to 4, you will always
254 // get RGBA output, but you can check *comp to see if it's trivially opaque
255 // because e.g. there were only 3 channels in the source image.
257 // An output image with N components has the following components interleaved
258 // in this order in each pixel:
260 // N=#comp components
263 // 3 red, green, blue
264 // 4 red, green, blue, alpha
266 // If image loading fails for any reason, the return value will be NULL,
267 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
268 // can be queried for an extremely brief, end-user unfriendly explanation
269 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
270 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
271 // more user-friendly ones.
273 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
275 // ===========================================================================
279 // stb libraries are designed with the following priorities:
282 // 2. easy to maintain
283 // 3. good performance
285 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
286 // and for best performance I may provide less-easy-to-use APIs that give higher
287 // performance, in addition to the easy to use ones. Nevertheless, it's important
288 // to keep in mind that from the standpoint of you, a client of this library,
289 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
291 // Some secondary priorities arise directly from the first two, some of which
292 // make more explicit reasons why performance can't be emphasized.
294 // - Portable ("ease of use")
295 // - Small footprint ("easy to maintain")
296 // - No dependencies ("ease of use")
298 // ===========================================================================
302 // I/O callbacks allow you to read from arbitrary sources, like packaged
303 // files or some other source. Data read from callbacks are processed
304 // through a small internal buffer (currently 128 bytes) to try to reduce
307 // The three functions you must define are "read" (reads some bytes of data),
308 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
310 // ===========================================================================
314 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
315 // supported by the compiler. For ARM Neon support, you must explicitly
318 // (The old do-it-yourself SIMD API is no longer supported in the current
321 // On x86, SSE2 will automatically be used when available based on a run-time
322 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
323 // the typical path is to have separate builds for NEON and non-NEON devices
324 // (at least this is true for iOS and Android). Therefore, the NEON support is
325 // toggled by a build flag: define STBI_NEON to get NEON loops.
327 // The output of the JPEG decoder is slightly different from versions where
328 // SIMD support was introduced (that is, for versions before 1.49). The
329 // difference is only +-1 in the 8-bit RGB channels, and only on a small
330 // fraction of pixels. You can force the pre-1.49 behavior by defining
331 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
332 // and hence cost some performance.
334 // If for some reason you do not want to use any of SIMD code, or if
335 // you have issues compiling it, you can disable it entirely by
336 // defining STBI_NO_SIMD.
338 // ===========================================================================
340 // HDR image support (disable by defining STBI_NO_HDR)
342 // stb_image now supports loading HDR images in general, and currently
343 // the Radiance .HDR file format, although the support is provided
344 // generically. You can still load any file through the existing interface;
345 // if you attempt to load an HDR file, it will be automatically remapped to
346 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
347 // both of these constants can be reconfigured through this interface:
349 // stbi_hdr_to_ldr_gamma(2.2f);
350 // stbi_hdr_to_ldr_scale(1.0f);
352 // (note, do not use _inverse_ constants; stbi_image will invert them
355 // Additionally, there is a new, parallel interface for loading files as
356 // (linear) floats to preserve the full dynamic range:
358 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
360 // If you load LDR images through this interface, those images will
361 // be promoted to floating point values, run through the inverse of
362 // constants corresponding to the above:
364 // stbi_ldr_to_hdr_scale(1.0f);
365 // stbi_ldr_to_hdr_gamma(2.2f);
367 // Finally, given a filename (or an open file or memory block--see header
368 // file for details) containing image data, you can query for the "most
369 // appropriate" interface to use (that is, whether the image is HDR or
372 // stbi_is_hdr(char *filename);
374 // ===========================================================================
376 // iPhone PNG support:
378 // By default we convert iphone-formatted PNGs back to RGB, even though
379 // they are internally encoded differently. You can disable this conversion
380 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
381 // you will always just get the native iphone "format" through (which
382 // is BGR stored in RGB).
384 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
385 // pixel to remove any premultiplied alpha *only* if the image file explicitly
386 // says there's premultiplied data (currently only happens in iPhone images,
387 // and only if iPhone convert-to-rgb processing is on).
391 #ifndef STBI_NO_STDIO
393 #endif // STBI_NO_STDIO
395 #define STBI_VERSION 1
399 STBI_default
= 0, // only used for req_comp
407 typedef unsigned char stbi_uc
;
413 #ifdef STB_IMAGE_STATIC
414 #define STBIDEF static
416 #define STBIDEF extern
419 //////////////////////////////////////////////////////////////////////////////
421 // PRIMARY API - works on images of any type
425 // load image by filename, open file, or memory buffer
430 int (*read
) (void *user
,char *data
,int size
); // fill 'data' with 'size' bytes. return number of bytes actually read
431 void (*skip
) (void *user
,int n
); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
432 int (*eof
) (void *user
); // returns nonzero if we are at end of file/data
435 STBIDEF stbi_uc
*stbi_load (char const *filename
, int *x
, int *y
, int *comp
, int req_comp
);
436 STBIDEF stbi_uc
*stbi_load_from_memory (stbi_uc
const *buffer
, int len
, int *x
, int *y
, int *comp
, int req_comp
);
437 STBIDEF stbi_uc
*stbi_load_from_callbacks(stbi_io_callbacks
const *clbk
, void *user
, int *x
, int *y
, int *comp
, int req_comp
);
439 #ifndef STBI_NO_STDIO
440 STBIDEF stbi_uc
*stbi_load_from_file (FILE *f
, int *x
, int *y
, int *comp
, int req_comp
);
441 // for stbi_load_from_file, file pointer is left pointing immediately after image
444 #ifndef STBI_NO_LINEAR
445 STBIDEF
float *stbi_loadf (char const *filename
, int *x
, int *y
, int *comp
, int req_comp
);
446 STBIDEF
float *stbi_loadf_from_memory (stbi_uc
const *buffer
, int len
, int *x
, int *y
, int *comp
, int req_comp
);
447 STBIDEF
float *stbi_loadf_from_callbacks (stbi_io_callbacks
const *clbk
, void *user
, int *x
, int *y
, int *comp
, int req_comp
);
449 #ifndef STBI_NO_STDIO
450 STBIDEF
float *stbi_loadf_from_file (FILE *f
, int *x
, int *y
, int *comp
, int req_comp
);
455 STBIDEF
void stbi_hdr_to_ldr_gamma(float gamma
);
456 STBIDEF
void stbi_hdr_to_ldr_scale(float scale
);
457 #endif // STBI_NO_HDR
459 #ifndef STBI_NO_LINEAR
460 STBIDEF
void stbi_ldr_to_hdr_gamma(float gamma
);
461 STBIDEF
void stbi_ldr_to_hdr_scale(float scale
);
462 #endif // STBI_NO_LINEAR
464 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
465 STBIDEF
int stbi_is_hdr_from_callbacks(stbi_io_callbacks
const *clbk
, void *user
);
466 STBIDEF
int stbi_is_hdr_from_memory(stbi_uc
const *buffer
, int len
);
467 #ifndef STBI_NO_STDIO
468 STBIDEF
int stbi_is_hdr (char const *filename
);
469 STBIDEF
int stbi_is_hdr_from_file(FILE *f
);
470 #endif // STBI_NO_STDIO
473 // get a VERY brief reason for failure
475 STBIDEF
const char *stbi_failure_reason (void);
477 // free the loaded image -- this is just free()
478 STBIDEF
void stbi_image_free (void *retval_from_stbi_load
);
480 // get image dimensions & components without fully decoding
481 STBIDEF
int stbi_info_from_memory(stbi_uc
const *buffer
, int len
, int *x
, int *y
, int *comp
);
482 STBIDEF
int stbi_info_from_callbacks(stbi_io_callbacks
const *clbk
, void *user
, int *x
, int *y
, int *comp
);
484 #ifndef STBI_NO_STDIO
485 STBIDEF
int stbi_info (char const *filename
, int *x
, int *y
, int *comp
);
486 STBIDEF
int stbi_info_from_file (FILE *f
, int *x
, int *y
, int *comp
);
492 // for image formats that explicitly notate that they have premultiplied alpha,
493 // we just return the colors as stored in the file. set this flag to force
494 // unpremultiplication. results are undefined if the unpremultiply overflow.
495 STBIDEF
void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply
);
497 // indicate whether we should process iphone images back to canonical format,
498 // or just pass them through "as-is"
499 STBIDEF
void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert
);
501 // flip the image vertically, so the first pixel in the output array is the bottom left
502 STBIDEF
void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip
);
504 // ZLIB client - used by PNG, available for other purposes
506 STBIDEF
char *stbi_zlib_decode_malloc_guesssize(const char *buffer
, int len
, int initial_size
, int *outlen
);
507 STBIDEF
char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer
, int len
, int initial_size
, int *outlen
, int parse_header
);
508 STBIDEF
char *stbi_zlib_decode_malloc(const char *buffer
, int len
, int *outlen
);
509 STBIDEF
int stbi_zlib_decode_buffer(char *obuffer
, int olen
, const char *ibuffer
, int ilen
);
511 STBIDEF
char *stbi_zlib_decode_noheader_malloc(const char *buffer
, int len
, int *outlen
);
512 STBIDEF
int stbi_zlib_decode_noheader_buffer(char *obuffer
, int olen
, const char *ibuffer
, int ilen
);
521 //// end header file /////////////////////////////////////////////////////
522 #endif // STBI_INCLUDE_STB_IMAGE_H
524 #ifdef STB_IMAGE_IMPLEMENTATION
526 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
527 || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
528 || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
529 || defined(STBI_ONLY_ZLIB)
530 #ifndef STBI_ONLY_JPEG
533 #ifndef STBI_ONLY_PNG
536 #ifndef STBI_ONLY_BMP
539 #ifndef STBI_ONLY_PSD
542 #ifndef STBI_ONLY_TGA
545 #ifndef STBI_ONLY_GIF
548 #ifndef STBI_ONLY_HDR
551 #ifndef STBI_ONLY_PIC
554 #ifndef STBI_ONLY_PNM
559 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
565 #include <stddef.h> // ptrdiff_t on osx
569 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
570 #include <math.h> // ldexp
573 #ifndef STBI_NO_STDIO
579 #define STBI_ASSERT(x) assert(x)
585 #define stbi_inline inline
590 #define stbi_inline __forceinline
595 typedef unsigned short stbi__uint16
;
596 typedef signed short stbi__int16
;
597 typedef unsigned int stbi__uint32
;
598 typedef signed int stbi__int32
;
601 typedef uint16_t stbi__uint16
;
602 typedef int16_t stbi__int16
;
603 typedef uint32_t stbi__uint32
;
604 typedef int32_t stbi__int32
;
607 // should produce compiler error if size is wrong
608 typedef unsigned char validate_uint32
[sizeof(stbi__uint32
)==4 ? 1 : -1];
611 #define STBI_NOTUSED(v) (void)(v)
613 #define STBI_NOTUSED(v) (void)sizeof(v)
617 #define STBI_HAS_LROTL
620 #ifdef STBI_HAS_LROTL
621 #define stbi_lrot(x,y) _lrotl(x,y)
623 #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (32 - (y))))
626 #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
628 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
631 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
635 #define STBI_MALLOC(sz) malloc(sz)
636 #define STBI_REALLOC(p,newsz) realloc(p,newsz)
637 #define STBI_FREE(p) free(p)
640 #ifndef STBI_REALLOC_SIZED
641 #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
645 #if defined(__x86_64__) || defined(_M_X64)
646 #define STBI__X64_TARGET
647 #elif defined(__i386) || defined(_M_IX86)
648 #define STBI__X86_TARGET
651 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
652 // NOTE: not clear do we actually need this for the 64-bit path?
653 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
654 // (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
655 // this is just broken and gcc are jerks for not fixing it properly
656 // http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
660 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
661 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
663 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
664 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
665 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
666 // simultaneously enabling "-mstackrealign".
668 // See https://github.com/nothings/stb/issues/81 for more information.
670 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
671 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
675 #if !defined(STBI_NO_SIMD) && defined(STBI__X86_TARGET)
677 #include <emmintrin.h>
681 #if _MSC_VER >= 1400 // not VC6
682 #include <intrin.h> // __cpuid
683 static int stbi__cpuid3(void)
690 static int stbi__cpuid3(void)
702 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
704 static int stbi__sse2_available()
706 int info3
= stbi__cpuid3();
707 return ((info3
>> 26) & 1) != 0;
709 #else // assume GCC-style if not VC++
710 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
712 static int stbi__sse2_available()
714 #if defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 408 // GCC 4.8 or later
715 // GCC 4.8+ has a nice way to do this
716 return __builtin_cpu_supports("sse2");
718 // portable way to do this, preferably without using GCC inline ASM?
719 // just bail for now.
727 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
732 #include <arm_neon.h>
733 // assume GCC or Clang on ARM targets
734 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
737 #ifndef STBI_SIMD_ALIGN
738 #define STBI_SIMD_ALIGN(type, name) type name
741 ///////////////////////////////////////////////
743 // stbi__context struct and start_xxx functions
745 // stbi__context structure is our basic context used by all images, so it
746 // contains all the IO context, plus some basic image information
749 stbi__uint32 img_x
, img_y
;
750 int img_n
, img_out_n
;
752 stbi_io_callbacks io
;
755 int read_from_callbacks
;
757 stbi_uc buffer_start
[128];
759 stbi_uc
*img_buffer
, *img_buffer_end
;
760 stbi_uc
*img_buffer_original
, *img_buffer_original_end
;
764 static void stbi__refill_buffer(stbi__context
*s
);
766 // initialize a memory-decode context
767 static void stbi__start_mem(stbi__context
*s
, stbi_uc
const *buffer
, int len
)
770 s
->read_from_callbacks
= 0;
771 s
->img_buffer
= s
->img_buffer_original
= (stbi_uc
*) buffer
;
772 s
->img_buffer_end
= s
->img_buffer_original_end
= (stbi_uc
*) buffer
+len
;
775 // initialize a callback-based context
776 static void stbi__start_callbacks(stbi__context
*s
, stbi_io_callbacks
*c
, void *user
)
779 s
->io_user_data
= user
;
780 s
->buflen
= sizeof(s
->buffer_start
);
781 s
->read_from_callbacks
= 1;
782 s
->img_buffer_original
= s
->buffer_start
;
783 stbi__refill_buffer(s
);
784 s
->img_buffer_original_end
= s
->img_buffer_end
;
787 #ifndef STBI_NO_STDIO
789 static int stbi__stdio_read(void *user
, char *data
, int size
)
791 return (int) fread(data
,1,size
,(FILE*) user
);
794 static void stbi__stdio_skip(void *user
, int n
)
796 fseek((FILE*) user
, n
, SEEK_CUR
);
799 static int stbi__stdio_eof(void *user
)
801 return feof((FILE*) user
);
804 static stbi_io_callbacks stbi__stdio_callbacks
=
811 static void stbi__start_file(stbi__context
*s
, FILE *f
)
813 stbi__start_callbacks(s
, &stbi__stdio_callbacks
, (void *) f
);
816 //static void stop_file(stbi__context *s) { }
818 #endif // !STBI_NO_STDIO
820 static void stbi__rewind(stbi__context
*s
)
822 // conceptually rewind SHOULD rewind to the beginning of the stream,
823 // but we just rewind to the beginning of the initial buffer, because
824 // we only use it after doing 'test', which only ever looks at at most 92 bytes
825 s
->img_buffer
= s
->img_buffer_original
;
826 s
->img_buffer_end
= s
->img_buffer_original_end
;
830 static int stbi__jpeg_test(stbi__context
*s
);
831 static stbi_uc
*stbi__jpeg_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
832 static int stbi__jpeg_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
836 static int stbi__png_test(stbi__context
*s
);
837 static stbi_uc
*stbi__png_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
838 static int stbi__png_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
842 static int stbi__bmp_test(stbi__context
*s
);
843 static stbi_uc
*stbi__bmp_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
844 static int stbi__bmp_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
848 static int stbi__tga_test(stbi__context
*s
);
849 static stbi_uc
*stbi__tga_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
850 static int stbi__tga_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
854 static int stbi__psd_test(stbi__context
*s
);
855 static stbi_uc
*stbi__psd_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
856 static int stbi__psd_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
860 static int stbi__hdr_test(stbi__context
*s
);
861 static float *stbi__hdr_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
862 static int stbi__hdr_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
866 static int stbi__pic_test(stbi__context
*s
);
867 static stbi_uc
*stbi__pic_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
868 static int stbi__pic_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
872 static int stbi__gif_test(stbi__context
*s
);
873 static stbi_uc
*stbi__gif_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
874 static int stbi__gif_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
878 static int stbi__pnm_test(stbi__context
*s
);
879 static stbi_uc
*stbi__pnm_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
880 static int stbi__pnm_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
883 // this is not threadsafe
884 static const char *stbi__g_failure_reason
;
886 STBIDEF
const char *stbi_failure_reason(void)
888 return stbi__g_failure_reason
;
891 static int stbi__err(const char *str
)
893 stbi__g_failure_reason
= str
;
897 static void *stbi__malloc(size_t size
)
899 return STBI_MALLOC(size
);
903 // stbi__errpf - error returning pointer to float
904 // stbi__errpuc - error returning pointer to unsigned char
906 #ifdef STBI_NO_FAILURE_STRINGS
907 #define stbi__err(x,y) 0
908 #elif defined(STBI_FAILURE_USERMSG)
909 #define stbi__err(x,y) stbi__err(y)
911 #define stbi__err(x,y) stbi__err(x)
914 #define stbi__errpf(x,y) ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
915 #define stbi__errpuc(x,y) ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
917 STBIDEF
void stbi_image_free(void *retval_from_stbi_load
)
919 STBI_FREE(retval_from_stbi_load
);
922 #ifndef STBI_NO_LINEAR
923 static float *stbi__ldr_to_hdr(stbi_uc
*data
, int x
, int y
, int comp
);
927 static stbi_uc
*stbi__hdr_to_ldr(float *data
, int x
, int y
, int comp
);
930 static int stbi__vertically_flip_on_load
= 0;
932 STBIDEF
void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip
)
934 stbi__vertically_flip_on_load
= flag_true_if_should_flip
;
937 static unsigned char *stbi__load_main(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
940 if (stbi__jpeg_test(s
)) return stbi__jpeg_load(s
,x
,y
,comp
,req_comp
);
943 if (stbi__png_test(s
)) return stbi__png_load(s
,x
,y
,comp
,req_comp
);
946 if (stbi__bmp_test(s
)) return stbi__bmp_load(s
,x
,y
,comp
,req_comp
);
949 if (stbi__gif_test(s
)) return stbi__gif_load(s
,x
,y
,comp
,req_comp
);
952 if (stbi__psd_test(s
)) return stbi__psd_load(s
,x
,y
,comp
,req_comp
);
955 if (stbi__pic_test(s
)) return stbi__pic_load(s
,x
,y
,comp
,req_comp
);
958 if (stbi__pnm_test(s
)) return stbi__pnm_load(s
,x
,y
,comp
,req_comp
);
962 if (stbi__hdr_test(s
)) {
963 float *hdr
= stbi__hdr_load(s
, x
,y
,comp
,req_comp
);
964 return stbi__hdr_to_ldr(hdr
, *x
, *y
, req_comp
? req_comp
: *comp
);
969 // test tga last because it's a crappy test!
970 if (stbi__tga_test(s
))
971 return stbi__tga_load(s
,x
,y
,comp
,req_comp
);
974 return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
977 static unsigned char *stbi__load_flip(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
979 unsigned char *result
= stbi__load_main(s
, x
, y
, comp
, req_comp
);
981 if (stbi__vertically_flip_on_load
&& result
!= NULL
) {
983 int depth
= req_comp
? req_comp
: *comp
;
987 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
988 for (row
= 0; row
< (h
>>1); row
++) {
989 for (col
= 0; col
< w
; col
++) {
990 for (z
= 0; z
< depth
; z
++) {
991 temp
= result
[(row
* w
+ col
) * depth
+ z
];
992 result
[(row
* w
+ col
) * depth
+ z
] = result
[((h
- row
- 1) * w
+ col
) * depth
+ z
];
993 result
[((h
- row
- 1) * w
+ col
) * depth
+ z
] = temp
;
1003 static void stbi__float_postprocess(float *result
, int *x
, int *y
, int *comp
, int req_comp
)
1005 if (stbi__vertically_flip_on_load
&& result
!= NULL
) {
1007 int depth
= req_comp
? req_comp
: *comp
;
1011 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
1012 for (row
= 0; row
< (h
>>1); row
++) {
1013 for (col
= 0; col
< w
; col
++) {
1014 for (z
= 0; z
< depth
; z
++) {
1015 temp
= result
[(row
* w
+ col
) * depth
+ z
];
1016 result
[(row
* w
+ col
) * depth
+ z
] = result
[((h
- row
- 1) * w
+ col
) * depth
+ z
];
1017 result
[((h
- row
- 1) * w
+ col
) * depth
+ z
] = temp
;
1025 #ifndef STBI_NO_STDIO
1027 static FILE *stbi__fopen(char const *filename
, char const *mode
)
1030 #if defined(_MSC_VER) && _MSC_VER >= 1400
1031 if (0 != fopen_s(&f
, filename
, mode
))
1034 f
= fopen(filename
, mode
);
1040 STBIDEF stbi_uc
*stbi_load(char const *filename
, int *x
, int *y
, int *comp
, int req_comp
)
1042 FILE *f
= stbi__fopen(filename
, "rb");
1043 unsigned char *result
;
1044 if (!f
) return stbi__errpuc("can't fopen", "Unable to open file");
1045 result
= stbi_load_from_file(f
,x
,y
,comp
,req_comp
);
1050 STBIDEF stbi_uc
*stbi_load_from_file(FILE *f
, int *x
, int *y
, int *comp
, int req_comp
)
1052 unsigned char *result
;
1054 stbi__start_file(&s
,f
);
1055 result
= stbi__load_flip(&s
,x
,y
,comp
,req_comp
);
1057 // need to 'unget' all the characters in the IO buffer
1058 fseek(f
, - (int) (s
.img_buffer_end
- s
.img_buffer
), SEEK_CUR
);
1062 #endif //!STBI_NO_STDIO
1064 STBIDEF stbi_uc
*stbi_load_from_memory(stbi_uc
const *buffer
, int len
, int *x
, int *y
, int *comp
, int req_comp
)
1067 stbi__start_mem(&s
,buffer
,len
);
1068 return stbi__load_flip(&s
,x
,y
,comp
,req_comp
);
1071 STBIDEF stbi_uc
*stbi_load_from_callbacks(stbi_io_callbacks
const *clbk
, void *user
, int *x
, int *y
, int *comp
, int req_comp
)
1074 stbi__start_callbacks(&s
, (stbi_io_callbacks
*) clbk
, user
);
1075 return stbi__load_flip(&s
,x
,y
,comp
,req_comp
);
1078 #ifndef STBI_NO_LINEAR
1079 static float *stbi__loadf_main(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
1081 unsigned char *data
;
1083 if (stbi__hdr_test(s
)) {
1084 float *hdr_data
= stbi__hdr_load(s
,x
,y
,comp
,req_comp
);
1086 stbi__float_postprocess(hdr_data
,x
,y
,comp
,req_comp
);
1090 data
= stbi__load_flip(s
, x
, y
, comp
, req_comp
);
1092 return stbi__ldr_to_hdr(data
, *x
, *y
, req_comp
? req_comp
: *comp
);
1093 return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1096 STBIDEF
float *stbi_loadf_from_memory(stbi_uc
const *buffer
, int len
, int *x
, int *y
, int *comp
, int req_comp
)
1099 stbi__start_mem(&s
,buffer
,len
);
1100 return stbi__loadf_main(&s
,x
,y
,comp
,req_comp
);
1103 STBIDEF
float *stbi_loadf_from_callbacks(stbi_io_callbacks
const *clbk
, void *user
, int *x
, int *y
, int *comp
, int req_comp
)
1106 stbi__start_callbacks(&s
, (stbi_io_callbacks
*) clbk
, user
);
1107 return stbi__loadf_main(&s
,x
,y
,comp
,req_comp
);
1110 #ifndef STBI_NO_STDIO
1111 STBIDEF
float *stbi_loadf(char const *filename
, int *x
, int *y
, int *comp
, int req_comp
)
1114 FILE *f
= stbi__fopen(filename
, "rb");
1115 if (!f
) return stbi__errpf("can't fopen", "Unable to open file");
1116 result
= stbi_loadf_from_file(f
,x
,y
,comp
,req_comp
);
1121 STBIDEF
float *stbi_loadf_from_file(FILE *f
, int *x
, int *y
, int *comp
, int req_comp
)
1124 stbi__start_file(&s
,f
);
1125 return stbi__loadf_main(&s
,x
,y
,comp
,req_comp
);
1127 #endif // !STBI_NO_STDIO
1129 #endif // !STBI_NO_LINEAR
1131 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1132 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1135 STBIDEF
int stbi_is_hdr_from_memory(stbi_uc
const *buffer
, int len
)
1139 stbi__start_mem(&s
,buffer
,len
);
1140 return stbi__hdr_test(&s
);
1142 STBI_NOTUSED(buffer
);
1148 #ifndef STBI_NO_STDIO
1149 STBIDEF
int stbi_is_hdr (char const *filename
)
1151 FILE *f
= stbi__fopen(filename
, "rb");
1154 result
= stbi_is_hdr_from_file(f
);
1160 STBIDEF
int stbi_is_hdr_from_file(FILE *f
)
1164 stbi__start_file(&s
,f
);
1165 return stbi__hdr_test(&s
);
1171 #endif // !STBI_NO_STDIO
1173 STBIDEF
int stbi_is_hdr_from_callbacks(stbi_io_callbacks
const *clbk
, void *user
)
1177 stbi__start_callbacks(&s
, (stbi_io_callbacks
*) clbk
, user
);
1178 return stbi__hdr_test(&s
);
1186 #ifndef STBI_NO_LINEAR
1187 static float stbi__l2h_gamma
=2.2f
, stbi__l2h_scale
=1.0f
;
1189 STBIDEF
void stbi_ldr_to_hdr_gamma(float gamma
) { stbi__l2h_gamma
= gamma
; }
1190 STBIDEF
void stbi_ldr_to_hdr_scale(float scale
) { stbi__l2h_scale
= scale
; }
1193 static float stbi__h2l_gamma_i
=1.0f
/2.2f
, stbi__h2l_scale_i
=1.0f
;
1195 STBIDEF
void stbi_hdr_to_ldr_gamma(float gamma
) { stbi__h2l_gamma_i
= 1/gamma
; }
1196 STBIDEF
void stbi_hdr_to_ldr_scale(float scale
) { stbi__h2l_scale_i
= 1/scale
; }
1199 //////////////////////////////////////////////////////////////////////////////
1201 // Common code used by all image loaders
1211 static void stbi__refill_buffer(stbi__context
*s
)
1213 int n
= (s
->io
.read
)(s
->io_user_data
,(char*)s
->buffer_start
,s
->buflen
);
1215 // at end of file, treat same as if from memory, but need to handle case
1216 // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1217 s
->read_from_callbacks
= 0;
1218 s
->img_buffer
= s
->buffer_start
;
1219 s
->img_buffer_end
= s
->buffer_start
+1;
1222 s
->img_buffer
= s
->buffer_start
;
1223 s
->img_buffer_end
= s
->buffer_start
+ n
;
1227 stbi_inline
static stbi_uc
stbi__get8(stbi__context
*s
)
1229 if (s
->img_buffer
< s
->img_buffer_end
)
1230 return *s
->img_buffer
++;
1231 if (s
->read_from_callbacks
) {
1232 stbi__refill_buffer(s
);
1233 return *s
->img_buffer
++;
1238 stbi_inline
static int stbi__at_eof(stbi__context
*s
)
1241 if (!(s
->io
.eof
)(s
->io_user_data
)) return 0;
1242 // if feof() is true, check if buffer = end
1243 // special case: we've only got the special 0 character at the end
1244 if (s
->read_from_callbacks
== 0) return 1;
1247 return s
->img_buffer
>= s
->img_buffer_end
;
1250 static void stbi__skip(stbi__context
*s
, int n
)
1253 s
->img_buffer
= s
->img_buffer_end
;
1257 int blen
= (int) (s
->img_buffer_end
- s
->img_buffer
);
1259 s
->img_buffer
= s
->img_buffer_end
;
1260 (s
->io
.skip
)(s
->io_user_data
, n
- blen
);
1267 static int stbi__getn(stbi__context
*s
, stbi_uc
*buffer
, int n
)
1270 int blen
= (int) (s
->img_buffer_end
- s
->img_buffer
);
1274 memcpy(buffer
, s
->img_buffer
, blen
);
1276 count
= (s
->io
.read
)(s
->io_user_data
, (char*) buffer
+ blen
, n
- blen
);
1277 res
= (count
== (n
-blen
));
1278 s
->img_buffer
= s
->img_buffer_end
;
1283 if (s
->img_buffer
+n
<= s
->img_buffer_end
) {
1284 memcpy(buffer
, s
->img_buffer
, n
);
1291 static int stbi__get16be(stbi__context
*s
)
1293 int z
= stbi__get8(s
);
1294 return (z
<< 8) + stbi__get8(s
);
1297 static stbi__uint32
stbi__get32be(stbi__context
*s
)
1299 stbi__uint32 z
= stbi__get16be(s
);
1300 return (z
<< 16) + stbi__get16be(s
);
1303 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
1306 static int stbi__get16le(stbi__context
*s
)
1308 int z
= stbi__get8(s
);
1309 return z
+ (stbi__get8(s
) << 8);
1314 static stbi__uint32
stbi__get32le(stbi__context
*s
)
1316 stbi__uint32 z
= stbi__get16le(s
);
1317 return z
+ (stbi__get16le(s
) << 16);
1321 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1324 //////////////////////////////////////////////////////////////////////////////
1326 // generic converter from built-in img_n to req_comp
1327 // individual types do this automatically as much as possible (e.g. jpeg
1328 // does all cases internally since it needs to colorspace convert anyway,
1329 // and it never has alpha, so very few cases ). png can automatically
1330 // interleave an alpha=255 channel, but falls back to this for other cases
1332 // assume data buffer is malloced, so malloc a new one and free that one
1333 // only failure mode is malloc failing
1335 static stbi_uc
stbi__compute_y(int r
, int g
, int b
)
1337 return (stbi_uc
) (((r
*77) + (g
*150) + (29*b
)) >> 8);
1340 static unsigned char *stbi__convert_format(unsigned char *data
, int img_n
, int req_comp
, unsigned int x
, unsigned int y
)
1343 unsigned char *good
;
1345 if (req_comp
== img_n
) return data
;
1346 STBI_ASSERT(req_comp
>= 1 && req_comp
<= 4);
1348 good
= (unsigned char *) stbi__malloc(req_comp
* x
* y
);
1351 return stbi__errpuc("outofmem", "Out of memory");
1354 for (j
=0; j
< (int) y
; ++j
) {
1355 unsigned char *src
= data
+ j
* x
* img_n
;
1356 unsigned char *dest
= good
+ j
* x
* req_comp
;
1358 #define COMBO(a,b) ((a)*8+(b))
1359 #define CASE(a,b) case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1360 // convert source image with img_n components to one with req_comp components;
1361 // avoid switch per pixel, so use switch per scanline and massive macros
1362 switch (COMBO(img_n
, req_comp
)) {
1363 CASE(1,2) dest
[0]=src
[0], dest
[1]=255; break;
1364 CASE(1,3) dest
[0]=dest
[1]=dest
[2]=src
[0]; break;
1365 CASE(1,4) dest
[0]=dest
[1]=dest
[2]=src
[0], dest
[3]=255; break;
1366 CASE(2,1) dest
[0]=src
[0]; break;
1367 CASE(2,3) dest
[0]=dest
[1]=dest
[2]=src
[0]; break;
1368 CASE(2,4) dest
[0]=dest
[1]=dest
[2]=src
[0], dest
[3]=src
[1]; break;
1369 CASE(3,4) dest
[0]=src
[0],dest
[1]=src
[1],dest
[2]=src
[2],dest
[3]=255; break;
1370 CASE(3,1) dest
[0]=stbi__compute_y(src
[0],src
[1],src
[2]); break;
1371 CASE(3,2) dest
[0]=stbi__compute_y(src
[0],src
[1],src
[2]), dest
[1] = 255; break;
1372 CASE(4,1) dest
[0]=stbi__compute_y(src
[0],src
[1],src
[2]); break;
1373 CASE(4,2) dest
[0]=stbi__compute_y(src
[0],src
[1],src
[2]), dest
[1] = src
[3]; break;
1374 CASE(4,3) dest
[0]=src
[0],dest
[1]=src
[1],dest
[2]=src
[2]; break;
1375 default: STBI_ASSERT(0);
1384 #ifndef STBI_NO_LINEAR
1385 static float *stbi__ldr_to_hdr(stbi_uc
*data
, int x
, int y
, int comp
)
1388 float *output
= (float *) stbi__malloc(x
* y
* comp
* sizeof(float));
1389 if (output
== NULL
) { STBI_FREE(data
); return stbi__errpf("outofmem", "Out of memory"); }
1390 // compute number of non-alpha components
1391 if (comp
& 1) n
= comp
; else n
= comp
-1;
1392 for (i
=0; i
< x
*y
; ++i
) {
1393 for (k
=0; k
< n
; ++k
) {
1394 output
[i
*comp
+ k
] = (float) (pow(data
[i
*comp
+k
]/255.0f
, stbi__l2h_gamma
) * stbi__l2h_scale
);
1396 if (k
< comp
) output
[i
*comp
+ k
] = data
[i
*comp
+k
]/255.0f
;
1404 #define stbi__float2int(x) ((int) (x))
1405 static stbi_uc
*stbi__hdr_to_ldr(float *data
, int x
, int y
, int comp
)
1408 stbi_uc
*output
= (stbi_uc
*) stbi__malloc(x
* y
* comp
);
1409 if (output
== NULL
) { STBI_FREE(data
); return stbi__errpuc("outofmem", "Out of memory"); }
1410 // compute number of non-alpha components
1411 if (comp
& 1) n
= comp
; else n
= comp
-1;
1412 for (i
=0; i
< x
*y
; ++i
) {
1413 for (k
=0; k
< n
; ++k
) {
1414 float z
= (float) pow(data
[i
*comp
+k
]*stbi__h2l_scale_i
, stbi__h2l_gamma_i
) * 255 + 0.5f
;
1416 if (z
> 255) z
= 255;
1417 output
[i
*comp
+ k
] = (stbi_uc
) stbi__float2int(z
);
1420 float z
= data
[i
*comp
+k
] * 255 + 0.5f
;
1422 if (z
> 255) z
= 255;
1423 output
[i
*comp
+ k
] = (stbi_uc
) stbi__float2int(z
);
1431 //////////////////////////////////////////////////////////////////////////////
1433 // "baseline" JPEG/JFIF decoder
1435 // simple implementation
1436 // - doesn't support delayed output of y-dimension
1437 // - simple interface (only one output format: 8-bit interleaved RGB)
1438 // - doesn't try to recover corrupt jpegs
1439 // - doesn't allow partial loading, loading multiple at once
1440 // - still fast on x86 (copying globals into locals doesn't help x86)
1441 // - allocates lots of intermediate memory (full size of all components)
1442 // - non-interleaved case requires this anyway
1443 // - allows good upsampling (see next)
1445 // - upsampled channels are bilinearly interpolated, even across blocks
1446 // - quality integer IDCT derived from IJG's 'slow'
1448 // - fast huffman; reasonable integer IDCT
1449 // - some SIMD kernels for common paths on targets with SSE2/NEON
1450 // - uses a lot of intermediate memory, could cache poorly
1452 #ifndef STBI_NO_JPEG
1454 // huffman decoding acceleration
1455 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1459 stbi_uc fast
[1 << FAST_BITS
];
1460 // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1461 stbi__uint16 code
[256];
1462 stbi_uc values
[256];
1464 unsigned int maxcode
[18];
1465 int delta
[17]; // old 'firstsymbol' - old 'firstcode'
1471 stbi__huffman huff_dc
[4];
1472 stbi__huffman huff_ac
[4];
1473 stbi_uc dequant
[4][64];
1474 stbi__int16 fast_ac
[4][1 << FAST_BITS
];
1476 // sizes for components, interleaved MCUs
1477 int img_h_max
, img_v_max
;
1478 int img_mcu_x
, img_mcu_y
;
1479 int img_mcu_w
, img_mcu_h
;
1481 // definition of jpeg image component
1492 void *raw_data
, *raw_coeff
;
1494 short *coeff
; // progressive only
1495 int coeff_w
, coeff_h
; // number of 8x8 coefficient blocks
1498 stbi__uint32 code_buffer
; // jpeg entropy-coded buffer
1499 int code_bits
; // number of valid bits
1500 unsigned char marker
; // marker seen while filling entropy buffer
1501 int nomore
; // flag if we saw a marker so must stop
1510 int scan_n
, order
[4];
1511 int restart_interval
, todo
;
1514 void (*idct_block_kernel
)(stbi_uc
*out
, int out_stride
, short data
[64]);
1515 void (*YCbCr_to_RGB_kernel
)(stbi_uc
*out
, const stbi_uc
*y
, const stbi_uc
*pcb
, const stbi_uc
*pcr
, int count
, int step
);
1516 stbi_uc
*(*resample_row_hv_2_kernel
)(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
);
1519 static int stbi__build_huffman(stbi__huffman
*h
, int *count
)
1522 // build size list for each symbol (from JPEG spec)
1523 for (i
=0; i
< 16; ++i
)
1524 for (j
=0; j
< count
[i
]; ++j
)
1525 h
->size
[k
++] = (stbi_uc
) (i
+1);
1528 // compute actual symbols (from jpeg spec)
1531 for(j
=1; j
<= 16; ++j
) {
1532 // compute delta to add to code to compute symbol id
1533 h
->delta
[j
] = k
- code
;
1534 if (h
->size
[k
] == j
) {
1535 while (h
->size
[k
] == j
)
1536 h
->code
[k
++] = (stbi__uint16
) (code
++);
1537 if (code
-1 >= (1 << j
)) return stbi__err("bad code lengths","Corrupt JPEG");
1539 // compute largest code + 1 for this size, preshifted as needed later
1540 h
->maxcode
[j
] = code
<< (16-j
);
1543 h
->maxcode
[j
] = 0xffffffff;
1545 // build non-spec acceleration table; 255 is flag for not-accelerated
1546 memset(h
->fast
, 255, 1 << FAST_BITS
);
1547 for (i
=0; i
< k
; ++i
) {
1549 if (s
<= FAST_BITS
) {
1550 int c
= h
->code
[i
] << (FAST_BITS
-s
);
1551 int m
= 1 << (FAST_BITS
-s
);
1552 for (j
=0; j
< m
; ++j
) {
1553 h
->fast
[c
+j
] = (stbi_uc
) i
;
1560 // build a table that decodes both magnitude and value of small ACs in
1562 static void stbi__build_fast_ac(stbi__int16
*fast_ac
, stbi__huffman
*h
)
1565 for (i
=0; i
< (1 << FAST_BITS
); ++i
) {
1566 stbi_uc fast
= h
->fast
[i
];
1569 int rs
= h
->values
[fast
];
1570 int run
= (rs
>> 4) & 15;
1571 int magbits
= rs
& 15;
1572 int len
= h
->size
[fast
];
1574 if (magbits
&& len
+ magbits
<= FAST_BITS
) {
1575 // magnitude code followed by receive_extend code
1576 int k
= ((i
<< len
) & ((1 << FAST_BITS
) - 1)) >> (FAST_BITS
- magbits
);
1577 int m
= 1 << (magbits
- 1);
1578 if (k
< m
) k
+= (-1 << magbits
) + 1;
1579 // if the result is small enough, we can fit it in fast_ac table
1580 if (k
>= -128 && k
<= 127)
1581 fast_ac
[i
] = (stbi__int16
) ((k
<< 8) + (run
<< 4) + (len
+ magbits
));
1587 static void stbi__grow_buffer_unsafe(stbi__jpeg
*j
)
1590 int b
= j
->nomore
? 0 : stbi__get8(j
->s
);
1592 int c
= stbi__get8(j
->s
);
1594 j
->marker
= (unsigned char) c
;
1599 j
->code_buffer
|= b
<< (24 - j
->code_bits
);
1601 } while (j
->code_bits
<= 24);
1605 static stbi__uint32 stbi__bmask
[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1607 // decode a jpeg huffman value from the bitstream
1608 stbi_inline
static int stbi__jpeg_huff_decode(stbi__jpeg
*j
, stbi__huffman
*h
)
1613 if (j
->code_bits
< 16) stbi__grow_buffer_unsafe(j
);
1615 // look at the top FAST_BITS and determine what symbol ID it is,
1616 // if the code is <= FAST_BITS
1617 c
= (j
->code_buffer
>> (32 - FAST_BITS
)) & ((1 << FAST_BITS
)-1);
1621 if (s
> j
->code_bits
)
1623 j
->code_buffer
<<= s
;
1625 return h
->values
[k
];
1628 // naive test is to shift the code_buffer down so k bits are
1629 // valid, then test against maxcode. To speed this up, we've
1630 // preshifted maxcode left so that it has (16-k) 0s at the
1631 // end; in other words, regardless of the number of bits, it
1632 // wants to be compared against something shifted to have 16;
1633 // that way we don't need to shift inside the loop.
1634 temp
= j
->code_buffer
>> 16;
1635 for (k
=FAST_BITS
+1 ; ; ++k
)
1636 if (temp
< h
->maxcode
[k
])
1639 // error! code not found
1644 if (k
> j
->code_bits
)
1647 // convert the huffman code to the symbol id
1648 c
= ((j
->code_buffer
>> (32 - k
)) & stbi__bmask
[k
]) + h
->delta
[k
];
1649 STBI_ASSERT((((j
->code_buffer
) >> (32 - h
->size
[c
])) & stbi__bmask
[h
->size
[c
]]) == h
->code
[c
]);
1651 // convert the id to a symbol
1653 j
->code_buffer
<<= k
;
1654 return h
->values
[c
];
1657 // bias[n] = (-1<<n) + 1
1658 static int const stbi__jbias
[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
1660 // combined JPEG 'receive' and JPEG 'extend', since baseline
1661 // always extends everything it receives.
1662 stbi_inline
static int stbi__extend_receive(stbi__jpeg
*j
, int n
)
1666 if (j
->code_bits
< n
) stbi__grow_buffer_unsafe(j
);
1668 sgn
= (stbi__int32
)j
->code_buffer
>> 31; // sign bit is always in MSB
1669 k
= stbi_lrot(j
->code_buffer
, n
);
1670 STBI_ASSERT(n
>= 0 && n
< (int) (sizeof(stbi__bmask
)/sizeof(*stbi__bmask
)));
1671 j
->code_buffer
= k
& ~stbi__bmask
[n
];
1672 k
&= stbi__bmask
[n
];
1674 return k
+ (stbi__jbias
[n
] & ~sgn
);
1677 // get some unsigned bits
1678 stbi_inline
static int stbi__jpeg_get_bits(stbi__jpeg
*j
, int n
)
1681 if (j
->code_bits
< n
) stbi__grow_buffer_unsafe(j
);
1682 k
= stbi_lrot(j
->code_buffer
, n
);
1683 j
->code_buffer
= k
& ~stbi__bmask
[n
];
1684 k
&= stbi__bmask
[n
];
1689 stbi_inline
static int stbi__jpeg_get_bit(stbi__jpeg
*j
)
1692 if (j
->code_bits
< 1) stbi__grow_buffer_unsafe(j
);
1694 j
->code_buffer
<<= 1;
1696 return k
& 0x80000000;
1699 // given a value that's at position X in the zigzag stream,
1700 // where does it appear in the 8x8 matrix coded as row-major?
1701 static stbi_uc stbi__jpeg_dezigzag
[64+15] =
1703 0, 1, 8, 16, 9, 2, 3, 10,
1704 17, 24, 32, 25, 18, 11, 4, 5,
1705 12, 19, 26, 33, 40, 48, 41, 34,
1706 27, 20, 13, 6, 7, 14, 21, 28,
1707 35, 42, 49, 56, 57, 50, 43, 36,
1708 29, 22, 15, 23, 30, 37, 44, 51,
1709 58, 59, 52, 45, 38, 31, 39, 46,
1710 53, 60, 61, 54, 47, 55, 62, 63,
1711 // let corrupt input sample past end
1712 63, 63, 63, 63, 63, 63, 63, 63,
1713 63, 63, 63, 63, 63, 63, 63
1716 // decode one 64-entry block--
1717 static int stbi__jpeg_decode_block(stbi__jpeg
*j
, short data
[64], stbi__huffman
*hdc
, stbi__huffman
*hac
, stbi__int16
*fac
, int b
, stbi_uc
*dequant
)
1722 if (j
->code_bits
< 16) stbi__grow_buffer_unsafe(j
);
1723 t
= stbi__jpeg_huff_decode(j
, hdc
);
1724 if (t
< 0) return stbi__err("bad huffman code","Corrupt JPEG");
1726 // 0 all the ac values now so we can do it 32-bits at a time
1727 memset(data
,0,64*sizeof(data
[0]));
1729 diff
= t
? stbi__extend_receive(j
, t
) : 0;
1730 dc
= j
->img_comp
[b
].dc_pred
+ diff
;
1731 j
->img_comp
[b
].dc_pred
= dc
;
1732 data
[0] = (short) (dc
* dequant
[0]);
1734 // decode AC components, see JPEG spec
1739 if (j
->code_bits
< 16) stbi__grow_buffer_unsafe(j
);
1740 c
= (j
->code_buffer
>> (32 - FAST_BITS
)) & ((1 << FAST_BITS
)-1);
1742 if (r
) { // fast-AC path
1743 k
+= (r
>> 4) & 15; // run
1744 s
= r
& 15; // combined length
1745 j
->code_buffer
<<= s
;
1747 // decode into unzigzag'd location
1748 zig
= stbi__jpeg_dezigzag
[k
++];
1749 data
[zig
] = (short) ((r
>> 8) * dequant
[zig
]);
1751 int rs
= stbi__jpeg_huff_decode(j
, hac
);
1752 if (rs
< 0) return stbi__err("bad huffman code","Corrupt JPEG");
1756 if (rs
!= 0xf0) break; // end block
1760 // decode into unzigzag'd location
1761 zig
= stbi__jpeg_dezigzag
[k
++];
1762 data
[zig
] = (short) (stbi__extend_receive(j
,s
) * dequant
[zig
]);
1769 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg
*j
, short data
[64], stbi__huffman
*hdc
, int b
)
1773 if (j
->spec_end
!= 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1775 if (j
->code_bits
< 16) stbi__grow_buffer_unsafe(j
);
1777 if (j
->succ_high
== 0) {
1778 // first scan for DC coefficient, must be first
1779 memset(data
,0,64*sizeof(data
[0])); // 0 all the ac values now
1780 t
= stbi__jpeg_huff_decode(j
, hdc
);
1781 diff
= t
? stbi__extend_receive(j
, t
) : 0;
1783 dc
= j
->img_comp
[b
].dc_pred
+ diff
;
1784 j
->img_comp
[b
].dc_pred
= dc
;
1785 data
[0] = (short) (dc
<< j
->succ_low
);
1787 // refinement scan for DC coefficient
1788 if (stbi__jpeg_get_bit(j
))
1789 data
[0] += (short) (1 << j
->succ_low
);
1794 // @OPTIMIZE: store non-zigzagged during the decode passes,
1795 // and only de-zigzag when dequantizing
1796 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg
*j
, short data
[64], stbi__huffman
*hac
, stbi__int16
*fac
)
1799 if (j
->spec_start
== 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1801 if (j
->succ_high
== 0) {
1802 int shift
= j
->succ_low
;
1813 if (j
->code_bits
< 16) stbi__grow_buffer_unsafe(j
);
1814 c
= (j
->code_buffer
>> (32 - FAST_BITS
)) & ((1 << FAST_BITS
)-1);
1816 if (r
) { // fast-AC path
1817 k
+= (r
>> 4) & 15; // run
1818 s
= r
& 15; // combined length
1819 j
->code_buffer
<<= s
;
1821 zig
= stbi__jpeg_dezigzag
[k
++];
1822 data
[zig
] = (short) ((r
>> 8) << shift
);
1824 int rs
= stbi__jpeg_huff_decode(j
, hac
);
1825 if (rs
< 0) return stbi__err("bad huffman code","Corrupt JPEG");
1830 j
->eob_run
= (1 << r
);
1832 j
->eob_run
+= stbi__jpeg_get_bits(j
, r
);
1839 zig
= stbi__jpeg_dezigzag
[k
++];
1840 data
[zig
] = (short) (stbi__extend_receive(j
,s
) << shift
);
1843 } while (k
<= j
->spec_end
);
1845 // refinement scan for these AC coefficients
1847 short bit
= (short) (1 << j
->succ_low
);
1851 for (k
= j
->spec_start
; k
<= j
->spec_end
; ++k
) {
1852 short *p
= &data
[stbi__jpeg_dezigzag
[k
]];
1854 if (stbi__jpeg_get_bit(j
))
1855 if ((*p
& bit
)==0) {
1866 int rs
= stbi__jpeg_huff_decode(j
, hac
); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
1867 if (rs
< 0) return stbi__err("bad huffman code","Corrupt JPEG");
1872 j
->eob_run
= (1 << r
) - 1;
1874 j
->eob_run
+= stbi__jpeg_get_bits(j
, r
);
1875 r
= 64; // force end of block
1877 // r=15 s=0 should write 16 0s, so we just do
1878 // a run of 15 0s and then write s (which is 0),
1879 // so we don't have to do anything special here
1882 if (s
!= 1) return stbi__err("bad huffman code", "Corrupt JPEG");
1884 if (stbi__jpeg_get_bit(j
))
1891 while (k
<= j
->spec_end
) {
1892 short *p
= &data
[stbi__jpeg_dezigzag
[k
++]];
1894 if (stbi__jpeg_get_bit(j
))
1895 if ((*p
& bit
)==0) {
1909 } while (k
<= j
->spec_end
);
1915 // take a -128..127 value and stbi__clamp it and convert to 0..255
1916 stbi_inline
static stbi_uc
stbi__clamp(int x
)
1918 // trick to use a single test to catch both cases
1919 if ((unsigned int) x
> 255) {
1920 if (x
< 0) return 0;
1921 if (x
> 255) return 255;
1926 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
1927 #define stbi__fsh(x) ((x) << 12)
1929 // derived from jidctint -- DCT_ISLOW
1930 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
1931 int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
1934 p1 = (p2+p3) * stbi__f2f(0.5411961f); \
1935 t2 = p1 + p3*stbi__f2f(-1.847759065f); \
1936 t3 = p1 + p2*stbi__f2f( 0.765366865f); \
1939 t0 = stbi__fsh(p2+p3); \
1940 t1 = stbi__fsh(p2-p3); \
1953 p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
1954 t0 = t0*stbi__f2f( 0.298631336f); \
1955 t1 = t1*stbi__f2f( 2.053119869f); \
1956 t2 = t2*stbi__f2f( 3.072711026f); \
1957 t3 = t3*stbi__f2f( 1.501321110f); \
1958 p1 = p5 + p1*stbi__f2f(-0.899976223f); \
1959 p2 = p5 + p2*stbi__f2f(-2.562915447f); \
1960 p3 = p3*stbi__f2f(-1.961570560f); \
1961 p4 = p4*stbi__f2f(-0.390180644f); \
1967 static void stbi__idct_block(stbi_uc
*out
, int out_stride
, short data
[64])
1969 int i
,val
[64],*v
=val
;
1974 for (i
=0; i
< 8; ++i
,++d
, ++v
) {
1975 // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
1976 if (d
[ 8]==0 && d
[16]==0 && d
[24]==0 && d
[32]==0
1977 && d
[40]==0 && d
[48]==0 && d
[56]==0) {
1978 // no shortcut 0 seconds
1979 // (1|2|3|4|5|6|7)==0 0 seconds
1980 // all separate -0.047 seconds
1981 // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
1982 int dcterm
= d
[0] << 2;
1983 v
[0] = v
[8] = v
[16] = v
[24] = v
[32] = v
[40] = v
[48] = v
[56] = dcterm
;
1985 STBI__IDCT_1D(d
[ 0],d
[ 8],d
[16],d
[24],d
[32],d
[40],d
[48],d
[56])
1986 // constants scaled things up by 1<<12; let's bring them back
1987 // down, but keep 2 extra bits of precision
1988 x0
+= 512; x1
+= 512; x2
+= 512; x3
+= 512;
1989 v
[ 0] = (x0
+t3
) >> 10;
1990 v
[56] = (x0
-t3
) >> 10;
1991 v
[ 8] = (x1
+t2
) >> 10;
1992 v
[48] = (x1
-t2
) >> 10;
1993 v
[16] = (x2
+t1
) >> 10;
1994 v
[40] = (x2
-t1
) >> 10;
1995 v
[24] = (x3
+t0
) >> 10;
1996 v
[32] = (x3
-t0
) >> 10;
2000 for (i
=0, v
=val
, o
=out
; i
< 8; ++i
,v
+=8,o
+=out_stride
) {
2001 // no fast case since the first 1D IDCT spread components out
2002 STBI__IDCT_1D(v
[0],v
[1],v
[2],v
[3],v
[4],v
[5],v
[6],v
[7])
2003 // constants scaled things up by 1<<12, plus we had 1<<2 from first
2004 // loop, plus horizontal and vertical each scale by sqrt(8) so together
2005 // we've got an extra 1<<3, so 1<<17 total we need to remove.
2006 // so we want to round that, which means adding 0.5 * 1<<17,
2007 // aka 65536. Also, we'll end up with -128 to 127 that we want
2008 // to encode as 0..255 by adding 128, so we'll add that before the shift
2009 x0
+= 65536 + (128<<17);
2010 x1
+= 65536 + (128<<17);
2011 x2
+= 65536 + (128<<17);
2012 x3
+= 65536 + (128<<17);
2013 // tried computing the shifts into temps, or'ing the temps to see
2014 // if any were out of range, but that was slower
2015 o
[0] = stbi__clamp((x0
+t3
) >> 17);
2016 o
[7] = stbi__clamp((x0
-t3
) >> 17);
2017 o
[1] = stbi__clamp((x1
+t2
) >> 17);
2018 o
[6] = stbi__clamp((x1
-t2
) >> 17);
2019 o
[2] = stbi__clamp((x2
+t1
) >> 17);
2020 o
[5] = stbi__clamp((x2
-t1
) >> 17);
2021 o
[3] = stbi__clamp((x3
+t0
) >> 17);
2022 o
[4] = stbi__clamp((x3
-t0
) >> 17);
2027 // sse2 integer IDCT. not the fastest possible implementation but it
2028 // produces bit-identical results to the generic C version so it's
2029 // fully "transparent".
2030 static void stbi__idct_simd(stbi_uc
*out
, int out_stride
, short data
[64])
2032 // This is constructed to match our regular (generic) integer IDCT exactly.
2033 __m128i row0
, row1
, row2
, row3
, row4
, row5
, row6
, row7
;
2036 // dot product constant: even elems=x, odd elems=y
2037 #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2039 // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2040 // out(1) = c1[even]*x + c1[odd]*y
2041 #define dct_rot(out0,out1, x,y,c0,c1) \
2042 __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2043 __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2044 __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2045 __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2046 __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2047 __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2049 // out = in << 12 (in 16-bit, out 32-bit)
2050 #define dct_widen(out, in) \
2051 __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2052 __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2055 #define dct_wadd(out, a, b) \
2056 __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2057 __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2060 #define dct_wsub(out, a, b) \
2061 __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2062 __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2064 // butterfly a/b, add bias, then shift by "s" and pack
2065 #define dct_bfly32o(out0, out1, a,b,bias,s) \
2067 __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2068 __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2069 dct_wadd(sum, abiased, b); \
2070 dct_wsub(dif, abiased, b); \
2071 out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2072 out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2075 // 8-bit interleave step (for transposes)
2076 #define dct_interleave8(a, b) \
2078 a = _mm_unpacklo_epi8(a, b); \
2079 b = _mm_unpackhi_epi8(tmp, b)
2081 // 16-bit interleave step (for transposes)
2082 #define dct_interleave16(a, b) \
2084 a = _mm_unpacklo_epi16(a, b); \
2085 b = _mm_unpackhi_epi16(tmp, b)
2087 #define dct_pass(bias,shift) \
2090 dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2091 __m128i sum04 = _mm_add_epi16(row0, row4); \
2092 __m128i dif04 = _mm_sub_epi16(row0, row4); \
2093 dct_widen(t0e, sum04); \
2094 dct_widen(t1e, dif04); \
2095 dct_wadd(x0, t0e, t3e); \
2096 dct_wsub(x3, t0e, t3e); \
2097 dct_wadd(x1, t1e, t2e); \
2098 dct_wsub(x2, t1e, t2e); \
2100 dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2101 dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2102 __m128i sum17 = _mm_add_epi16(row1, row7); \
2103 __m128i sum35 = _mm_add_epi16(row3, row5); \
2104 dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2105 dct_wadd(x4, y0o, y4o); \
2106 dct_wadd(x5, y1o, y5o); \
2107 dct_wadd(x6, y2o, y5o); \
2108 dct_wadd(x7, y3o, y4o); \
2109 dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2110 dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2111 dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2112 dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2115 __m128i rot0_0
= dct_const(stbi__f2f(0.5411961f
), stbi__f2f(0.5411961f
) + stbi__f2f(-1.847759065f
));
2116 __m128i rot0_1
= dct_const(stbi__f2f(0.5411961f
) + stbi__f2f( 0.765366865f
), stbi__f2f(0.5411961f
));
2117 __m128i rot1_0
= dct_const(stbi__f2f(1.175875602f
) + stbi__f2f(-0.899976223f
), stbi__f2f(1.175875602f
));
2118 __m128i rot1_1
= dct_const(stbi__f2f(1.175875602f
), stbi__f2f(1.175875602f
) + stbi__f2f(-2.562915447f
));
2119 __m128i rot2_0
= dct_const(stbi__f2f(-1.961570560f
) + stbi__f2f( 0.298631336f
), stbi__f2f(-1.961570560f
));
2120 __m128i rot2_1
= dct_const(stbi__f2f(-1.961570560f
), stbi__f2f(-1.961570560f
) + stbi__f2f( 3.072711026f
));
2121 __m128i rot3_0
= dct_const(stbi__f2f(-0.390180644f
) + stbi__f2f( 2.053119869f
), stbi__f2f(-0.390180644f
));
2122 __m128i rot3_1
= dct_const(stbi__f2f(-0.390180644f
), stbi__f2f(-0.390180644f
) + stbi__f2f( 1.501321110f
));
2124 // rounding biases in column/row passes, see stbi__idct_block for explanation.
2125 __m128i bias_0
= _mm_set1_epi32(512);
2126 __m128i bias_1
= _mm_set1_epi32(65536 + (128<<17));
2129 row0
= _mm_load_si128((const __m128i
*) (data
+ 0*8));
2130 row1
= _mm_load_si128((const __m128i
*) (data
+ 1*8));
2131 row2
= _mm_load_si128((const __m128i
*) (data
+ 2*8));
2132 row3
= _mm_load_si128((const __m128i
*) (data
+ 3*8));
2133 row4
= _mm_load_si128((const __m128i
*) (data
+ 4*8));
2134 row5
= _mm_load_si128((const __m128i
*) (data
+ 5*8));
2135 row6
= _mm_load_si128((const __m128i
*) (data
+ 6*8));
2136 row7
= _mm_load_si128((const __m128i
*) (data
+ 7*8));
2139 dct_pass(bias_0
, 10);
2142 // 16bit 8x8 transpose pass 1
2143 dct_interleave16(row0
, row4
);
2144 dct_interleave16(row1
, row5
);
2145 dct_interleave16(row2
, row6
);
2146 dct_interleave16(row3
, row7
);
2149 dct_interleave16(row0
, row2
);
2150 dct_interleave16(row1
, row3
);
2151 dct_interleave16(row4
, row6
);
2152 dct_interleave16(row5
, row7
);
2155 dct_interleave16(row0
, row1
);
2156 dct_interleave16(row2
, row3
);
2157 dct_interleave16(row4
, row5
);
2158 dct_interleave16(row6
, row7
);
2162 dct_pass(bias_1
, 17);
2166 __m128i p0
= _mm_packus_epi16(row0
, row1
); // a0a1a2a3...a7b0b1b2b3...b7
2167 __m128i p1
= _mm_packus_epi16(row2
, row3
);
2168 __m128i p2
= _mm_packus_epi16(row4
, row5
);
2169 __m128i p3
= _mm_packus_epi16(row6
, row7
);
2171 // 8bit 8x8 transpose pass 1
2172 dct_interleave8(p0
, p2
); // a0e0a1e1...
2173 dct_interleave8(p1
, p3
); // c0g0c1g1...
2176 dct_interleave8(p0
, p1
); // a0c0e0g0...
2177 dct_interleave8(p2
, p3
); // b0d0f0h0...
2180 dct_interleave8(p0
, p2
); // a0b0c0d0...
2181 dct_interleave8(p1
, p3
); // a4b4c4d4...
2184 _mm_storel_epi64((__m128i
*) out
, p0
); out
+= out_stride
;
2185 _mm_storel_epi64((__m128i
*) out
, _mm_shuffle_epi32(p0
, 0x4e)); out
+= out_stride
;
2186 _mm_storel_epi64((__m128i
*) out
, p2
); out
+= out_stride
;
2187 _mm_storel_epi64((__m128i
*) out
, _mm_shuffle_epi32(p2
, 0x4e)); out
+= out_stride
;
2188 _mm_storel_epi64((__m128i
*) out
, p1
); out
+= out_stride
;
2189 _mm_storel_epi64((__m128i
*) out
, _mm_shuffle_epi32(p1
, 0x4e)); out
+= out_stride
;
2190 _mm_storel_epi64((__m128i
*) out
, p3
); out
+= out_stride
;
2191 _mm_storel_epi64((__m128i
*) out
, _mm_shuffle_epi32(p3
, 0x4e));
2200 #undef dct_interleave8
2201 #undef dct_interleave16
2209 // NEON integer IDCT. should produce bit-identical
2210 // results to the generic C version.
2211 static void stbi__idct_simd(stbi_uc
*out
, int out_stride
, short data
[64])
2213 int16x8_t row0
, row1
, row2
, row3
, row4
, row5
, row6
, row7
;
2215 int16x4_t rot0_0
= vdup_n_s16(stbi__f2f(0.5411961f
));
2216 int16x4_t rot0_1
= vdup_n_s16(stbi__f2f(-1.847759065f
));
2217 int16x4_t rot0_2
= vdup_n_s16(stbi__f2f( 0.765366865f
));
2218 int16x4_t rot1_0
= vdup_n_s16(stbi__f2f( 1.175875602f
));
2219 int16x4_t rot1_1
= vdup_n_s16(stbi__f2f(-0.899976223f
));
2220 int16x4_t rot1_2
= vdup_n_s16(stbi__f2f(-2.562915447f
));
2221 int16x4_t rot2_0
= vdup_n_s16(stbi__f2f(-1.961570560f
));
2222 int16x4_t rot2_1
= vdup_n_s16(stbi__f2f(-0.390180644f
));
2223 int16x4_t rot3_0
= vdup_n_s16(stbi__f2f( 0.298631336f
));
2224 int16x4_t rot3_1
= vdup_n_s16(stbi__f2f( 2.053119869f
));
2225 int16x4_t rot3_2
= vdup_n_s16(stbi__f2f( 3.072711026f
));
2226 int16x4_t rot3_3
= vdup_n_s16(stbi__f2f( 1.501321110f
));
2228 #define dct_long_mul(out, inq, coeff) \
2229 int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2230 int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2232 #define dct_long_mac(out, acc, inq, coeff) \
2233 int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2234 int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2236 #define dct_widen(out, inq) \
2237 int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2238 int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2241 #define dct_wadd(out, a, b) \
2242 int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2243 int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2246 #define dct_wsub(out, a, b) \
2247 int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2248 int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2250 // butterfly a/b, then shift using "shiftop" by "s" and pack
2251 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2253 dct_wadd(sum, a, b); \
2254 dct_wsub(dif, a, b); \
2255 out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2256 out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2259 #define dct_pass(shiftop, shift) \
2262 int16x8_t sum26 = vaddq_s16(row2, row6); \
2263 dct_long_mul(p1e, sum26, rot0_0); \
2264 dct_long_mac(t2e, p1e, row6, rot0_1); \
2265 dct_long_mac(t3e, p1e, row2, rot0_2); \
2266 int16x8_t sum04 = vaddq_s16(row0, row4); \
2267 int16x8_t dif04 = vsubq_s16(row0, row4); \
2268 dct_widen(t0e, sum04); \
2269 dct_widen(t1e, dif04); \
2270 dct_wadd(x0, t0e, t3e); \
2271 dct_wsub(x3, t0e, t3e); \
2272 dct_wadd(x1, t1e, t2e); \
2273 dct_wsub(x2, t1e, t2e); \
2275 int16x8_t sum15 = vaddq_s16(row1, row5); \
2276 int16x8_t sum17 = vaddq_s16(row1, row7); \
2277 int16x8_t sum35 = vaddq_s16(row3, row5); \
2278 int16x8_t sum37 = vaddq_s16(row3, row7); \
2279 int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2280 dct_long_mul(p5o, sumodd, rot1_0); \
2281 dct_long_mac(p1o, p5o, sum17, rot1_1); \
2282 dct_long_mac(p2o, p5o, sum35, rot1_2); \
2283 dct_long_mul(p3o, sum37, rot2_0); \
2284 dct_long_mul(p4o, sum15, rot2_1); \
2285 dct_wadd(sump13o, p1o, p3o); \
2286 dct_wadd(sump24o, p2o, p4o); \
2287 dct_wadd(sump23o, p2o, p3o); \
2288 dct_wadd(sump14o, p1o, p4o); \
2289 dct_long_mac(x4, sump13o, row7, rot3_0); \
2290 dct_long_mac(x5, sump24o, row5, rot3_1); \
2291 dct_long_mac(x6, sump23o, row3, rot3_2); \
2292 dct_long_mac(x7, sump14o, row1, rot3_3); \
2293 dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2294 dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2295 dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2296 dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2300 row0
= vld1q_s16(data
+ 0*8);
2301 row1
= vld1q_s16(data
+ 1*8);
2302 row2
= vld1q_s16(data
+ 2*8);
2303 row3
= vld1q_s16(data
+ 3*8);
2304 row4
= vld1q_s16(data
+ 4*8);
2305 row5
= vld1q_s16(data
+ 5*8);
2306 row6
= vld1q_s16(data
+ 6*8);
2307 row7
= vld1q_s16(data
+ 7*8);
2310 row0
= vaddq_s16(row0
, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2313 dct_pass(vrshrn_n_s32
, 10);
2315 // 16bit 8x8 transpose
2317 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2318 // whether compilers actually get this is another story, sadly.
2319 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2320 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2321 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2324 dct_trn16(row0
, row1
); // a0b0a2b2a4b4a6b6
2325 dct_trn16(row2
, row3
);
2326 dct_trn16(row4
, row5
);
2327 dct_trn16(row6
, row7
);
2330 dct_trn32(row0
, row2
); // a0b0c0d0a4b4c4d4
2331 dct_trn32(row1
, row3
);
2332 dct_trn32(row4
, row6
);
2333 dct_trn32(row5
, row7
);
2336 dct_trn64(row0
, row4
); // a0b0c0d0e0f0g0h0
2337 dct_trn64(row1
, row5
);
2338 dct_trn64(row2
, row6
);
2339 dct_trn64(row3
, row7
);
2347 // vrshrn_n_s32 only supports shifts up to 16, we need
2348 // 17. so do a non-rounding shift of 16 first then follow
2349 // up with a rounding shift by 1.
2350 dct_pass(vshrn_n_s32
, 16);
2354 uint8x8_t p0
= vqrshrun_n_s16(row0
, 1);
2355 uint8x8_t p1
= vqrshrun_n_s16(row1
, 1);
2356 uint8x8_t p2
= vqrshrun_n_s16(row2
, 1);
2357 uint8x8_t p3
= vqrshrun_n_s16(row3
, 1);
2358 uint8x8_t p4
= vqrshrun_n_s16(row4
, 1);
2359 uint8x8_t p5
= vqrshrun_n_s16(row5
, 1);
2360 uint8x8_t p6
= vqrshrun_n_s16(row6
, 1);
2361 uint8x8_t p7
= vqrshrun_n_s16(row7
, 1);
2363 // again, these can translate into one instruction, but often don't.
2364 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2365 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2366 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2368 // sadly can't use interleaved stores here since we only write
2369 // 8 bytes to each scan line!
2371 // 8x8 8-bit transpose pass 1
2378 dct_trn8_16(p0
, p2
);
2379 dct_trn8_16(p1
, p3
);
2380 dct_trn8_16(p4
, p6
);
2381 dct_trn8_16(p5
, p7
);
2384 dct_trn8_32(p0
, p4
);
2385 dct_trn8_32(p1
, p5
);
2386 dct_trn8_32(p2
, p6
);
2387 dct_trn8_32(p3
, p7
);
2390 vst1_u8(out
, p0
); out
+= out_stride
;
2391 vst1_u8(out
, p1
); out
+= out_stride
;
2392 vst1_u8(out
, p2
); out
+= out_stride
;
2393 vst1_u8(out
, p3
); out
+= out_stride
;
2394 vst1_u8(out
, p4
); out
+= out_stride
;
2395 vst1_u8(out
, p5
); out
+= out_stride
;
2396 vst1_u8(out
, p6
); out
+= out_stride
;
2415 #define STBI__MARKER_none 0xff
2416 // if there's a pending marker from the entropy stream, return that
2417 // otherwise, fetch from the stream and get a marker. if there's no
2418 // marker, return 0xff, which is never a valid marker value
2419 static stbi_uc
stbi__get_marker(stbi__jpeg
*j
)
2422 if (j
->marker
!= STBI__MARKER_none
) { x
= j
->marker
; j
->marker
= STBI__MARKER_none
; return x
; }
2423 x
= stbi__get8(j
->s
);
2424 if (x
!= 0xff) return STBI__MARKER_none
;
2426 x
= stbi__get8(j
->s
);
2430 // in each scan, we'll have scan_n components, and the order
2431 // of the components is specified by order[]
2432 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2434 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2435 // the dc prediction
2436 static void stbi__jpeg_reset(stbi__jpeg
*j
)
2441 j
->img_comp
[0].dc_pred
= j
->img_comp
[1].dc_pred
= j
->img_comp
[2].dc_pred
= 0;
2442 j
->marker
= STBI__MARKER_none
;
2443 j
->todo
= j
->restart_interval
? j
->restart_interval
: 0x7fffffff;
2445 // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2446 // since we don't even allow 1<<30 pixels
2449 static int stbi__parse_entropy_coded_data(stbi__jpeg
*z
)
2451 stbi__jpeg_reset(z
);
2452 if (!z
->progressive
) {
2453 if (z
->scan_n
== 1) {
2455 STBI_SIMD_ALIGN(short, data
[64]);
2456 int n
= z
->order
[0];
2457 // non-interleaved data, we just need to process one block at a time,
2458 // in trivial scanline order
2459 // number of blocks to do just depends on how many actual "pixels" this
2460 // component has, independent of interleaved MCU blocking and such
2461 int w
= (z
->img_comp
[n
].x
+7) >> 3;
2462 int h
= (z
->img_comp
[n
].y
+7) >> 3;
2463 for (j
=0; j
< h
; ++j
) {
2464 for (i
=0; i
< w
; ++i
) {
2465 int ha
= z
->img_comp
[n
].ha
;
2466 if (!stbi__jpeg_decode_block(z
, data
, z
->huff_dc
+z
->img_comp
[n
].hd
, z
->huff_ac
+ha
, z
->fast_ac
[ha
], n
, z
->dequant
[z
->img_comp
[n
].tq
])) return 0;
2467 z
->idct_block_kernel(z
->img_comp
[n
].data
+z
->img_comp
[n
].w2
*j
*8+i
*8, z
->img_comp
[n
].w2
, data
);
2468 // every data block is an MCU, so countdown the restart interval
2469 if (--z
->todo
<= 0) {
2470 if (z
->code_bits
< 24) stbi__grow_buffer_unsafe(z
);
2471 // if it's NOT a restart, then just bail, so we get corrupt data
2472 // rather than no data
2473 if (!STBI__RESTART(z
->marker
)) return 1;
2474 stbi__jpeg_reset(z
);
2479 } else { // interleaved
2481 STBI_SIMD_ALIGN(short, data
[64]);
2482 for (j
=0; j
< z
->img_mcu_y
; ++j
) {
2483 for (i
=0; i
< z
->img_mcu_x
; ++i
) {
2484 // scan an interleaved mcu... process scan_n components in order
2485 for (k
=0; k
< z
->scan_n
; ++k
) {
2486 int n
= z
->order
[k
];
2487 // scan out an mcu's worth of this component; that's just determined
2488 // by the basic H and V specified for the component
2489 for (y
=0; y
< z
->img_comp
[n
].v
; ++y
) {
2490 for (x
=0; x
< z
->img_comp
[n
].h
; ++x
) {
2491 int x2
= (i
*z
->img_comp
[n
].h
+ x
)*8;
2492 int y2
= (j
*z
->img_comp
[n
].v
+ y
)*8;
2493 int ha
= z
->img_comp
[n
].ha
;
2494 if (!stbi__jpeg_decode_block(z
, data
, z
->huff_dc
+z
->img_comp
[n
].hd
, z
->huff_ac
+ha
, z
->fast_ac
[ha
], n
, z
->dequant
[z
->img_comp
[n
].tq
])) return 0;
2495 z
->idct_block_kernel(z
->img_comp
[n
].data
+z
->img_comp
[n
].w2
*y2
+x2
, z
->img_comp
[n
].w2
, data
);
2499 // after all interleaved components, that's an interleaved MCU,
2500 // so now count down the restart interval
2501 if (--z
->todo
<= 0) {
2502 if (z
->code_bits
< 24) stbi__grow_buffer_unsafe(z
);
2503 if (!STBI__RESTART(z
->marker
)) return 1;
2504 stbi__jpeg_reset(z
);
2511 if (z
->scan_n
== 1) {
2513 int n
= z
->order
[0];
2514 // non-interleaved data, we just need to process one block at a time,
2515 // in trivial scanline order
2516 // number of blocks to do just depends on how many actual "pixels" this
2517 // component has, independent of interleaved MCU blocking and such
2518 int w
= (z
->img_comp
[n
].x
+7) >> 3;
2519 int h
= (z
->img_comp
[n
].y
+7) >> 3;
2520 for (j
=0; j
< h
; ++j
) {
2521 for (i
=0; i
< w
; ++i
) {
2522 short *data
= z
->img_comp
[n
].coeff
+ 64 * (i
+ j
* z
->img_comp
[n
].coeff_w
);
2523 if (z
->spec_start
== 0) {
2524 if (!stbi__jpeg_decode_block_prog_dc(z
, data
, &z
->huff_dc
[z
->img_comp
[n
].hd
], n
))
2527 int ha
= z
->img_comp
[n
].ha
;
2528 if (!stbi__jpeg_decode_block_prog_ac(z
, data
, &z
->huff_ac
[ha
], z
->fast_ac
[ha
]))
2531 // every data block is an MCU, so countdown the restart interval
2532 if (--z
->todo
<= 0) {
2533 if (z
->code_bits
< 24) stbi__grow_buffer_unsafe(z
);
2534 if (!STBI__RESTART(z
->marker
)) return 1;
2535 stbi__jpeg_reset(z
);
2540 } else { // interleaved
2542 for (j
=0; j
< z
->img_mcu_y
; ++j
) {
2543 for (i
=0; i
< z
->img_mcu_x
; ++i
) {
2544 // scan an interleaved mcu... process scan_n components in order
2545 for (k
=0; k
< z
->scan_n
; ++k
) {
2546 int n
= z
->order
[k
];
2547 // scan out an mcu's worth of this component; that's just determined
2548 // by the basic H and V specified for the component
2549 for (y
=0; y
< z
->img_comp
[n
].v
; ++y
) {
2550 for (x
=0; x
< z
->img_comp
[n
].h
; ++x
) {
2551 int x2
= (i
*z
->img_comp
[n
].h
+ x
);
2552 int y2
= (j
*z
->img_comp
[n
].v
+ y
);
2553 short *data
= z
->img_comp
[n
].coeff
+ 64 * (x2
+ y2
* z
->img_comp
[n
].coeff_w
);
2554 if (!stbi__jpeg_decode_block_prog_dc(z
, data
, &z
->huff_dc
[z
->img_comp
[n
].hd
], n
))
2559 // after all interleaved components, that's an interleaved MCU,
2560 // so now count down the restart interval
2561 if (--z
->todo
<= 0) {
2562 if (z
->code_bits
< 24) stbi__grow_buffer_unsafe(z
);
2563 if (!STBI__RESTART(z
->marker
)) return 1;
2564 stbi__jpeg_reset(z
);
2573 static void stbi__jpeg_dequantize(short *data
, stbi_uc
*dequant
)
2576 for (i
=0; i
< 64; ++i
)
2577 data
[i
] *= dequant
[i
];
2580 static void stbi__jpeg_finish(stbi__jpeg
*z
)
2582 if (z
->progressive
) {
2583 // dequantize and idct the data
2585 for (n
=0; n
< z
->s
->img_n
; ++n
) {
2586 int w
= (z
->img_comp
[n
].x
+7) >> 3;
2587 int h
= (z
->img_comp
[n
].y
+7) >> 3;
2588 for (j
=0; j
< h
; ++j
) {
2589 for (i
=0; i
< w
; ++i
) {
2590 short *data
= z
->img_comp
[n
].coeff
+ 64 * (i
+ j
* z
->img_comp
[n
].coeff_w
);
2591 stbi__jpeg_dequantize(data
, z
->dequant
[z
->img_comp
[n
].tq
]);
2592 z
->idct_block_kernel(z
->img_comp
[n
].data
+z
->img_comp
[n
].w2
*j
*8+i
*8, z
->img_comp
[n
].w2
, data
);
2599 static int stbi__process_marker(stbi__jpeg
*z
, int m
)
2603 case STBI__MARKER_none
: // no marker found
2604 return stbi__err("expected marker","Corrupt JPEG");
2606 case 0xDD: // DRI - specify restart interval
2607 if (stbi__get16be(z
->s
) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
2608 z
->restart_interval
= stbi__get16be(z
->s
);
2611 case 0xDB: // DQT - define quantization table
2612 L
= stbi__get16be(z
->s
)-2;
2614 int q
= stbi__get8(z
->s
);
2617 if (p
!= 0) return stbi__err("bad DQT type","Corrupt JPEG");
2618 if (t
> 3) return stbi__err("bad DQT table","Corrupt JPEG");
2619 for (i
=0; i
< 64; ++i
)
2620 z
->dequant
[t
][stbi__jpeg_dezigzag
[i
]] = stbi__get8(z
->s
);
2625 case 0xC4: // DHT - define huffman table
2626 L
= stbi__get16be(z
->s
)-2;
2629 int sizes
[16],i
,n
=0;
2630 int q
= stbi__get8(z
->s
);
2633 if (tc
> 1 || th
> 3) return stbi__err("bad DHT header","Corrupt JPEG");
2634 for (i
=0; i
< 16; ++i
) {
2635 sizes
[i
] = stbi__get8(z
->s
);
2640 if (!stbi__build_huffman(z
->huff_dc
+th
, sizes
)) return 0;
2641 v
= z
->huff_dc
[th
].values
;
2643 if (!stbi__build_huffman(z
->huff_ac
+th
, sizes
)) return 0;
2644 v
= z
->huff_ac
[th
].values
;
2646 for (i
=0; i
< n
; ++i
)
2647 v
[i
] = stbi__get8(z
->s
);
2649 stbi__build_fast_ac(z
->fast_ac
[th
], z
->huff_ac
+ th
);
2654 // check for comment block or APP blocks
2655 if ((m
>= 0xE0 && m
<= 0xEF) || m
== 0xFE) {
2656 stbi__skip(z
->s
, stbi__get16be(z
->s
)-2);
2663 static int stbi__process_scan_header(stbi__jpeg
*z
)
2666 int Ls
= stbi__get16be(z
->s
);
2667 z
->scan_n
= stbi__get8(z
->s
);
2668 if (z
->scan_n
< 1 || z
->scan_n
> 4 || z
->scan_n
> (int) z
->s
->img_n
) return stbi__err("bad SOS component count","Corrupt JPEG");
2669 if (Ls
!= 6+2*z
->scan_n
) return stbi__err("bad SOS len","Corrupt JPEG");
2670 for (i
=0; i
< z
->scan_n
; ++i
) {
2671 int id
= stbi__get8(z
->s
), which
;
2672 int q
= stbi__get8(z
->s
);
2673 for (which
= 0; which
< z
->s
->img_n
; ++which
)
2674 if (z
->img_comp
[which
].id
== id
)
2676 if (which
== z
->s
->img_n
) return 0; // no match
2677 z
->img_comp
[which
].hd
= q
>> 4; if (z
->img_comp
[which
].hd
> 3) return stbi__err("bad DC huff","Corrupt JPEG");
2678 z
->img_comp
[which
].ha
= q
& 15; if (z
->img_comp
[which
].ha
> 3) return stbi__err("bad AC huff","Corrupt JPEG");
2679 z
->order
[i
] = which
;
2684 z
->spec_start
= stbi__get8(z
->s
);
2685 z
->spec_end
= stbi__get8(z
->s
); // should be 63, but might be 0
2686 aa
= stbi__get8(z
->s
);
2687 z
->succ_high
= (aa
>> 4);
2688 z
->succ_low
= (aa
& 15);
2689 if (z
->progressive
) {
2690 if (z
->spec_start
> 63 || z
->spec_end
> 63 || z
->spec_start
> z
->spec_end
|| z
->succ_high
> 13 || z
->succ_low
> 13)
2691 return stbi__err("bad SOS", "Corrupt JPEG");
2693 if (z
->spec_start
!= 0) return stbi__err("bad SOS","Corrupt JPEG");
2694 if (z
->succ_high
!= 0 || z
->succ_low
!= 0) return stbi__err("bad SOS","Corrupt JPEG");
2702 static int stbi__process_frame_header(stbi__jpeg
*z
, int scan
)
2704 stbi__context
*s
= z
->s
;
2705 int Lf
,p
,i
,q
, h_max
=1,v_max
=1,c
;
2706 Lf
= stbi__get16be(s
); if (Lf
< 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
2707 p
= stbi__get8(s
); if (p
!= 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
2708 s
->img_y
= stbi__get16be(s
); if (s
->img_y
== 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
2709 s
->img_x
= stbi__get16be(s
); if (s
->img_x
== 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
2711 if (c
!= 3 && c
!= 1) return stbi__err("bad component count","Corrupt JPEG"); // JFIF requires
2713 for (i
=0; i
< c
; ++i
) {
2714 z
->img_comp
[i
].data
= NULL
;
2715 z
->img_comp
[i
].linebuf
= NULL
;
2718 if (Lf
!= 8+3*s
->img_n
) return stbi__err("bad SOF len","Corrupt JPEG");
2720 for (i
=0; i
< s
->img_n
; ++i
) {
2721 z
->img_comp
[i
].id
= stbi__get8(s
);
2722 if (z
->img_comp
[i
].id
!= i
+1) // JFIF requires
2723 if (z
->img_comp
[i
].id
!= i
) // some version of jpegtran outputs non-JFIF-compliant files!
2724 return stbi__err("bad component ID","Corrupt JPEG");
2726 z
->img_comp
[i
].h
= (q
>> 4); if (!z
->img_comp
[i
].h
|| z
->img_comp
[i
].h
> 4) return stbi__err("bad H","Corrupt JPEG");
2727 z
->img_comp
[i
].v
= q
& 15; if (!z
->img_comp
[i
].v
|| z
->img_comp
[i
].v
> 4) return stbi__err("bad V","Corrupt JPEG");
2728 z
->img_comp
[i
].tq
= stbi__get8(s
); if (z
->img_comp
[i
].tq
> 3) return stbi__err("bad TQ","Corrupt JPEG");
2731 if (scan
!= STBI__SCAN_load
) return 1;
2733 if ((1 << 30) / s
->img_x
/ s
->img_n
< s
->img_y
) return stbi__err("too large", "Image too large to decode");
2735 for (i
=0; i
< s
->img_n
; ++i
) {
2736 if (z
->img_comp
[i
].h
> h_max
) h_max
= z
->img_comp
[i
].h
;
2737 if (z
->img_comp
[i
].v
> v_max
) v_max
= z
->img_comp
[i
].v
;
2740 // compute interleaved mcu info
2741 z
->img_h_max
= h_max
;
2742 z
->img_v_max
= v_max
;
2743 z
->img_mcu_w
= h_max
* 8;
2744 z
->img_mcu_h
= v_max
* 8;
2745 z
->img_mcu_x
= (s
->img_x
+ z
->img_mcu_w
-1) / z
->img_mcu_w
;
2746 z
->img_mcu_y
= (s
->img_y
+ z
->img_mcu_h
-1) / z
->img_mcu_h
;
2748 for (i
=0; i
< s
->img_n
; ++i
) {
2749 // number of effective pixels (e.g. for non-interleaved MCU)
2750 z
->img_comp
[i
].x
= (s
->img_x
* z
->img_comp
[i
].h
+ h_max
-1) / h_max
;
2751 z
->img_comp
[i
].y
= (s
->img_y
* z
->img_comp
[i
].v
+ v_max
-1) / v_max
;
2752 // to simplify generation, we'll allocate enough memory to decode
2753 // the bogus oversized data from using interleaved MCUs and their
2754 // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
2755 // discard the extra data until colorspace conversion
2756 z
->img_comp
[i
].w2
= z
->img_mcu_x
* z
->img_comp
[i
].h
* 8;
2757 z
->img_comp
[i
].h2
= z
->img_mcu_y
* z
->img_comp
[i
].v
* 8;
2758 z
->img_comp
[i
].raw_data
= stbi__malloc(z
->img_comp
[i
].w2
* z
->img_comp
[i
].h2
+15);
2760 if (z
->img_comp
[i
].raw_data
== NULL
) {
2761 for(--i
; i
>= 0; --i
) {
2762 STBI_FREE(z
->img_comp
[i
].raw_data
);
2763 z
->img_comp
[i
].raw_data
= NULL
;
2765 return stbi__err("outofmem", "Out of memory");
2767 // align blocks for idct using mmx/sse
2768 z
->img_comp
[i
].data
= (stbi_uc
*) (((size_t) z
->img_comp
[i
].raw_data
+ 15) & ~15);
2769 z
->img_comp
[i
].linebuf
= NULL
;
2770 if (z
->progressive
) {
2771 z
->img_comp
[i
].coeff_w
= (z
->img_comp
[i
].w2
+ 7) >> 3;
2772 z
->img_comp
[i
].coeff_h
= (z
->img_comp
[i
].h2
+ 7) >> 3;
2773 z
->img_comp
[i
].raw_coeff
= STBI_MALLOC(z
->img_comp
[i
].coeff_w
* z
->img_comp
[i
].coeff_h
* 64 * sizeof(short) + 15);
2774 z
->img_comp
[i
].coeff
= (short*) (((size_t) z
->img_comp
[i
].raw_coeff
+ 15) & ~15);
2776 z
->img_comp
[i
].coeff
= 0;
2777 z
->img_comp
[i
].raw_coeff
= 0;
2784 // use comparisons since in some cases we handle more than one case (e.g. SOF)
2785 #define stbi__DNL(x) ((x) == 0xdc)
2786 #define stbi__SOI(x) ((x) == 0xd8)
2787 #define stbi__EOI(x) ((x) == 0xd9)
2788 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
2789 #define stbi__SOS(x) ((x) == 0xda)
2791 #define stbi__SOF_progressive(x) ((x) == 0xc2)
2793 static int stbi__decode_jpeg_header(stbi__jpeg
*z
, int scan
)
2796 z
->marker
= STBI__MARKER_none
; // initialize cached marker to empty
2797 m
= stbi__get_marker(z
);
2798 if (!stbi__SOI(m
)) return stbi__err("no SOI","Corrupt JPEG");
2799 if (scan
== STBI__SCAN_type
) return 1;
2800 m
= stbi__get_marker(z
);
2801 while (!stbi__SOF(m
)) {
2802 if (!stbi__process_marker(z
,m
)) return 0;
2803 m
= stbi__get_marker(z
);
2804 while (m
== STBI__MARKER_none
) {
2805 // some files have extra padding after their blocks, so ok, we'll scan
2806 if (stbi__at_eof(z
->s
)) return stbi__err("no SOF", "Corrupt JPEG");
2807 m
= stbi__get_marker(z
);
2810 z
->progressive
= stbi__SOF_progressive(m
);
2811 if (!stbi__process_frame_header(z
, scan
)) return 0;
2815 // decode image to YCbCr format
2816 static int stbi__decode_jpeg_image(stbi__jpeg
*j
)
2819 for (m
= 0; m
< 4; m
++) {
2820 j
->img_comp
[m
].raw_data
= NULL
;
2821 j
->img_comp
[m
].raw_coeff
= NULL
;
2823 j
->restart_interval
= 0;
2824 if (!stbi__decode_jpeg_header(j
, STBI__SCAN_load
)) return 0;
2825 m
= stbi__get_marker(j
);
2826 while (!stbi__EOI(m
)) {
2828 if (!stbi__process_scan_header(j
)) return 0;
2829 if (!stbi__parse_entropy_coded_data(j
)) return 0;
2830 if (j
->marker
== STBI__MARKER_none
) {
2831 // handle 0s at the end of image data from IP Kamera 9060
2832 while (!stbi__at_eof(j
->s
)) {
2833 int x
= stbi__get8(j
->s
);
2835 j
->marker
= stbi__get8(j
->s
);
2837 } else if (x
!= 0) {
2838 return stbi__err("junk before marker", "Corrupt JPEG");
2841 // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2844 if (!stbi__process_marker(j
, m
)) return 0;
2846 m
= stbi__get_marker(j
);
2849 stbi__jpeg_finish(j
);
2853 // static jfif-centered resampling (across block boundaries)
2855 typedef stbi_uc
*(*resample_row_func
)(stbi_uc
*out
, stbi_uc
*in0
, stbi_uc
*in1
,
2858 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2860 static stbi_uc
*resample_row_1(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
)
2863 STBI_NOTUSED(in_far
);
2869 static stbi_uc
* stbi__resample_row_v_2(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
)
2871 // need to generate two samples vertically for every one in input
2874 for (i
=0; i
< w
; ++i
)
2875 out
[i
] = stbi__div4(3*in_near
[i
] + in_far
[i
] + 2);
2879 static stbi_uc
* stbi__resample_row_h_2(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
)
2881 // need to generate two samples horizontally for every one in input
2883 stbi_uc
*input
= in_near
;
2886 // if only one sample, can't do any interpolation
2887 out
[0] = out
[1] = input
[0];
2892 out
[1] = stbi__div4(input
[0]*3 + input
[1] + 2);
2893 for (i
=1; i
< w
-1; ++i
) {
2894 int n
= 3*input
[i
]+2;
2895 out
[i
*2+0] = stbi__div4(n
+input
[i
-1]);
2896 out
[i
*2+1] = stbi__div4(n
+input
[i
+1]);
2898 out
[i
*2+0] = stbi__div4(input
[w
-2]*3 + input
[w
-1] + 2);
2899 out
[i
*2+1] = input
[w
-1];
2901 STBI_NOTUSED(in_far
);
2907 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2909 static stbi_uc
*stbi__resample_row_hv_2(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
)
2911 // need to generate 2x2 samples for every one in input
2914 out
[0] = out
[1] = stbi__div4(3*in_near
[0] + in_far
[0] + 2);
2918 t1
= 3*in_near
[0] + in_far
[0];
2919 out
[0] = stbi__div4(t1
+2);
2920 for (i
=1; i
< w
; ++i
) {
2922 t1
= 3*in_near
[i
]+in_far
[i
];
2923 out
[i
*2-1] = stbi__div16(3*t0
+ t1
+ 8);
2924 out
[i
*2 ] = stbi__div16(3*t1
+ t0
+ 8);
2926 out
[w
*2-1] = stbi__div4(t1
+2);
2933 #if defined(STBI_SSE2) || defined(STBI_NEON)
2934 static stbi_uc
*stbi__resample_row_hv_2_simd(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
)
2936 // need to generate 2x2 samples for every one in input
2940 out
[0] = out
[1] = stbi__div4(3*in_near
[0] + in_far
[0] + 2);
2944 t1
= 3*in_near
[0] + in_far
[0];
2945 // process groups of 8 pixels for as long as we can.
2946 // note we can't handle the last pixel in a row in this loop
2947 // because we need to handle the filter boundary conditions.
2948 for (; i
< ((w
-1) & ~7); i
+= 8) {
2949 #if defined(STBI_SSE2)
2950 // load and perform the vertical filtering pass
2951 // this uses 3*x + y = 4*x + (y - x)
2952 __m128i zero
= _mm_setzero_si128();
2953 __m128i farb
= _mm_loadl_epi64((__m128i
*) (in_far
+ i
));
2954 __m128i nearb
= _mm_loadl_epi64((__m128i
*) (in_near
+ i
));
2955 __m128i farw
= _mm_unpacklo_epi8(farb
, zero
);
2956 __m128i nearw
= _mm_unpacklo_epi8(nearb
, zero
);
2957 __m128i diff
= _mm_sub_epi16(farw
, nearw
);
2958 __m128i nears
= _mm_slli_epi16(nearw
, 2);
2959 __m128i curr
= _mm_add_epi16(nears
, diff
); // current row
2961 // horizontal filter works the same based on shifted vers of current
2962 // row. "prev" is current row shifted right by 1 pixel; we need to
2963 // insert the previous pixel value (from t1).
2964 // "next" is current row shifted left by 1 pixel, with first pixel
2965 // of next block of 8 pixels added in.
2966 __m128i prv0
= _mm_slli_si128(curr
, 2);
2967 __m128i nxt0
= _mm_srli_si128(curr
, 2);
2968 __m128i prev
= _mm_insert_epi16(prv0
, t1
, 0);
2969 __m128i next
= _mm_insert_epi16(nxt0
, 3*in_near
[i
+8] + in_far
[i
+8], 7);
2971 // horizontal filter, polyphase implementation since it's convenient:
2972 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
2973 // odd pixels = 3*cur + next = cur*4 + (next - cur)
2974 // note the shared term.
2975 __m128i bias
= _mm_set1_epi16(8);
2976 __m128i curs
= _mm_slli_epi16(curr
, 2);
2977 __m128i prvd
= _mm_sub_epi16(prev
, curr
);
2978 __m128i nxtd
= _mm_sub_epi16(next
, curr
);
2979 __m128i curb
= _mm_add_epi16(curs
, bias
);
2980 __m128i even
= _mm_add_epi16(prvd
, curb
);
2981 __m128i odd
= _mm_add_epi16(nxtd
, curb
);
2983 // interleave even and odd pixels, then undo scaling.
2984 __m128i int0
= _mm_unpacklo_epi16(even
, odd
);
2985 __m128i int1
= _mm_unpackhi_epi16(even
, odd
);
2986 __m128i de0
= _mm_srli_epi16(int0
, 4);
2987 __m128i de1
= _mm_srli_epi16(int1
, 4);
2989 // pack and write output
2990 __m128i outv
= _mm_packus_epi16(de0
, de1
);
2991 _mm_storeu_si128((__m128i
*) (out
+ i
*2), outv
);
2992 #elif defined(STBI_NEON)
2993 // load and perform the vertical filtering pass
2994 // this uses 3*x + y = 4*x + (y - x)
2995 uint8x8_t farb
= vld1_u8(in_far
+ i
);
2996 uint8x8_t nearb
= vld1_u8(in_near
+ i
);
2997 int16x8_t diff
= vreinterpretq_s16_u16(vsubl_u8(farb
, nearb
));
2998 int16x8_t nears
= vreinterpretq_s16_u16(vshll_n_u8(nearb
, 2));
2999 int16x8_t curr
= vaddq_s16(nears
, diff
); // current row
3001 // horizontal filter works the same based on shifted vers of current
3002 // row. "prev" is current row shifted right by 1 pixel; we need to
3003 // insert the previous pixel value (from t1).
3004 // "next" is current row shifted left by 1 pixel, with first pixel
3005 // of next block of 8 pixels added in.
3006 int16x8_t prv0
= vextq_s16(curr
, curr
, 7);
3007 int16x8_t nxt0
= vextq_s16(curr
, curr
, 1);
3008 int16x8_t prev
= vsetq_lane_s16(t1
, prv0
, 0);
3009 int16x8_t next
= vsetq_lane_s16(3*in_near
[i
+8] + in_far
[i
+8], nxt0
, 7);
3011 // horizontal filter, polyphase implementation since it's convenient:
3012 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3013 // odd pixels = 3*cur + next = cur*4 + (next - cur)
3014 // note the shared term.
3015 int16x8_t curs
= vshlq_n_s16(curr
, 2);
3016 int16x8_t prvd
= vsubq_s16(prev
, curr
);
3017 int16x8_t nxtd
= vsubq_s16(next
, curr
);
3018 int16x8_t even
= vaddq_s16(curs
, prvd
);
3019 int16x8_t odd
= vaddq_s16(curs
, nxtd
);
3021 // undo scaling and round, then store with even/odd phases interleaved
3023 o
.val
[0] = vqrshrun_n_s16(even
, 4);
3024 o
.val
[1] = vqrshrun_n_s16(odd
, 4);
3025 vst2_u8(out
+ i
*2, o
);
3028 // "previous" value for next iter
3029 t1
= 3*in_near
[i
+7] + in_far
[i
+7];
3033 t1
= 3*in_near
[i
] + in_far
[i
];
3034 out
[i
*2] = stbi__div16(3*t1
+ t0
+ 8);
3036 for (++i
; i
< w
; ++i
) {
3038 t1
= 3*in_near
[i
]+in_far
[i
];
3039 out
[i
*2-1] = stbi__div16(3*t0
+ t1
+ 8);
3040 out
[i
*2 ] = stbi__div16(3*t1
+ t0
+ 8);
3042 out
[w
*2-1] = stbi__div4(t1
+2);
3050 static stbi_uc
*stbi__resample_row_generic(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
)
3052 // resample with nearest-neighbor
3054 STBI_NOTUSED(in_far
);
3055 for (i
=0; i
< w
; ++i
)
3056 for (j
=0; j
< hs
; ++j
)
3057 out
[i
*hs
+j
] = in_near
[i
];
3061 #ifdef STBI_JPEG_OLD
3062 // this is the same YCbCr-to-RGB calculation that stb_image has used
3063 // historically before the algorithm changes in 1.49
3064 #define float2fixed(x) ((int) ((x) * 65536 + 0.5))
3065 static void stbi__YCbCr_to_RGB_row(stbi_uc
*out
, const stbi_uc
*y
, const stbi_uc
*pcb
, const stbi_uc
*pcr
, int count
, int step
)
3068 for (i
=0; i
< count
; ++i
) {
3069 int y_fixed
= (y
[i
] << 16) + 32768; // rounding
3071 int cr
= pcr
[i
] - 128;
3072 int cb
= pcb
[i
] - 128;
3073 r
= y_fixed
+ cr
*float2fixed(1.40200f
);
3074 g
= y_fixed
- cr
*float2fixed(0.71414f
) - cb
*float2fixed(0.34414f
);
3075 b
= y_fixed
+ cb
*float2fixed(1.77200f
);
3079 if ((unsigned) r
> 255) { if (r
< 0) r
= 0; else r
= 255; }
3080 if ((unsigned) g
> 255) { if (g
< 0) g
= 0; else g
= 255; }
3081 if ((unsigned) b
> 255) { if (b
< 0) b
= 0; else b
= 255; }
3082 out
[0] = (stbi_uc
)r
;
3083 out
[1] = (stbi_uc
)g
;
3084 out
[2] = (stbi_uc
)b
;
3090 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3091 // to make sure the code produces the same results in both SIMD and scalar
3092 #define float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
3093 static void stbi__YCbCr_to_RGB_row(stbi_uc
*out
, const stbi_uc
*y
, const stbi_uc
*pcb
, const stbi_uc
*pcr
, int count
, int step
)
3096 for (i
=0; i
< count
; ++i
) {
3097 int y_fixed
= (y
[i
] << 20) + (1<<19); // rounding
3099 int cr
= pcr
[i
] - 128;
3100 int cb
= pcb
[i
] - 128;
3101 r
= y_fixed
+ cr
* float2fixed(1.40200f
);
3102 g
= y_fixed
+ (cr
*-float2fixed(0.71414f
)) + ((cb
*-float2fixed(0.34414f
)) & 0xffff0000);
3103 b
= y_fixed
+ cb
* float2fixed(1.77200f
);
3107 if ((unsigned) r
> 255) { if (r
< 0) r
= 0; else r
= 255; }
3108 if ((unsigned) g
> 255) { if (g
< 0) g
= 0; else g
= 255; }
3109 if ((unsigned) b
> 255) { if (b
< 0) b
= 0; else b
= 255; }
3110 out
[0] = (stbi_uc
)r
;
3111 out
[1] = (stbi_uc
)g
;
3112 out
[2] = (stbi_uc
)b
;
3119 #if defined(STBI_SSE2) || defined(STBI_NEON)
3120 static void stbi__YCbCr_to_RGB_simd(stbi_uc
*out
, stbi_uc
const *y
, stbi_uc
const *pcb
, stbi_uc
const *pcr
, int count
, int step
)
3125 // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3126 // it's useful in practice (you wouldn't use it for textures, for example).
3127 // so just accelerate step == 4 case.
3129 // this is a fairly straightforward implementation and not super-optimized.
3130 __m128i signflip
= _mm_set1_epi8(-0x80);
3131 __m128i cr_const0
= _mm_set1_epi16( (short) ( 1.40200f
*4096.0f
+0.5f
));
3132 __m128i cr_const1
= _mm_set1_epi16( - (short) ( 0.71414f
*4096.0f
+0.5f
));
3133 __m128i cb_const0
= _mm_set1_epi16( - (short) ( 0.34414f
*4096.0f
+0.5f
));
3134 __m128i cb_const1
= _mm_set1_epi16( (short) ( 1.77200f
*4096.0f
+0.5f
));
3135 __m128i y_bias
= _mm_set1_epi8((char) (unsigned char) 128);
3136 __m128i xw
= _mm_set1_epi16(255); // alpha channel
3138 for (; i
+7 < count
; i
+= 8) {
3140 __m128i y_bytes
= _mm_loadl_epi64((__m128i
*) (y
+i
));
3141 __m128i cr_bytes
= _mm_loadl_epi64((__m128i
*) (pcr
+i
));
3142 __m128i cb_bytes
= _mm_loadl_epi64((__m128i
*) (pcb
+i
));
3143 __m128i cr_biased
= _mm_xor_si128(cr_bytes
, signflip
); // -128
3144 __m128i cb_biased
= _mm_xor_si128(cb_bytes
, signflip
); // -128
3146 // unpack to short (and left-shift cr, cb by 8)
3147 __m128i yw
= _mm_unpacklo_epi8(y_bias
, y_bytes
);
3148 __m128i crw
= _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased
);
3149 __m128i cbw
= _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased
);
3152 __m128i yws
= _mm_srli_epi16(yw
, 4);
3153 __m128i cr0
= _mm_mulhi_epi16(cr_const0
, crw
);
3154 __m128i cb0
= _mm_mulhi_epi16(cb_const0
, cbw
);
3155 __m128i cb1
= _mm_mulhi_epi16(cbw
, cb_const1
);
3156 __m128i cr1
= _mm_mulhi_epi16(crw
, cr_const1
);
3157 __m128i rws
= _mm_add_epi16(cr0
, yws
);
3158 __m128i gwt
= _mm_add_epi16(cb0
, yws
);
3159 __m128i bws
= _mm_add_epi16(yws
, cb1
);
3160 __m128i gws
= _mm_add_epi16(gwt
, cr1
);
3163 __m128i rw
= _mm_srai_epi16(rws
, 4);
3164 __m128i bw
= _mm_srai_epi16(bws
, 4);
3165 __m128i gw
= _mm_srai_epi16(gws
, 4);
3167 // back to byte, set up for transpose
3168 __m128i brb
= _mm_packus_epi16(rw
, bw
);
3169 __m128i gxb
= _mm_packus_epi16(gw
, xw
);
3171 // transpose to interleave channels
3172 __m128i t0
= _mm_unpacklo_epi8(brb
, gxb
);
3173 __m128i t1
= _mm_unpackhi_epi8(brb
, gxb
);
3174 __m128i o0
= _mm_unpacklo_epi16(t0
, t1
);
3175 __m128i o1
= _mm_unpackhi_epi16(t0
, t1
);
3178 _mm_storeu_si128((__m128i
*) (out
+ 0), o0
);
3179 _mm_storeu_si128((__m128i
*) (out
+ 16), o1
);
3186 // in this version, step=3 support would be easy to add. but is there demand?
3188 // this is a fairly straightforward implementation and not super-optimized.
3189 uint8x8_t signflip
= vdup_n_u8(0x80);
3190 int16x8_t cr_const0
= vdupq_n_s16( (short) ( 1.40200f
*4096.0f
+0.5f
));
3191 int16x8_t cr_const1
= vdupq_n_s16( - (short) ( 0.71414f
*4096.0f
+0.5f
));
3192 int16x8_t cb_const0
= vdupq_n_s16( - (short) ( 0.34414f
*4096.0f
+0.5f
));
3193 int16x8_t cb_const1
= vdupq_n_s16( (short) ( 1.77200f
*4096.0f
+0.5f
));
3195 for (; i
+7 < count
; i
+= 8) {
3197 uint8x8_t y_bytes
= vld1_u8(y
+ i
);
3198 uint8x8_t cr_bytes
= vld1_u8(pcr
+ i
);
3199 uint8x8_t cb_bytes
= vld1_u8(pcb
+ i
);
3200 int8x8_t cr_biased
= vreinterpret_s8_u8(vsub_u8(cr_bytes
, signflip
));
3201 int8x8_t cb_biased
= vreinterpret_s8_u8(vsub_u8(cb_bytes
, signflip
));
3204 int16x8_t yws
= vreinterpretq_s16_u16(vshll_n_u8(y_bytes
, 4));
3205 int16x8_t crw
= vshll_n_s8(cr_biased
, 7);
3206 int16x8_t cbw
= vshll_n_s8(cb_biased
, 7);
3209 int16x8_t cr0
= vqdmulhq_s16(crw
, cr_const0
);
3210 int16x8_t cb0
= vqdmulhq_s16(cbw
, cb_const0
);
3211 int16x8_t cr1
= vqdmulhq_s16(crw
, cr_const1
);
3212 int16x8_t cb1
= vqdmulhq_s16(cbw
, cb_const1
);
3213 int16x8_t rws
= vaddq_s16(yws
, cr0
);
3214 int16x8_t gws
= vaddq_s16(vaddq_s16(yws
, cb0
), cr1
);
3215 int16x8_t bws
= vaddq_s16(yws
, cb1
);
3217 // undo scaling, round, convert to byte
3219 o
.val
[0] = vqrshrun_n_s16(rws
, 4);
3220 o
.val
[1] = vqrshrun_n_s16(gws
, 4);
3221 o
.val
[2] = vqrshrun_n_s16(bws
, 4);
3222 o
.val
[3] = vdup_n_u8(255);
3224 // store, interleaving r/g/b/a
3231 for (; i
< count
; ++i
) {
3232 int y_fixed
= (y
[i
] << 20) + (1<<19); // rounding
3234 int cr
= pcr
[i
] - 128;
3235 int cb
= pcb
[i
] - 128;
3236 r
= y_fixed
+ cr
* float2fixed(1.40200f
);
3237 g
= y_fixed
+ cr
*-float2fixed(0.71414f
) + ((cb
*-float2fixed(0.34414f
)) & 0xffff0000);
3238 b
= y_fixed
+ cb
* float2fixed(1.77200f
);
3242 if ((unsigned) r
> 255) { if (r
< 0) r
= 0; else r
= 255; }
3243 if ((unsigned) g
> 255) { if (g
< 0) g
= 0; else g
= 255; }
3244 if ((unsigned) b
> 255) { if (b
< 0) b
= 0; else b
= 255; }
3245 out
[0] = (stbi_uc
)r
;
3246 out
[1] = (stbi_uc
)g
;
3247 out
[2] = (stbi_uc
)b
;
3254 // set up the kernels
3255 static void stbi__setup_jpeg(stbi__jpeg
*j
)
3257 j
->idct_block_kernel
= stbi__idct_block
;
3258 j
->YCbCr_to_RGB_kernel
= stbi__YCbCr_to_RGB_row
;
3259 j
->resample_row_hv_2_kernel
= stbi__resample_row_hv_2
;
3262 if (stbi__sse2_available()) {
3263 j
->idct_block_kernel
= stbi__idct_simd
;
3264 #ifndef STBI_JPEG_OLD
3265 j
->YCbCr_to_RGB_kernel
= stbi__YCbCr_to_RGB_simd
;
3267 j
->resample_row_hv_2_kernel
= stbi__resample_row_hv_2_simd
;
3272 j
->idct_block_kernel
= stbi__idct_simd
;
3273 #ifndef STBI_JPEG_OLD
3274 j
->YCbCr_to_RGB_kernel
= stbi__YCbCr_to_RGB_simd
;
3276 j
->resample_row_hv_2_kernel
= stbi__resample_row_hv_2_simd
;
3280 // clean up the temporary component buffers
3281 static void stbi__cleanup_jpeg(stbi__jpeg
*j
)
3284 for (i
=0; i
< j
->s
->img_n
; ++i
) {
3285 if (j
->img_comp
[i
].raw_data
) {
3286 STBI_FREE(j
->img_comp
[i
].raw_data
);
3287 j
->img_comp
[i
].raw_data
= NULL
;
3288 j
->img_comp
[i
].data
= NULL
;
3290 if (j
->img_comp
[i
].raw_coeff
) {
3291 STBI_FREE(j
->img_comp
[i
].raw_coeff
);
3292 j
->img_comp
[i
].raw_coeff
= 0;
3293 j
->img_comp
[i
].coeff
= 0;
3295 if (j
->img_comp
[i
].linebuf
) {
3296 STBI_FREE(j
->img_comp
[i
].linebuf
);
3297 j
->img_comp
[i
].linebuf
= NULL
;
3304 resample_row_func resample
;
3305 stbi_uc
*line0
,*line1
;
3306 int hs
,vs
; // expansion factor in each axis
3307 int w_lores
; // horizontal pixels pre-expansion
3308 int ystep
; // how far through vertical expansion we are
3309 int ypos
; // which pre-expansion row we're on
3312 static stbi_uc
*load_jpeg_image(stbi__jpeg
*z
, int *out_x
, int *out_y
, int *comp
, int req_comp
)
3315 z
->s
->img_n
= 0; // make stbi__cleanup_jpeg safe
3317 // validate req_comp
3318 if (req_comp
< 0 || req_comp
> 4) return stbi__errpuc("bad req_comp", "Internal error");
3320 // load a jpeg image from whichever source, but leave in YCbCr format
3321 if (!stbi__decode_jpeg_image(z
)) { stbi__cleanup_jpeg(z
); return NULL
; }
3323 // determine actual number of components to generate
3324 n
= req_comp
? req_comp
: z
->s
->img_n
;
3326 if (z
->s
->img_n
== 3 && n
< 3)
3329 decode_n
= z
->s
->img_n
;
3331 // resample and color-convert
3336 stbi_uc
*coutput
[4];
3338 stbi__resample res_comp
[4];
3340 for (k
=0; k
< decode_n
; ++k
) {
3341 stbi__resample
*r
= &res_comp
[k
];
3343 // allocate line buffer big enough for upsampling off the edges
3344 // with upsample factor of 4
3345 z
->img_comp
[k
].linebuf
= (stbi_uc
*) stbi__malloc(z
->s
->img_x
+ 3);
3346 if (!z
->img_comp
[k
].linebuf
) { stbi__cleanup_jpeg(z
); return stbi__errpuc("outofmem", "Out of memory"); }
3348 r
->hs
= z
->img_h_max
/ z
->img_comp
[k
].h
;
3349 r
->vs
= z
->img_v_max
/ z
->img_comp
[k
].v
;
3350 r
->ystep
= r
->vs
>> 1;
3351 r
->w_lores
= (z
->s
->img_x
+ r
->hs
-1) / r
->hs
;
3353 r
->line0
= r
->line1
= z
->img_comp
[k
].data
;
3355 if (r
->hs
== 1 && r
->vs
== 1) r
->resample
= resample_row_1
;
3356 else if (r
->hs
== 1 && r
->vs
== 2) r
->resample
= stbi__resample_row_v_2
;
3357 else if (r
->hs
== 2 && r
->vs
== 1) r
->resample
= stbi__resample_row_h_2
;
3358 else if (r
->hs
== 2 && r
->vs
== 2) r
->resample
= z
->resample_row_hv_2_kernel
;
3359 else r
->resample
= stbi__resample_row_generic
;
3362 // can't error after this so, this is safe
3363 output
= (stbi_uc
*) stbi__malloc(n
* z
->s
->img_x
* z
->s
->img_y
+ 1);
3364 if (!output
) { stbi__cleanup_jpeg(z
); return stbi__errpuc("outofmem", "Out of memory"); }
3366 // now go ahead and resample
3367 for (j
=0; j
< z
->s
->img_y
; ++j
) {
3368 stbi_uc
*out
= output
+ n
* z
->s
->img_x
* j
;
3369 for (k
=0; k
< decode_n
; ++k
) {
3370 stbi__resample
*r
= &res_comp
[k
];
3371 int y_bot
= r
->ystep
>= (r
->vs
>> 1);
3372 coutput
[k
] = r
->resample(z
->img_comp
[k
].linebuf
,
3373 y_bot
? r
->line1
: r
->line0
,
3374 y_bot
? r
->line0
: r
->line1
,
3376 if (++r
->ystep
>= r
->vs
) {
3378 r
->line0
= r
->line1
;
3379 if (++r
->ypos
< z
->img_comp
[k
].y
)
3380 r
->line1
+= z
->img_comp
[k
].w2
;
3384 stbi_uc
*y
= coutput
[0];
3385 if (z
->s
->img_n
== 3) {
3386 z
->YCbCr_to_RGB_kernel(out
, y
, coutput
[1], coutput
[2], z
->s
->img_x
, n
);
3388 for (i
=0; i
< z
->s
->img_x
; ++i
) {
3389 out
[0] = out
[1] = out
[2] = y
[i
];
3390 out
[3] = 255; // not used if n==3
3394 stbi_uc
*y
= coutput
[0];
3396 for (i
=0; i
< z
->s
->img_x
; ++i
) out
[i
] = y
[i
];
3398 for (i
=0; i
< z
->s
->img_x
; ++i
) *out
++ = y
[i
], *out
++ = 255;
3401 stbi__cleanup_jpeg(z
);
3402 *out_x
= z
->s
->img_x
;
3403 *out_y
= z
->s
->img_y
;
3404 if (comp
) *comp
= z
->s
->img_n
; // report original components, not output
3409 static unsigned char *stbi__jpeg_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
3413 stbi__setup_jpeg(&j
);
3414 return load_jpeg_image(&j
, x
,y
,comp
,req_comp
);
3417 static int stbi__jpeg_test(stbi__context
*s
)
3422 stbi__setup_jpeg(&j
);
3423 r
= stbi__decode_jpeg_header(&j
, STBI__SCAN_type
);
3428 static int stbi__jpeg_info_raw(stbi__jpeg
*j
, int *x
, int *y
, int *comp
)
3430 if (!stbi__decode_jpeg_header(j
, STBI__SCAN_header
)) {
3431 stbi__rewind( j
->s
);
3434 if (x
) *x
= j
->s
->img_x
;
3435 if (y
) *y
= j
->s
->img_y
;
3436 if (comp
) *comp
= j
->s
->img_n
;
3440 static int stbi__jpeg_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
3444 return stbi__jpeg_info_raw(&j
, x
, y
, comp
);
3448 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
3449 // simple implementation
3450 // - all input must be provided in an upfront buffer
3451 // - all output is written to a single output buffer (can malloc/realloc)
3455 #ifndef STBI_NO_ZLIB
3457 // fast-way is faster to check than jpeg huffman, but slow way is slower
3458 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
3459 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
3461 // zlib-style huffman encoding
3462 // (jpegs packs from left, zlib from right, so can't share code)
3465 stbi__uint16 fast
[1 << STBI__ZFAST_BITS
];
3466 stbi__uint16 firstcode
[16];
3468 stbi__uint16 firstsymbol
[16];
3470 stbi__uint16 value
[288];
3473 stbi_inline
static int stbi__bitreverse16(int n
)
3475 n
= ((n
& 0xAAAA) >> 1) | ((n
& 0x5555) << 1);
3476 n
= ((n
& 0xCCCC) >> 2) | ((n
& 0x3333) << 2);
3477 n
= ((n
& 0xF0F0) >> 4) | ((n
& 0x0F0F) << 4);
3478 n
= ((n
& 0xFF00) >> 8) | ((n
& 0x00FF) << 8);
3482 stbi_inline
static int stbi__bit_reverse(int v
, int bits
)
3484 STBI_ASSERT(bits
<= 16);
3485 // to bit reverse n bits, reverse 16 and shift
3486 // e.g. 11 bits, bit reverse and shift away 5
3487 return stbi__bitreverse16(v
) >> (16-bits
);
3490 static int stbi__zbuild_huffman(stbi__zhuffman
*z
, stbi_uc
*sizelist
, int num
)
3493 int code
, next_code
[16], sizes
[17];
3495 // DEFLATE spec for generating codes
3496 memset(sizes
, 0, sizeof(sizes
));
3497 memset(z
->fast
, 0, sizeof(z
->fast
));
3498 for (i
=0; i
< num
; ++i
)
3499 ++sizes
[sizelist
[i
]];
3501 for (i
=1; i
< 16; ++i
)
3502 if (sizes
[i
] > (1 << i
))
3503 return stbi__err("bad sizes", "Corrupt PNG");
3505 for (i
=1; i
< 16; ++i
) {
3506 next_code
[i
] = code
;
3507 z
->firstcode
[i
] = (stbi__uint16
) code
;
3508 z
->firstsymbol
[i
] = (stbi__uint16
) k
;
3509 code
= (code
+ sizes
[i
]);
3511 if (code
-1 >= (1 << i
)) return stbi__err("bad codelengths","Corrupt PNG");
3512 z
->maxcode
[i
] = code
<< (16-i
); // preshift for inner loop
3516 z
->maxcode
[16] = 0x10000; // sentinel
3517 for (i
=0; i
< num
; ++i
) {
3518 int s
= sizelist
[i
];
3520 int c
= next_code
[s
] - z
->firstcode
[s
] + z
->firstsymbol
[s
];
3521 stbi__uint16 fastv
= (stbi__uint16
) ((s
<< 9) | i
);
3522 z
->size
[c
] = (stbi_uc
) s
;
3523 z
->value
[c
] = (stbi__uint16
) i
;
3524 if (s
<= STBI__ZFAST_BITS
) {
3525 int j
= stbi__bit_reverse(next_code
[s
],s
);
3526 while (j
< (1 << STBI__ZFAST_BITS
)) {
3537 // zlib-from-memory implementation for PNG reading
3538 // because PNG allows splitting the zlib stream arbitrarily,
3539 // and it's annoying structurally to have PNG call ZLIB call PNG,
3540 // we require PNG read all the IDATs and combine them into a single
3545 stbi_uc
*zbuffer
, *zbuffer_end
;
3547 stbi__uint32 code_buffer
;
3554 stbi__zhuffman z_length
, z_distance
;
3557 stbi_inline
static stbi_uc
stbi__zget8(stbi__zbuf
*z
)
3559 if (z
->zbuffer
>= z
->zbuffer_end
) return 0;
3560 return *z
->zbuffer
++;
3563 static void stbi__fill_bits(stbi__zbuf
*z
)
3566 STBI_ASSERT(z
->code_buffer
< (1U << z
->num_bits
));
3567 z
->code_buffer
|= (unsigned int) stbi__zget8(z
) << z
->num_bits
;
3569 } while (z
->num_bits
<= 24);
3572 stbi_inline
static unsigned int stbi__zreceive(stbi__zbuf
*z
, int n
)
3575 if (z
->num_bits
< n
) stbi__fill_bits(z
);
3576 k
= z
->code_buffer
& ((1 << n
) - 1);
3577 z
->code_buffer
>>= n
;
3582 static int stbi__zhuffman_decode_slowpath(stbi__zbuf
*a
, stbi__zhuffman
*z
)
3585 // not resolved by fast table, so compute it the slow way
3586 // use jpeg approach, which requires MSbits at top
3587 k
= stbi__bit_reverse(a
->code_buffer
, 16);
3588 for (s
=STBI__ZFAST_BITS
+1; ; ++s
)
3589 if (k
< z
->maxcode
[s
])
3591 if (s
== 16) return -1; // invalid code!
3592 // code size is s, so:
3593 b
= (k
>> (16-s
)) - z
->firstcode
[s
] + z
->firstsymbol
[s
];
3594 STBI_ASSERT(z
->size
[b
] == s
);
3595 a
->code_buffer
>>= s
;
3600 stbi_inline
static int stbi__zhuffman_decode(stbi__zbuf
*a
, stbi__zhuffman
*z
)
3603 if (a
->num_bits
< 16) stbi__fill_bits(a
);
3604 b
= z
->fast
[a
->code_buffer
& STBI__ZFAST_MASK
];
3607 a
->code_buffer
>>= s
;
3611 return stbi__zhuffman_decode_slowpath(a
, z
);
3614 static int stbi__zexpand(stbi__zbuf
*z
, char *zout
, int n
) // need to make room for n bytes
3617 int cur
, limit
, old_limit
;
3619 if (!z
->z_expandable
) return stbi__err("output buffer limit","Corrupt PNG");
3620 cur
= (int) (z
->zout
- z
->zout_start
);
3621 limit
= old_limit
= (int) (z
->zout_end
- z
->zout_start
);
3622 while (cur
+ n
> limit
)
3624 q
= (char *) STBI_REALLOC_SIZED(z
->zout_start
, old_limit
, limit
);
3625 STBI_NOTUSED(old_limit
);
3626 if (q
== NULL
) return stbi__err("outofmem", "Out of memory");
3629 z
->zout_end
= q
+ limit
;
3633 static int stbi__zlength_base
[31] = {
3634 3,4,5,6,7,8,9,10,11,13,
3635 15,17,19,23,27,31,35,43,51,59,
3636 67,83,99,115,131,163,195,227,258,0,0 };
3638 static int stbi__zlength_extra
[31]=
3639 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
3641 static int stbi__zdist_base
[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
3642 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
3644 static int stbi__zdist_extra
[32] =
3645 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
3647 static int stbi__parse_huffman_block(stbi__zbuf
*a
)
3649 char *zout
= a
->zout
;
3651 int z
= stbi__zhuffman_decode(a
, &a
->z_length
);
3653 if (z
< 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
3654 if (zout
>= a
->zout_end
) {
3655 if (!stbi__zexpand(a
, zout
, 1)) return 0;
3667 len
= stbi__zlength_base
[z
];
3668 if (stbi__zlength_extra
[z
]) len
+= stbi__zreceive(a
, stbi__zlength_extra
[z
]);
3669 z
= stbi__zhuffman_decode(a
, &a
->z_distance
);
3670 if (z
< 0) return stbi__err("bad huffman code","Corrupt PNG");
3671 dist
= stbi__zdist_base
[z
];
3672 if (stbi__zdist_extra
[z
]) dist
+= stbi__zreceive(a
, stbi__zdist_extra
[z
]);
3673 if (zout
- a
->zout_start
< dist
) return stbi__err("bad dist","Corrupt PNG");
3674 if (zout
+ len
> a
->zout_end
) {
3675 if (!stbi__zexpand(a
, zout
, len
)) return 0;
3678 p
= (stbi_uc
*) (zout
- dist
);
3679 if (dist
== 1) { // run of one byte; common in images.
3681 if (len
) { do *zout
++ = v
; while (--len
); }
3683 if (len
) { do *zout
++ = *p
++; while (--len
); }
3689 static int stbi__compute_huffman_codes(stbi__zbuf
*a
)
3691 static stbi_uc length_dezigzag
[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
3692 stbi__zhuffman z_codelength
;
3693 stbi_uc lencodes
[286+32+137];//padding for maximum single op
3694 stbi_uc codelength_sizes
[19];
3697 int hlit
= stbi__zreceive(a
,5) + 257;
3698 int hdist
= stbi__zreceive(a
,5) + 1;
3699 int hclen
= stbi__zreceive(a
,4) + 4;
3701 memset(codelength_sizes
, 0, sizeof(codelength_sizes
));
3702 for (i
=0; i
< hclen
; ++i
) {
3703 int s
= stbi__zreceive(a
,3);
3704 codelength_sizes
[length_dezigzag
[i
]] = (stbi_uc
) s
;
3706 if (!stbi__zbuild_huffman(&z_codelength
, codelength_sizes
, 19)) return 0;
3709 while (n
< hlit
+ hdist
) {
3710 int c
= stbi__zhuffman_decode(a
, &z_codelength
);
3711 if (c
< 0 || c
>= 19) return stbi__err("bad codelengths", "Corrupt PNG");
3713 lencodes
[n
++] = (stbi_uc
) c
;
3715 c
= stbi__zreceive(a
,2)+3;
3716 memset(lencodes
+n
, lencodes
[n
-1], c
);
3718 } else if (c
== 17) {
3719 c
= stbi__zreceive(a
,3)+3;
3720 memset(lencodes
+n
, 0, c
);
3723 STBI_ASSERT(c
== 18);
3724 c
= stbi__zreceive(a
,7)+11;
3725 memset(lencodes
+n
, 0, c
);
3729 if (n
!= hlit
+hdist
) return stbi__err("bad codelengths","Corrupt PNG");
3730 if (!stbi__zbuild_huffman(&a
->z_length
, lencodes
, hlit
)) return 0;
3731 if (!stbi__zbuild_huffman(&a
->z_distance
, lencodes
+hlit
, hdist
)) return 0;
3735 static int stbi__parse_uncomperssed_block(stbi__zbuf
*a
)
3739 if (a
->num_bits
& 7)
3740 stbi__zreceive(a
, a
->num_bits
& 7); // discard
3741 // drain the bit-packed data into header
3743 while (a
->num_bits
> 0) {
3744 header
[k
++] = (stbi_uc
) (a
->code_buffer
& 255); // suppress MSVC run-time check
3745 a
->code_buffer
>>= 8;
3748 STBI_ASSERT(a
->num_bits
== 0);
3749 // now fill header the normal way
3751 header
[k
++] = stbi__zget8(a
);
3752 len
= header
[1] * 256 + header
[0];
3753 nlen
= header
[3] * 256 + header
[2];
3754 if (nlen
!= (len
^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
3755 if (a
->zbuffer
+ len
> a
->zbuffer_end
) return stbi__err("read past buffer","Corrupt PNG");
3756 if (a
->zout
+ len
> a
->zout_end
)
3757 if (!stbi__zexpand(a
, a
->zout
, len
)) return 0;
3758 memcpy(a
->zout
, a
->zbuffer
, len
);
3764 static int stbi__parse_zlib_header(stbi__zbuf
*a
)
3766 int cmf
= stbi__zget8(a
);
3768 /* int cinfo = cmf >> 4; */
3769 int flg
= stbi__zget8(a
);
3770 if ((cmf
*256+flg
) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
3771 if (flg
& 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
3772 if (cm
!= 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
3773 // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
3777 // @TODO: should statically initialize these for optimal thread safety
3778 static stbi_uc stbi__zdefault_length
[288], stbi__zdefault_distance
[32];
3779 static void stbi__init_zdefaults(void)
3781 int i
; // use <= to match clearly with spec
3782 for (i
=0; i
<= 143; ++i
) stbi__zdefault_length
[i
] = 8;
3783 for ( ; i
<= 255; ++i
) stbi__zdefault_length
[i
] = 9;
3784 for ( ; i
<= 279; ++i
) stbi__zdefault_length
[i
] = 7;
3785 for ( ; i
<= 287; ++i
) stbi__zdefault_length
[i
] = 8;
3787 for (i
=0; i
<= 31; ++i
) stbi__zdefault_distance
[i
] = 5;
3790 static int stbi__parse_zlib(stbi__zbuf
*a
, int parse_header
)
3794 if (!stbi__parse_zlib_header(a
)) return 0;
3798 final
= stbi__zreceive(a
,1);
3799 type
= stbi__zreceive(a
,2);
3801 if (!stbi__parse_uncomperssed_block(a
)) return 0;
3802 } else if (type
== 3) {
3806 // use fixed code lengths
3807 if (!stbi__zdefault_distance
[31]) stbi__init_zdefaults();
3808 if (!stbi__zbuild_huffman(&a
->z_length
, stbi__zdefault_length
, 288)) return 0;
3809 if (!stbi__zbuild_huffman(&a
->z_distance
, stbi__zdefault_distance
, 32)) return 0;
3811 if (!stbi__compute_huffman_codes(a
)) return 0;
3813 if (!stbi__parse_huffman_block(a
)) return 0;
3819 static int stbi__do_zlib(stbi__zbuf
*a
, char *obuf
, int olen
, int exp
, int parse_header
)
3821 a
->zout_start
= obuf
;
3823 a
->zout_end
= obuf
+ olen
;
3824 a
->z_expandable
= exp
;
3826 return stbi__parse_zlib(a
, parse_header
);
3829 STBIDEF
char *stbi_zlib_decode_malloc_guesssize(const char *buffer
, int len
, int initial_size
, int *outlen
)
3832 char *p
= (char *) stbi__malloc(initial_size
);
3833 if (p
== NULL
) return NULL
;
3834 a
.zbuffer
= (stbi_uc
*) buffer
;
3835 a
.zbuffer_end
= (stbi_uc
*) buffer
+ len
;
3836 if (stbi__do_zlib(&a
, p
, initial_size
, 1, 1)) {
3837 if (outlen
) *outlen
= (int) (a
.zout
- a
.zout_start
);
3838 return a
.zout_start
;
3840 STBI_FREE(a
.zout_start
);
3845 STBIDEF
char *stbi_zlib_decode_malloc(char const *buffer
, int len
, int *outlen
)
3847 return stbi_zlib_decode_malloc_guesssize(buffer
, len
, 16384, outlen
);
3850 STBIDEF
char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer
, int len
, int initial_size
, int *outlen
, int parse_header
)
3853 char *p
= (char *) stbi__malloc(initial_size
);
3854 if (p
== NULL
) return NULL
;
3855 a
.zbuffer
= (stbi_uc
*) buffer
;
3856 a
.zbuffer_end
= (stbi_uc
*) buffer
+ len
;
3857 if (stbi__do_zlib(&a
, p
, initial_size
, 1, parse_header
)) {
3858 if (outlen
) *outlen
= (int) (a
.zout
- a
.zout_start
);
3859 return a
.zout_start
;
3861 STBI_FREE(a
.zout_start
);
3866 STBIDEF
int stbi_zlib_decode_buffer(char *obuffer
, int olen
, char const *ibuffer
, int ilen
)
3869 a
.zbuffer
= (stbi_uc
*) ibuffer
;
3870 a
.zbuffer_end
= (stbi_uc
*) ibuffer
+ ilen
;
3871 if (stbi__do_zlib(&a
, obuffer
, olen
, 0, 1))
3872 return (int) (a
.zout
- a
.zout_start
);
3877 STBIDEF
char *stbi_zlib_decode_noheader_malloc(char const *buffer
, int len
, int *outlen
)
3880 char *p
= (char *) stbi__malloc(16384);
3881 if (p
== NULL
) return NULL
;
3882 a
.zbuffer
= (stbi_uc
*) buffer
;
3883 a
.zbuffer_end
= (stbi_uc
*) buffer
+len
;
3884 if (stbi__do_zlib(&a
, p
, 16384, 1, 0)) {
3885 if (outlen
) *outlen
= (int) (a
.zout
- a
.zout_start
);
3886 return a
.zout_start
;
3888 STBI_FREE(a
.zout_start
);
3893 STBIDEF
int stbi_zlib_decode_noheader_buffer(char *obuffer
, int olen
, const char *ibuffer
, int ilen
)
3896 a
.zbuffer
= (stbi_uc
*) ibuffer
;
3897 a
.zbuffer_end
= (stbi_uc
*) ibuffer
+ ilen
;
3898 if (stbi__do_zlib(&a
, obuffer
, olen
, 0, 0))
3899 return (int) (a
.zout
- a
.zout_start
);
3905 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
3906 // simple implementation
3907 // - only 8-bit samples
3908 // - no CRC checking
3909 // - allocates lots of intermediate memory
3910 // - avoids problem of streaming data between subsystems
3911 // - avoids explicit window management
3913 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
3918 stbi__uint32 length
;
3922 static stbi__pngchunk
stbi__get_chunk_header(stbi__context
*s
)
3925 c
.length
= stbi__get32be(s
);
3926 c
.type
= stbi__get32be(s
);
3930 static int stbi__check_png_header(stbi__context
*s
)
3932 static stbi_uc png_sig
[8] = { 137,80,78,71,13,10,26,10 };
3934 for (i
=0; i
< 8; ++i
)
3935 if (stbi__get8(s
) != png_sig
[i
]) return stbi__err("bad png sig","Not a PNG");
3942 stbi_uc
*idata
, *expanded
, *out
;
3952 // synthetic filters used for first scanline to avoid needing a dummy row of 0s
3957 static stbi_uc first_row_filter
[5] =
3966 static int stbi__paeth(int a
, int b
, int c
)
3972 if (pa
<= pb
&& pa
<= pc
) return a
;
3973 if (pb
<= pc
) return b
;
3977 static stbi_uc stbi__depth_scale_table
[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
3979 // create the png data from post-deflated data
3980 static int stbi__create_png_image_raw(stbi__png
*a
, stbi_uc
*raw
, stbi__uint32 raw_len
, int out_n
, stbi__uint32 x
, stbi__uint32 y
, int depth
, int color
)
3982 stbi__context
*s
= a
->s
;
3983 stbi__uint32 i
,j
,stride
= x
*out_n
;
3984 stbi__uint32 img_len
, img_width_bytes
;
3986 int img_n
= s
->img_n
; // copy it into a local for later
3988 STBI_ASSERT(out_n
== s
->img_n
|| out_n
== s
->img_n
+1);
3989 a
->out
= (stbi_uc
*) stbi__malloc(x
* y
* out_n
); // extra bytes to write off the end into
3990 if (!a
->out
) return stbi__err("outofmem", "Out of memory");
3992 img_width_bytes
= (((img_n
* x
* depth
) + 7) >> 3);
3993 img_len
= (img_width_bytes
+ 1) * y
;
3994 if (s
->img_x
== x
&& s
->img_y
== y
) {
3995 if (raw_len
!= img_len
) return stbi__err("not enough pixels","Corrupt PNG");
3996 } else { // interlaced:
3997 if (raw_len
< img_len
) return stbi__err("not enough pixels","Corrupt PNG");
4000 for (j
=0; j
< y
; ++j
) {
4001 stbi_uc
*cur
= a
->out
+ stride
*j
;
4002 stbi_uc
*prior
= cur
- stride
;
4003 int filter
= *raw
++;
4004 int filter_bytes
= img_n
;
4007 return stbi__err("invalid filter","Corrupt PNG");
4010 STBI_ASSERT(img_width_bytes
<= x
);
4011 cur
+= x
*out_n
- img_width_bytes
; // store output to the rightmost img_len bytes, so we can decode in place
4013 width
= img_width_bytes
;
4016 // if first row, use special filter that doesn't sample previous row
4017 if (j
== 0) filter
= first_row_filter
[filter
];
4019 // handle first byte explicitly
4020 for (k
=0; k
< filter_bytes
; ++k
) {
4022 case STBI__F_none
: cur
[k
] = raw
[k
]; break;
4023 case STBI__F_sub
: cur
[k
] = raw
[k
]; break;
4024 case STBI__F_up
: cur
[k
] = STBI__BYTECAST(raw
[k
] + prior
[k
]); break;
4025 case STBI__F_avg
: cur
[k
] = STBI__BYTECAST(raw
[k
] + (prior
[k
]>>1)); break;
4026 case STBI__F_paeth
: cur
[k
] = STBI__BYTECAST(raw
[k
] + stbi__paeth(0,prior
[k
],0)); break;
4027 case STBI__F_avg_first
: cur
[k
] = raw
[k
]; break;
4028 case STBI__F_paeth_first
: cur
[k
] = raw
[k
]; break;
4034 cur
[img_n
] = 255; // first pixel
4044 // this is a little gross, so that we don't switch per-pixel or per-component
4045 if (depth
< 8 || img_n
== out_n
) {
4046 int nk
= (width
- 1)*img_n
;
4049 for (k=0; k < nk; ++k)
4051 // "none" filter turns into a memcpy here; make that explicit.
4052 case STBI__F_none
: memcpy(cur
, raw
, nk
); break;
4053 CASE(STBI__F_sub
) cur
[k
] = STBI__BYTECAST(raw
[k
] + cur
[k
-filter_bytes
]); break;
4054 CASE(STBI__F_up
) cur
[k
] = STBI__BYTECAST(raw
[k
] + prior
[k
]); break;
4055 CASE(STBI__F_avg
) cur
[k
] = STBI__BYTECAST(raw
[k
] + ((prior
[k
] + cur
[k
-filter_bytes
])>>1)); break;
4056 CASE(STBI__F_paeth
) cur
[k
] = STBI__BYTECAST(raw
[k
] + stbi__paeth(cur
[k
-filter_bytes
],prior
[k
],prior
[k
-filter_bytes
])); break;
4057 CASE(STBI__F_avg_first
) cur
[k
] = STBI__BYTECAST(raw
[k
] + (cur
[k
-filter_bytes
] >> 1)); break;
4058 CASE(STBI__F_paeth_first
) cur
[k
] = STBI__BYTECAST(raw
[k
] + stbi__paeth(cur
[k
-filter_bytes
],0,0)); break;
4063 STBI_ASSERT(img_n
+1 == out_n
);
4066 for (i=x-1; i >= 1; --i, cur[img_n]=255,raw+=img_n,cur+=out_n,prior+=out_n) \
4067 for (k=0; k < img_n; ++k)
4069 CASE(STBI__F_none
) cur
[k
] = raw
[k
]; break;
4070 CASE(STBI__F_sub
) cur
[k
] = STBI__BYTECAST(raw
[k
] + cur
[k
-out_n
]); break;
4071 CASE(STBI__F_up
) cur
[k
] = STBI__BYTECAST(raw
[k
] + prior
[k
]); break;
4072 CASE(STBI__F_avg
) cur
[k
] = STBI__BYTECAST(raw
[k
] + ((prior
[k
] + cur
[k
-out_n
])>>1)); break;
4073 CASE(STBI__F_paeth
) cur
[k
] = STBI__BYTECAST(raw
[k
] + stbi__paeth(cur
[k
-out_n
],prior
[k
],prior
[k
-out_n
])); break;
4074 CASE(STBI__F_avg_first
) cur
[k
] = STBI__BYTECAST(raw
[k
] + (cur
[k
-out_n
] >> 1)); break;
4075 CASE(STBI__F_paeth_first
) cur
[k
] = STBI__BYTECAST(raw
[k
] + stbi__paeth(cur
[k
-out_n
],0,0)); break;
4081 // we make a separate pass to expand bits to pixels; for performance,
4082 // this could run two scanlines behind the above code, so it won't
4083 // intefere with filtering but will still be in the cache.
4085 for (j
=0; j
< y
; ++j
) {
4086 stbi_uc
*cur
= a
->out
+ stride
*j
;
4087 stbi_uc
*in
= a
->out
+ stride
*j
+ x
*out_n
- img_width_bytes
;
4088 // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4089 // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4090 stbi_uc scale
= (color
== 0) ? stbi__depth_scale_table
[depth
] : 1; // scale grayscale values to 0..255 range
4092 // note that the final byte might overshoot and write more data than desired.
4093 // we can allocate enough data that this never writes out of memory, but it
4094 // could also overwrite the next scanline. can it overwrite non-empty data
4095 // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4096 // so we need to explicitly clamp the final ones
4099 for (k
=x
*img_n
; k
>= 2; k
-=2, ++in
) {
4100 *cur
++ = scale
* ((*in
>> 4) );
4101 *cur
++ = scale
* ((*in
) & 0x0f);
4103 if (k
> 0) *cur
++ = scale
* ((*in
>> 4) );
4104 } else if (depth
== 2) {
4105 for (k
=x
*img_n
; k
>= 4; k
-=4, ++in
) {
4106 *cur
++ = scale
* ((*in
>> 6) );
4107 *cur
++ = scale
* ((*in
>> 4) & 0x03);
4108 *cur
++ = scale
* ((*in
>> 2) & 0x03);
4109 *cur
++ = scale
* ((*in
) & 0x03);
4111 if (k
> 0) *cur
++ = scale
* ((*in
>> 6) );
4112 if (k
> 1) *cur
++ = scale
* ((*in
>> 4) & 0x03);
4113 if (k
> 2) *cur
++ = scale
* ((*in
>> 2) & 0x03);
4114 } else if (depth
== 1) {
4115 for (k
=x
*img_n
; k
>= 8; k
-=8, ++in
) {
4116 *cur
++ = scale
* ((*in
>> 7) );
4117 *cur
++ = scale
* ((*in
>> 6) & 0x01);
4118 *cur
++ = scale
* ((*in
>> 5) & 0x01);
4119 *cur
++ = scale
* ((*in
>> 4) & 0x01);
4120 *cur
++ = scale
* ((*in
>> 3) & 0x01);
4121 *cur
++ = scale
* ((*in
>> 2) & 0x01);
4122 *cur
++ = scale
* ((*in
>> 1) & 0x01);
4123 *cur
++ = scale
* ((*in
) & 0x01);
4125 if (k
> 0) *cur
++ = scale
* ((*in
>> 7) );
4126 if (k
> 1) *cur
++ = scale
* ((*in
>> 6) & 0x01);
4127 if (k
> 2) *cur
++ = scale
* ((*in
>> 5) & 0x01);
4128 if (k
> 3) *cur
++ = scale
* ((*in
>> 4) & 0x01);
4129 if (k
> 4) *cur
++ = scale
* ((*in
>> 3) & 0x01);
4130 if (k
> 5) *cur
++ = scale
* ((*in
>> 2) & 0x01);
4131 if (k
> 6) *cur
++ = scale
* ((*in
>> 1) & 0x01);
4133 if (img_n
!= out_n
) {
4135 // insert alpha = 255
4136 cur
= a
->out
+ stride
*j
;
4138 for (q
=x
-1; q
>= 0; --q
) {
4140 cur
[q
*2+0] = cur
[q
];
4143 STBI_ASSERT(img_n
== 3);
4144 for (q
=x
-1; q
>= 0; --q
) {
4146 cur
[q
*4+2] = cur
[q
*3+2];
4147 cur
[q
*4+1] = cur
[q
*3+1];
4148 cur
[q
*4+0] = cur
[q
*3+0];
4158 static int stbi__create_png_image(stbi__png
*a
, stbi_uc
*image_data
, stbi__uint32 image_data_len
, int out_n
, int depth
, int color
, int interlaced
)
4163 return stbi__create_png_image_raw(a
, image_data
, image_data_len
, out_n
, a
->s
->img_x
, a
->s
->img_y
, depth
, color
);
4166 final
= (stbi_uc
*) stbi__malloc(a
->s
->img_x
* a
->s
->img_y
* out_n
);
4167 for (p
=0; p
< 7; ++p
) {
4168 int xorig
[] = { 0,4,0,2,0,1,0 };
4169 int yorig
[] = { 0,0,4,0,2,0,1 };
4170 int xspc
[] = { 8,8,4,4,2,2,1 };
4171 int yspc
[] = { 8,8,8,4,4,2,2 };
4173 // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4174 x
= (a
->s
->img_x
- xorig
[p
] + xspc
[p
]-1) / xspc
[p
];
4175 y
= (a
->s
->img_y
- yorig
[p
] + yspc
[p
]-1) / yspc
[p
];
4177 stbi__uint32 img_len
= ((((a
->s
->img_n
* x
* depth
) + 7) >> 3) + 1) * y
;
4178 if (!stbi__create_png_image_raw(a
, image_data
, image_data_len
, out_n
, x
, y
, depth
, color
)) {
4182 for (j
=0; j
< y
; ++j
) {
4183 for (i
=0; i
< x
; ++i
) {
4184 int out_y
= j
*yspc
[p
]+yorig
[p
];
4185 int out_x
= i
*xspc
[p
]+xorig
[p
];
4186 memcpy(final
+ out_y
*a
->s
->img_x
*out_n
+ out_x
*out_n
,
4187 a
->out
+ (j
*x
+i
)*out_n
, out_n
);
4191 image_data
+= img_len
;
4192 image_data_len
-= img_len
;
4200 static int stbi__compute_transparency(stbi__png
*z
, stbi_uc tc
[3], int out_n
)
4202 stbi__context
*s
= z
->s
;
4203 stbi__uint32 i
, pixel_count
= s
->img_x
* s
->img_y
;
4204 stbi_uc
*p
= z
->out
;
4206 // compute color-based transparency, assuming we've
4207 // already got 255 as the alpha value in the output
4208 STBI_ASSERT(out_n
== 2 || out_n
== 4);
4211 for (i
=0; i
< pixel_count
; ++i
) {
4212 p
[1] = (p
[0] == tc
[0] ? 0 : 255);
4216 for (i
=0; i
< pixel_count
; ++i
) {
4217 if (p
[0] == tc
[0] && p
[1] == tc
[1] && p
[2] == tc
[2])
4225 static int stbi__expand_png_palette(stbi__png
*a
, stbi_uc
*palette
, int len
, int pal_img_n
)
4227 stbi__uint32 i
, pixel_count
= a
->s
->img_x
* a
->s
->img_y
;
4228 stbi_uc
*p
, *temp_out
, *orig
= a
->out
;
4230 p
= (stbi_uc
*) stbi__malloc(pixel_count
* pal_img_n
);
4231 if (p
== NULL
) return stbi__err("outofmem", "Out of memory");
4233 // between here and free(out) below, exitting would leak
4236 if (pal_img_n
== 3) {
4237 for (i
=0; i
< pixel_count
; ++i
) {
4240 p
[1] = palette
[n
+1];
4241 p
[2] = palette
[n
+2];
4245 for (i
=0; i
< pixel_count
; ++i
) {
4248 p
[1] = palette
[n
+1];
4249 p
[2] = palette
[n
+2];
4250 p
[3] = palette
[n
+3];
4262 static int stbi__unpremultiply_on_load
= 0;
4263 static int stbi__de_iphone_flag
= 0;
4265 STBIDEF
void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply
)
4267 stbi__unpremultiply_on_load
= flag_true_if_should_unpremultiply
;
4270 STBIDEF
void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert
)
4272 stbi__de_iphone_flag
= flag_true_if_should_convert
;
4275 static void stbi__de_iphone(stbi__png
*z
)
4277 stbi__context
*s
= z
->s
;
4278 stbi__uint32 i
, pixel_count
= s
->img_x
* s
->img_y
;
4279 stbi_uc
*p
= z
->out
;
4281 if (s
->img_out_n
== 3) { // convert bgr to rgb
4282 for (i
=0; i
< pixel_count
; ++i
) {
4289 STBI_ASSERT(s
->img_out_n
== 4);
4290 if (stbi__unpremultiply_on_load
) {
4291 // convert bgr to rgb and unpremultiply
4292 for (i
=0; i
< pixel_count
; ++i
) {
4296 p
[0] = p
[2] * 255 / a
;
4297 p
[1] = p
[1] * 255 / a
;
4306 // convert bgr to rgb
4307 for (i
=0; i
< pixel_count
; ++i
) {
4317 #define STBI__PNG_TYPE(a,b,c,d) (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
4319 static int stbi__parse_png_file(stbi__png
*z
, int scan
, int req_comp
)
4321 stbi_uc palette
[1024], pal_img_n
=0;
4322 stbi_uc has_trans
=0, tc
[3];
4323 stbi__uint32 ioff
=0, idata_limit
=0, i
, pal_len
=0;
4324 int first
=1,k
,interlace
=0, color
=0, depth
=0, is_iphone
=0;
4325 stbi__context
*s
= z
->s
;
4331 if (!stbi__check_png_header(s
)) return 0;
4333 if (scan
== STBI__SCAN_type
) return 1;
4336 stbi__pngchunk c
= stbi__get_chunk_header(s
);
4338 case STBI__PNG_TYPE('C','g','B','I'):
4340 stbi__skip(s
, c
.length
);
4342 case STBI__PNG_TYPE('I','H','D','R'): {
4344 if (!first
) return stbi__err("multiple IHDR","Corrupt PNG");
4346 if (c
.length
!= 13) return stbi__err("bad IHDR len","Corrupt PNG");
4347 s
->img_x
= stbi__get32be(s
); if (s
->img_x
> (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4348 s
->img_y
= stbi__get32be(s
); if (s
->img_y
> (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4349 depth
= stbi__get8(s
); if (depth
!= 1 && depth
!= 2 && depth
!= 4 && depth
!= 8) return stbi__err("1/2/4/8-bit only","PNG not supported: 1/2/4/8-bit only");
4350 color
= stbi__get8(s
); if (color
> 6) return stbi__err("bad ctype","Corrupt PNG");
4351 if (color
== 3) pal_img_n
= 3; else if (color
& 1) return stbi__err("bad ctype","Corrupt PNG");
4352 comp
= stbi__get8(s
); if (comp
) return stbi__err("bad comp method","Corrupt PNG");
4353 filter
= stbi__get8(s
); if (filter
) return stbi__err("bad filter method","Corrupt PNG");
4354 interlace
= stbi__get8(s
); if (interlace
>1) return stbi__err("bad interlace method","Corrupt PNG");
4355 if (!s
->img_x
|| !s
->img_y
) return stbi__err("0-pixel image","Corrupt PNG");
4357 s
->img_n
= (color
& 2 ? 3 : 1) + (color
& 4 ? 1 : 0);
4358 if ((1 << 30) / s
->img_x
/ s
->img_n
< s
->img_y
) return stbi__err("too large", "Image too large to decode");
4359 if (scan
== STBI__SCAN_header
) return 1;
4361 // if paletted, then pal_n is our final components, and
4362 // img_n is # components to decompress/filter.
4364 if ((1 << 30) / s
->img_x
/ 4 < s
->img_y
) return stbi__err("too large","Corrupt PNG");
4365 // if SCAN_header, have to scan to see if we have a tRNS
4370 case STBI__PNG_TYPE('P','L','T','E'): {
4371 if (first
) return stbi__err("first not IHDR", "Corrupt PNG");
4372 if (c
.length
> 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4373 pal_len
= c
.length
/ 3;
4374 if (pal_len
* 3 != c
.length
) return stbi__err("invalid PLTE","Corrupt PNG");
4375 for (i
=0; i
< pal_len
; ++i
) {
4376 palette
[i
*4+0] = stbi__get8(s
);
4377 palette
[i
*4+1] = stbi__get8(s
);
4378 palette
[i
*4+2] = stbi__get8(s
);
4379 palette
[i
*4+3] = 255;
4384 case STBI__PNG_TYPE('t','R','N','S'): {
4385 if (first
) return stbi__err("first not IHDR", "Corrupt PNG");
4386 if (z
->idata
) return stbi__err("tRNS after IDAT","Corrupt PNG");
4388 if (scan
== STBI__SCAN_header
) { s
->img_n
= 4; return 1; }
4389 if (pal_len
== 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
4390 if (c
.length
> pal_len
) return stbi__err("bad tRNS len","Corrupt PNG");
4392 for (i
=0; i
< c
.length
; ++i
)
4393 palette
[i
*4+3] = stbi__get8(s
);
4395 if (!(s
->img_n
& 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
4396 if (c
.length
!= (stbi__uint32
) s
->img_n
*2) return stbi__err("bad tRNS len","Corrupt PNG");
4398 for (k
=0; k
< s
->img_n
; ++k
)
4399 tc
[k
] = (stbi_uc
) (stbi__get16be(s
) & 255) * stbi__depth_scale_table
[depth
]; // non 8-bit images will be larger
4404 case STBI__PNG_TYPE('I','D','A','T'): {
4405 if (first
) return stbi__err("first not IHDR", "Corrupt PNG");
4406 if (pal_img_n
&& !pal_len
) return stbi__err("no PLTE","Corrupt PNG");
4407 if (scan
== STBI__SCAN_header
) { s
->img_n
= pal_img_n
; return 1; }
4408 if ((int)(ioff
+ c
.length
) < (int)ioff
) return 0;
4409 if (ioff
+ c
.length
> idata_limit
) {
4410 stbi__uint32 idata_limit_old
= idata_limit
;
4412 if (idata_limit
== 0) idata_limit
= c
.length
> 4096 ? c
.length
: 4096;
4413 while (ioff
+ c
.length
> idata_limit
)
4415 STBI_NOTUSED(idata_limit_old
);
4416 p
= (stbi_uc
*) STBI_REALLOC_SIZED(z
->idata
, idata_limit_old
, idata_limit
); if (p
== NULL
) return stbi__err("outofmem", "Out of memory");
4419 if (!stbi__getn(s
, z
->idata
+ioff
,c
.length
)) return stbi__err("outofdata","Corrupt PNG");
4424 case STBI__PNG_TYPE('I','E','N','D'): {
4425 stbi__uint32 raw_len
, bpl
;
4426 if (first
) return stbi__err("first not IHDR", "Corrupt PNG");
4427 if (scan
!= STBI__SCAN_load
) return 1;
4428 if (z
->idata
== NULL
) return stbi__err("no IDAT","Corrupt PNG");
4429 // initial guess for decoded data size to avoid unnecessary reallocs
4430 bpl
= (s
->img_x
* depth
+ 7) / 8; // bytes per line, per component
4431 raw_len
= bpl
* s
->img_y
* s
->img_n
/* pixels */ + s
->img_y
/* filter mode per row */;
4432 z
->expanded
= (stbi_uc
*) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z
->idata
, ioff
, raw_len
, (int *) &raw_len
, !is_iphone
);
4433 if (z
->expanded
== NULL
) return 0; // zlib should set error
4434 STBI_FREE(z
->idata
); z
->idata
= NULL
;
4435 if ((req_comp
== s
->img_n
+1 && req_comp
!= 3 && !pal_img_n
) || has_trans
)
4436 s
->img_out_n
= s
->img_n
+1;
4438 s
->img_out_n
= s
->img_n
;
4439 if (!stbi__create_png_image(z
, z
->expanded
, raw_len
, s
->img_out_n
, depth
, color
, interlace
)) return 0;
4441 if (!stbi__compute_transparency(z
, tc
, s
->img_out_n
)) return 0;
4442 if (is_iphone
&& stbi__de_iphone_flag
&& s
->img_out_n
> 2)
4445 // pal_img_n == 3 or 4
4446 s
->img_n
= pal_img_n
; // record the actual colors we had
4447 s
->img_out_n
= pal_img_n
;
4448 if (req_comp
>= 3) s
->img_out_n
= req_comp
;
4449 if (!stbi__expand_png_palette(z
, palette
, pal_len
, s
->img_out_n
))
4452 STBI_FREE(z
->expanded
); z
->expanded
= NULL
;
4457 // if critical, fail
4458 if (first
) return stbi__err("first not IHDR", "Corrupt PNG");
4459 if ((c
.type
& (1 << 29)) == 0) {
4460 #ifndef STBI_NO_FAILURE_STRINGS
4462 static char invalid_chunk
[] = "XXXX PNG chunk not known";
4463 invalid_chunk
[0] = STBI__BYTECAST(c
.type
>> 24);
4464 invalid_chunk
[1] = STBI__BYTECAST(c
.type
>> 16);
4465 invalid_chunk
[2] = STBI__BYTECAST(c
.type
>> 8);
4466 invalid_chunk
[3] = STBI__BYTECAST(c
.type
>> 0);
4468 return stbi__err(invalid_chunk
, "PNG not supported: unknown PNG chunk type");
4470 stbi__skip(s
, c
.length
);
4473 // end of PNG chunk, read and skip CRC
4478 static unsigned char *stbi__do_png(stbi__png
*p
, int *x
, int *y
, int *n
, int req_comp
)
4480 unsigned char *result
=NULL
;
4481 if (req_comp
< 0 || req_comp
> 4) return stbi__errpuc("bad req_comp", "Internal error");
4482 if (stbi__parse_png_file(p
, STBI__SCAN_load
, req_comp
)) {
4485 if (req_comp
&& req_comp
!= p
->s
->img_out_n
) {
4486 result
= stbi__convert_format(result
, p
->s
->img_out_n
, req_comp
, p
->s
->img_x
, p
->s
->img_y
);
4487 p
->s
->img_out_n
= req_comp
;
4488 if (result
== NULL
) return result
;
4492 if (n
) *n
= p
->s
->img_out_n
;
4494 STBI_FREE(p
->out
); p
->out
= NULL
;
4495 STBI_FREE(p
->expanded
); p
->expanded
= NULL
;
4496 STBI_FREE(p
->idata
); p
->idata
= NULL
;
4501 static unsigned char *stbi__png_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
4505 return stbi__do_png(&p
, x
,y
,comp
,req_comp
);
4508 static int stbi__png_test(stbi__context
*s
)
4511 r
= stbi__check_png_header(s
);
4516 static int stbi__png_info_raw(stbi__png
*p
, int *x
, int *y
, int *comp
)
4518 if (!stbi__parse_png_file(p
, STBI__SCAN_header
, 0)) {
4519 stbi__rewind( p
->s
);
4522 if (x
) *x
= p
->s
->img_x
;
4523 if (y
) *y
= p
->s
->img_y
;
4524 if (comp
) *comp
= p
->s
->img_n
;
4528 static int stbi__png_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
4532 return stbi__png_info_raw(&p
, x
, y
, comp
);
4536 // Microsoft/Windows BMP image
4539 static int stbi__bmp_test_raw(stbi__context
*s
)
4543 if (stbi__get8(s
) != 'B') return 0;
4544 if (stbi__get8(s
) != 'M') return 0;
4545 stbi__get32le(s
); // discard filesize
4546 stbi__get16le(s
); // discard reserved
4547 stbi__get16le(s
); // discard reserved
4548 stbi__get32le(s
); // discard data offset
4549 sz
= stbi__get32le(s
);
4550 r
= (sz
== 12 || sz
== 40 || sz
== 56 || sz
== 108 || sz
== 124);
4554 static int stbi__bmp_test(stbi__context
*s
)
4556 int r
= stbi__bmp_test_raw(s
);
4562 // returns 0..31 for the highest set bit
4563 static int stbi__high_bit(unsigned int z
)
4566 if (z
== 0) return -1;
4567 if (z
>= 0x10000) n
+= 16, z
>>= 16;
4568 if (z
>= 0x00100) n
+= 8, z
>>= 8;
4569 if (z
>= 0x00010) n
+= 4, z
>>= 4;
4570 if (z
>= 0x00004) n
+= 2, z
>>= 2;
4571 if (z
>= 0x00002) n
+= 1, z
>>= 1;
4575 static int stbi__bitcount(unsigned int a
)
4577 a
= (a
& 0x55555555) + ((a
>> 1) & 0x55555555); // max 2
4578 a
= (a
& 0x33333333) + ((a
>> 2) & 0x33333333); // max 4
4579 a
= (a
+ (a
>> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
4580 a
= (a
+ (a
>> 8)); // max 16 per 8 bits
4581 a
= (a
+ (a
>> 16)); // max 32 per 8 bits
4585 static int stbi__shiftsigned(int v
, int shift
, int bits
)
4590 if (shift
< 0) v
<<= -shift
;
4604 int bpp
, offset
, hsz
;
4605 unsigned int mr
,mg
,mb
,ma
, all_a
;
4608 static void *stbi__bmp_parse_header(stbi__context
*s
, stbi__bmp_data
*info
)
4611 if (stbi__get8(s
) != 'B' || stbi__get8(s
) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
4612 stbi__get32le(s
); // discard filesize
4613 stbi__get16le(s
); // discard reserved
4614 stbi__get16le(s
); // discard reserved
4615 info
->offset
= stbi__get32le(s
);
4616 info
->hsz
= hsz
= stbi__get32le(s
);
4618 if (hsz
!= 12 && hsz
!= 40 && hsz
!= 56 && hsz
!= 108 && hsz
!= 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
4620 s
->img_x
= stbi__get16le(s
);
4621 s
->img_y
= stbi__get16le(s
);
4623 s
->img_x
= stbi__get32le(s
);
4624 s
->img_y
= stbi__get32le(s
);
4626 if (stbi__get16le(s
) != 1) return stbi__errpuc("bad BMP", "bad BMP");
4627 info
->bpp
= stbi__get16le(s
);
4628 if (info
->bpp
== 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
4630 int compress
= stbi__get32le(s
);
4631 if (compress
== 1 || compress
== 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
4632 stbi__get32le(s
); // discard sizeof
4633 stbi__get32le(s
); // discard hres
4634 stbi__get32le(s
); // discard vres
4635 stbi__get32le(s
); // discard colorsused
4636 stbi__get32le(s
); // discard max important
4637 if (hsz
== 40 || hsz
== 56) {
4644 if (info
->bpp
== 16 || info
->bpp
== 32) {
4645 info
->mr
= info
->mg
= info
->mb
= 0;
4646 if (compress
== 0) {
4647 if (info
->bpp
== 32) {
4648 info
->mr
= 0xffu
<< 16;
4649 info
->mg
= 0xffu
<< 8;
4650 info
->mb
= 0xffu
<< 0;
4651 info
->ma
= 0xffu
<< 24;
4652 info
->all_a
= 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
4654 info
->mr
= 31u << 10;
4655 info
->mg
= 31u << 5;
4656 info
->mb
= 31u << 0;
4658 } else if (compress
== 3) {
4659 info
->mr
= stbi__get32le(s
);
4660 info
->mg
= stbi__get32le(s
);
4661 info
->mb
= stbi__get32le(s
);
4662 // not documented, but generated by photoshop and handled by mspaint
4663 if (info
->mr
== info
->mg
&& info
->mg
== info
->mb
) {
4665 return stbi__errpuc("bad BMP", "bad BMP");
4668 return stbi__errpuc("bad BMP", "bad BMP");
4672 if (hsz
!= 108 && hsz
!= 124)
4673 return stbi__errpuc("bad BMP", "bad BMP");
4674 info
->mr
= stbi__get32le(s
);
4675 info
->mg
= stbi__get32le(s
);
4676 info
->mb
= stbi__get32le(s
);
4677 info
->ma
= stbi__get32le(s
);
4678 stbi__get32le(s
); // discard color space
4679 for (i
=0; i
< 12; ++i
)
4680 stbi__get32le(s
); // discard color space parameters
4682 stbi__get32le(s
); // discard rendering intent
4683 stbi__get32le(s
); // discard offset of profile data
4684 stbi__get32le(s
); // discard size of profile data
4685 stbi__get32le(s
); // discard reserved
4693 static stbi_uc
*stbi__bmp_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
4696 unsigned int mr
=0,mg
=0,mb
=0,ma
=0, all_a
;
4697 stbi_uc pal
[256][4];
4698 int psize
=0,i
,j
,width
;
4699 int flip_vertically
, pad
, target
;
4700 stbi__bmp_data info
;
4703 if (stbi__bmp_parse_header(s
, &info
) == NULL
)
4704 return NULL
; // error code already set
4706 flip_vertically
= ((int) s
->img_y
) > 0;
4707 s
->img_y
= abs((int) s
->img_y
);
4715 if (info
.hsz
== 12) {
4717 psize
= (info
.offset
- 14 - 24) / 3;
4720 psize
= (info
.offset
- 14 - info
.hsz
) >> 2;
4723 s
->img_n
= ma
? 4 : 3;
4724 if (req_comp
&& req_comp
>= 3) // we can directly decode 3 or 4
4727 target
= s
->img_n
; // if they want monochrome, we'll post-convert
4729 out
= (stbi_uc
*) stbi__malloc(target
* s
->img_x
* s
->img_y
);
4730 if (!out
) return stbi__errpuc("outofmem", "Out of memory");
4731 if (info
.bpp
< 16) {
4733 if (psize
== 0 || psize
> 256) { STBI_FREE(out
); return stbi__errpuc("invalid", "Corrupt BMP"); }
4734 for (i
=0; i
< psize
; ++i
) {
4735 pal
[i
][2] = stbi__get8(s
);
4736 pal
[i
][1] = stbi__get8(s
);
4737 pal
[i
][0] = stbi__get8(s
);
4738 if (info
.hsz
!= 12) stbi__get8(s
);
4741 stbi__skip(s
, info
.offset
- 14 - info
.hsz
- psize
* (info
.hsz
== 12 ? 3 : 4));
4742 if (info
.bpp
== 4) width
= (s
->img_x
+ 1) >> 1;
4743 else if (info
.bpp
== 8) width
= s
->img_x
;
4744 else { STBI_FREE(out
); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
4746 for (j
=0; j
< (int) s
->img_y
; ++j
) {
4747 for (i
=0; i
< (int) s
->img_x
; i
+= 2) {
4748 int v
=stbi__get8(s
),v2
=0;
4749 if (info
.bpp
== 4) {
4753 out
[z
++] = pal
[v
][0];
4754 out
[z
++] = pal
[v
][1];
4755 out
[z
++] = pal
[v
][2];
4756 if (target
== 4) out
[z
++] = 255;
4757 if (i
+1 == (int) s
->img_x
) break;
4758 v
= (info
.bpp
== 8) ? stbi__get8(s
) : v2
;
4759 out
[z
++] = pal
[v
][0];
4760 out
[z
++] = pal
[v
][1];
4761 out
[z
++] = pal
[v
][2];
4762 if (target
== 4) out
[z
++] = 255;
4767 int rshift
=0,gshift
=0,bshift
=0,ashift
=0,rcount
=0,gcount
=0,bcount
=0,acount
=0;
4770 stbi__skip(s
, info
.offset
- 14 - info
.hsz
);
4771 if (info
.bpp
== 24) width
= 3 * s
->img_x
;
4772 else if (info
.bpp
== 16) width
= 2*s
->img_x
;
4773 else /* bpp = 32 and pad = 0 */ width
=0;
4775 if (info
.bpp
== 24) {
4777 } else if (info
.bpp
== 32) {
4778 if (mb
== 0xff && mg
== 0xff00 && mr
== 0x00ff0000 && ma
== 0xff000000)
4782 if (!mr
|| !mg
|| !mb
) { STBI_FREE(out
); return stbi__errpuc("bad masks", "Corrupt BMP"); }
4783 // right shift amt to put high bit in position #7
4784 rshift
= stbi__high_bit(mr
)-7; rcount
= stbi__bitcount(mr
);
4785 gshift
= stbi__high_bit(mg
)-7; gcount
= stbi__bitcount(mg
);
4786 bshift
= stbi__high_bit(mb
)-7; bcount
= stbi__bitcount(mb
);
4787 ashift
= stbi__high_bit(ma
)-7; acount
= stbi__bitcount(ma
);
4789 for (j
=0; j
< (int) s
->img_y
; ++j
) {
4791 for (i
=0; i
< (int) s
->img_x
; ++i
) {
4793 out
[z
+2] = stbi__get8(s
);
4794 out
[z
+1] = stbi__get8(s
);
4795 out
[z
+0] = stbi__get8(s
);
4797 a
= (easy
== 2 ? stbi__get8(s
) : 255);
4799 if (target
== 4) out
[z
++] = a
;
4803 for (i
=0; i
< (int) s
->img_x
; ++i
) {
4804 stbi__uint32 v
= (bpp
== 16 ? (stbi__uint32
) stbi__get16le(s
) : stbi__get32le(s
));
4806 out
[z
++] = STBI__BYTECAST(stbi__shiftsigned(v
& mr
, rshift
, rcount
));
4807 out
[z
++] = STBI__BYTECAST(stbi__shiftsigned(v
& mg
, gshift
, gcount
));
4808 out
[z
++] = STBI__BYTECAST(stbi__shiftsigned(v
& mb
, bshift
, bcount
));
4809 a
= (ma
? stbi__shiftsigned(v
& ma
, ashift
, acount
) : 255);
4811 if (target
== 4) out
[z
++] = STBI__BYTECAST(a
);
4818 // if alpha channel is all 0s, replace with all 255s
4819 if (target
== 4 && all_a
== 0)
4820 for (i
=4*s
->img_x
*s
->img_y
-1; i
>= 0; i
-= 4)
4823 if (flip_vertically
) {
4825 for (j
=0; j
< (int) s
->img_y
>>1; ++j
) {
4826 stbi_uc
*p1
= out
+ j
*s
->img_x
*target
;
4827 stbi_uc
*p2
= out
+ (s
->img_y
-1-j
)*s
->img_x
*target
;
4828 for (i
=0; i
< (int) s
->img_x
*target
; ++i
) {
4829 t
= p1
[i
], p1
[i
] = p2
[i
], p2
[i
] = t
;
4834 if (req_comp
&& req_comp
!= target
) {
4835 out
= stbi__convert_format(out
, target
, req_comp
, s
->img_x
, s
->img_y
);
4836 if (out
== NULL
) return out
; // stbi__convert_format frees input on failure
4841 if (comp
) *comp
= s
->img_n
;
4846 // Targa Truevision - TGA
4847 // by Jonathan Dummer
4849 // returns STBI_rgb or whatever, 0 on error
4850 static int stbi__tga_get_comp(int bits_per_pixel
, int is_grey
, int* is_rgb16
)
4852 // only RGB or RGBA (incl. 16bit) or grey allowed
4853 if(is_rgb16
) *is_rgb16
= 0;
4854 switch(bits_per_pixel
) {
4855 case 8: return STBI_grey
;
4856 case 16: if(is_grey
) return STBI_grey_alpha
;
4857 // else: fall-through
4858 case 15: if(is_rgb16
) *is_rgb16
= 1;
4860 case 24: // fall-through
4861 case 32: return bits_per_pixel
/8;
4866 static int stbi__tga_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
4868 int tga_w
, tga_h
, tga_comp
, tga_image_type
, tga_bits_per_pixel
, tga_colormap_bpp
;
4869 int sz
, tga_colormap_type
;
4870 stbi__get8(s
); // discard Offset
4871 tga_colormap_type
= stbi__get8(s
); // colormap type
4872 if( tga_colormap_type
> 1 ) {
4874 return 0; // only RGB or indexed allowed
4876 tga_image_type
= stbi__get8(s
); // image type
4877 if ( tga_colormap_type
== 1 ) { // colormapped (paletted) image
4878 if (tga_image_type
!= 1 && tga_image_type
!= 9) {
4882 stbi__skip(s
,4); // skip index of first colormap entry and number of entries
4883 sz
= stbi__get8(s
); // check bits per palette color entry
4884 if ( (sz
!= 8) && (sz
!= 15) && (sz
!= 16) && (sz
!= 24) && (sz
!= 32) ) {
4888 stbi__skip(s
,4); // skip image x and y origin
4889 tga_colormap_bpp
= sz
;
4890 } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
4891 if ( (tga_image_type
!= 2) && (tga_image_type
!= 3) && (tga_image_type
!= 10) && (tga_image_type
!= 11) ) {
4893 return 0; // only RGB or grey allowed, +/- RLE
4895 stbi__skip(s
,9); // skip colormap specification and image x/y origin
4896 tga_colormap_bpp
= 0;
4898 tga_w
= stbi__get16le(s
);
4901 return 0; // test width
4903 tga_h
= stbi__get16le(s
);
4906 return 0; // test height
4908 tga_bits_per_pixel
= stbi__get8(s
); // bits per pixel
4909 stbi__get8(s
); // ignore alpha bits
4910 if (tga_colormap_bpp
!= 0) {
4911 if((tga_bits_per_pixel
!= 8) && (tga_bits_per_pixel
!= 16)) {
4912 // when using a colormap, tga_bits_per_pixel is the size of the indexes
4913 // I don't think anything but 8 or 16bit indexes makes sense
4917 tga_comp
= stbi__tga_get_comp(tga_colormap_bpp
, 0, NULL
);
4919 tga_comp
= stbi__tga_get_comp(tga_bits_per_pixel
, (tga_image_type
== 3) || (tga_image_type
== 11), NULL
);
4927 if (comp
) *comp
= tga_comp
;
4928 return 1; // seems to have passed everything
4931 static int stbi__tga_test(stbi__context
*s
)
4934 int sz
, tga_color_type
;
4935 stbi__get8(s
); // discard Offset
4936 tga_color_type
= stbi__get8(s
); // color type
4937 if ( tga_color_type
> 1 ) goto errorEnd
; // only RGB or indexed allowed
4938 sz
= stbi__get8(s
); // image type
4939 if ( tga_color_type
== 1 ) { // colormapped (paletted) image
4940 if (sz
!= 1 && sz
!= 9) goto errorEnd
; // colortype 1 demands image type 1 or 9
4941 stbi__skip(s
,4); // skip index of first colormap entry and number of entries
4942 sz
= stbi__get8(s
); // check bits per palette color entry
4943 if ( (sz
!= 8) && (sz
!= 15) && (sz
!= 16) && (sz
!= 24) && (sz
!= 32) ) goto errorEnd
;
4944 stbi__skip(s
,4); // skip image x and y origin
4945 } else { // "normal" image w/o colormap
4946 if ( (sz
!= 2) && (sz
!= 3) && (sz
!= 10) && (sz
!= 11) ) goto errorEnd
; // only RGB or grey allowed, +/- RLE
4947 stbi__skip(s
,9); // skip colormap specification and image x/y origin
4949 if ( stbi__get16le(s
) < 1 ) goto errorEnd
; // test width
4950 if ( stbi__get16le(s
) < 1 ) goto errorEnd
; // test height
4951 sz
= stbi__get8(s
); // bits per pixel
4952 if ( (tga_color_type
== 1) && (sz
!= 8) && (sz
!= 16) ) goto errorEnd
; // for colormapped images, bpp is size of an index
4953 if ( (sz
!= 8) && (sz
!= 15) && (sz
!= 16) && (sz
!= 24) && (sz
!= 32) ) goto errorEnd
;
4955 res
= 1; // if we got this far, everything's good and we can return 1 instead of 0
4962 // read 16bit value and convert to 24bit RGB
4963 void stbi__tga_read_rgb16(stbi__context
*s
, stbi_uc
* out
)
4965 stbi__uint16 px
= stbi__get16le(s
);
4966 stbi__uint16 fiveBitMask
= 31;
4967 // we have 3 channels with 5bits each
4968 int r
= (px
>> 10) & fiveBitMask
;
4969 int g
= (px
>> 5) & fiveBitMask
;
4970 int b
= px
& fiveBitMask
;
4971 // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
4972 out
[0] = (r
* 255)/31;
4973 out
[1] = (g
* 255)/31;
4974 out
[2] = (b
* 255)/31;
4976 // some people claim that the most significant bit might be used for alpha
4977 // (possibly if an alpha-bit is set in the "image descriptor byte")
4978 // but that only made 16bit test images completely translucent..
4979 // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
4982 static stbi_uc
*stbi__tga_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
4984 // read in the TGA header stuff
4985 int tga_offset
= stbi__get8(s
);
4986 int tga_indexed
= stbi__get8(s
);
4987 int tga_image_type
= stbi__get8(s
);
4989 int tga_palette_start
= stbi__get16le(s
);
4990 int tga_palette_len
= stbi__get16le(s
);
4991 int tga_palette_bits
= stbi__get8(s
);
4992 int tga_x_origin
= stbi__get16le(s
);
4993 int tga_y_origin
= stbi__get16le(s
);
4994 int tga_width
= stbi__get16le(s
);
4995 int tga_height
= stbi__get16le(s
);
4996 int tga_bits_per_pixel
= stbi__get8(s
);
4997 int tga_comp
, tga_rgb16
=0;
4998 int tga_inverted
= stbi__get8(s
);
4999 // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
5001 unsigned char *tga_data
;
5002 unsigned char *tga_palette
= NULL
;
5004 unsigned char raw_data
[4];
5006 int RLE_repeating
= 0;
5007 int read_next_pixel
= 1;
5009 // do a tiny bit of precessing
5010 if ( tga_image_type
>= 8 )
5012 tga_image_type
-= 8;
5015 tga_inverted
= 1 - ((tga_inverted
>> 5) & 1);
5017 // If I'm paletted, then I'll use the number of bits from the palette
5018 if ( tga_indexed
) tga_comp
= stbi__tga_get_comp(tga_palette_bits
, 0, &tga_rgb16
);
5019 else tga_comp
= stbi__tga_get_comp(tga_bits_per_pixel
, (tga_image_type
== 3), &tga_rgb16
);
5021 if(!tga_comp
) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
5022 return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
5027 if (comp
) *comp
= tga_comp
;
5029 tga_data
= (unsigned char*)stbi__malloc( (size_t)tga_width
* tga_height
* tga_comp
);
5030 if (!tga_data
) return stbi__errpuc("outofmem", "Out of memory");
5032 // skip to the data's starting position (offset usually = 0)
5033 stbi__skip(s
, tga_offset
);
5035 if ( !tga_indexed
&& !tga_is_RLE
&& !tga_rgb16
) {
5036 for (i
=0; i
< tga_height
; ++i
) {
5037 int row
= tga_inverted
? tga_height
-i
- 1 : i
;
5038 stbi_uc
*tga_row
= tga_data
+ row
*tga_width
*tga_comp
;
5039 stbi__getn(s
, tga_row
, tga_width
* tga_comp
);
5042 // do I need to load a palette?
5045 // any data to skip? (offset usually = 0)
5046 stbi__skip(s
, tga_palette_start
);
5048 tga_palette
= (unsigned char*)stbi__malloc( tga_palette_len
* tga_comp
);
5050 STBI_FREE(tga_data
);
5051 return stbi__errpuc("outofmem", "Out of memory");
5054 stbi_uc
*pal_entry
= tga_palette
;
5055 STBI_ASSERT(tga_comp
== STBI_rgb
);
5056 for (i
=0; i
< tga_palette_len
; ++i
) {
5057 stbi__tga_read_rgb16(s
, pal_entry
);
5058 pal_entry
+= tga_comp
;
5060 } else if (!stbi__getn(s
, tga_palette
, tga_palette_len
* tga_comp
)) {
5061 STBI_FREE(tga_data
);
5062 STBI_FREE(tga_palette
);
5063 return stbi__errpuc("bad palette", "Corrupt TGA");
5067 for (i
=0; i
< tga_width
* tga_height
; ++i
)
5069 // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
5072 if ( RLE_count
== 0 )
5074 // yep, get the next byte as a RLE command
5075 int RLE_cmd
= stbi__get8(s
);
5076 RLE_count
= 1 + (RLE_cmd
& 127);
5077 RLE_repeating
= RLE_cmd
>> 7;
5078 read_next_pixel
= 1;
5079 } else if ( !RLE_repeating
)
5081 read_next_pixel
= 1;
5085 read_next_pixel
= 1;
5087 // OK, if I need to read a pixel, do it now
5088 if ( read_next_pixel
)
5090 // load however much data we did have
5093 // read in index, then perform the lookup
5094 int pal_idx
= (tga_bits_per_pixel
== 8) ? stbi__get8(s
) : stbi__get16le(s
);
5095 if ( pal_idx
>= tga_palette_len
) {
5099 pal_idx
*= tga_comp
;
5100 for (j
= 0; j
< tga_comp
; ++j
) {
5101 raw_data
[j
] = tga_palette
[pal_idx
+j
];
5103 } else if(tga_rgb16
) {
5104 STBI_ASSERT(tga_comp
== STBI_rgb
);
5105 stbi__tga_read_rgb16(s
, raw_data
);
5107 // read in the data raw
5108 for (j
= 0; j
< tga_comp
; ++j
) {
5109 raw_data
[j
] = stbi__get8(s
);
5112 // clear the reading flag for the next pixel
5113 read_next_pixel
= 0;
5114 } // end of reading a pixel
5117 for (j
= 0; j
< tga_comp
; ++j
)
5118 tga_data
[i
*tga_comp
+j
] = raw_data
[j
];
5120 // in case we're in RLE mode, keep counting down
5123 // do I need to invert the image?
5126 for (j
= 0; j
*2 < tga_height
; ++j
)
5128 int index1
= j
* tga_width
* tga_comp
;
5129 int index2
= (tga_height
- 1 - j
) * tga_width
* tga_comp
;
5130 for (i
= tga_width
* tga_comp
; i
> 0; --i
)
5132 unsigned char temp
= tga_data
[index1
];
5133 tga_data
[index1
] = tga_data
[index2
];
5134 tga_data
[index2
] = temp
;
5140 // clear my palette, if I had one
5141 if ( tga_palette
!= NULL
)
5143 STBI_FREE( tga_palette
);
5147 // swap RGB - if the source data was RGB16, it already is in the right order
5148 if (tga_comp
>= 3 && !tga_rgb16
)
5150 unsigned char* tga_pixel
= tga_data
;
5151 for (i
=0; i
< tga_width
* tga_height
; ++i
)
5153 unsigned char temp
= tga_pixel
[0];
5154 tga_pixel
[0] = tga_pixel
[2];
5155 tga_pixel
[2] = temp
;
5156 tga_pixel
+= tga_comp
;
5160 // convert to target component count
5161 if (req_comp
&& req_comp
!= tga_comp
)
5162 tga_data
= stbi__convert_format(tga_data
, tga_comp
, req_comp
, tga_width
, tga_height
);
5164 // the things I do to get rid of an error message, and yet keep
5165 // Microsoft's C compilers happy... [8^(
5166 tga_palette_start
= tga_palette_len
= tga_palette_bits
=
5167 tga_x_origin
= tga_y_origin
= 0;
5173 // *************************************************************************************************
5174 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5177 static int stbi__psd_test(stbi__context
*s
)
5179 int r
= (stbi__get32be(s
) == 0x38425053);
5184 static stbi_uc
*stbi__psd_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
5187 int channelCount
, compression
;
5188 int channel
, i
, count
, len
;
5194 if (stbi__get32be(s
) != 0x38425053) // "8BPS"
5195 return stbi__errpuc("not PSD", "Corrupt PSD image");
5197 // Check file type version.
5198 if (stbi__get16be(s
) != 1)
5199 return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5201 // Skip 6 reserved bytes.
5204 // Read the number of channels (R, G, B, A, etc).
5205 channelCount
= stbi__get16be(s
);
5206 if (channelCount
< 0 || channelCount
> 16)
5207 return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5209 // Read the rows and columns of the image.
5210 h
= stbi__get32be(s
);
5211 w
= stbi__get32be(s
);
5213 // Make sure the depth is 8 bits.
5214 bitdepth
= stbi__get16be(s
);
5215 if (bitdepth
!= 8 && bitdepth
!= 16)
5216 return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
5218 // Make sure the color mode is RGB.
5219 // Valid options are:
5228 if (stbi__get16be(s
) != 3)
5229 return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5231 // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.)
5232 stbi__skip(s
,stbi__get32be(s
) );
5234 // Skip the image resources. (resolution, pen tool paths, etc)
5235 stbi__skip(s
, stbi__get32be(s
) );
5237 // Skip the reserved data.
5238 stbi__skip(s
, stbi__get32be(s
) );
5240 // Find out if the data is compressed.
5242 // 0: no compression
5243 // 1: RLE compressed
5244 compression
= stbi__get16be(s
);
5245 if (compression
> 1)
5246 return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5248 // Create the destination image.
5249 out
= (stbi_uc
*) stbi__malloc(4 * w
*h
);
5250 if (!out
) return stbi__errpuc("outofmem", "Out of memory");
5253 // Initialize the data to zero.
5254 //memset( out, 0, pixelCount * 4 );
5256 // Finally, the image data.
5258 // RLE as used by .PSD and .TIFF
5259 // Loop until you get the number of unpacked bytes you are expecting:
5260 // Read the next source byte into n.
5261 // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5262 // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5263 // Else if n is 128, noop.
5266 // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
5267 // which we're going to just skip.
5268 stbi__skip(s
, h
* channelCount
* 2 );
5270 // Read the RLE data by channel.
5271 for (channel
= 0; channel
< 4; channel
++) {
5275 if (channel
>= channelCount
) {
5276 // Fill this channel with default data.
5277 for (i
= 0; i
< pixelCount
; i
++, p
+= 4)
5278 *p
= (channel
== 3 ? 255 : 0);
5280 // Read the RLE data.
5282 while (count
< pixelCount
) {
5283 len
= stbi__get8(s
);
5286 } else if (len
< 128) {
5287 // Copy next len+1 bytes literally.
5295 } else if (len
> 128) {
5297 // Next -len+1 bytes in the dest are replicated from next source byte.
5298 // (Interpret len as a negative 8-bit int.)
5301 val
= stbi__get8(s
);
5314 // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
5315 // where each channel consists of an 8-bit value for each pixel in the image.
5317 // Read the data by channel.
5318 for (channel
= 0; channel
< 4; channel
++) {
5322 if (channel
>= channelCount
) {
5323 // Fill this channel with default data.
5324 stbi_uc val
= channel
== 3 ? 255 : 0;
5325 for (i
= 0; i
< pixelCount
; i
++, p
+= 4)
5329 if (bitdepth
== 16) {
5330 for (i
= 0; i
< pixelCount
; i
++, p
+= 4)
5331 *p
= (stbi_uc
) (stbi__get16be(s
) >> 8);
5333 for (i
= 0; i
< pixelCount
; i
++, p
+= 4)
5340 if (req_comp
&& req_comp
!= 4) {
5341 out
= stbi__convert_format(out
, 4, req_comp
, w
, h
);
5342 if (out
== NULL
) return out
; // stbi__convert_format frees input on failure
5345 if (comp
) *comp
= 4;
5353 // *************************************************************************************************
5354 // Softimage PIC loader
5357 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5358 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5361 static int stbi__pic_is4(stbi__context
*s
,const char *str
)
5365 if (stbi__get8(s
) != (stbi_uc
)str
[i
])
5371 static int stbi__pic_test_core(stbi__context
*s
)
5375 if (!stbi__pic_is4(s
,"\x53\x80\xF6\x34"))
5381 if (!stbi__pic_is4(s
,"PICT"))
5389 stbi_uc size
,type
,channel
;
5392 static stbi_uc
*stbi__readval(stbi__context
*s
, int channel
, stbi_uc
*dest
)
5396 for (i
=0; i
<4; ++i
, mask
>>=1) {
5397 if (channel
& mask
) {
5398 if (stbi__at_eof(s
)) return stbi__errpuc("bad file","PIC file too short");
5399 dest
[i
]=stbi__get8(s
);
5406 static void stbi__copyval(int channel
,stbi_uc
*dest
,const stbi_uc
*src
)
5410 for (i
=0;i
<4; ++i
, mask
>>=1)
5415 static stbi_uc
*stbi__pic_load_core(stbi__context
*s
,int width
,int height
,int *comp
, stbi_uc
*result
)
5417 int act_comp
=0,num_packets
=0,y
,chained
;
5418 stbi__pic_packet packets
[10];
5420 // this will (should...) cater for even some bizarre stuff like having data
5421 // for the same channel in multiple packets.
5423 stbi__pic_packet
*packet
;
5425 if (num_packets
==sizeof(packets
)/sizeof(packets
[0]))
5426 return stbi__errpuc("bad format","too many packets");
5428 packet
= &packets
[num_packets
++];
5430 chained
= stbi__get8(s
);
5431 packet
->size
= stbi__get8(s
);
5432 packet
->type
= stbi__get8(s
);
5433 packet
->channel
= stbi__get8(s
);
5435 act_comp
|= packet
->channel
;
5437 if (stbi__at_eof(s
)) return stbi__errpuc("bad file","file too short (reading packets)");
5438 if (packet
->size
!= 8) return stbi__errpuc("bad format","packet isn't 8bpp");
5441 *comp
= (act_comp
& 0x10 ? 4 : 3); // has alpha channel?
5443 for(y
=0; y
<height
; ++y
) {
5446 for(packet_idx
=0; packet_idx
< num_packets
; ++packet_idx
) {
5447 stbi__pic_packet
*packet
= &packets
[packet_idx
];
5448 stbi_uc
*dest
= result
+y
*width
*4;
5450 switch (packet
->type
) {
5452 return stbi__errpuc("bad format","packet has bad compression type");
5454 case 0: {//uncompressed
5457 for(x
=0;x
<width
;++x
, dest
+=4)
5458 if (!stbi__readval(s
,packet
->channel
,dest
))
5468 stbi_uc count
,value
[4];
5470 count
=stbi__get8(s
);
5471 if (stbi__at_eof(s
)) return stbi__errpuc("bad file","file too short (pure read count)");
5474 count
= (stbi_uc
) left
;
5476 if (!stbi__readval(s
,packet
->channel
,value
)) return 0;
5478 for(i
=0; i
<count
; ++i
,dest
+=4)
5479 stbi__copyval(packet
->channel
,dest
,value
);
5485 case 2: {//Mixed RLE
5488 int count
= stbi__get8(s
), i
;
5489 if (stbi__at_eof(s
)) return stbi__errpuc("bad file","file too short (mixed read count)");
5491 if (count
>= 128) { // Repeated
5495 count
= stbi__get16be(s
);
5499 return stbi__errpuc("bad file","scanline overrun");
5501 if (!stbi__readval(s
,packet
->channel
,value
))
5504 for(i
=0;i
<count
;++i
, dest
+= 4)
5505 stbi__copyval(packet
->channel
,dest
,value
);
5508 if (count
>left
) return stbi__errpuc("bad file","scanline overrun");
5510 for(i
=0;i
<count
;++i
, dest
+=4)
5511 if (!stbi__readval(s
,packet
->channel
,dest
))
5525 static stbi_uc
*stbi__pic_load(stbi__context
*s
,int *px
,int *py
,int *comp
,int req_comp
)
5530 for (i
=0; i
<92; ++i
)
5533 x
= stbi__get16be(s
);
5534 y
= stbi__get16be(s
);
5535 if (stbi__at_eof(s
)) return stbi__errpuc("bad file","file too short (pic header)");
5536 if ((1 << 28) / x
< y
) return stbi__errpuc("too large", "Image too large to decode");
5538 stbi__get32be(s
); //skip `ratio'
5539 stbi__get16be(s
); //skip `fields'
5540 stbi__get16be(s
); //skip `pad'
5542 // intermediate buffer is RGBA
5543 result
= (stbi_uc
*) stbi__malloc(x
*y
*4);
5544 memset(result
, 0xff, x
*y
*4);
5546 if (!stbi__pic_load_core(s
,x
,y
,comp
, result
)) {
5552 if (req_comp
== 0) req_comp
= *comp
;
5553 result
=stbi__convert_format(result
,4,req_comp
,x
,y
);
5558 static int stbi__pic_test(stbi__context
*s
)
5560 int r
= stbi__pic_test_core(s
);
5566 // *************************************************************************************************
5567 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
5580 stbi_uc
*out
, *old_out
; // output buffer (always 4 components)
5581 int flags
, bgindex
, ratio
, transparent
, eflags
, delay
;
5582 stbi_uc pal
[256][4];
5583 stbi_uc lpal
[256][4];
5584 stbi__gif_lzw codes
[4096];
5585 stbi_uc
*color_table
;
5588 int start_x
, start_y
;
5594 static int stbi__gif_test_raw(stbi__context
*s
)
5597 if (stbi__get8(s
) != 'G' || stbi__get8(s
) != 'I' || stbi__get8(s
) != 'F' || stbi__get8(s
) != '8') return 0;
5599 if (sz
!= '9' && sz
!= '7') return 0;
5600 if (stbi__get8(s
) != 'a') return 0;
5604 static int stbi__gif_test(stbi__context
*s
)
5606 int r
= stbi__gif_test_raw(s
);
5611 static void stbi__gif_parse_colortable(stbi__context
*s
, stbi_uc pal
[256][4], int num_entries
, int transp
)
5614 for (i
=0; i
< num_entries
; ++i
) {
5615 pal
[i
][2] = stbi__get8(s
);
5616 pal
[i
][1] = stbi__get8(s
);
5617 pal
[i
][0] = stbi__get8(s
);
5618 pal
[i
][3] = transp
== i
? 0 : 255;
5622 static int stbi__gif_header(stbi__context
*s
, stbi__gif
*g
, int *comp
, int is_info
)
5625 if (stbi__get8(s
) != 'G' || stbi__get8(s
) != 'I' || stbi__get8(s
) != 'F' || stbi__get8(s
) != '8')
5626 return stbi__err("not GIF", "Corrupt GIF");
5628 version
= stbi__get8(s
);
5629 if (version
!= '7' && version
!= '9') return stbi__err("not GIF", "Corrupt GIF");
5630 if (stbi__get8(s
) != 'a') return stbi__err("not GIF", "Corrupt GIF");
5632 stbi__g_failure_reason
= "";
5633 g
->w
= stbi__get16le(s
);
5634 g
->h
= stbi__get16le(s
);
5635 g
->flags
= stbi__get8(s
);
5636 g
->bgindex
= stbi__get8(s
);
5637 g
->ratio
= stbi__get8(s
);
5638 g
->transparent
= -1;
5640 if (comp
!= 0) *comp
= 4; // can't actually tell whether it's 3 or 4 until we parse the comments
5642 if (is_info
) return 1;
5644 if (g
->flags
& 0x80)
5645 stbi__gif_parse_colortable(s
,g
->pal
, 2 << (g
->flags
& 7), -1);
5650 static int stbi__gif_info_raw(stbi__context
*s
, int *x
, int *y
, int *comp
)
5653 if (!stbi__gif_header(s
, &g
, comp
, 1)) {
5662 static void stbi__out_gif_code(stbi__gif
*g
, stbi__uint16 code
)
5666 // recurse to decode the prefixes, since the linked-list is backwards,
5667 // and working backwards through an interleaved image would be nasty
5668 if (g
->codes
[code
].prefix
>= 0)
5669 stbi__out_gif_code(g
, g
->codes
[code
].prefix
);
5671 if (g
->cur_y
>= g
->max_y
) return;
5673 p
= &g
->out
[g
->cur_x
+ g
->cur_y
];
5674 c
= &g
->color_table
[g
->codes
[code
].suffix
* 4];
5684 if (g
->cur_x
>= g
->max_x
) {
5685 g
->cur_x
= g
->start_x
;
5686 g
->cur_y
+= g
->step
;
5688 while (g
->cur_y
>= g
->max_y
&& g
->parse
> 0) {
5689 g
->step
= (1 << g
->parse
) * g
->line_size
;
5690 g
->cur_y
= g
->start_y
+ (g
->step
>> 1);
5696 static stbi_uc
*stbi__process_gif_raster(stbi__context
*s
, stbi__gif
*g
)
5699 stbi__int32 len
, init_code
;
5701 stbi__int32 codesize
, codemask
, avail
, oldcode
, bits
, valid_bits
, clear
;
5704 lzw_cs
= stbi__get8(s
);
5705 if (lzw_cs
> 12) return NULL
;
5706 clear
= 1 << lzw_cs
;
5708 codesize
= lzw_cs
+ 1;
5709 codemask
= (1 << codesize
) - 1;
5712 for (init_code
= 0; init_code
< clear
; init_code
++) {
5713 g
->codes
[init_code
].prefix
= -1;
5714 g
->codes
[init_code
].first
= (stbi_uc
) init_code
;
5715 g
->codes
[init_code
].suffix
= (stbi_uc
) init_code
;
5718 // support no starting clear code
5724 if (valid_bits
< codesize
) {
5726 len
= stbi__get8(s
); // start new block
5731 bits
|= (stbi__int32
) stbi__get8(s
) << valid_bits
;
5734 stbi__int32 code
= bits
& codemask
;
5736 valid_bits
-= codesize
;
5737 // @OPTIMIZE: is there some way we can accelerate the non-clear path?
5738 if (code
== clear
) { // clear code
5739 codesize
= lzw_cs
+ 1;
5740 codemask
= (1 << codesize
) - 1;
5744 } else if (code
== clear
+ 1) { // end of stream code
5746 while ((len
= stbi__get8(s
)) > 0)
5749 } else if (code
<= avail
) {
5750 if (first
) return stbi__errpuc("no clear code", "Corrupt GIF");
5753 p
= &g
->codes
[avail
++];
5754 if (avail
> 4096) return stbi__errpuc("too many codes", "Corrupt GIF");
5755 p
->prefix
= (stbi__int16
) oldcode
;
5756 p
->first
= g
->codes
[oldcode
].first
;
5757 p
->suffix
= (code
== avail
) ? p
->first
: g
->codes
[code
].first
;
5758 } else if (code
== avail
)
5759 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5761 stbi__out_gif_code(g
, (stbi__uint16
) code
);
5763 if ((avail
& codemask
) == 0 && avail
<= 0x0FFF) {
5765 codemask
= (1 << codesize
) - 1;
5770 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5776 static void stbi__fill_gif_background(stbi__gif
*g
, int x0
, int y0
, int x1
, int y1
)
5779 stbi_uc
*c
= g
->pal
[g
->bgindex
];
5780 for (y
= y0
; y
< y1
; y
+= 4 * g
->w
) {
5781 for (x
= x0
; x
< x1
; x
+= 4) {
5782 stbi_uc
*p
= &g
->out
[y
+ x
];
5791 // this function is designed to support animated gifs, although stb_image doesn't support it
5792 static stbi_uc
*stbi__gif_load_next(stbi__context
*s
, stbi__gif
*g
, int *comp
, int req_comp
)
5795 stbi_uc
*prev_out
= 0;
5797 if (g
->out
== 0 && !stbi__gif_header(s
, g
, comp
,0))
5798 return 0; // stbi__g_failure_reason set by stbi__gif_header
5801 g
->out
= (stbi_uc
*) stbi__malloc(4 * g
->w
* g
->h
);
5802 if (g
->out
== 0) return stbi__errpuc("outofmem", "Out of memory");
5804 switch ((g
->eflags
& 0x1C) >> 2) {
5805 case 0: // unspecified (also always used on 1st frame)
5806 stbi__fill_gif_background(g
, 0, 0, 4 * g
->w
, 4 * g
->w
* g
->h
);
5808 case 1: // do not dispose
5809 if (prev_out
) memcpy(g
->out
, prev_out
, 4 * g
->w
* g
->h
);
5810 g
->old_out
= prev_out
;
5812 case 2: // dispose to background
5813 if (prev_out
) memcpy(g
->out
, prev_out
, 4 * g
->w
* g
->h
);
5814 stbi__fill_gif_background(g
, g
->start_x
, g
->start_y
, g
->max_x
, g
->max_y
);
5816 case 3: // dispose to previous
5818 for (i
= g
->start_y
; i
< g
->max_y
; i
+= 4 * g
->w
)
5819 memcpy(&g
->out
[i
+ g
->start_x
], &g
->old_out
[i
+ g
->start_x
], g
->max_x
- g
->start_x
);
5825 switch (stbi__get8(s
)) {
5826 case 0x2C: /* Image Descriptor */
5828 int prev_trans
= -1;
5829 stbi__int32 x
, y
, w
, h
;
5832 x
= stbi__get16le(s
);
5833 y
= stbi__get16le(s
);
5834 w
= stbi__get16le(s
);
5835 h
= stbi__get16le(s
);
5836 if (((x
+ w
) > (g
->w
)) || ((y
+ h
) > (g
->h
)))
5837 return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
5839 g
->line_size
= g
->w
* 4;
5841 g
->start_y
= y
* g
->line_size
;
5842 g
->max_x
= g
->start_x
+ w
* 4;
5843 g
->max_y
= g
->start_y
+ h
* g
->line_size
;
5844 g
->cur_x
= g
->start_x
;
5845 g
->cur_y
= g
->start_y
;
5847 g
->lflags
= stbi__get8(s
);
5849 if (g
->lflags
& 0x40) {
5850 g
->step
= 8 * g
->line_size
; // first interlaced spacing
5853 g
->step
= g
->line_size
;
5857 if (g
->lflags
& 0x80) {
5858 stbi__gif_parse_colortable(s
,g
->lpal
, 2 << (g
->lflags
& 7), g
->eflags
& 0x01 ? g
->transparent
: -1);
5859 g
->color_table
= (stbi_uc
*) g
->lpal
;
5860 } else if (g
->flags
& 0x80) {
5861 if (g
->transparent
>= 0 && (g
->eflags
& 0x01)) {
5862 prev_trans
= g
->pal
[g
->transparent
][3];
5863 g
->pal
[g
->transparent
][3] = 0;
5865 g
->color_table
= (stbi_uc
*) g
->pal
;
5867 return stbi__errpuc("missing color table", "Corrupt GIF");
5869 o
= stbi__process_gif_raster(s
, g
);
5870 if (o
== NULL
) return NULL
;
5872 if (prev_trans
!= -1)
5873 g
->pal
[g
->transparent
][3] = (stbi_uc
) prev_trans
;
5878 case 0x21: // Comment Extension.
5881 if (stbi__get8(s
) == 0xF9) { // Graphic Control Extension.
5882 len
= stbi__get8(s
);
5884 g
->eflags
= stbi__get8(s
);
5885 g
->delay
= stbi__get16le(s
);
5886 g
->transparent
= stbi__get8(s
);
5892 while ((len
= stbi__get8(s
)) != 0)
5897 case 0x3B: // gif stream termination code
5898 return (stbi_uc
*) s
; // using '1' causes warning on some compilers
5901 return stbi__errpuc("unknown code", "Corrupt GIF");
5905 STBI_NOTUSED(req_comp
);
5908 static stbi_uc
*stbi__gif_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
5912 memset(&g
, 0, sizeof(g
));
5914 u
= stbi__gif_load_next(s
, &g
, comp
, req_comp
);
5915 if (u
== (stbi_uc
*) s
) u
= 0; // end of animated gif marker
5919 if (req_comp
&& req_comp
!= 4)
5920 u
= stbi__convert_format(u
, 4, req_comp
, g
.w
, g
.h
);
5928 static int stbi__gif_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
5930 return stbi__gif_info_raw(s
,x
,y
,comp
);
5934 // *************************************************************************************************
5935 // Radiance RGBE HDR loader
5936 // originally by Nicolas Schulz
5938 static int stbi__hdr_test_core(stbi__context
*s
)
5940 const char *signature
= "#?RADIANCE\n";
5942 for (i
=0; signature
[i
]; ++i
)
5943 if (stbi__get8(s
) != signature
[i
])
5948 static int stbi__hdr_test(stbi__context
* s
)
5950 int r
= stbi__hdr_test_core(s
);
5955 #define STBI__HDR_BUFLEN 1024
5956 static char *stbi__hdr_gettoken(stbi__context
*z
, char *buffer
)
5961 c
= (char) stbi__get8(z
);
5963 while (!stbi__at_eof(z
) && c
!= '\n') {
5965 if (len
== STBI__HDR_BUFLEN
-1) {
5966 // flush to end of line
5967 while (!stbi__at_eof(z
) && stbi__get8(z
) != '\n')
5971 c
= (char) stbi__get8(z
);
5978 static void stbi__hdr_convert(float *output
, stbi_uc
*input
, int req_comp
)
5980 if ( input
[3] != 0 ) {
5983 f1
= (float) ldexp(1.0f
, input
[3] - (int)(128 + 8));
5985 output
[0] = (input
[0] + input
[1] + input
[2]) * f1
/ 3;
5987 output
[0] = input
[0] * f1
;
5988 output
[1] = input
[1] * f1
;
5989 output
[2] = input
[2] * f1
;
5991 if (req_comp
== 2) output
[1] = 1;
5992 if (req_comp
== 4) output
[3] = 1;
5995 case 4: output
[3] = 1; /* fallthrough */
5996 case 3: output
[0] = output
[1] = output
[2] = 0;
5998 case 2: output
[1] = 1; /* fallthrough */
5999 case 1: output
[0] = 0;
6005 static float *stbi__hdr_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
6007 char buffer
[STBI__HDR_BUFLEN
];
6014 unsigned char count
, value
;
6015 int i
, j
, k
, c1
,c2
, z
;
6019 if (strcmp(stbi__hdr_gettoken(s
,buffer
), "#?RADIANCE") != 0)
6020 return stbi__errpf("not HDR", "Corrupt HDR image");
6024 token
= stbi__hdr_gettoken(s
,buffer
);
6025 if (token
[0] == 0) break;
6026 if (strcmp(token
, "FORMAT=32-bit_rle_rgbe") == 0) valid
= 1;
6029 if (!valid
) return stbi__errpf("unsupported format", "Unsupported HDR format");
6031 // Parse width and height
6032 // can't use sscanf() if we're not using stdio!
6033 token
= stbi__hdr_gettoken(s
,buffer
);
6034 if (strncmp(token
, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6036 height
= (int) strtol(token
, &token
, 10);
6037 while (*token
== ' ') ++token
;
6038 if (strncmp(token
, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6040 width
= (int) strtol(token
, NULL
, 10);
6045 if (comp
) *comp
= 3;
6046 if (req_comp
== 0) req_comp
= 3;
6049 hdr_data
= (float *) stbi__malloc(height
* width
* req_comp
* sizeof(float));
6052 // image data is stored as some number of sca
6053 if ( width
< 8 || width
>= 32768) {
6055 for (j
=0; j
< height
; ++j
) {
6056 for (i
=0; i
< width
; ++i
) {
6059 stbi__getn(s
, rgbe
, 4);
6060 stbi__hdr_convert(hdr_data
+ j
* width
* req_comp
+ i
* req_comp
, rgbe
, req_comp
);
6064 // Read RLE-encoded data
6067 for (j
= 0; j
< height
; ++j
) {
6070 len
= stbi__get8(s
);
6071 if (c1
!= 2 || c2
!= 2 || (len
& 0x80)) {
6072 // not run-length encoded, so we have to actually use THIS data as a decoded
6073 // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
6075 rgbe
[0] = (stbi_uc
) c1
;
6076 rgbe
[1] = (stbi_uc
) c2
;
6077 rgbe
[2] = (stbi_uc
) len
;
6078 rgbe
[3] = (stbi_uc
) stbi__get8(s
);
6079 stbi__hdr_convert(hdr_data
, rgbe
, req_comp
);
6082 STBI_FREE(scanline
);
6083 goto main_decode_loop
; // yes, this makes no sense
6086 len
|= stbi__get8(s
);
6087 if (len
!= width
) { STBI_FREE(hdr_data
); STBI_FREE(scanline
); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
6088 if (scanline
== NULL
) scanline
= (stbi_uc
*) stbi__malloc(width
* 4);
6090 for (k
= 0; k
< 4; ++k
) {
6093 count
= stbi__get8(s
);
6096 value
= stbi__get8(s
);
6098 for (z
= 0; z
< count
; ++z
)
6099 scanline
[i
++ * 4 + k
] = value
;
6102 for (z
= 0; z
< count
; ++z
)
6103 scanline
[i
++ * 4 + k
] = stbi__get8(s
);
6107 for (i
=0; i
< width
; ++i
)
6108 stbi__hdr_convert(hdr_data
+(j
*width
+ i
)*req_comp
, scanline
+ i
*4, req_comp
);
6110 STBI_FREE(scanline
);
6116 static int stbi__hdr_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
6118 char buffer
[STBI__HDR_BUFLEN
];
6122 if (stbi__hdr_test(s
) == 0) {
6128 token
= stbi__hdr_gettoken(s
,buffer
);
6129 if (token
[0] == 0) break;
6130 if (strcmp(token
, "FORMAT=32-bit_rle_rgbe") == 0) valid
= 1;
6137 token
= stbi__hdr_gettoken(s
,buffer
);
6138 if (strncmp(token
, "-Y ", 3)) {
6143 *y
= (int) strtol(token
, &token
, 10);
6144 while (*token
== ' ') ++token
;
6145 if (strncmp(token
, "+X ", 3)) {
6150 *x
= (int) strtol(token
, NULL
, 10);
6154 #endif // STBI_NO_HDR
6157 static int stbi__bmp_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
6160 stbi__bmp_data info
;
6163 p
= stbi__bmp_parse_header(s
, &info
);
6169 *comp
= info
.ma
? 4 : 3;
6175 static int stbi__psd_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
6178 if (stbi__get32be(s
) != 0x38425053) {
6182 if (stbi__get16be(s
) != 1) {
6187 channelCount
= stbi__get16be(s
);
6188 if (channelCount
< 0 || channelCount
> 16) {
6192 *y
= stbi__get32be(s
);
6193 *x
= stbi__get32be(s
);
6194 if (stbi__get16be(s
) != 8) {
6198 if (stbi__get16be(s
) != 3) {
6208 static int stbi__pic_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
6210 int act_comp
=0,num_packets
=0,chained
;
6211 stbi__pic_packet packets
[10];
6213 if (!stbi__pic_is4(s
,"\x53\x80\xF6\x34")) {
6220 *x
= stbi__get16be(s
);
6221 *y
= stbi__get16be(s
);
6222 if (stbi__at_eof(s
)) {
6226 if ( (*x
) != 0 && (1 << 28) / (*x
) < (*y
)) {
6234 stbi__pic_packet
*packet
;
6236 if (num_packets
==sizeof(packets
)/sizeof(packets
[0]))
6239 packet
= &packets
[num_packets
++];
6240 chained
= stbi__get8(s
);
6241 packet
->size
= stbi__get8(s
);
6242 packet
->type
= stbi__get8(s
);
6243 packet
->channel
= stbi__get8(s
);
6244 act_comp
|= packet
->channel
;
6246 if (stbi__at_eof(s
)) {
6250 if (packet
->size
!= 8) {
6256 *comp
= (act_comp
& 0x10 ? 4 : 3);
6262 // *************************************************************************************************
6263 // Portable Gray Map and Portable Pixel Map loader
6266 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
6267 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
6269 // Known limitations:
6270 // Does not support comments in the header section
6271 // Does not support ASCII image data (formats P2 and P3)
6272 // Does not support 16-bit-per-channel
6276 static int stbi__pnm_test(stbi__context
*s
)
6279 p
= (char) stbi__get8(s
);
6280 t
= (char) stbi__get8(s
);
6281 if (p
!= 'P' || (t
!= '5' && t
!= '6')) {
6288 static stbi_uc
*stbi__pnm_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
6291 if (!stbi__pnm_info(s
, (int *)&s
->img_x
, (int *)&s
->img_y
, (int *)&s
->img_n
))
6297 out
= (stbi_uc
*) stbi__malloc(s
->img_n
* s
->img_x
* s
->img_y
);
6298 if (!out
) return stbi__errpuc("outofmem", "Out of memory");
6299 stbi__getn(s
, out
, s
->img_n
* s
->img_x
* s
->img_y
);
6301 if (req_comp
&& req_comp
!= s
->img_n
) {
6302 out
= stbi__convert_format(out
, s
->img_n
, req_comp
, s
->img_x
, s
->img_y
);
6303 if (out
== NULL
) return out
; // stbi__convert_format frees input on failure
6308 static int stbi__pnm_isspace(char c
)
6310 return c
== ' ' || c
== '\t' || c
== '\n' || c
== '\v' || c
== '\f' || c
== '\r';
6313 static void stbi__pnm_skip_whitespace(stbi__context
*s
, char *c
)
6316 while (!stbi__at_eof(s
) && stbi__pnm_isspace(*c
))
6317 *c
= (char) stbi__get8(s
);
6319 if (stbi__at_eof(s
) || *c
!= '#')
6322 while (!stbi__at_eof(s
) && *c
!= '\n' && *c
!= '\r' )
6323 *c
= (char) stbi__get8(s
);
6327 static int stbi__pnm_isdigit(char c
)
6329 return c
>= '0' && c
<= '9';
6332 static int stbi__pnm_getinteger(stbi__context
*s
, char *c
)
6336 while (!stbi__at_eof(s
) && stbi__pnm_isdigit(*c
)) {
6337 value
= value
*10 + (*c
- '0');
6338 *c
= (char) stbi__get8(s
);
6344 static int stbi__pnm_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
6352 p
= (char) stbi__get8(s
);
6353 t
= (char) stbi__get8(s
);
6354 if (p
!= 'P' || (t
!= '5' && t
!= '6')) {
6359 *comp
= (t
== '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
6361 c
= (char) stbi__get8(s
);
6362 stbi__pnm_skip_whitespace(s
, &c
);
6364 *x
= stbi__pnm_getinteger(s
, &c
); // read width
6365 stbi__pnm_skip_whitespace(s
, &c
);
6367 *y
= stbi__pnm_getinteger(s
, &c
); // read height
6368 stbi__pnm_skip_whitespace(s
, &c
);
6370 maxv
= stbi__pnm_getinteger(s
, &c
); // read max value
6373 return stbi__err("max value > 255", "PPM image not 8-bit");
6379 static int stbi__info_main(stbi__context
*s
, int *x
, int *y
, int *comp
)
6381 #ifndef STBI_NO_JPEG
6382 if (stbi__jpeg_info(s
, x
, y
, comp
)) return 1;
6386 if (stbi__png_info(s
, x
, y
, comp
)) return 1;
6390 if (stbi__gif_info(s
, x
, y
, comp
)) return 1;
6394 if (stbi__bmp_info(s
, x
, y
, comp
)) return 1;
6398 if (stbi__psd_info(s
, x
, y
, comp
)) return 1;
6402 if (stbi__pic_info(s
, x
, y
, comp
)) return 1;
6406 if (stbi__pnm_info(s
, x
, y
, comp
)) return 1;
6410 if (stbi__hdr_info(s
, x
, y
, comp
)) return 1;
6413 // test tga last because it's a crappy test!
6415 if (stbi__tga_info(s
, x
, y
, comp
))
6418 return stbi__err("unknown image type", "Image not of any known type, or corrupt");
6421 #ifndef STBI_NO_STDIO
6422 STBIDEF
int stbi_info(char const *filename
, int *x
, int *y
, int *comp
)
6424 FILE *f
= stbi__fopen(filename
, "rb");
6426 if (!f
) return stbi__err("can't fopen", "Unable to open file");
6427 result
= stbi_info_from_file(f
, x
, y
, comp
);
6432 STBIDEF
int stbi_info_from_file(FILE *f
, int *x
, int *y
, int *comp
)
6436 long pos
= ftell(f
);
6437 stbi__start_file(&s
, f
);
6438 r
= stbi__info_main(&s
,x
,y
,comp
);
6439 fseek(f
,pos
,SEEK_SET
);
6442 #endif // !STBI_NO_STDIO
6444 STBIDEF
int stbi_info_from_memory(stbi_uc
const *buffer
, int len
, int *x
, int *y
, int *comp
)
6447 stbi__start_mem(&s
,buffer
,len
);
6448 return stbi__info_main(&s
,x
,y
,comp
);
6451 STBIDEF
int stbi_info_from_callbacks(stbi_io_callbacks
const *c
, void *user
, int *x
, int *y
, int *comp
)
6454 stbi__start_callbacks(&s
, (stbi_io_callbacks
*) c
, user
);
6455 return stbi__info_main(&s
,x
,y
,comp
);
6458 #endif // STB_IMAGE_IMPLEMENTATION
6462 2.10 (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
6463 2.09 (2016-01-16) allow comments in PNM files
6464 16-bit-per-pixel TGA (not bit-per-component)
6465 info() for TGA could break due to .hdr handling
6466 info() for BMP to shares code instead of sloppy parse
6467 can use STBI_REALLOC_SIZED if allocator doesn't support realloc
6469 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
6470 2.07 (2015-09-13) fix compiler warnings
6471 partial animated GIF support
6472 limited 16-bpc PSD support
6473 #ifdef unused functions
6474 bug with < 92 byte PIC,PNM,HDR,TGA
6475 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
6476 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
6477 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
6478 2.03 (2015-04-12) extra corruption checking (mmozeiko)
6479 stbi_set_flip_vertically_on_load (nguillemot)
6480 fix NEON support; fix mingw support
6481 2.02 (2015-01-19) fix incorrect assert, fix warning
6482 2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
6483 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
6484 2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
6485 progressive JPEG (stb)
6486 PGM/PPM support (Ken Miller)
6487 STBI_MALLOC,STBI_REALLOC,STBI_FREE
6488 GIF bugfix -- seemingly never worked
6489 STBI_NO_*, STBI_ONLY_*
6490 1.48 (2014-12-14) fix incorrectly-named assert()
6491 1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
6493 fix bug in interlaced PNG with user-specified channel count (stb)
6495 fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
6497 fix MSVC-ARM internal compiler error by wrapping malloc
6499 various warning fixes from Ronny Chevalier
6501 fix MSVC-only compiler problem in code changed in 1.42
6503 don't define _CRT_SECURE_NO_WARNINGS (affects user code)
6504 fixes to stbi__cleanup_jpeg path
6505 added STBI_ASSERT to avoid requiring assert.h
6507 fix search&replace from 1.36 that messed up comments/error messages
6509 fix gcc struct-initialization warning
6511 fix to TGA optimization when req_comp != number of components in TGA;
6512 fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
6513 add support for BMP version 5 (more ignored fields)
6515 suppress MSVC warnings on integer casts truncating values
6516 fix accidental rename of 'skip' field of I/O
6518 remove duplicate typedef
6520 convert to header file single-file library
6521 if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
6524 fix broken STBI_SIMD path
6525 fix bug where stbi_load_from_file no longer left file pointer in correct place
6526 fix broken non-easy path for 32-bit BMP (possibly never used)
6527 TGA optimization by Arseny Kapoulkine
6529 use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
6531 make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
6533 support for "info" function for all supported filetypes (SpartanJ)
6535 a few more leak fixes, bug in PNG handling (SpartanJ)
6537 added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
6538 removed deprecated format-specific test/load functions
6539 removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
6540 error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
6541 fix inefficiency in decoding 32-bit BMP (David Woo)
6543 various warning fixes from Aurelien Pocheville
6545 fix bug in GIF palette transparency (SpartanJ)
6547 cast-to-stbi_uc to fix warnings
6549 fix bug in file buffering for PNG reported by SpartanJ
6551 refix trans_data warning (Won Chun)
6553 perf improvements reading from files on platforms with lock-heavy fgetc()
6554 minor perf improvements for jpeg
6555 deprecated type-specific functions so we'll get feedback if they're needed
6556 attempt to fix trans_data warning (Won Chun)
6557 1.23 fixed bug in iPhone support
6559 removed image *writing* support
6560 stbi_info support from Jetro Lauha
6561 GIF support from Jean-Marc Lienher
6562 iPhone PNG-extensions from James Brown
6563 warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
6564 1.21 fix use of 'stbi_uc' in header (reported by jon blow)
6565 1.20 added support for Softimage PIC, by Tom Seddon
6566 1.19 bug in interlaced PNG corruption check (found by ryg)
6568 fix a threading bug (local mutable static)
6569 1.17 support interlaced PNG
6570 1.16 major bugfix - stbi__convert_format converted one too many pixels
6571 1.15 initialize some fields for thread safety
6572 1.14 fix threadsafe conversion bug
6573 header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
6575 1.12 const qualifiers in the API
6576 1.11 Support installable IDCT, colorspace conversion routines
6577 1.10 Fixes for 64-bit (don't use "unsigned long")
6578 optimized upsampling by Fabian "ryg" Giesen
6579 1.09 Fix format-conversion for PSD code (bad global variables!)
6580 1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
6581 1.07 attempt to fix C++ warning/errors again
6582 1.06 attempt to fix C++ warning/errors again
6583 1.05 fix TGA loading to return correct *comp and use good luminance calc
6584 1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
6585 1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
6586 1.02 support for (subset of) HDR files, float interface for preferred access to them
6587 1.01 fix bug: possible bug in handling right-side up bmps... not sure
6588 fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
6589 1.00 interface to zlib that skips zlib header
6590 0.99 correct handling of alpha in palette
6591 0.98 TGA loader by lonesock; dynamically add loaders (untested)
6592 0.97 jpeg errors on too large a file; also catch another malloc failure
6593 0.96 fix detection of invalid v value - particleman@mollyrocket forum
6594 0.95 during header scan, seek to markers in case of padding
6595 0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
6596 0.93 handle jpegtran output; verbose errors
6597 0.92 read 4,8,16,24,32-bit BMP files of several formats
6598 0.91 output 24-bit Windows 3.0 BMP files
6599 0.90 fix a few more warnings; bump version number to approach 1.0
6600 0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
6601 0.60 fix compiling as c++
6602 0.59 fix warnings: merge Dave Moore's -Wall fixes
6603 0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian
6604 0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
6605 0.56 fix bug: zlib uncompressed mode len vs. nlen
6606 0.55 fix bug: restart_interval not initialized to 0
6607 0.54 allow NULL for 'int *comp'
6608 0.53 fix bug in png 3->4; speedup png decoding
6609 0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments
6610 0.51 obey req_comp requests, 1-component jpegs return as 1-component,
6611 on 'test' only check type, not whether we support this variant
6613 first released version