1 /* stb_image - v2.08 - public domain image loader - http://nothings.org/stb_image.h
2 no warranty implied; use at your own risk
5 #define STB_IMAGE_IMPLEMENTATION
6 before you include this file in *one* C or C++ file to create the implementation.
8 // i.e. it should look like this:
12 #define STB_IMAGE_IMPLEMENTATION
13 #include "stb_image.h"
15 You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16 And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
20 Primarily of interest to game developers and other people who can
21 avoid problematic images and only need the trivial interface
23 JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24 PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
26 TGA (not sure what subset, if a subset)
28 PSD (composited view only, no extra channels, 8/16 bit-per-channel)
30 GIF (*comp always reports as 4-channel)
31 HDR (radiance rgbE format)
33 PNM (PPM and PGM binary only)
35 Animated GIF still needs a proper API, but here's one way to do it:
36 http://gist.github.com/urraka/685d9a6340b26b830d49
38 - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
39 - decode from arbitrary I/O callbacks
40 - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
42 Full documentation under "DOCUMENTATION" below.
45 Revision 2.00 release notes:
47 - Progressive JPEG is now supported.
49 - PPM and PGM binary formats are now supported, thanks to Ken Miller.
51 - x86 platforms now make use of SSE2 SIMD instructions for
52 JPEG decoding, and ARM platforms can use NEON SIMD if requested.
53 This work was done by Fabian "ryg" Giesen. SSE2 is used by
54 default, but NEON must be enabled explicitly; see docs.
56 With other JPEG optimizations included in this version, we see
57 2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
58 on a JPEG on an ARM machine, relative to previous versions of this
59 library. The same results will not obtain for all JPGs and for all
60 x86/ARM machines. (Note that progressive JPEGs are significantly
61 slower to decode than regular JPEGs.) This doesn't mean that this
62 is the fastest JPEG decoder in the land; rather, it brings it
63 closer to parity with standard libraries. If you want the fastest
64 decode, look elsewhere. (See "Philosophy" section of docs below.)
66 See final bullet items below for more info on SIMD.
68 - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
69 the memory allocator. Unlike other STBI libraries, these macros don't
70 support a context parameter, so if you need to pass a context in to
71 the allocator, you'll have to store it in a global or a thread-local
74 - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
76 STBI_NO_HDR: suppress implementation of .hdr reader format
77 STBI_NO_LINEAR: suppress high-dynamic-range light-linear float API
79 - You can suppress implementation of any of the decoders to reduce
80 your code footprint by #defining one or more of the following
81 symbols before creating the implementation.
91 STBI_NO_PNM (.ppm and .pgm)
93 - You can request *only* certain decoders and suppress all other ones
94 (this will be more forward-compatible, as addition of new decoders
95 doesn't require you to disable them explicitly):
105 STBI_ONLY_PNM (.ppm and .pgm)
107 Note that you can define multiples of these, and you will get all
108 of them ("only x" and "only y" is interpreted to mean "only x&y").
110 - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
111 want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
113 - Compilation of all SIMD code can be suppressed with
115 It should not be necessary to disable SIMD unless you have issues
116 compiling (e.g. using an x86 compiler which doesn't support SSE
117 intrinsics or that doesn't support the method used to detect
118 SSE2 support at run-time), and even those can be reported as
119 bugs so I can refine the built-in compile-time checking to be
122 - The old STBI_SIMD system which allowed installing a user-defined
123 IDCT etc. has been removed. If you need this, don't upgrade. My
124 assumption is that almost nobody was doing this, and those who
125 were will find the built-in SIMD more satisfactory anyway.
127 - RGB values computed for JPEG images are slightly different from
128 previous versions of stb_image. (This is due to using less
129 integer precision in SIMD.) The C code has been adjusted so
130 that the same RGB values will be computed regardless of whether
131 SIMD support is available, so your app should always produce
132 consistent results. But these results are slightly different from
133 previous versions. (Specifically, about 3% of available YCbCr values
134 will compute different RGB results from pre-1.49 versions by +-1;
135 most of the deviating values are one smaller in the G channel.)
137 - If you must produce consistent results with previous versions of
138 stb_image, #define STBI_JPEG_OLD and you will get the same results
139 you used to; however, you will not get the SIMD speedups for
140 the YCbCr-to-RGB conversion step (although you should still see
141 significant JPEG speedup from the other changes).
143 Please note that STBI_JPEG_OLD is a temporary feature; it will be
144 removed in future versions of the library. It is only intended for
145 near-term back-compatibility use.
148 Latest revision history:
149 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
150 2.07 (2015-09-13) partial animated GIF support
151 limited 16-bit PSD support
152 minor bugs, code cleanup, and compiler warnings
153 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
154 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
155 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
156 2.03 (2015-04-12) additional corruption checking
157 stbi_set_flip_vertically_on_load
158 fix NEON support; fix mingw support
159 2.02 (2015-01-19) fix incorrect assert, fix warning
160 2.01 (2015-01-17) fix various warnings
161 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
162 2.00 (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
165 STBI_MALLOC,STBI_REALLOC,STBI_FREE
166 STBI_NO_*, STBI_ONLY_*
168 1.48 (2014-12-14) fix incorrectly-named assert()
169 1.47 (2014-12-14) 1/2/4-bit PNG support (both grayscale and paletted)
171 fix bug in interlaced PNG with user-specified channel count
173 See end of file for full revision history.
176 ============================ Contributors =========================
178 Image formats Bug fixes & warning fixes
179 Sean Barrett (jpeg, png, bmp) Marc LeBlanc
180 Nicolas Schulz (hdr, psd) Christpher Lloyd
181 Jonathan Dummer (tga) Dave Moore
182 Jean-Marc Lienher (gif) Won Chun
183 Tom Seddon (pic) the Horde3D community
184 Thatcher Ulrich (psd) Janez Zemva
185 Ken Miller (pgm, ppm) Jonathan Blow
186 urraka@github (animated gif) Laurent Gomila
190 Extensions, features Martin Golini
191 Jetro Lauha (stbi_info) Roy Eltham
192 Martin "SpartanJ" Golini (stbi_info) Luke Graham
193 James "moose2000" Brown (iPhone PNG) Thomas Ruf
194 Ben "Disch" Wenger (io callbacks) John Bartholomew
195 Omar Cornut (1/2/4-bit PNG) Ken Hamada
196 Nicolas Guillemot (vertical flip) Cort Stratton
197 Richard Mitton (16-bit PSD) Blazej Dariusz Roszkowski
205 Optimizations & bugfixes Michal Cichon
206 Fabian "ryg" Giesen Tero Hanninen
207 Arseny Kapoulkine Sergio Gonzalez
210 If your name should be here but Martins Mozeiko
211 isn't, let Sean know. Joseph Thomson
214 Michaelangel007@github
219 This software is in the public domain. Where that dedication is not
220 recognized, you are granted a perpetual, irrevocable license to copy,
221 distribute, and modify this file as you see fit.
225 #ifndef STBI_INCLUDE_STB_IMAGE_H
226 #define STBI_INCLUDE_STB_IMAGE_H
231 // - no 16-bit-per-channel PNG
232 // - no 12-bit-per-channel JPEG
233 // - no JPEGs with arithmetic coding
235 // - GIF always returns *comp=4
237 // Basic usage (see HDR discussion below for HDR usage):
239 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
240 // // ... process data if not NULL ...
241 // // ... x = width, y = height, n = # 8-bit components per pixel ...
242 // // ... replace '0' with '1'..'4' to force that many components per pixel
243 // // ... but 'n' will always be the number that it would have been if you said 0
244 // stbi_image_free(data)
246 // Standard parameters:
247 // int *x -- outputs image width in pixels
248 // int *y -- outputs image height in pixels
249 // int *comp -- outputs # of image components in image file
250 // int req_comp -- if non-zero, # of image components requested in result
252 // The return value from an image loader is an 'unsigned char *' which points
253 // to the pixel data, or NULL on an allocation failure or if the image is
254 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
255 // with each pixel consisting of N interleaved 8-bit components; the first
256 // pixel pointed to is top-left-most in the image. There is no padding between
257 // image scanlines or between pixels, regardless of format. The number of
258 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
259 // If req_comp is non-zero, *comp has the number of components that _would_
260 // have been output otherwise. E.g. if you set req_comp to 4, you will always
261 // get RGBA output, but you can check *comp to see if it's trivially opaque
262 // because e.g. there were only 3 channels in the source image.
264 // An output image with N components has the following components interleaved
265 // in this order in each pixel:
267 // N=#comp components
270 // 3 red, green, blue
271 // 4 red, green, blue, alpha
273 // If image loading fails for any reason, the return value will be NULL,
274 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
275 // can be queried for an extremely brief, end-user unfriendly explanation
276 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
277 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
278 // more user-friendly ones.
280 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
282 // ===========================================================================
286 // stb libraries are designed with the following priorities:
289 // 2. easy to maintain
290 // 3. good performance
292 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
293 // and for best performance I may provide less-easy-to-use APIs that give higher
294 // performance, in addition to the easy to use ones. Nevertheless, it's important
295 // to keep in mind that from the standpoint of you, a client of this library,
296 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
298 // Some secondary priorities arise directly from the first two, some of which
299 // make more explicit reasons why performance can't be emphasized.
301 // - Portable ("ease of use")
302 // - Small footprint ("easy to maintain")
303 // - No dependencies ("ease of use")
305 // ===========================================================================
309 // I/O callbacks allow you to read from arbitrary sources, like packaged
310 // files or some other source. Data read from callbacks are processed
311 // through a small internal buffer (currently 128 bytes) to try to reduce
314 // The three functions you must define are "read" (reads some bytes of data),
315 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
317 // ===========================================================================
321 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
322 // supported by the compiler. For ARM Neon support, you must explicitly
325 // (The old do-it-yourself SIMD API is no longer supported in the current
328 // On x86, SSE2 will automatically be used when available based on a run-time
329 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
330 // the typical path is to have separate builds for NEON and non-NEON devices
331 // (at least this is true for iOS and Android). Therefore, the NEON support is
332 // toggled by a build flag: define STBI_NEON to get NEON loops.
334 // The output of the JPEG decoder is slightly different from versions where
335 // SIMD support was introduced (that is, for versions before 1.49). The
336 // difference is only +-1 in the 8-bit RGB channels, and only on a small
337 // fraction of pixels. You can force the pre-1.49 behavior by defining
338 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
339 // and hence cost some performance.
341 // If for some reason you do not want to use any of SIMD code, or if
342 // you have issues compiling it, you can disable it entirely by
343 // defining STBI_NO_SIMD.
345 // ===========================================================================
347 // HDR image support (disable by defining STBI_NO_HDR)
349 // stb_image now supports loading HDR images in general, and currently
350 // the Radiance .HDR file format, although the support is provided
351 // generically. You can still load any file through the existing interface;
352 // if you attempt to load an HDR file, it will be automatically remapped to
353 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
354 // both of these constants can be reconfigured through this interface:
356 // stbi_hdr_to_ldr_gamma(2.2f);
357 // stbi_hdr_to_ldr_scale(1.0f);
359 // (note, do not use _inverse_ constants; stbi_image will invert them
362 // Additionally, there is a new, parallel interface for loading files as
363 // (linear) floats to preserve the full dynamic range:
365 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
367 // If you load LDR images through this interface, those images will
368 // be promoted to floating point values, run through the inverse of
369 // constants corresponding to the above:
371 // stbi_ldr_to_hdr_scale(1.0f);
372 // stbi_ldr_to_hdr_gamma(2.2f);
374 // Finally, given a filename (or an open file or memory block--see header
375 // file for details) containing image data, you can query for the "most
376 // appropriate" interface to use (that is, whether the image is HDR or
379 // stbi_is_hdr(char *filename);
381 // ===========================================================================
383 // iPhone PNG support:
385 // By default we convert iphone-formatted PNGs back to RGB, even though
386 // they are internally encoded differently. You can disable this conversion
387 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
388 // you will always just get the native iphone "format" through (which
389 // is BGR stored in RGB).
391 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
392 // pixel to remove any premultiplied alpha *only* if the image file explicitly
393 // says there's premultiplied data (currently only happens in iPhone images,
394 // and only if iPhone convert-to-rgb processing is on).
398 #ifndef STBI_NO_STDIO
400 #endif // STBI_NO_STDIO
402 #define STBI_VERSION 1
406 STBI_default
= 0, // only used for req_comp
414 typedef unsigned char stbi_uc
;
420 #ifdef STB_IMAGE_STATIC
421 #define STBIDEF static
423 #define STBIDEF extern
426 //////////////////////////////////////////////////////////////////////////////
428 // PRIMARY API - works on images of any type
432 // load image by filename, open file, or memory buffer
437 int (*read
) (void *user
,char *data
,int size
); // fill 'data' with 'size' bytes. return number of bytes actually read
438 void (*skip
) (void *user
,int n
); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
439 int (*eof
) (void *user
); // returns nonzero if we are at end of file/data
442 STBIDEF stbi_uc
*stbi_load (char const *filename
, int *x
, int *y
, int *comp
, int req_comp
);
443 STBIDEF stbi_uc
*stbi_load_from_memory (stbi_uc
const *buffer
, int len
, int *x
, int *y
, int *comp
, int req_comp
);
444 STBIDEF stbi_uc
*stbi_load_from_callbacks(stbi_io_callbacks
const *clbk
, void *user
, int *x
, int *y
, int *comp
, int req_comp
);
446 #ifndef STBI_NO_STDIO
447 STBIDEF stbi_uc
*stbi_load_from_file (FILE *f
, int *x
, int *y
, int *comp
, int req_comp
);
448 // for stbi_load_from_file, file pointer is left pointing immediately after image
451 #ifndef STBI_NO_LINEAR
452 STBIDEF
float *stbi_loadf (char const *filename
, int *x
, int *y
, int *comp
, int req_comp
);
453 STBIDEF
float *stbi_loadf_from_memory (stbi_uc
const *buffer
, int len
, int *x
, int *y
, int *comp
, int req_comp
);
454 STBIDEF
float *stbi_loadf_from_callbacks (stbi_io_callbacks
const *clbk
, void *user
, int *x
, int *y
, int *comp
, int req_comp
);
456 #ifndef STBI_NO_STDIO
457 STBIDEF
float *stbi_loadf_from_file (FILE *f
, int *x
, int *y
, int *comp
, int req_comp
);
462 STBIDEF
void stbi_hdr_to_ldr_gamma(float gamma
);
463 STBIDEF
void stbi_hdr_to_ldr_scale(float scale
);
466 #ifndef STBI_NO_LINEAR
467 STBIDEF
void stbi_ldr_to_hdr_gamma(float gamma
);
468 STBIDEF
void stbi_ldr_to_hdr_scale(float scale
);
469 #endif // STBI_NO_HDR
471 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
472 STBIDEF
int stbi_is_hdr_from_callbacks(stbi_io_callbacks
const *clbk
, void *user
);
473 STBIDEF
int stbi_is_hdr_from_memory(stbi_uc
const *buffer
, int len
);
474 #ifndef STBI_NO_STDIO
475 STBIDEF
int stbi_is_hdr (char const *filename
);
476 STBIDEF
int stbi_is_hdr_from_file(FILE *f
);
477 #endif // STBI_NO_STDIO
480 // get a VERY brief reason for failure
482 STBIDEF
const char *stbi_failure_reason (void);
484 // free the loaded image -- this is just free()
485 STBIDEF
void stbi_image_free (void *retval_from_stbi_load
);
487 // get image dimensions & components without fully decoding
488 STBIDEF
int stbi_info_from_memory(stbi_uc
const *buffer
, int len
, int *x
, int *y
, int *comp
);
489 STBIDEF
int stbi_info_from_callbacks(stbi_io_callbacks
const *clbk
, void *user
, int *x
, int *y
, int *comp
);
491 #ifndef STBI_NO_STDIO
492 STBIDEF
int stbi_info (char const *filename
, int *x
, int *y
, int *comp
);
493 STBIDEF
int stbi_info_from_file (FILE *f
, int *x
, int *y
, int *comp
);
499 // for image formats that explicitly notate that they have premultiplied alpha,
500 // we just return the colors as stored in the file. set this flag to force
501 // unpremultiplication. results are undefined if the unpremultiply overflow.
502 STBIDEF
void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply
);
504 // indicate whether we should process iphone images back to canonical format,
505 // or just pass them through "as-is"
506 STBIDEF
void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert
);
508 // flip the image vertically, so the first pixel in the output array is the bottom left
509 STBIDEF
void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip
);
511 // ZLIB client - used by PNG, available for other purposes
513 STBIDEF
char *stbi_zlib_decode_malloc_guesssize(const char *buffer
, int len
, int initial_size
, int *outlen
);
514 STBIDEF
char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer
, int len
, int initial_size
, int *outlen
, int parse_header
);
515 STBIDEF
char *stbi_zlib_decode_malloc(const char *buffer
, int len
, int *outlen
);
516 STBIDEF
int stbi_zlib_decode_buffer(char *obuffer
, int olen
, const char *ibuffer
, int ilen
);
518 STBIDEF
char *stbi_zlib_decode_noheader_malloc(const char *buffer
, int len
, int *outlen
);
519 STBIDEF
int stbi_zlib_decode_noheader_buffer(char *obuffer
, int olen
, const char *ibuffer
, int ilen
);
528 //// end header file /////////////////////////////////////////////////////
529 #endif // STBI_INCLUDE_STB_IMAGE_H
531 #ifdef STB_IMAGE_IMPLEMENTATION
533 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
534 || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
535 || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
536 || defined(STBI_ONLY_ZLIB)
537 #ifndef STBI_ONLY_JPEG
540 #ifndef STBI_ONLY_PNG
543 #ifndef STBI_ONLY_BMP
546 #ifndef STBI_ONLY_PSD
549 #ifndef STBI_ONLY_TGA
552 #ifndef STBI_ONLY_GIF
555 #ifndef STBI_ONLY_HDR
558 #ifndef STBI_ONLY_PIC
561 #ifndef STBI_ONLY_PNM
566 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
572 #include <stddef.h> // ptrdiff_t on osx
576 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
577 #include <math.h> // ldexp
580 #ifndef STBI_NO_STDIO
586 #define STBI_ASSERT(x) assert(x)
592 #define stbi_inline inline
597 #define stbi_inline __forceinline
602 typedef unsigned short stbi__uint16
;
603 typedef signed short stbi__int16
;
604 typedef unsigned int stbi__uint32
;
605 typedef signed int stbi__int32
;
608 typedef uint16_t stbi__uint16
;
609 typedef int16_t stbi__int16
;
610 typedef uint32_t stbi__uint32
;
611 typedef int32_t stbi__int32
;
614 // should produce compiler error if size is wrong
615 typedef unsigned char validate_uint32
[sizeof(stbi__uint32
)==4 ? 1 : -1];
618 #define STBI_NOTUSED(v) (void)(v)
620 #define STBI_NOTUSED(v) (void)sizeof(v)
624 #define STBI_HAS_LROTL
627 #ifdef STBI_HAS_LROTL
628 #define stbi_lrot(x,y) _lrotl(x,y)
630 #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (32 - (y))))
633 #if defined(STBI_MALLOC) && defined(STBI_FREE) && defined(STBI_REALLOC)
635 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC)
638 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC."
642 #define STBI_MALLOC(sz) malloc(sz)
643 #define STBI_REALLOC(p,sz) realloc(p,sz)
644 #define STBI_FREE(p) free(p)
648 #if defined(__x86_64__) || defined(_M_X64)
649 #define STBI__X64_TARGET
650 #elif defined(__i386) || defined(_M_IX86)
651 #define STBI__X86_TARGET
654 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
655 // NOTE: not clear do we actually need this for the 64-bit path?
656 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
657 // (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
658 // this is just broken and gcc are jerks for not fixing it properly
659 // http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
663 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
664 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
666 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
667 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
668 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
669 // simultaneously enabling "-mstackrealign".
671 // See https://github.com/nothings/stb/issues/81 for more information.
673 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
674 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
678 #if !defined(STBI_NO_SIMD) && defined(STBI__X86_TARGET)
680 #include <emmintrin.h>
684 #if _MSC_VER >= 1400 // not VC6
685 #include <intrin.h> // __cpuid
686 static int stbi__cpuid3(void)
693 static int stbi__cpuid3(void)
705 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
707 static int stbi__sse2_available()
709 int info3
= stbi__cpuid3();
710 return ((info3
>> 26) & 1) != 0;
712 #else // assume GCC-style if not VC++
713 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
715 static int stbi__sse2_available()
717 #if defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 408 // GCC 4.8 or later
718 // GCC 4.8+ has a nice way to do this
719 return __builtin_cpu_supports("sse2");
721 // portable way to do this, preferably without using GCC inline ASM?
722 // just bail for now.
730 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
735 #include <arm_neon.h>
736 // assume GCC or Clang on ARM targets
737 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
740 #ifndef STBI_SIMD_ALIGN
741 #define STBI_SIMD_ALIGN(type, name) type name
744 ///////////////////////////////////////////////
746 // stbi__context struct and start_xxx functions
748 // stbi__context structure is our basic context used by all images, so it
749 // contains all the IO context, plus some basic image information
752 stbi__uint32 img_x
, img_y
;
753 int img_n
, img_out_n
;
755 stbi_io_callbacks io
;
758 int read_from_callbacks
;
760 stbi_uc buffer_start
[128];
762 stbi_uc
*img_buffer
, *img_buffer_end
;
763 stbi_uc
*img_buffer_original
, *img_buffer_original_end
;
767 static void stbi__refill_buffer(stbi__context
*s
);
769 // initialize a memory-decode context
770 static void stbi__start_mem(stbi__context
*s
, stbi_uc
const *buffer
, int len
)
773 s
->read_from_callbacks
= 0;
774 s
->img_buffer
= s
->img_buffer_original
= (stbi_uc
*) buffer
;
775 s
->img_buffer_end
= s
->img_buffer_original_end
= (stbi_uc
*) buffer
+len
;
778 // initialize a callback-based context
779 static void stbi__start_callbacks(stbi__context
*s
, stbi_io_callbacks
*c
, void *user
)
782 s
->io_user_data
= user
;
783 s
->buflen
= sizeof(s
->buffer_start
);
784 s
->read_from_callbacks
= 1;
785 s
->img_buffer_original
= s
->buffer_start
;
786 stbi__refill_buffer(s
);
787 s
->img_buffer_original_end
= s
->img_buffer_end
;
790 #ifndef STBI_NO_STDIO
792 static int stbi__stdio_read(void *user
, char *data
, int size
)
794 return (int) fread(data
,1,size
,(FILE*) user
);
797 static void stbi__stdio_skip(void *user
, int n
)
799 fseek((FILE*) user
, n
, SEEK_CUR
);
802 static int stbi__stdio_eof(void *user
)
804 return feof((FILE*) user
);
807 static stbi_io_callbacks stbi__stdio_callbacks
=
814 static void stbi__start_file(stbi__context
*s
, FILE *f
)
816 stbi__start_callbacks(s
, &stbi__stdio_callbacks
, (void *) f
);
819 //static void stop_file(stbi__context *s) { }
821 #endif // !STBI_NO_STDIO
823 static void stbi__rewind(stbi__context
*s
)
825 // conceptually rewind SHOULD rewind to the beginning of the stream,
826 // but we just rewind to the beginning of the initial buffer, because
827 // we only use it after doing 'test', which only ever looks at at most 92 bytes
828 s
->img_buffer
= s
->img_buffer_original
;
829 s
->img_buffer_end
= s
->img_buffer_original_end
;
833 static int stbi__jpeg_test(stbi__context
*s
);
834 static stbi_uc
*stbi__jpeg_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
835 static int stbi__jpeg_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
839 static int stbi__png_test(stbi__context
*s
);
840 static stbi_uc
*stbi__png_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
841 static int stbi__png_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
845 static int stbi__bmp_test(stbi__context
*s
);
846 static stbi_uc
*stbi__bmp_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
847 static int stbi__bmp_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
851 static int stbi__tga_test(stbi__context
*s
);
852 static stbi_uc
*stbi__tga_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
853 static int stbi__tga_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
857 static int stbi__psd_test(stbi__context
*s
);
858 static stbi_uc
*stbi__psd_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
859 static int stbi__psd_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
863 static int stbi__hdr_test(stbi__context
*s
);
864 static float *stbi__hdr_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
865 static int stbi__hdr_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
869 static int stbi__pic_test(stbi__context
*s
);
870 static stbi_uc
*stbi__pic_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
871 static int stbi__pic_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
875 static int stbi__gif_test(stbi__context
*s
);
876 static stbi_uc
*stbi__gif_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
877 static int stbi__gif_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
881 static int stbi__pnm_test(stbi__context
*s
);
882 static stbi_uc
*stbi__pnm_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
);
883 static int stbi__pnm_info(stbi__context
*s
, int *x
, int *y
, int *comp
);
886 // this is not threadsafe
887 static const char *stbi__g_failure_reason
;
889 STBIDEF
const char *stbi_failure_reason(void)
891 return stbi__g_failure_reason
;
894 static int stbi__err(const char *str
)
896 stbi__g_failure_reason
= str
;
900 static void *stbi__malloc(size_t size
)
902 return STBI_MALLOC(size
);
906 // stbi__errpf - error returning pointer to float
907 // stbi__errpuc - error returning pointer to unsigned char
909 #ifdef STBI_NO_FAILURE_STRINGS
910 #define stbi__err(x,y) 0
911 #elif defined(STBI_FAILURE_USERMSG)
912 #define stbi__err(x,y) stbi__err(y)
914 #define stbi__err(x,y) stbi__err(x)
917 #define stbi__errpf(x,y) ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
918 #define stbi__errpuc(x,y) ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
920 STBIDEF
void stbi_image_free(void *retval_from_stbi_load
)
922 STBI_FREE(retval_from_stbi_load
);
925 #ifndef STBI_NO_LINEAR
926 static float *stbi__ldr_to_hdr(stbi_uc
*data
, int x
, int y
, int comp
);
930 static stbi_uc
*stbi__hdr_to_ldr(float *data
, int x
, int y
, int comp
);
933 static int stbi__vertically_flip_on_load
= 0;
935 STBIDEF
void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip
)
937 stbi__vertically_flip_on_load
= flag_true_if_should_flip
;
940 static unsigned char *stbi__load_main(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
943 if (stbi__jpeg_test(s
)) return stbi__jpeg_load(s
,x
,y
,comp
,req_comp
);
946 if (stbi__png_test(s
)) return stbi__png_load(s
,x
,y
,comp
,req_comp
);
949 if (stbi__bmp_test(s
)) return stbi__bmp_load(s
,x
,y
,comp
,req_comp
);
952 if (stbi__gif_test(s
)) return stbi__gif_load(s
,x
,y
,comp
,req_comp
);
955 if (stbi__psd_test(s
)) return stbi__psd_load(s
,x
,y
,comp
,req_comp
);
958 if (stbi__pic_test(s
)) return stbi__pic_load(s
,x
,y
,comp
,req_comp
);
961 if (stbi__pnm_test(s
)) return stbi__pnm_load(s
,x
,y
,comp
,req_comp
);
965 if (stbi__hdr_test(s
)) {
966 float *hdr
= stbi__hdr_load(s
, x
,y
,comp
,req_comp
);
967 return stbi__hdr_to_ldr(hdr
, *x
, *y
, req_comp
? req_comp
: *comp
);
972 // test tga last because it's a crappy test!
973 if (stbi__tga_test(s
))
974 return stbi__tga_load(s
,x
,y
,comp
,req_comp
);
977 return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
980 static unsigned char *stbi__load_flip(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
982 unsigned char *result
= stbi__load_main(s
, x
, y
, comp
, req_comp
);
984 if (stbi__vertically_flip_on_load
&& result
!= NULL
) {
986 int depth
= req_comp
? req_comp
: *comp
;
990 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
991 for (row
= 0; row
< (h
>>1); row
++) {
992 for (col
= 0; col
< w
; col
++) {
993 for (z
= 0; z
< depth
; z
++) {
994 temp
= result
[(row
* w
+ col
) * depth
+ z
];
995 result
[(row
* w
+ col
) * depth
+ z
] = result
[((h
- row
- 1) * w
+ col
) * depth
+ z
];
996 result
[((h
- row
- 1) * w
+ col
) * depth
+ z
] = temp
;
1006 static void stbi__float_postprocess(float *result
, int *x
, int *y
, int *comp
, int req_comp
)
1008 if (stbi__vertically_flip_on_load
&& result
!= NULL
) {
1010 int depth
= req_comp
? req_comp
: *comp
;
1014 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
1015 for (row
= 0; row
< (h
>>1); row
++) {
1016 for (col
= 0; col
< w
; col
++) {
1017 for (z
= 0; z
< depth
; z
++) {
1018 temp
= result
[(row
* w
+ col
) * depth
+ z
];
1019 result
[(row
* w
+ col
) * depth
+ z
] = result
[((h
- row
- 1) * w
+ col
) * depth
+ z
];
1020 result
[((h
- row
- 1) * w
+ col
) * depth
+ z
] = temp
;
1028 #ifndef STBI_NO_STDIO
1030 static FILE *stbi__fopen(char const *filename
, char const *mode
)
1033 #if defined(_MSC_VER) && _MSC_VER >= 1400
1034 if (0 != fopen_s(&f
, filename
, mode
))
1037 f
= fopen(filename
, mode
);
1043 STBIDEF stbi_uc
*stbi_load(char const *filename
, int *x
, int *y
, int *comp
, int req_comp
)
1045 FILE *f
= stbi__fopen(filename
, "rb");
1046 unsigned char *result
;
1047 if (!f
) return stbi__errpuc("can't fopen", "Unable to open file");
1048 result
= stbi_load_from_file(f
,x
,y
,comp
,req_comp
);
1053 STBIDEF stbi_uc
*stbi_load_from_file(FILE *f
, int *x
, int *y
, int *comp
, int req_comp
)
1055 unsigned char *result
;
1057 stbi__start_file(&s
,f
);
1058 result
= stbi__load_flip(&s
,x
,y
,comp
,req_comp
);
1060 // need to 'unget' all the characters in the IO buffer
1061 fseek(f
, - (int) (s
.img_buffer_end
- s
.img_buffer
), SEEK_CUR
);
1065 #endif //!STBI_NO_STDIO
1067 STBIDEF stbi_uc
*stbi_load_from_memory(stbi_uc
const *buffer
, int len
, int *x
, int *y
, int *comp
, int req_comp
)
1070 stbi__start_mem(&s
,buffer
,len
);
1071 return stbi__load_flip(&s
,x
,y
,comp
,req_comp
);
1074 STBIDEF stbi_uc
*stbi_load_from_callbacks(stbi_io_callbacks
const *clbk
, void *user
, int *x
, int *y
, int *comp
, int req_comp
)
1077 stbi__start_callbacks(&s
, (stbi_io_callbacks
*) clbk
, user
);
1078 return stbi__load_flip(&s
,x
,y
,comp
,req_comp
);
1081 #ifndef STBI_NO_LINEAR
1082 static float *stbi__loadf_main(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
1084 unsigned char *data
;
1086 if (stbi__hdr_test(s
)) {
1087 float *hdr_data
= stbi__hdr_load(s
,x
,y
,comp
,req_comp
);
1089 stbi__float_postprocess(hdr_data
,x
,y
,comp
,req_comp
);
1093 data
= stbi__load_flip(s
, x
, y
, comp
, req_comp
);
1095 return stbi__ldr_to_hdr(data
, *x
, *y
, req_comp
? req_comp
: *comp
);
1096 return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1099 STBIDEF
float *stbi_loadf_from_memory(stbi_uc
const *buffer
, int len
, int *x
, int *y
, int *comp
, int req_comp
)
1102 stbi__start_mem(&s
,buffer
,len
);
1103 return stbi__loadf_main(&s
,x
,y
,comp
,req_comp
);
1106 STBIDEF
float *stbi_loadf_from_callbacks(stbi_io_callbacks
const *clbk
, void *user
, int *x
, int *y
, int *comp
, int req_comp
)
1109 stbi__start_callbacks(&s
, (stbi_io_callbacks
*) clbk
, user
);
1110 return stbi__loadf_main(&s
,x
,y
,comp
,req_comp
);
1113 #ifndef STBI_NO_STDIO
1114 STBIDEF
float *stbi_loadf(char const *filename
, int *x
, int *y
, int *comp
, int req_comp
)
1117 FILE *f
= stbi__fopen(filename
, "rb");
1118 if (!f
) return stbi__errpf("can't fopen", "Unable to open file");
1119 result
= stbi_loadf_from_file(f
,x
,y
,comp
,req_comp
);
1124 STBIDEF
float *stbi_loadf_from_file(FILE *f
, int *x
, int *y
, int *comp
, int req_comp
)
1127 stbi__start_file(&s
,f
);
1128 return stbi__loadf_main(&s
,x
,y
,comp
,req_comp
);
1130 #endif // !STBI_NO_STDIO
1132 #endif // !STBI_NO_LINEAR
1134 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1135 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1138 STBIDEF
int stbi_is_hdr_from_memory(stbi_uc
const *buffer
, int len
)
1142 stbi__start_mem(&s
,buffer
,len
);
1143 return stbi__hdr_test(&s
);
1145 STBI_NOTUSED(buffer
);
1151 #ifndef STBI_NO_STDIO
1152 STBIDEF
int stbi_is_hdr (char const *filename
)
1154 FILE *f
= stbi__fopen(filename
, "rb");
1157 result
= stbi_is_hdr_from_file(f
);
1163 STBIDEF
int stbi_is_hdr_from_file(FILE *f
)
1167 stbi__start_file(&s
,f
);
1168 return stbi__hdr_test(&s
);
1174 #endif // !STBI_NO_STDIO
1176 STBIDEF
int stbi_is_hdr_from_callbacks(stbi_io_callbacks
const *clbk
, void *user
)
1180 stbi__start_callbacks(&s
, (stbi_io_callbacks
*) clbk
, user
);
1181 return stbi__hdr_test(&s
);
1189 static float stbi__h2l_gamma_i
=1.0f
/2.2f
, stbi__h2l_scale_i
=1.0f
;
1190 static float stbi__l2h_gamma
=2.2f
, stbi__l2h_scale
=1.0f
;
1192 #ifndef STBI_NO_LINEAR
1193 STBIDEF
void stbi_ldr_to_hdr_gamma(float gamma
) { stbi__l2h_gamma
= gamma
; }
1194 STBIDEF
void stbi_ldr_to_hdr_scale(float scale
) { stbi__l2h_scale
= scale
; }
1197 STBIDEF
void stbi_hdr_to_ldr_gamma(float gamma
) { stbi__h2l_gamma_i
= 1/gamma
; }
1198 STBIDEF
void stbi_hdr_to_ldr_scale(float scale
) { stbi__h2l_scale_i
= 1/scale
; }
1201 //////////////////////////////////////////////////////////////////////////////
1203 // Common code used by all image loaders
1213 static void stbi__refill_buffer(stbi__context
*s
)
1215 int n
= (s
->io
.read
)(s
->io_user_data
,(char*)s
->buffer_start
,s
->buflen
);
1217 // at end of file, treat same as if from memory, but need to handle case
1218 // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1219 s
->read_from_callbacks
= 0;
1220 s
->img_buffer
= s
->buffer_start
;
1221 s
->img_buffer_end
= s
->buffer_start
+1;
1224 s
->img_buffer
= s
->buffer_start
;
1225 s
->img_buffer_end
= s
->buffer_start
+ n
;
1229 stbi_inline
static stbi_uc
stbi__get8(stbi__context
*s
)
1231 if (s
->img_buffer
< s
->img_buffer_end
)
1232 return *s
->img_buffer
++;
1233 if (s
->read_from_callbacks
) {
1234 stbi__refill_buffer(s
);
1235 return *s
->img_buffer
++;
1240 stbi_inline
static int stbi__at_eof(stbi__context
*s
)
1243 if (!(s
->io
.eof
)(s
->io_user_data
)) return 0;
1244 // if feof() is true, check if buffer = end
1245 // special case: we've only got the special 0 character at the end
1246 if (s
->read_from_callbacks
== 0) return 1;
1249 return s
->img_buffer
>= s
->img_buffer_end
;
1252 static void stbi__skip(stbi__context
*s
, int n
)
1255 s
->img_buffer
= s
->img_buffer_end
;
1259 int blen
= (int) (s
->img_buffer_end
- s
->img_buffer
);
1261 s
->img_buffer
= s
->img_buffer_end
;
1262 (s
->io
.skip
)(s
->io_user_data
, n
- blen
);
1269 static int stbi__getn(stbi__context
*s
, stbi_uc
*buffer
, int n
)
1272 int blen
= (int) (s
->img_buffer_end
- s
->img_buffer
);
1276 memcpy(buffer
, s
->img_buffer
, blen
);
1278 count
= (s
->io
.read
)(s
->io_user_data
, (char*) buffer
+ blen
, n
- blen
);
1279 res
= (count
== (n
-blen
));
1280 s
->img_buffer
= s
->img_buffer_end
;
1285 if (s
->img_buffer
+n
<= s
->img_buffer_end
) {
1286 memcpy(buffer
, s
->img_buffer
, n
);
1293 static int stbi__get16be(stbi__context
*s
)
1295 int z
= stbi__get8(s
);
1296 return (z
<< 8) + stbi__get8(s
);
1299 static stbi__uint32
stbi__get32be(stbi__context
*s
)
1301 stbi__uint32 z
= stbi__get16be(s
);
1302 return (z
<< 16) + stbi__get16be(s
);
1305 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
1308 static int stbi__get16le(stbi__context
*s
)
1310 int z
= stbi__get8(s
);
1311 return z
+ (stbi__get8(s
) << 8);
1316 static stbi__uint32
stbi__get32le(stbi__context
*s
)
1318 stbi__uint32 z
= stbi__get16le(s
);
1319 return z
+ (stbi__get16le(s
) << 16);
1323 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1326 //////////////////////////////////////////////////////////////////////////////
1328 // generic converter from built-in img_n to req_comp
1329 // individual types do this automatically as much as possible (e.g. jpeg
1330 // does all cases internally since it needs to colorspace convert anyway,
1331 // and it never has alpha, so very few cases ). png can automatically
1332 // interleave an alpha=255 channel, but falls back to this for other cases
1334 // assume data buffer is malloced, so malloc a new one and free that one
1335 // only failure mode is malloc failing
1337 static stbi_uc
stbi__compute_y(int r
, int g
, int b
)
1339 return (stbi_uc
) (((r
*77) + (g
*150) + (29*b
)) >> 8);
1342 static unsigned char *stbi__convert_format(unsigned char *data
, int img_n
, int req_comp
, unsigned int x
, unsigned int y
)
1345 unsigned char *good
;
1347 if (req_comp
== img_n
) return data
;
1348 STBI_ASSERT(req_comp
>= 1 && req_comp
<= 4);
1350 good
= (unsigned char *) stbi__malloc(req_comp
* x
* y
);
1353 return stbi__errpuc("outofmem", "Out of memory");
1356 for (j
=0; j
< (int) y
; ++j
) {
1357 unsigned char *src
= data
+ j
* x
* img_n
;
1358 unsigned char *dest
= good
+ j
* x
* req_comp
;
1360 #define COMBO(a,b) ((a)*8+(b))
1361 #define CASE(a,b) case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1362 // convert source image with img_n components to one with req_comp components;
1363 // avoid switch per pixel, so use switch per scanline and massive macros
1364 switch (COMBO(img_n
, req_comp
)) {
1365 CASE(1,2) dest
[0]=src
[0], dest
[1]=255; break;
1366 CASE(1,3) dest
[0]=dest
[1]=dest
[2]=src
[0]; break;
1367 CASE(1,4) dest
[0]=dest
[1]=dest
[2]=src
[0], dest
[3]=255; break;
1368 CASE(2,1) dest
[0]=src
[0]; break;
1369 CASE(2,3) dest
[0]=dest
[1]=dest
[2]=src
[0]; break;
1370 CASE(2,4) dest
[0]=dest
[1]=dest
[2]=src
[0], dest
[3]=src
[1]; break;
1371 CASE(3,4) dest
[0]=src
[0],dest
[1]=src
[1],dest
[2]=src
[2],dest
[3]=255; break;
1372 CASE(3,1) dest
[0]=stbi__compute_y(src
[0],src
[1],src
[2]); break;
1373 CASE(3,2) dest
[0]=stbi__compute_y(src
[0],src
[1],src
[2]), dest
[1] = 255; break;
1374 CASE(4,1) dest
[0]=stbi__compute_y(src
[0],src
[1],src
[2]); break;
1375 CASE(4,2) dest
[0]=stbi__compute_y(src
[0],src
[1],src
[2]), dest
[1] = src
[3]; break;
1376 CASE(4,3) dest
[0]=src
[0],dest
[1]=src
[1],dest
[2]=src
[2]; break;
1377 default: STBI_ASSERT(0);
1386 #ifndef STBI_NO_LINEAR
1387 static float *stbi__ldr_to_hdr(stbi_uc
*data
, int x
, int y
, int comp
)
1390 float *output
= (float *) stbi__malloc(x
* y
* comp
* sizeof(float));
1391 if (output
== NULL
) { STBI_FREE(data
); return stbi__errpf("outofmem", "Out of memory"); }
1392 // compute number of non-alpha components
1393 if (comp
& 1) n
= comp
; else n
= comp
-1;
1394 for (i
=0; i
< x
*y
; ++i
) {
1395 for (k
=0; k
< n
; ++k
) {
1396 output
[i
*comp
+ k
] = (float) (pow(data
[i
*comp
+k
]/255.0f
, stbi__l2h_gamma
) * stbi__l2h_scale
);
1398 if (k
< comp
) output
[i
*comp
+ k
] = data
[i
*comp
+k
]/255.0f
;
1406 #define stbi__float2int(x) ((int) (x))
1407 static stbi_uc
*stbi__hdr_to_ldr(float *data
, int x
, int y
, int comp
)
1410 stbi_uc
*output
= (stbi_uc
*) stbi__malloc(x
* y
* comp
);
1411 if (output
== NULL
) { STBI_FREE(data
); return stbi__errpuc("outofmem", "Out of memory"); }
1412 // compute number of non-alpha components
1413 if (comp
& 1) n
= comp
; else n
= comp
-1;
1414 for (i
=0; i
< x
*y
; ++i
) {
1415 for (k
=0; k
< n
; ++k
) {
1416 float z
= (float) pow(data
[i
*comp
+k
]*stbi__h2l_scale_i
, stbi__h2l_gamma_i
) * 255 + 0.5f
;
1418 if (z
> 255) z
= 255;
1419 output
[i
*comp
+ k
] = (stbi_uc
) stbi__float2int(z
);
1422 float z
= data
[i
*comp
+k
] * 255 + 0.5f
;
1424 if (z
> 255) z
= 255;
1425 output
[i
*comp
+ k
] = (stbi_uc
) stbi__float2int(z
);
1433 //////////////////////////////////////////////////////////////////////////////
1435 // "baseline" JPEG/JFIF decoder
1437 // simple implementation
1438 // - doesn't support delayed output of y-dimension
1439 // - simple interface (only one output format: 8-bit interleaved RGB)
1440 // - doesn't try to recover corrupt jpegs
1441 // - doesn't allow partial loading, loading multiple at once
1442 // - still fast on x86 (copying globals into locals doesn't help x86)
1443 // - allocates lots of intermediate memory (full size of all components)
1444 // - non-interleaved case requires this anyway
1445 // - allows good upsampling (see next)
1447 // - upsampled channels are bilinearly interpolated, even across blocks
1448 // - quality integer IDCT derived from IJG's 'slow'
1450 // - fast huffman; reasonable integer IDCT
1451 // - some SIMD kernels for common paths on targets with SSE2/NEON
1452 // - uses a lot of intermediate memory, could cache poorly
1454 #ifndef STBI_NO_JPEG
1456 // huffman decoding acceleration
1457 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1461 stbi_uc fast
[1 << FAST_BITS
];
1462 // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1463 stbi__uint16 code
[256];
1464 stbi_uc values
[256];
1466 unsigned int maxcode
[18];
1467 int delta
[17]; // old 'firstsymbol' - old 'firstcode'
1473 stbi__huffman huff_dc
[4];
1474 stbi__huffman huff_ac
[4];
1475 stbi_uc dequant
[4][64];
1476 stbi__int16 fast_ac
[4][1 << FAST_BITS
];
1478 // sizes for components, interleaved MCUs
1479 int img_h_max
, img_v_max
;
1480 int img_mcu_x
, img_mcu_y
;
1481 int img_mcu_w
, img_mcu_h
;
1483 // definition of jpeg image component
1494 void *raw_data
, *raw_coeff
;
1496 short *coeff
; // progressive only
1497 int coeff_w
, coeff_h
; // number of 8x8 coefficient blocks
1500 stbi__uint32 code_buffer
; // jpeg entropy-coded buffer
1501 int code_bits
; // number of valid bits
1502 unsigned char marker
; // marker seen while filling entropy buffer
1503 int nomore
; // flag if we saw a marker so must stop
1512 int scan_n
, order
[4];
1513 int restart_interval
, todo
;
1516 void (*idct_block_kernel
)(stbi_uc
*out
, int out_stride
, short data
[64]);
1517 void (*YCbCr_to_RGB_kernel
)(stbi_uc
*out
, const stbi_uc
*y
, const stbi_uc
*pcb
, const stbi_uc
*pcr
, int count
, int step
);
1518 stbi_uc
*(*resample_row_hv_2_kernel
)(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
);
1521 static int stbi__build_huffman(stbi__huffman
*h
, int *count
)
1524 // build size list for each symbol (from JPEG spec)
1525 for (i
=0; i
< 16; ++i
)
1526 for (j
=0; j
< count
[i
]; ++j
)
1527 h
->size
[k
++] = (stbi_uc
) (i
+1);
1530 // compute actual symbols (from jpeg spec)
1533 for(j
=1; j
<= 16; ++j
) {
1534 // compute delta to add to code to compute symbol id
1535 h
->delta
[j
] = k
- code
;
1536 if (h
->size
[k
] == j
) {
1537 while (h
->size
[k
] == j
)
1538 h
->code
[k
++] = (stbi__uint16
) (code
++);
1539 if (code
-1 >= (1 << j
)) return stbi__err("bad code lengths","Corrupt JPEG");
1541 // compute largest code + 1 for this size, preshifted as needed later
1542 h
->maxcode
[j
] = code
<< (16-j
);
1545 h
->maxcode
[j
] = 0xffffffff;
1547 // build non-spec acceleration table; 255 is flag for not-accelerated
1548 memset(h
->fast
, 255, 1 << FAST_BITS
);
1549 for (i
=0; i
< k
; ++i
) {
1551 if (s
<= FAST_BITS
) {
1552 int c
= h
->code
[i
] << (FAST_BITS
-s
);
1553 int m
= 1 << (FAST_BITS
-s
);
1554 for (j
=0; j
< m
; ++j
) {
1555 h
->fast
[c
+j
] = (stbi_uc
) i
;
1562 // build a table that decodes both magnitude and value of small ACs in
1564 static void stbi__build_fast_ac(stbi__int16
*fast_ac
, stbi__huffman
*h
)
1567 for (i
=0; i
< (1 << FAST_BITS
); ++i
) {
1568 stbi_uc fast
= h
->fast
[i
];
1571 int rs
= h
->values
[fast
];
1572 int run
= (rs
>> 4) & 15;
1573 int magbits
= rs
& 15;
1574 int len
= h
->size
[fast
];
1576 if (magbits
&& len
+ magbits
<= FAST_BITS
) {
1577 // magnitude code followed by receive_extend code
1578 int k
= ((i
<< len
) & ((1 << FAST_BITS
) - 1)) >> (FAST_BITS
- magbits
);
1579 int m
= 1 << (magbits
- 1);
1580 if (k
< m
) k
+= (-1 << magbits
) + 1;
1581 // if the result is small enough, we can fit it in fast_ac table
1582 if (k
>= -128 && k
<= 127)
1583 fast_ac
[i
] = (stbi__int16
) ((k
<< 8) + (run
<< 4) + (len
+ magbits
));
1589 static void stbi__grow_buffer_unsafe(stbi__jpeg
*j
)
1592 int b
= j
->nomore
? 0 : stbi__get8(j
->s
);
1594 int c
= stbi__get8(j
->s
);
1596 j
->marker
= (unsigned char) c
;
1601 j
->code_buffer
|= b
<< (24 - j
->code_bits
);
1603 } while (j
->code_bits
<= 24);
1607 static stbi__uint32 stbi__bmask
[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1609 // decode a jpeg huffman value from the bitstream
1610 stbi_inline
static int stbi__jpeg_huff_decode(stbi__jpeg
*j
, stbi__huffman
*h
)
1615 if (j
->code_bits
< 16) stbi__grow_buffer_unsafe(j
);
1617 // look at the top FAST_BITS and determine what symbol ID it is,
1618 // if the code is <= FAST_BITS
1619 c
= (j
->code_buffer
>> (32 - FAST_BITS
)) & ((1 << FAST_BITS
)-1);
1623 if (s
> j
->code_bits
)
1625 j
->code_buffer
<<= s
;
1627 return h
->values
[k
];
1630 // naive test is to shift the code_buffer down so k bits are
1631 // valid, then test against maxcode. To speed this up, we've
1632 // preshifted maxcode left so that it has (16-k) 0s at the
1633 // end; in other words, regardless of the number of bits, it
1634 // wants to be compared against something shifted to have 16;
1635 // that way we don't need to shift inside the loop.
1636 temp
= j
->code_buffer
>> 16;
1637 for (k
=FAST_BITS
+1 ; ; ++k
)
1638 if (temp
< h
->maxcode
[k
])
1641 // error! code not found
1646 if (k
> j
->code_bits
)
1649 // convert the huffman code to the symbol id
1650 c
= ((j
->code_buffer
>> (32 - k
)) & stbi__bmask
[k
]) + h
->delta
[k
];
1651 STBI_ASSERT((((j
->code_buffer
) >> (32 - h
->size
[c
])) & stbi__bmask
[h
->size
[c
]]) == h
->code
[c
]);
1653 // convert the id to a symbol
1655 j
->code_buffer
<<= k
;
1656 return h
->values
[c
];
1659 // bias[n] = (-1<<n) + 1
1660 static int const stbi__jbias
[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
1662 // combined JPEG 'receive' and JPEG 'extend', since baseline
1663 // always extends everything it receives.
1664 stbi_inline
static int stbi__extend_receive(stbi__jpeg
*j
, int n
)
1668 if (j
->code_bits
< n
) stbi__grow_buffer_unsafe(j
);
1670 sgn
= (stbi__int32
)j
->code_buffer
>> 31; // sign bit is always in MSB
1671 k
= stbi_lrot(j
->code_buffer
, n
);
1672 STBI_ASSERT(n
>= 0 && n
< (int) (sizeof(stbi__bmask
)/sizeof(*stbi__bmask
)));
1673 j
->code_buffer
= k
& ~stbi__bmask
[n
];
1674 k
&= stbi__bmask
[n
];
1676 return k
+ (stbi__jbias
[n
] & ~sgn
);
1679 // get some unsigned bits
1680 stbi_inline
static int stbi__jpeg_get_bits(stbi__jpeg
*j
, int n
)
1683 if (j
->code_bits
< n
) stbi__grow_buffer_unsafe(j
);
1684 k
= stbi_lrot(j
->code_buffer
, n
);
1685 j
->code_buffer
= k
& ~stbi__bmask
[n
];
1686 k
&= stbi__bmask
[n
];
1691 stbi_inline
static int stbi__jpeg_get_bit(stbi__jpeg
*j
)
1694 if (j
->code_bits
< 1) stbi__grow_buffer_unsafe(j
);
1696 j
->code_buffer
<<= 1;
1698 return k
& 0x80000000;
1701 // given a value that's at position X in the zigzag stream,
1702 // where does it appear in the 8x8 matrix coded as row-major?
1703 static stbi_uc stbi__jpeg_dezigzag
[64+15] =
1705 0, 1, 8, 16, 9, 2, 3, 10,
1706 17, 24, 32, 25, 18, 11, 4, 5,
1707 12, 19, 26, 33, 40, 48, 41, 34,
1708 27, 20, 13, 6, 7, 14, 21, 28,
1709 35, 42, 49, 56, 57, 50, 43, 36,
1710 29, 22, 15, 23, 30, 37, 44, 51,
1711 58, 59, 52, 45, 38, 31, 39, 46,
1712 53, 60, 61, 54, 47, 55, 62, 63,
1713 // let corrupt input sample past end
1714 63, 63, 63, 63, 63, 63, 63, 63,
1715 63, 63, 63, 63, 63, 63, 63
1718 // decode one 64-entry block--
1719 static int stbi__jpeg_decode_block(stbi__jpeg
*j
, short data
[64], stbi__huffman
*hdc
, stbi__huffman
*hac
, stbi__int16
*fac
, int b
, stbi_uc
*dequant
)
1724 if (j
->code_bits
< 16) stbi__grow_buffer_unsafe(j
);
1725 t
= stbi__jpeg_huff_decode(j
, hdc
);
1726 if (t
< 0) return stbi__err("bad huffman code","Corrupt JPEG");
1728 // 0 all the ac values now so we can do it 32-bits at a time
1729 memset(data
,0,64*sizeof(data
[0]));
1731 diff
= t
? stbi__extend_receive(j
, t
) : 0;
1732 dc
= j
->img_comp
[b
].dc_pred
+ diff
;
1733 j
->img_comp
[b
].dc_pred
= dc
;
1734 data
[0] = (short) (dc
* dequant
[0]);
1736 // decode AC components, see JPEG spec
1741 if (j
->code_bits
< 16) stbi__grow_buffer_unsafe(j
);
1742 c
= (j
->code_buffer
>> (32 - FAST_BITS
)) & ((1 << FAST_BITS
)-1);
1744 if (r
) { // fast-AC path
1745 k
+= (r
>> 4) & 15; // run
1746 s
= r
& 15; // combined length
1747 j
->code_buffer
<<= s
;
1749 // decode into unzigzag'd location
1750 zig
= stbi__jpeg_dezigzag
[k
++];
1751 data
[zig
] = (short) ((r
>> 8) * dequant
[zig
]);
1753 int rs
= stbi__jpeg_huff_decode(j
, hac
);
1754 if (rs
< 0) return stbi__err("bad huffman code","Corrupt JPEG");
1758 if (rs
!= 0xf0) break; // end block
1762 // decode into unzigzag'd location
1763 zig
= stbi__jpeg_dezigzag
[k
++];
1764 data
[zig
] = (short) (stbi__extend_receive(j
,s
) * dequant
[zig
]);
1771 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg
*j
, short data
[64], stbi__huffman
*hdc
, int b
)
1775 if (j
->spec_end
!= 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1777 if (j
->code_bits
< 16) stbi__grow_buffer_unsafe(j
);
1779 if (j
->succ_high
== 0) {
1780 // first scan for DC coefficient, must be first
1781 memset(data
,0,64*sizeof(data
[0])); // 0 all the ac values now
1782 t
= stbi__jpeg_huff_decode(j
, hdc
);
1783 diff
= t
? stbi__extend_receive(j
, t
) : 0;
1785 dc
= j
->img_comp
[b
].dc_pred
+ diff
;
1786 j
->img_comp
[b
].dc_pred
= dc
;
1787 data
[0] = (short) (dc
<< j
->succ_low
);
1789 // refinement scan for DC coefficient
1790 if (stbi__jpeg_get_bit(j
))
1791 data
[0] += (short) (1 << j
->succ_low
);
1796 // @OPTIMIZE: store non-zigzagged during the decode passes,
1797 // and only de-zigzag when dequantizing
1798 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg
*j
, short data
[64], stbi__huffman
*hac
, stbi__int16
*fac
)
1801 if (j
->spec_start
== 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1803 if (j
->succ_high
== 0) {
1804 int shift
= j
->succ_low
;
1815 if (j
->code_bits
< 16) stbi__grow_buffer_unsafe(j
);
1816 c
= (j
->code_buffer
>> (32 - FAST_BITS
)) & ((1 << FAST_BITS
)-1);
1818 if (r
) { // fast-AC path
1819 k
+= (r
>> 4) & 15; // run
1820 s
= r
& 15; // combined length
1821 j
->code_buffer
<<= s
;
1823 zig
= stbi__jpeg_dezigzag
[k
++];
1824 data
[zig
] = (short) ((r
>> 8) << shift
);
1826 int rs
= stbi__jpeg_huff_decode(j
, hac
);
1827 if (rs
< 0) return stbi__err("bad huffman code","Corrupt JPEG");
1832 j
->eob_run
= (1 << r
);
1834 j
->eob_run
+= stbi__jpeg_get_bits(j
, r
);
1841 zig
= stbi__jpeg_dezigzag
[k
++];
1842 data
[zig
] = (short) (stbi__extend_receive(j
,s
) << shift
);
1845 } while (k
<= j
->spec_end
);
1847 // refinement scan for these AC coefficients
1849 short bit
= (short) (1 << j
->succ_low
);
1853 for (k
= j
->spec_start
; k
<= j
->spec_end
; ++k
) {
1854 short *p
= &data
[stbi__jpeg_dezigzag
[k
]];
1856 if (stbi__jpeg_get_bit(j
))
1857 if ((*p
& bit
)==0) {
1868 int rs
= stbi__jpeg_huff_decode(j
, hac
); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
1869 if (rs
< 0) return stbi__err("bad huffman code","Corrupt JPEG");
1874 j
->eob_run
= (1 << r
) - 1;
1876 j
->eob_run
+= stbi__jpeg_get_bits(j
, r
);
1877 r
= 64; // force end of block
1879 // r=15 s=0 should write 16 0s, so we just do
1880 // a run of 15 0s and then write s (which is 0),
1881 // so we don't have to do anything special here
1884 if (s
!= 1) return stbi__err("bad huffman code", "Corrupt JPEG");
1886 if (stbi__jpeg_get_bit(j
))
1893 while (k
<= j
->spec_end
) {
1894 short *p
= &data
[stbi__jpeg_dezigzag
[k
++]];
1896 if (stbi__jpeg_get_bit(j
))
1897 if ((*p
& bit
)==0) {
1911 } while (k
<= j
->spec_end
);
1917 // take a -128..127 value and stbi__clamp it and convert to 0..255
1918 stbi_inline
static stbi_uc
stbi__clamp(int x
)
1920 // trick to use a single test to catch both cases
1921 if ((unsigned int) x
> 255) {
1922 if (x
< 0) return 0;
1923 if (x
> 255) return 255;
1928 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
1929 #define stbi__fsh(x) ((x) << 12)
1931 // derived from jidctint -- DCT_ISLOW
1932 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
1933 int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
1936 p1 = (p2+p3) * stbi__f2f(0.5411961f); \
1937 t2 = p1 + p3*stbi__f2f(-1.847759065f); \
1938 t3 = p1 + p2*stbi__f2f( 0.765366865f); \
1941 t0 = stbi__fsh(p2+p3); \
1942 t1 = stbi__fsh(p2-p3); \
1955 p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
1956 t0 = t0*stbi__f2f( 0.298631336f); \
1957 t1 = t1*stbi__f2f( 2.053119869f); \
1958 t2 = t2*stbi__f2f( 3.072711026f); \
1959 t3 = t3*stbi__f2f( 1.501321110f); \
1960 p1 = p5 + p1*stbi__f2f(-0.899976223f); \
1961 p2 = p5 + p2*stbi__f2f(-2.562915447f); \
1962 p3 = p3*stbi__f2f(-1.961570560f); \
1963 p4 = p4*stbi__f2f(-0.390180644f); \
1969 static void stbi__idct_block(stbi_uc
*out
, int out_stride
, short data
[64])
1971 int i
,val
[64],*v
=val
;
1976 for (i
=0; i
< 8; ++i
,++d
, ++v
) {
1977 // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
1978 if (d
[ 8]==0 && d
[16]==0 && d
[24]==0 && d
[32]==0
1979 && d
[40]==0 && d
[48]==0 && d
[56]==0) {
1980 // no shortcut 0 seconds
1981 // (1|2|3|4|5|6|7)==0 0 seconds
1982 // all separate -0.047 seconds
1983 // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
1984 int dcterm
= d
[0] << 2;
1985 v
[0] = v
[8] = v
[16] = v
[24] = v
[32] = v
[40] = v
[48] = v
[56] = dcterm
;
1987 STBI__IDCT_1D(d
[ 0],d
[ 8],d
[16],d
[24],d
[32],d
[40],d
[48],d
[56])
1988 // constants scaled things up by 1<<12; let's bring them back
1989 // down, but keep 2 extra bits of precision
1990 x0
+= 512; x1
+= 512; x2
+= 512; x3
+= 512;
1991 v
[ 0] = (x0
+t3
) >> 10;
1992 v
[56] = (x0
-t3
) >> 10;
1993 v
[ 8] = (x1
+t2
) >> 10;
1994 v
[48] = (x1
-t2
) >> 10;
1995 v
[16] = (x2
+t1
) >> 10;
1996 v
[40] = (x2
-t1
) >> 10;
1997 v
[24] = (x3
+t0
) >> 10;
1998 v
[32] = (x3
-t0
) >> 10;
2002 for (i
=0, v
=val
, o
=out
; i
< 8; ++i
,v
+=8,o
+=out_stride
) {
2003 // no fast case since the first 1D IDCT spread components out
2004 STBI__IDCT_1D(v
[0],v
[1],v
[2],v
[3],v
[4],v
[5],v
[6],v
[7])
2005 // constants scaled things up by 1<<12, plus we had 1<<2 from first
2006 // loop, plus horizontal and vertical each scale by sqrt(8) so together
2007 // we've got an extra 1<<3, so 1<<17 total we need to remove.
2008 // so we want to round that, which means adding 0.5 * 1<<17,
2009 // aka 65536. Also, we'll end up with -128 to 127 that we want
2010 // to encode as 0..255 by adding 128, so we'll add that before the shift
2011 x0
+= 65536 + (128<<17);
2012 x1
+= 65536 + (128<<17);
2013 x2
+= 65536 + (128<<17);
2014 x3
+= 65536 + (128<<17);
2015 // tried computing the shifts into temps, or'ing the temps to see
2016 // if any were out of range, but that was slower
2017 o
[0] = stbi__clamp((x0
+t3
) >> 17);
2018 o
[7] = stbi__clamp((x0
-t3
) >> 17);
2019 o
[1] = stbi__clamp((x1
+t2
) >> 17);
2020 o
[6] = stbi__clamp((x1
-t2
) >> 17);
2021 o
[2] = stbi__clamp((x2
+t1
) >> 17);
2022 o
[5] = stbi__clamp((x2
-t1
) >> 17);
2023 o
[3] = stbi__clamp((x3
+t0
) >> 17);
2024 o
[4] = stbi__clamp((x3
-t0
) >> 17);
2029 // sse2 integer IDCT. not the fastest possible implementation but it
2030 // produces bit-identical results to the generic C version so it's
2031 // fully "transparent".
2032 static void stbi__idct_simd(stbi_uc
*out
, int out_stride
, short data
[64])
2034 // This is constructed to match our regular (generic) integer IDCT exactly.
2035 __m128i row0
, row1
, row2
, row3
, row4
, row5
, row6
, row7
;
2038 // dot product constant: even elems=x, odd elems=y
2039 #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2041 // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2042 // out(1) = c1[even]*x + c1[odd]*y
2043 #define dct_rot(out0,out1, x,y,c0,c1) \
2044 __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2045 __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2046 __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2047 __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2048 __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2049 __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2051 // out = in << 12 (in 16-bit, out 32-bit)
2052 #define dct_widen(out, in) \
2053 __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2054 __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2057 #define dct_wadd(out, a, b) \
2058 __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2059 __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2062 #define dct_wsub(out, a, b) \
2063 __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2064 __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2066 // butterfly a/b, add bias, then shift by "s" and pack
2067 #define dct_bfly32o(out0, out1, a,b,bias,s) \
2069 __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2070 __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2071 dct_wadd(sum, abiased, b); \
2072 dct_wsub(dif, abiased, b); \
2073 out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2074 out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2077 // 8-bit interleave step (for transposes)
2078 #define dct_interleave8(a, b) \
2080 a = _mm_unpacklo_epi8(a, b); \
2081 b = _mm_unpackhi_epi8(tmp, b)
2083 // 16-bit interleave step (for transposes)
2084 #define dct_interleave16(a, b) \
2086 a = _mm_unpacklo_epi16(a, b); \
2087 b = _mm_unpackhi_epi16(tmp, b)
2089 #define dct_pass(bias,shift) \
2092 dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2093 __m128i sum04 = _mm_add_epi16(row0, row4); \
2094 __m128i dif04 = _mm_sub_epi16(row0, row4); \
2095 dct_widen(t0e, sum04); \
2096 dct_widen(t1e, dif04); \
2097 dct_wadd(x0, t0e, t3e); \
2098 dct_wsub(x3, t0e, t3e); \
2099 dct_wadd(x1, t1e, t2e); \
2100 dct_wsub(x2, t1e, t2e); \
2102 dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2103 dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2104 __m128i sum17 = _mm_add_epi16(row1, row7); \
2105 __m128i sum35 = _mm_add_epi16(row3, row5); \
2106 dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2107 dct_wadd(x4, y0o, y4o); \
2108 dct_wadd(x5, y1o, y5o); \
2109 dct_wadd(x6, y2o, y5o); \
2110 dct_wadd(x7, y3o, y4o); \
2111 dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2112 dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2113 dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2114 dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2117 __m128i rot0_0
= dct_const(stbi__f2f(0.5411961f
), stbi__f2f(0.5411961f
) + stbi__f2f(-1.847759065f
));
2118 __m128i rot0_1
= dct_const(stbi__f2f(0.5411961f
) + stbi__f2f( 0.765366865f
), stbi__f2f(0.5411961f
));
2119 __m128i rot1_0
= dct_const(stbi__f2f(1.175875602f
) + stbi__f2f(-0.899976223f
), stbi__f2f(1.175875602f
));
2120 __m128i rot1_1
= dct_const(stbi__f2f(1.175875602f
), stbi__f2f(1.175875602f
) + stbi__f2f(-2.562915447f
));
2121 __m128i rot2_0
= dct_const(stbi__f2f(-1.961570560f
) + stbi__f2f( 0.298631336f
), stbi__f2f(-1.961570560f
));
2122 __m128i rot2_1
= dct_const(stbi__f2f(-1.961570560f
), stbi__f2f(-1.961570560f
) + stbi__f2f( 3.072711026f
));
2123 __m128i rot3_0
= dct_const(stbi__f2f(-0.390180644f
) + stbi__f2f( 2.053119869f
), stbi__f2f(-0.390180644f
));
2124 __m128i rot3_1
= dct_const(stbi__f2f(-0.390180644f
), stbi__f2f(-0.390180644f
) + stbi__f2f( 1.501321110f
));
2126 // rounding biases in column/row passes, see stbi__idct_block for explanation.
2127 __m128i bias_0
= _mm_set1_epi32(512);
2128 __m128i bias_1
= _mm_set1_epi32(65536 + (128<<17));
2131 row0
= _mm_load_si128((const __m128i
*) (data
+ 0*8));
2132 row1
= _mm_load_si128((const __m128i
*) (data
+ 1*8));
2133 row2
= _mm_load_si128((const __m128i
*) (data
+ 2*8));
2134 row3
= _mm_load_si128((const __m128i
*) (data
+ 3*8));
2135 row4
= _mm_load_si128((const __m128i
*) (data
+ 4*8));
2136 row5
= _mm_load_si128((const __m128i
*) (data
+ 5*8));
2137 row6
= _mm_load_si128((const __m128i
*) (data
+ 6*8));
2138 row7
= _mm_load_si128((const __m128i
*) (data
+ 7*8));
2141 dct_pass(bias_0
, 10);
2144 // 16bit 8x8 transpose pass 1
2145 dct_interleave16(row0
, row4
);
2146 dct_interleave16(row1
, row5
);
2147 dct_interleave16(row2
, row6
);
2148 dct_interleave16(row3
, row7
);
2151 dct_interleave16(row0
, row2
);
2152 dct_interleave16(row1
, row3
);
2153 dct_interleave16(row4
, row6
);
2154 dct_interleave16(row5
, row7
);
2157 dct_interleave16(row0
, row1
);
2158 dct_interleave16(row2
, row3
);
2159 dct_interleave16(row4
, row5
);
2160 dct_interleave16(row6
, row7
);
2164 dct_pass(bias_1
, 17);
2168 __m128i p0
= _mm_packus_epi16(row0
, row1
); // a0a1a2a3...a7b0b1b2b3...b7
2169 __m128i p1
= _mm_packus_epi16(row2
, row3
);
2170 __m128i p2
= _mm_packus_epi16(row4
, row5
);
2171 __m128i p3
= _mm_packus_epi16(row6
, row7
);
2173 // 8bit 8x8 transpose pass 1
2174 dct_interleave8(p0
, p2
); // a0e0a1e1...
2175 dct_interleave8(p1
, p3
); // c0g0c1g1...
2178 dct_interleave8(p0
, p1
); // a0c0e0g0...
2179 dct_interleave8(p2
, p3
); // b0d0f0h0...
2182 dct_interleave8(p0
, p2
); // a0b0c0d0...
2183 dct_interleave8(p1
, p3
); // a4b4c4d4...
2186 _mm_storel_epi64((__m128i
*) out
, p0
); out
+= out_stride
;
2187 _mm_storel_epi64((__m128i
*) out
, _mm_shuffle_epi32(p0
, 0x4e)); out
+= out_stride
;
2188 _mm_storel_epi64((__m128i
*) out
, p2
); out
+= out_stride
;
2189 _mm_storel_epi64((__m128i
*) out
, _mm_shuffle_epi32(p2
, 0x4e)); out
+= out_stride
;
2190 _mm_storel_epi64((__m128i
*) out
, p1
); out
+= out_stride
;
2191 _mm_storel_epi64((__m128i
*) out
, _mm_shuffle_epi32(p1
, 0x4e)); out
+= out_stride
;
2192 _mm_storel_epi64((__m128i
*) out
, p3
); out
+= out_stride
;
2193 _mm_storel_epi64((__m128i
*) out
, _mm_shuffle_epi32(p3
, 0x4e));
2202 #undef dct_interleave8
2203 #undef dct_interleave16
2211 // NEON integer IDCT. should produce bit-identical
2212 // results to the generic C version.
2213 static void stbi__idct_simd(stbi_uc
*out
, int out_stride
, short data
[64])
2215 int16x8_t row0
, row1
, row2
, row3
, row4
, row5
, row6
, row7
;
2217 int16x4_t rot0_0
= vdup_n_s16(stbi__f2f(0.5411961f
));
2218 int16x4_t rot0_1
= vdup_n_s16(stbi__f2f(-1.847759065f
));
2219 int16x4_t rot0_2
= vdup_n_s16(stbi__f2f( 0.765366865f
));
2220 int16x4_t rot1_0
= vdup_n_s16(stbi__f2f( 1.175875602f
));
2221 int16x4_t rot1_1
= vdup_n_s16(stbi__f2f(-0.899976223f
));
2222 int16x4_t rot1_2
= vdup_n_s16(stbi__f2f(-2.562915447f
));
2223 int16x4_t rot2_0
= vdup_n_s16(stbi__f2f(-1.961570560f
));
2224 int16x4_t rot2_1
= vdup_n_s16(stbi__f2f(-0.390180644f
));
2225 int16x4_t rot3_0
= vdup_n_s16(stbi__f2f( 0.298631336f
));
2226 int16x4_t rot3_1
= vdup_n_s16(stbi__f2f( 2.053119869f
));
2227 int16x4_t rot3_2
= vdup_n_s16(stbi__f2f( 3.072711026f
));
2228 int16x4_t rot3_3
= vdup_n_s16(stbi__f2f( 1.501321110f
));
2230 #define dct_long_mul(out, inq, coeff) \
2231 int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2232 int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2234 #define dct_long_mac(out, acc, inq, coeff) \
2235 int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2236 int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2238 #define dct_widen(out, inq) \
2239 int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2240 int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2243 #define dct_wadd(out, a, b) \
2244 int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2245 int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2248 #define dct_wsub(out, a, b) \
2249 int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2250 int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2252 // butterfly a/b, then shift using "shiftop" by "s" and pack
2253 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2255 dct_wadd(sum, a, b); \
2256 dct_wsub(dif, a, b); \
2257 out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2258 out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2261 #define dct_pass(shiftop, shift) \
2264 int16x8_t sum26 = vaddq_s16(row2, row6); \
2265 dct_long_mul(p1e, sum26, rot0_0); \
2266 dct_long_mac(t2e, p1e, row6, rot0_1); \
2267 dct_long_mac(t3e, p1e, row2, rot0_2); \
2268 int16x8_t sum04 = vaddq_s16(row0, row4); \
2269 int16x8_t dif04 = vsubq_s16(row0, row4); \
2270 dct_widen(t0e, sum04); \
2271 dct_widen(t1e, dif04); \
2272 dct_wadd(x0, t0e, t3e); \
2273 dct_wsub(x3, t0e, t3e); \
2274 dct_wadd(x1, t1e, t2e); \
2275 dct_wsub(x2, t1e, t2e); \
2277 int16x8_t sum15 = vaddq_s16(row1, row5); \
2278 int16x8_t sum17 = vaddq_s16(row1, row7); \
2279 int16x8_t sum35 = vaddq_s16(row3, row5); \
2280 int16x8_t sum37 = vaddq_s16(row3, row7); \
2281 int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2282 dct_long_mul(p5o, sumodd, rot1_0); \
2283 dct_long_mac(p1o, p5o, sum17, rot1_1); \
2284 dct_long_mac(p2o, p5o, sum35, rot1_2); \
2285 dct_long_mul(p3o, sum37, rot2_0); \
2286 dct_long_mul(p4o, sum15, rot2_1); \
2287 dct_wadd(sump13o, p1o, p3o); \
2288 dct_wadd(sump24o, p2o, p4o); \
2289 dct_wadd(sump23o, p2o, p3o); \
2290 dct_wadd(sump14o, p1o, p4o); \
2291 dct_long_mac(x4, sump13o, row7, rot3_0); \
2292 dct_long_mac(x5, sump24o, row5, rot3_1); \
2293 dct_long_mac(x6, sump23o, row3, rot3_2); \
2294 dct_long_mac(x7, sump14o, row1, rot3_3); \
2295 dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2296 dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2297 dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2298 dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2302 row0
= vld1q_s16(data
+ 0*8);
2303 row1
= vld1q_s16(data
+ 1*8);
2304 row2
= vld1q_s16(data
+ 2*8);
2305 row3
= vld1q_s16(data
+ 3*8);
2306 row4
= vld1q_s16(data
+ 4*8);
2307 row5
= vld1q_s16(data
+ 5*8);
2308 row6
= vld1q_s16(data
+ 6*8);
2309 row7
= vld1q_s16(data
+ 7*8);
2312 row0
= vaddq_s16(row0
, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2315 dct_pass(vrshrn_n_s32
, 10);
2317 // 16bit 8x8 transpose
2319 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2320 // whether compilers actually get this is another story, sadly.
2321 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2322 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2323 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2326 dct_trn16(row0
, row1
); // a0b0a2b2a4b4a6b6
2327 dct_trn16(row2
, row3
);
2328 dct_trn16(row4
, row5
);
2329 dct_trn16(row6
, row7
);
2332 dct_trn32(row0
, row2
); // a0b0c0d0a4b4c4d4
2333 dct_trn32(row1
, row3
);
2334 dct_trn32(row4
, row6
);
2335 dct_trn32(row5
, row7
);
2338 dct_trn64(row0
, row4
); // a0b0c0d0e0f0g0h0
2339 dct_trn64(row1
, row5
);
2340 dct_trn64(row2
, row6
);
2341 dct_trn64(row3
, row7
);
2349 // vrshrn_n_s32 only supports shifts up to 16, we need
2350 // 17. so do a non-rounding shift of 16 first then follow
2351 // up with a rounding shift by 1.
2352 dct_pass(vshrn_n_s32
, 16);
2356 uint8x8_t p0
= vqrshrun_n_s16(row0
, 1);
2357 uint8x8_t p1
= vqrshrun_n_s16(row1
, 1);
2358 uint8x8_t p2
= vqrshrun_n_s16(row2
, 1);
2359 uint8x8_t p3
= vqrshrun_n_s16(row3
, 1);
2360 uint8x8_t p4
= vqrshrun_n_s16(row4
, 1);
2361 uint8x8_t p5
= vqrshrun_n_s16(row5
, 1);
2362 uint8x8_t p6
= vqrshrun_n_s16(row6
, 1);
2363 uint8x8_t p7
= vqrshrun_n_s16(row7
, 1);
2365 // again, these can translate into one instruction, but often don't.
2366 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2367 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2368 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2370 // sadly can't use interleaved stores here since we only write
2371 // 8 bytes to each scan line!
2373 // 8x8 8-bit transpose pass 1
2380 dct_trn8_16(p0
, p2
);
2381 dct_trn8_16(p1
, p3
);
2382 dct_trn8_16(p4
, p6
);
2383 dct_trn8_16(p5
, p7
);
2386 dct_trn8_32(p0
, p4
);
2387 dct_trn8_32(p1
, p5
);
2388 dct_trn8_32(p2
, p6
);
2389 dct_trn8_32(p3
, p7
);
2392 vst1_u8(out
, p0
); out
+= out_stride
;
2393 vst1_u8(out
, p1
); out
+= out_stride
;
2394 vst1_u8(out
, p2
); out
+= out_stride
;
2395 vst1_u8(out
, p3
); out
+= out_stride
;
2396 vst1_u8(out
, p4
); out
+= out_stride
;
2397 vst1_u8(out
, p5
); out
+= out_stride
;
2398 vst1_u8(out
, p6
); out
+= out_stride
;
2417 #define STBI__MARKER_none 0xff
2418 // if there's a pending marker from the entropy stream, return that
2419 // otherwise, fetch from the stream and get a marker. if there's no
2420 // marker, return 0xff, which is never a valid marker value
2421 static stbi_uc
stbi__get_marker(stbi__jpeg
*j
)
2424 if (j
->marker
!= STBI__MARKER_none
) { x
= j
->marker
; j
->marker
= STBI__MARKER_none
; return x
; }
2425 x
= stbi__get8(j
->s
);
2426 if (x
!= 0xff) return STBI__MARKER_none
;
2428 x
= stbi__get8(j
->s
);
2432 // in each scan, we'll have scan_n components, and the order
2433 // of the components is specified by order[]
2434 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2436 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2437 // the dc prediction
2438 static void stbi__jpeg_reset(stbi__jpeg
*j
)
2443 j
->img_comp
[0].dc_pred
= j
->img_comp
[1].dc_pred
= j
->img_comp
[2].dc_pred
= 0;
2444 j
->marker
= STBI__MARKER_none
;
2445 j
->todo
= j
->restart_interval
? j
->restart_interval
: 0x7fffffff;
2447 // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2448 // since we don't even allow 1<<30 pixels
2451 static int stbi__parse_entropy_coded_data(stbi__jpeg
*z
)
2453 stbi__jpeg_reset(z
);
2454 if (!z
->progressive
) {
2455 if (z
->scan_n
== 1) {
2457 STBI_SIMD_ALIGN(short, data
[64]);
2458 int n
= z
->order
[0];
2459 // non-interleaved data, we just need to process one block at a time,
2460 // in trivial scanline order
2461 // number of blocks to do just depends on how many actual "pixels" this
2462 // component has, independent of interleaved MCU blocking and such
2463 int w
= (z
->img_comp
[n
].x
+7) >> 3;
2464 int h
= (z
->img_comp
[n
].y
+7) >> 3;
2465 for (j
=0; j
< h
; ++j
) {
2466 for (i
=0; i
< w
; ++i
) {
2467 int ha
= z
->img_comp
[n
].ha
;
2468 if (!stbi__jpeg_decode_block(z
, data
, z
->huff_dc
+z
->img_comp
[n
].hd
, z
->huff_ac
+ha
, z
->fast_ac
[ha
], n
, z
->dequant
[z
->img_comp
[n
].tq
])) return 0;
2469 z
->idct_block_kernel(z
->img_comp
[n
].data
+z
->img_comp
[n
].w2
*j
*8+i
*8, z
->img_comp
[n
].w2
, data
);
2470 // every data block is an MCU, so countdown the restart interval
2471 if (--z
->todo
<= 0) {
2472 if (z
->code_bits
< 24) stbi__grow_buffer_unsafe(z
);
2473 // if it's NOT a restart, then just bail, so we get corrupt data
2474 // rather than no data
2475 if (!STBI__RESTART(z
->marker
)) return 1;
2476 stbi__jpeg_reset(z
);
2481 } else { // interleaved
2483 STBI_SIMD_ALIGN(short, data
[64]);
2484 for (j
=0; j
< z
->img_mcu_y
; ++j
) {
2485 for (i
=0; i
< z
->img_mcu_x
; ++i
) {
2486 // scan an interleaved mcu... process scan_n components in order
2487 for (k
=0; k
< z
->scan_n
; ++k
) {
2488 int n
= z
->order
[k
];
2489 // scan out an mcu's worth of this component; that's just determined
2490 // by the basic H and V specified for the component
2491 for (y
=0; y
< z
->img_comp
[n
].v
; ++y
) {
2492 for (x
=0; x
< z
->img_comp
[n
].h
; ++x
) {
2493 int x2
= (i
*z
->img_comp
[n
].h
+ x
)*8;
2494 int y2
= (j
*z
->img_comp
[n
].v
+ y
)*8;
2495 int ha
= z
->img_comp
[n
].ha
;
2496 if (!stbi__jpeg_decode_block(z
, data
, z
->huff_dc
+z
->img_comp
[n
].hd
, z
->huff_ac
+ha
, z
->fast_ac
[ha
], n
, z
->dequant
[z
->img_comp
[n
].tq
])) return 0;
2497 z
->idct_block_kernel(z
->img_comp
[n
].data
+z
->img_comp
[n
].w2
*y2
+x2
, z
->img_comp
[n
].w2
, data
);
2501 // after all interleaved components, that's an interleaved MCU,
2502 // so now count down the restart interval
2503 if (--z
->todo
<= 0) {
2504 if (z
->code_bits
< 24) stbi__grow_buffer_unsafe(z
);
2505 if (!STBI__RESTART(z
->marker
)) return 1;
2506 stbi__jpeg_reset(z
);
2513 if (z
->scan_n
== 1) {
2515 int n
= z
->order
[0];
2516 // non-interleaved data, we just need to process one block at a time,
2517 // in trivial scanline order
2518 // number of blocks to do just depends on how many actual "pixels" this
2519 // component has, independent of interleaved MCU blocking and such
2520 int w
= (z
->img_comp
[n
].x
+7) >> 3;
2521 int h
= (z
->img_comp
[n
].y
+7) >> 3;
2522 for (j
=0; j
< h
; ++j
) {
2523 for (i
=0; i
< w
; ++i
) {
2524 short *data
= z
->img_comp
[n
].coeff
+ 64 * (i
+ j
* z
->img_comp
[n
].coeff_w
);
2525 if (z
->spec_start
== 0) {
2526 if (!stbi__jpeg_decode_block_prog_dc(z
, data
, &z
->huff_dc
[z
->img_comp
[n
].hd
], n
))
2529 int ha
= z
->img_comp
[n
].ha
;
2530 if (!stbi__jpeg_decode_block_prog_ac(z
, data
, &z
->huff_ac
[ha
], z
->fast_ac
[ha
]))
2533 // every data block is an MCU, so countdown the restart interval
2534 if (--z
->todo
<= 0) {
2535 if (z
->code_bits
< 24) stbi__grow_buffer_unsafe(z
);
2536 if (!STBI__RESTART(z
->marker
)) return 1;
2537 stbi__jpeg_reset(z
);
2542 } else { // interleaved
2544 for (j
=0; j
< z
->img_mcu_y
; ++j
) {
2545 for (i
=0; i
< z
->img_mcu_x
; ++i
) {
2546 // scan an interleaved mcu... process scan_n components in order
2547 for (k
=0; k
< z
->scan_n
; ++k
) {
2548 int n
= z
->order
[k
];
2549 // scan out an mcu's worth of this component; that's just determined
2550 // by the basic H and V specified for the component
2551 for (y
=0; y
< z
->img_comp
[n
].v
; ++y
) {
2552 for (x
=0; x
< z
->img_comp
[n
].h
; ++x
) {
2553 int x2
= (i
*z
->img_comp
[n
].h
+ x
);
2554 int y2
= (j
*z
->img_comp
[n
].v
+ y
);
2555 short *data
= z
->img_comp
[n
].coeff
+ 64 * (x2
+ y2
* z
->img_comp
[n
].coeff_w
);
2556 if (!stbi__jpeg_decode_block_prog_dc(z
, data
, &z
->huff_dc
[z
->img_comp
[n
].hd
], n
))
2561 // after all interleaved components, that's an interleaved MCU,
2562 // so now count down the restart interval
2563 if (--z
->todo
<= 0) {
2564 if (z
->code_bits
< 24) stbi__grow_buffer_unsafe(z
);
2565 if (!STBI__RESTART(z
->marker
)) return 1;
2566 stbi__jpeg_reset(z
);
2575 static void stbi__jpeg_dequantize(short *data
, stbi_uc
*dequant
)
2578 for (i
=0; i
< 64; ++i
)
2579 data
[i
] *= dequant
[i
];
2582 static void stbi__jpeg_finish(stbi__jpeg
*z
)
2584 if (z
->progressive
) {
2585 // dequantize and idct the data
2587 for (n
=0; n
< z
->s
->img_n
; ++n
) {
2588 int w
= (z
->img_comp
[n
].x
+7) >> 3;
2589 int h
= (z
->img_comp
[n
].y
+7) >> 3;
2590 for (j
=0; j
< h
; ++j
) {
2591 for (i
=0; i
< w
; ++i
) {
2592 short *data
= z
->img_comp
[n
].coeff
+ 64 * (i
+ j
* z
->img_comp
[n
].coeff_w
);
2593 stbi__jpeg_dequantize(data
, z
->dequant
[z
->img_comp
[n
].tq
]);
2594 z
->idct_block_kernel(z
->img_comp
[n
].data
+z
->img_comp
[n
].w2
*j
*8+i
*8, z
->img_comp
[n
].w2
, data
);
2601 static int stbi__process_marker(stbi__jpeg
*z
, int m
)
2605 case STBI__MARKER_none
: // no marker found
2606 return stbi__err("expected marker","Corrupt JPEG");
2608 case 0xDD: // DRI - specify restart interval
2609 if (stbi__get16be(z
->s
) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
2610 z
->restart_interval
= stbi__get16be(z
->s
);
2613 case 0xDB: // DQT - define quantization table
2614 L
= stbi__get16be(z
->s
)-2;
2616 int q
= stbi__get8(z
->s
);
2619 if (p
!= 0) return stbi__err("bad DQT type","Corrupt JPEG");
2620 if (t
> 3) return stbi__err("bad DQT table","Corrupt JPEG");
2621 for (i
=0; i
< 64; ++i
)
2622 z
->dequant
[t
][stbi__jpeg_dezigzag
[i
]] = stbi__get8(z
->s
);
2627 case 0xC4: // DHT - define huffman table
2628 L
= stbi__get16be(z
->s
)-2;
2631 int sizes
[16],i
,n
=0;
2632 int q
= stbi__get8(z
->s
);
2635 if (tc
> 1 || th
> 3) return stbi__err("bad DHT header","Corrupt JPEG");
2636 for (i
=0; i
< 16; ++i
) {
2637 sizes
[i
] = stbi__get8(z
->s
);
2642 if (!stbi__build_huffman(z
->huff_dc
+th
, sizes
)) return 0;
2643 v
= z
->huff_dc
[th
].values
;
2645 if (!stbi__build_huffman(z
->huff_ac
+th
, sizes
)) return 0;
2646 v
= z
->huff_ac
[th
].values
;
2648 for (i
=0; i
< n
; ++i
)
2649 v
[i
] = stbi__get8(z
->s
);
2651 stbi__build_fast_ac(z
->fast_ac
[th
], z
->huff_ac
+ th
);
2656 // check for comment block or APP blocks
2657 if ((m
>= 0xE0 && m
<= 0xEF) || m
== 0xFE) {
2658 stbi__skip(z
->s
, stbi__get16be(z
->s
)-2);
2665 static int stbi__process_scan_header(stbi__jpeg
*z
)
2668 int Ls
= stbi__get16be(z
->s
);
2669 z
->scan_n
= stbi__get8(z
->s
);
2670 if (z
->scan_n
< 1 || z
->scan_n
> 4 || z
->scan_n
> (int) z
->s
->img_n
) return stbi__err("bad SOS component count","Corrupt JPEG");
2671 if (Ls
!= 6+2*z
->scan_n
) return stbi__err("bad SOS len","Corrupt JPEG");
2672 for (i
=0; i
< z
->scan_n
; ++i
) {
2673 int id
= stbi__get8(z
->s
), which
;
2674 int q
= stbi__get8(z
->s
);
2675 for (which
= 0; which
< z
->s
->img_n
; ++which
)
2676 if (z
->img_comp
[which
].id
== id
)
2678 if (which
== z
->s
->img_n
) return 0; // no match
2679 z
->img_comp
[which
].hd
= q
>> 4; if (z
->img_comp
[which
].hd
> 3) return stbi__err("bad DC huff","Corrupt JPEG");
2680 z
->img_comp
[which
].ha
= q
& 15; if (z
->img_comp
[which
].ha
> 3) return stbi__err("bad AC huff","Corrupt JPEG");
2681 z
->order
[i
] = which
;
2686 z
->spec_start
= stbi__get8(z
->s
);
2687 z
->spec_end
= stbi__get8(z
->s
); // should be 63, but might be 0
2688 aa
= stbi__get8(z
->s
);
2689 z
->succ_high
= (aa
>> 4);
2690 z
->succ_low
= (aa
& 15);
2691 if (z
->progressive
) {
2692 if (z
->spec_start
> 63 || z
->spec_end
> 63 || z
->spec_start
> z
->spec_end
|| z
->succ_high
> 13 || z
->succ_low
> 13)
2693 return stbi__err("bad SOS", "Corrupt JPEG");
2695 if (z
->spec_start
!= 0) return stbi__err("bad SOS","Corrupt JPEG");
2696 if (z
->succ_high
!= 0 || z
->succ_low
!= 0) return stbi__err("bad SOS","Corrupt JPEG");
2704 static int stbi__process_frame_header(stbi__jpeg
*z
, int scan
)
2706 stbi__context
*s
= z
->s
;
2707 int Lf
,p
,i
,q
, h_max
=1,v_max
=1,c
;
2708 Lf
= stbi__get16be(s
); if (Lf
< 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
2709 p
= stbi__get8(s
); if (p
!= 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
2710 s
->img_y
= stbi__get16be(s
); if (s
->img_y
== 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
2711 s
->img_x
= stbi__get16be(s
); if (s
->img_x
== 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
2713 if (c
!= 3 && c
!= 1) return stbi__err("bad component count","Corrupt JPEG"); // JFIF requires
2715 for (i
=0; i
< c
; ++i
) {
2716 z
->img_comp
[i
].data
= NULL
;
2717 z
->img_comp
[i
].linebuf
= NULL
;
2720 if (Lf
!= 8+3*s
->img_n
) return stbi__err("bad SOF len","Corrupt JPEG");
2722 for (i
=0; i
< s
->img_n
; ++i
) {
2723 z
->img_comp
[i
].id
= stbi__get8(s
);
2724 if (z
->img_comp
[i
].id
!= i
+1) // JFIF requires
2725 if (z
->img_comp
[i
].id
!= i
) // some version of jpegtran outputs non-JFIF-compliant files!
2726 return stbi__err("bad component ID","Corrupt JPEG");
2728 z
->img_comp
[i
].h
= (q
>> 4); if (!z
->img_comp
[i
].h
|| z
->img_comp
[i
].h
> 4) return stbi__err("bad H","Corrupt JPEG");
2729 z
->img_comp
[i
].v
= q
& 15; if (!z
->img_comp
[i
].v
|| z
->img_comp
[i
].v
> 4) return stbi__err("bad V","Corrupt JPEG");
2730 z
->img_comp
[i
].tq
= stbi__get8(s
); if (z
->img_comp
[i
].tq
> 3) return stbi__err("bad TQ","Corrupt JPEG");
2733 if (scan
!= STBI__SCAN_load
) return 1;
2735 if ((1 << 30) / s
->img_x
/ s
->img_n
< s
->img_y
) return stbi__err("too large", "Image too large to decode");
2737 for (i
=0; i
< s
->img_n
; ++i
) {
2738 if (z
->img_comp
[i
].h
> h_max
) h_max
= z
->img_comp
[i
].h
;
2739 if (z
->img_comp
[i
].v
> v_max
) v_max
= z
->img_comp
[i
].v
;
2742 // compute interleaved mcu info
2743 z
->img_h_max
= h_max
;
2744 z
->img_v_max
= v_max
;
2745 z
->img_mcu_w
= h_max
* 8;
2746 z
->img_mcu_h
= v_max
* 8;
2747 z
->img_mcu_x
= (s
->img_x
+ z
->img_mcu_w
-1) / z
->img_mcu_w
;
2748 z
->img_mcu_y
= (s
->img_y
+ z
->img_mcu_h
-1) / z
->img_mcu_h
;
2750 for (i
=0; i
< s
->img_n
; ++i
) {
2751 // number of effective pixels (e.g. for non-interleaved MCU)
2752 z
->img_comp
[i
].x
= (s
->img_x
* z
->img_comp
[i
].h
+ h_max
-1) / h_max
;
2753 z
->img_comp
[i
].y
= (s
->img_y
* z
->img_comp
[i
].v
+ v_max
-1) / v_max
;
2754 // to simplify generation, we'll allocate enough memory to decode
2755 // the bogus oversized data from using interleaved MCUs and their
2756 // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
2757 // discard the extra data until colorspace conversion
2758 z
->img_comp
[i
].w2
= z
->img_mcu_x
* z
->img_comp
[i
].h
* 8;
2759 z
->img_comp
[i
].h2
= z
->img_mcu_y
* z
->img_comp
[i
].v
* 8;
2760 z
->img_comp
[i
].raw_data
= stbi__malloc(z
->img_comp
[i
].w2
* z
->img_comp
[i
].h2
+15);
2762 if (z
->img_comp
[i
].raw_data
== NULL
) {
2763 for(--i
; i
>= 0; --i
) {
2764 STBI_FREE(z
->img_comp
[i
].raw_data
);
2765 z
->img_comp
[i
].raw_data
= NULL
;
2767 return stbi__err("outofmem", "Out of memory");
2769 // align blocks for idct using mmx/sse
2770 z
->img_comp
[i
].data
= (stbi_uc
*) (((size_t) z
->img_comp
[i
].raw_data
+ 15) & ~15);
2771 z
->img_comp
[i
].linebuf
= NULL
;
2772 if (z
->progressive
) {
2773 z
->img_comp
[i
].coeff_w
= (z
->img_comp
[i
].w2
+ 7) >> 3;
2774 z
->img_comp
[i
].coeff_h
= (z
->img_comp
[i
].h2
+ 7) >> 3;
2775 z
->img_comp
[i
].raw_coeff
= STBI_MALLOC(z
->img_comp
[i
].coeff_w
* z
->img_comp
[i
].coeff_h
* 64 * sizeof(short) + 15);
2776 z
->img_comp
[i
].coeff
= (short*) (((size_t) z
->img_comp
[i
].raw_coeff
+ 15) & ~15);
2778 z
->img_comp
[i
].coeff
= 0;
2779 z
->img_comp
[i
].raw_coeff
= 0;
2786 // use comparisons since in some cases we handle more than one case (e.g. SOF)
2787 #define stbi__DNL(x) ((x) == 0xdc)
2788 #define stbi__SOI(x) ((x) == 0xd8)
2789 #define stbi__EOI(x) ((x) == 0xd9)
2790 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
2791 #define stbi__SOS(x) ((x) == 0xda)
2793 #define stbi__SOF_progressive(x) ((x) == 0xc2)
2795 static int stbi__decode_jpeg_header(stbi__jpeg
*z
, int scan
)
2798 z
->marker
= STBI__MARKER_none
; // initialize cached marker to empty
2799 m
= stbi__get_marker(z
);
2800 if (!stbi__SOI(m
)) return stbi__err("no SOI","Corrupt JPEG");
2801 if (scan
== STBI__SCAN_type
) return 1;
2802 m
= stbi__get_marker(z
);
2803 while (!stbi__SOF(m
)) {
2804 if (!stbi__process_marker(z
,m
)) return 0;
2805 m
= stbi__get_marker(z
);
2806 while (m
== STBI__MARKER_none
) {
2807 // some files have extra padding after their blocks, so ok, we'll scan
2808 if (stbi__at_eof(z
->s
)) return stbi__err("no SOF", "Corrupt JPEG");
2809 m
= stbi__get_marker(z
);
2812 z
->progressive
= stbi__SOF_progressive(m
);
2813 if (!stbi__process_frame_header(z
, scan
)) return 0;
2817 // decode image to YCbCr format
2818 static int stbi__decode_jpeg_image(stbi__jpeg
*j
)
2821 for (m
= 0; m
< 4; m
++) {
2822 j
->img_comp
[m
].raw_data
= NULL
;
2823 j
->img_comp
[m
].raw_coeff
= NULL
;
2825 j
->restart_interval
= 0;
2826 if (!stbi__decode_jpeg_header(j
, STBI__SCAN_load
)) return 0;
2827 m
= stbi__get_marker(j
);
2828 while (!stbi__EOI(m
)) {
2830 if (!stbi__process_scan_header(j
)) return 0;
2831 if (!stbi__parse_entropy_coded_data(j
)) return 0;
2832 if (j
->marker
== STBI__MARKER_none
) {
2833 // handle 0s at the end of image data from IP Kamera 9060
2834 while (!stbi__at_eof(j
->s
)) {
2835 int x
= stbi__get8(j
->s
);
2837 j
->marker
= stbi__get8(j
->s
);
2839 } else if (x
!= 0) {
2840 return stbi__err("junk before marker", "Corrupt JPEG");
2843 // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2846 if (!stbi__process_marker(j
, m
)) return 0;
2848 m
= stbi__get_marker(j
);
2851 stbi__jpeg_finish(j
);
2855 // static jfif-centered resampling (across block boundaries)
2857 typedef stbi_uc
*(*resample_row_func
)(stbi_uc
*out
, stbi_uc
*in0
, stbi_uc
*in1
,
2860 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2862 static stbi_uc
*resample_row_1(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
)
2865 STBI_NOTUSED(in_far
);
2871 static stbi_uc
* stbi__resample_row_v_2(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
)
2873 // need to generate two samples vertically for every one in input
2876 for (i
=0; i
< w
; ++i
)
2877 out
[i
] = stbi__div4(3*in_near
[i
] + in_far
[i
] + 2);
2881 static stbi_uc
* stbi__resample_row_h_2(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
)
2883 // need to generate two samples horizontally for every one in input
2885 stbi_uc
*input
= in_near
;
2888 // if only one sample, can't do any interpolation
2889 out
[0] = out
[1] = input
[0];
2894 out
[1] = stbi__div4(input
[0]*3 + input
[1] + 2);
2895 for (i
=1; i
< w
-1; ++i
) {
2896 int n
= 3*input
[i
]+2;
2897 out
[i
*2+0] = stbi__div4(n
+input
[i
-1]);
2898 out
[i
*2+1] = stbi__div4(n
+input
[i
+1]);
2900 out
[i
*2+0] = stbi__div4(input
[w
-2]*3 + input
[w
-1] + 2);
2901 out
[i
*2+1] = input
[w
-1];
2903 STBI_NOTUSED(in_far
);
2909 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2911 static stbi_uc
*stbi__resample_row_hv_2(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
)
2913 // need to generate 2x2 samples for every one in input
2916 out
[0] = out
[1] = stbi__div4(3*in_near
[0] + in_far
[0] + 2);
2920 t1
= 3*in_near
[0] + in_far
[0];
2921 out
[0] = stbi__div4(t1
+2);
2922 for (i
=1; i
< w
; ++i
) {
2924 t1
= 3*in_near
[i
]+in_far
[i
];
2925 out
[i
*2-1] = stbi__div16(3*t0
+ t1
+ 8);
2926 out
[i
*2 ] = stbi__div16(3*t1
+ t0
+ 8);
2928 out
[w
*2-1] = stbi__div4(t1
+2);
2935 #if defined(STBI_SSE2) || defined(STBI_NEON)
2936 static stbi_uc
*stbi__resample_row_hv_2_simd(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
)
2938 // need to generate 2x2 samples for every one in input
2942 out
[0] = out
[1] = stbi__div4(3*in_near
[0] + in_far
[0] + 2);
2946 t1
= 3*in_near
[0] + in_far
[0];
2947 // process groups of 8 pixels for as long as we can.
2948 // note we can't handle the last pixel in a row in this loop
2949 // because we need to handle the filter boundary conditions.
2950 for (; i
< ((w
-1) & ~7); i
+= 8) {
2951 #if defined(STBI_SSE2)
2952 // load and perform the vertical filtering pass
2953 // this uses 3*x + y = 4*x + (y - x)
2954 __m128i zero
= _mm_setzero_si128();
2955 __m128i farb
= _mm_loadl_epi64((__m128i
*) (in_far
+ i
));
2956 __m128i nearb
= _mm_loadl_epi64((__m128i
*) (in_near
+ i
));
2957 __m128i farw
= _mm_unpacklo_epi8(farb
, zero
);
2958 __m128i nearw
= _mm_unpacklo_epi8(nearb
, zero
);
2959 __m128i diff
= _mm_sub_epi16(farw
, nearw
);
2960 __m128i nears
= _mm_slli_epi16(nearw
, 2);
2961 __m128i curr
= _mm_add_epi16(nears
, diff
); // current row
2963 // horizontal filter works the same based on shifted vers of current
2964 // row. "prev" is current row shifted right by 1 pixel; we need to
2965 // insert the previous pixel value (from t1).
2966 // "next" is current row shifted left by 1 pixel, with first pixel
2967 // of next block of 8 pixels added in.
2968 __m128i prv0
= _mm_slli_si128(curr
, 2);
2969 __m128i nxt0
= _mm_srli_si128(curr
, 2);
2970 __m128i prev
= _mm_insert_epi16(prv0
, t1
, 0);
2971 __m128i next
= _mm_insert_epi16(nxt0
, 3*in_near
[i
+8] + in_far
[i
+8], 7);
2973 // horizontal filter, polyphase implementation since it's convenient:
2974 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
2975 // odd pixels = 3*cur + next = cur*4 + (next - cur)
2976 // note the shared term.
2977 __m128i bias
= _mm_set1_epi16(8);
2978 __m128i curs
= _mm_slli_epi16(curr
, 2);
2979 __m128i prvd
= _mm_sub_epi16(prev
, curr
);
2980 __m128i nxtd
= _mm_sub_epi16(next
, curr
);
2981 __m128i curb
= _mm_add_epi16(curs
, bias
);
2982 __m128i even
= _mm_add_epi16(prvd
, curb
);
2983 __m128i odd
= _mm_add_epi16(nxtd
, curb
);
2985 // interleave even and odd pixels, then undo scaling.
2986 __m128i int0
= _mm_unpacklo_epi16(even
, odd
);
2987 __m128i int1
= _mm_unpackhi_epi16(even
, odd
);
2988 __m128i de0
= _mm_srli_epi16(int0
, 4);
2989 __m128i de1
= _mm_srli_epi16(int1
, 4);
2991 // pack and write output
2992 __m128i outv
= _mm_packus_epi16(de0
, de1
);
2993 _mm_storeu_si128((__m128i
*) (out
+ i
*2), outv
);
2994 #elif defined(STBI_NEON)
2995 // load and perform the vertical filtering pass
2996 // this uses 3*x + y = 4*x + (y - x)
2997 uint8x8_t farb
= vld1_u8(in_far
+ i
);
2998 uint8x8_t nearb
= vld1_u8(in_near
+ i
);
2999 int16x8_t diff
= vreinterpretq_s16_u16(vsubl_u8(farb
, nearb
));
3000 int16x8_t nears
= vreinterpretq_s16_u16(vshll_n_u8(nearb
, 2));
3001 int16x8_t curr
= vaddq_s16(nears
, diff
); // current row
3003 // horizontal filter works the same based on shifted vers of current
3004 // row. "prev" is current row shifted right by 1 pixel; we need to
3005 // insert the previous pixel value (from t1).
3006 // "next" is current row shifted left by 1 pixel, with first pixel
3007 // of next block of 8 pixels added in.
3008 int16x8_t prv0
= vextq_s16(curr
, curr
, 7);
3009 int16x8_t nxt0
= vextq_s16(curr
, curr
, 1);
3010 int16x8_t prev
= vsetq_lane_s16(t1
, prv0
, 0);
3011 int16x8_t next
= vsetq_lane_s16(3*in_near
[i
+8] + in_far
[i
+8], nxt0
, 7);
3013 // horizontal filter, polyphase implementation since it's convenient:
3014 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3015 // odd pixels = 3*cur + next = cur*4 + (next - cur)
3016 // note the shared term.
3017 int16x8_t curs
= vshlq_n_s16(curr
, 2);
3018 int16x8_t prvd
= vsubq_s16(prev
, curr
);
3019 int16x8_t nxtd
= vsubq_s16(next
, curr
);
3020 int16x8_t even
= vaddq_s16(curs
, prvd
);
3021 int16x8_t odd
= vaddq_s16(curs
, nxtd
);
3023 // undo scaling and round, then store with even/odd phases interleaved
3025 o
.val
[0] = vqrshrun_n_s16(even
, 4);
3026 o
.val
[1] = vqrshrun_n_s16(odd
, 4);
3027 vst2_u8(out
+ i
*2, o
);
3030 // "previous" value for next iter
3031 t1
= 3*in_near
[i
+7] + in_far
[i
+7];
3035 t1
= 3*in_near
[i
] + in_far
[i
];
3036 out
[i
*2] = stbi__div16(3*t1
+ t0
+ 8);
3038 for (++i
; i
< w
; ++i
) {
3040 t1
= 3*in_near
[i
]+in_far
[i
];
3041 out
[i
*2-1] = stbi__div16(3*t0
+ t1
+ 8);
3042 out
[i
*2 ] = stbi__div16(3*t1
+ t0
+ 8);
3044 out
[w
*2-1] = stbi__div4(t1
+2);
3052 static stbi_uc
*stbi__resample_row_generic(stbi_uc
*out
, stbi_uc
*in_near
, stbi_uc
*in_far
, int w
, int hs
)
3054 // resample with nearest-neighbor
3056 STBI_NOTUSED(in_far
);
3057 for (i
=0; i
< w
; ++i
)
3058 for (j
=0; j
< hs
; ++j
)
3059 out
[i
*hs
+j
] = in_near
[i
];
3063 #ifdef STBI_JPEG_OLD
3064 // this is the same YCbCr-to-RGB calculation that stb_image has used
3065 // historically before the algorithm changes in 1.49
3066 #define float2fixed(x) ((int) ((x) * 65536 + 0.5))
3067 static void stbi__YCbCr_to_RGB_row(stbi_uc
*out
, const stbi_uc
*y
, const stbi_uc
*pcb
, const stbi_uc
*pcr
, int count
, int step
)
3070 for (i
=0; i
< count
; ++i
) {
3071 int y_fixed
= (y
[i
] << 16) + 32768; // rounding
3073 int cr
= pcr
[i
] - 128;
3074 int cb
= pcb
[i
] - 128;
3075 r
= y_fixed
+ cr
*float2fixed(1.40200f
);
3076 g
= y_fixed
- cr
*float2fixed(0.71414f
) - cb
*float2fixed(0.34414f
);
3077 b
= y_fixed
+ cb
*float2fixed(1.77200f
);
3081 if ((unsigned) r
> 255) { if (r
< 0) r
= 0; else r
= 255; }
3082 if ((unsigned) g
> 255) { if (g
< 0) g
= 0; else g
= 255; }
3083 if ((unsigned) b
> 255) { if (b
< 0) b
= 0; else b
= 255; }
3084 out
[0] = (stbi_uc
)r
;
3085 out
[1] = (stbi_uc
)g
;
3086 out
[2] = (stbi_uc
)b
;
3092 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3093 // to make sure the code produces the same results in both SIMD and scalar
3094 #define float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
3095 static void stbi__YCbCr_to_RGB_row(stbi_uc
*out
, const stbi_uc
*y
, const stbi_uc
*pcb
, const stbi_uc
*pcr
, int count
, int step
)
3098 for (i
=0; i
< count
; ++i
) {
3099 int y_fixed
= (y
[i
] << 20) + (1<<19); // rounding
3101 int cr
= pcr
[i
] - 128;
3102 int cb
= pcb
[i
] - 128;
3103 r
= y_fixed
+ cr
* float2fixed(1.40200f
);
3104 g
= y_fixed
+ (cr
*-float2fixed(0.71414f
)) + ((cb
*-float2fixed(0.34414f
)) & 0xffff0000);
3105 b
= y_fixed
+ cb
* float2fixed(1.77200f
);
3109 if ((unsigned) r
> 255) { if (r
< 0) r
= 0; else r
= 255; }
3110 if ((unsigned) g
> 255) { if (g
< 0) g
= 0; else g
= 255; }
3111 if ((unsigned) b
> 255) { if (b
< 0) b
= 0; else b
= 255; }
3112 out
[0] = (stbi_uc
)r
;
3113 out
[1] = (stbi_uc
)g
;
3114 out
[2] = (stbi_uc
)b
;
3121 #if defined(STBI_SSE2) || defined(STBI_NEON)
3122 static void stbi__YCbCr_to_RGB_simd(stbi_uc
*out
, stbi_uc
const *y
, stbi_uc
const *pcb
, stbi_uc
const *pcr
, int count
, int step
)
3127 // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3128 // it's useful in practice (you wouldn't use it for textures, for example).
3129 // so just accelerate step == 4 case.
3131 // this is a fairly straightforward implementation and not super-optimized.
3132 __m128i signflip
= _mm_set1_epi8(-0x80);
3133 __m128i cr_const0
= _mm_set1_epi16( (short) ( 1.40200f
*4096.0f
+0.5f
));
3134 __m128i cr_const1
= _mm_set1_epi16( - (short) ( 0.71414f
*4096.0f
+0.5f
));
3135 __m128i cb_const0
= _mm_set1_epi16( - (short) ( 0.34414f
*4096.0f
+0.5f
));
3136 __m128i cb_const1
= _mm_set1_epi16( (short) ( 1.77200f
*4096.0f
+0.5f
));
3137 __m128i y_bias
= _mm_set1_epi8((char) (unsigned char) 128);
3138 __m128i xw
= _mm_set1_epi16(255); // alpha channel
3140 for (; i
+7 < count
; i
+= 8) {
3142 __m128i y_bytes
= _mm_loadl_epi64((__m128i
*) (y
+i
));
3143 __m128i cr_bytes
= _mm_loadl_epi64((__m128i
*) (pcr
+i
));
3144 __m128i cb_bytes
= _mm_loadl_epi64((__m128i
*) (pcb
+i
));
3145 __m128i cr_biased
= _mm_xor_si128(cr_bytes
, signflip
); // -128
3146 __m128i cb_biased
= _mm_xor_si128(cb_bytes
, signflip
); // -128
3148 // unpack to short (and left-shift cr, cb by 8)
3149 __m128i yw
= _mm_unpacklo_epi8(y_bias
, y_bytes
);
3150 __m128i crw
= _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased
);
3151 __m128i cbw
= _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased
);
3154 __m128i yws
= _mm_srli_epi16(yw
, 4);
3155 __m128i cr0
= _mm_mulhi_epi16(cr_const0
, crw
);
3156 __m128i cb0
= _mm_mulhi_epi16(cb_const0
, cbw
);
3157 __m128i cb1
= _mm_mulhi_epi16(cbw
, cb_const1
);
3158 __m128i cr1
= _mm_mulhi_epi16(crw
, cr_const1
);
3159 __m128i rws
= _mm_add_epi16(cr0
, yws
);
3160 __m128i gwt
= _mm_add_epi16(cb0
, yws
);
3161 __m128i bws
= _mm_add_epi16(yws
, cb1
);
3162 __m128i gws
= _mm_add_epi16(gwt
, cr1
);
3165 __m128i rw
= _mm_srai_epi16(rws
, 4);
3166 __m128i bw
= _mm_srai_epi16(bws
, 4);
3167 __m128i gw
= _mm_srai_epi16(gws
, 4);
3169 // back to byte, set up for transpose
3170 __m128i brb
= _mm_packus_epi16(rw
, bw
);
3171 __m128i gxb
= _mm_packus_epi16(gw
, xw
);
3173 // transpose to interleave channels
3174 __m128i t0
= _mm_unpacklo_epi8(brb
, gxb
);
3175 __m128i t1
= _mm_unpackhi_epi8(brb
, gxb
);
3176 __m128i o0
= _mm_unpacklo_epi16(t0
, t1
);
3177 __m128i o1
= _mm_unpackhi_epi16(t0
, t1
);
3180 _mm_storeu_si128((__m128i
*) (out
+ 0), o0
);
3181 _mm_storeu_si128((__m128i
*) (out
+ 16), o1
);
3188 // in this version, step=3 support would be easy to add. but is there demand?
3190 // this is a fairly straightforward implementation and not super-optimized.
3191 uint8x8_t signflip
= vdup_n_u8(0x80);
3192 int16x8_t cr_const0
= vdupq_n_s16( (short) ( 1.40200f
*4096.0f
+0.5f
));
3193 int16x8_t cr_const1
= vdupq_n_s16( - (short) ( 0.71414f
*4096.0f
+0.5f
));
3194 int16x8_t cb_const0
= vdupq_n_s16( - (short) ( 0.34414f
*4096.0f
+0.5f
));
3195 int16x8_t cb_const1
= vdupq_n_s16( (short) ( 1.77200f
*4096.0f
+0.5f
));
3197 for (; i
+7 < count
; i
+= 8) {
3199 uint8x8_t y_bytes
= vld1_u8(y
+ i
);
3200 uint8x8_t cr_bytes
= vld1_u8(pcr
+ i
);
3201 uint8x8_t cb_bytes
= vld1_u8(pcb
+ i
);
3202 int8x8_t cr_biased
= vreinterpret_s8_u8(vsub_u8(cr_bytes
, signflip
));
3203 int8x8_t cb_biased
= vreinterpret_s8_u8(vsub_u8(cb_bytes
, signflip
));
3206 int16x8_t yws
= vreinterpretq_s16_u16(vshll_n_u8(y_bytes
, 4));
3207 int16x8_t crw
= vshll_n_s8(cr_biased
, 7);
3208 int16x8_t cbw
= vshll_n_s8(cb_biased
, 7);
3211 int16x8_t cr0
= vqdmulhq_s16(crw
, cr_const0
);
3212 int16x8_t cb0
= vqdmulhq_s16(cbw
, cb_const0
);
3213 int16x8_t cr1
= vqdmulhq_s16(crw
, cr_const1
);
3214 int16x8_t cb1
= vqdmulhq_s16(cbw
, cb_const1
);
3215 int16x8_t rws
= vaddq_s16(yws
, cr0
);
3216 int16x8_t gws
= vaddq_s16(vaddq_s16(yws
, cb0
), cr1
);
3217 int16x8_t bws
= vaddq_s16(yws
, cb1
);
3219 // undo scaling, round, convert to byte
3221 o
.val
[0] = vqrshrun_n_s16(rws
, 4);
3222 o
.val
[1] = vqrshrun_n_s16(gws
, 4);
3223 o
.val
[2] = vqrshrun_n_s16(bws
, 4);
3224 o
.val
[3] = vdup_n_u8(255);
3226 // store, interleaving r/g/b/a
3233 for (; i
< count
; ++i
) {
3234 int y_fixed
= (y
[i
] << 20) + (1<<19); // rounding
3236 int cr
= pcr
[i
] - 128;
3237 int cb
= pcb
[i
] - 128;
3238 r
= y_fixed
+ cr
* float2fixed(1.40200f
);
3239 g
= y_fixed
+ cr
*-float2fixed(0.71414f
) + ((cb
*-float2fixed(0.34414f
)) & 0xffff0000);
3240 b
= y_fixed
+ cb
* float2fixed(1.77200f
);
3244 if ((unsigned) r
> 255) { if (r
< 0) r
= 0; else r
= 255; }
3245 if ((unsigned) g
> 255) { if (g
< 0) g
= 0; else g
= 255; }
3246 if ((unsigned) b
> 255) { if (b
< 0) b
= 0; else b
= 255; }
3247 out
[0] = (stbi_uc
)r
;
3248 out
[1] = (stbi_uc
)g
;
3249 out
[2] = (stbi_uc
)b
;
3256 // set up the kernels
3257 static void stbi__setup_jpeg(stbi__jpeg
*j
)
3259 j
->idct_block_kernel
= stbi__idct_block
;
3260 j
->YCbCr_to_RGB_kernel
= stbi__YCbCr_to_RGB_row
;
3261 j
->resample_row_hv_2_kernel
= stbi__resample_row_hv_2
;
3264 if (stbi__sse2_available()) {
3265 j
->idct_block_kernel
= stbi__idct_simd
;
3266 #ifndef STBI_JPEG_OLD
3267 j
->YCbCr_to_RGB_kernel
= stbi__YCbCr_to_RGB_simd
;
3269 j
->resample_row_hv_2_kernel
= stbi__resample_row_hv_2_simd
;
3274 j
->idct_block_kernel
= stbi__idct_simd
;
3275 #ifndef STBI_JPEG_OLD
3276 j
->YCbCr_to_RGB_kernel
= stbi__YCbCr_to_RGB_simd
;
3278 j
->resample_row_hv_2_kernel
= stbi__resample_row_hv_2_simd
;
3282 // clean up the temporary component buffers
3283 static void stbi__cleanup_jpeg(stbi__jpeg
*j
)
3286 for (i
=0; i
< j
->s
->img_n
; ++i
) {
3287 if (j
->img_comp
[i
].raw_data
) {
3288 STBI_FREE(j
->img_comp
[i
].raw_data
);
3289 j
->img_comp
[i
].raw_data
= NULL
;
3290 j
->img_comp
[i
].data
= NULL
;
3292 if (j
->img_comp
[i
].raw_coeff
) {
3293 STBI_FREE(j
->img_comp
[i
].raw_coeff
);
3294 j
->img_comp
[i
].raw_coeff
= 0;
3295 j
->img_comp
[i
].coeff
= 0;
3297 if (j
->img_comp
[i
].linebuf
) {
3298 STBI_FREE(j
->img_comp
[i
].linebuf
);
3299 j
->img_comp
[i
].linebuf
= NULL
;
3306 resample_row_func resample
;
3307 stbi_uc
*line0
,*line1
;
3308 int hs
,vs
; // expansion factor in each axis
3309 int w_lores
; // horizontal pixels pre-expansion
3310 int ystep
; // how far through vertical expansion we are
3311 int ypos
; // which pre-expansion row we're on
3314 static stbi_uc
*load_jpeg_image(stbi__jpeg
*z
, int *out_x
, int *out_y
, int *comp
, int req_comp
)
3317 z
->s
->img_n
= 0; // make stbi__cleanup_jpeg safe
3319 // validate req_comp
3320 if (req_comp
< 0 || req_comp
> 4) return stbi__errpuc("bad req_comp", "Internal error");
3322 // load a jpeg image from whichever source, but leave in YCbCr format
3323 if (!stbi__decode_jpeg_image(z
)) { stbi__cleanup_jpeg(z
); return NULL
; }
3325 // determine actual number of components to generate
3326 n
= req_comp
? req_comp
: z
->s
->img_n
;
3328 if (z
->s
->img_n
== 3 && n
< 3)
3331 decode_n
= z
->s
->img_n
;
3333 // resample and color-convert
3338 stbi_uc
*coutput
[4];
3340 stbi__resample res_comp
[4];
3342 for (k
=0; k
< decode_n
; ++k
) {
3343 stbi__resample
*r
= &res_comp
[k
];
3345 // allocate line buffer big enough for upsampling off the edges
3346 // with upsample factor of 4
3347 z
->img_comp
[k
].linebuf
= (stbi_uc
*) stbi__malloc(z
->s
->img_x
+ 3);
3348 if (!z
->img_comp
[k
].linebuf
) { stbi__cleanup_jpeg(z
); return stbi__errpuc("outofmem", "Out of memory"); }
3350 r
->hs
= z
->img_h_max
/ z
->img_comp
[k
].h
;
3351 r
->vs
= z
->img_v_max
/ z
->img_comp
[k
].v
;
3352 r
->ystep
= r
->vs
>> 1;
3353 r
->w_lores
= (z
->s
->img_x
+ r
->hs
-1) / r
->hs
;
3355 r
->line0
= r
->line1
= z
->img_comp
[k
].data
;
3357 if (r
->hs
== 1 && r
->vs
== 1) r
->resample
= resample_row_1
;
3358 else if (r
->hs
== 1 && r
->vs
== 2) r
->resample
= stbi__resample_row_v_2
;
3359 else if (r
->hs
== 2 && r
->vs
== 1) r
->resample
= stbi__resample_row_h_2
;
3360 else if (r
->hs
== 2 && r
->vs
== 2) r
->resample
= z
->resample_row_hv_2_kernel
;
3361 else r
->resample
= stbi__resample_row_generic
;
3364 // can't error after this so, this is safe
3365 output
= (stbi_uc
*) stbi__malloc(n
* z
->s
->img_x
* z
->s
->img_y
+ 1);
3366 if (!output
) { stbi__cleanup_jpeg(z
); return stbi__errpuc("outofmem", "Out of memory"); }
3368 // now go ahead and resample
3369 for (j
=0; j
< z
->s
->img_y
; ++j
) {
3370 stbi_uc
*out
= output
+ n
* z
->s
->img_x
* j
;
3371 for (k
=0; k
< decode_n
; ++k
) {
3372 stbi__resample
*r
= &res_comp
[k
];
3373 int y_bot
= r
->ystep
>= (r
->vs
>> 1);
3374 coutput
[k
] = r
->resample(z
->img_comp
[k
].linebuf
,
3375 y_bot
? r
->line1
: r
->line0
,
3376 y_bot
? r
->line0
: r
->line1
,
3378 if (++r
->ystep
>= r
->vs
) {
3380 r
->line0
= r
->line1
;
3381 if (++r
->ypos
< z
->img_comp
[k
].y
)
3382 r
->line1
+= z
->img_comp
[k
].w2
;
3386 stbi_uc
*y
= coutput
[0];
3387 if (z
->s
->img_n
== 3) {
3388 z
->YCbCr_to_RGB_kernel(out
, y
, coutput
[1], coutput
[2], z
->s
->img_x
, n
);
3390 for (i
=0; i
< z
->s
->img_x
; ++i
) {
3391 out
[0] = out
[1] = out
[2] = y
[i
];
3392 out
[3] = 255; // not used if n==3
3396 stbi_uc
*y
= coutput
[0];
3398 for (i
=0; i
< z
->s
->img_x
; ++i
) out
[i
] = y
[i
];
3400 for (i
=0; i
< z
->s
->img_x
; ++i
) *out
++ = y
[i
], *out
++ = 255;
3403 stbi__cleanup_jpeg(z
);
3404 *out_x
= z
->s
->img_x
;
3405 *out_y
= z
->s
->img_y
;
3406 if (comp
) *comp
= z
->s
->img_n
; // report original components, not output
3411 static unsigned char *stbi__jpeg_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
3415 stbi__setup_jpeg(&j
);
3416 return load_jpeg_image(&j
, x
,y
,comp
,req_comp
);
3419 static int stbi__jpeg_test(stbi__context
*s
)
3424 stbi__setup_jpeg(&j
);
3425 r
= stbi__decode_jpeg_header(&j
, STBI__SCAN_type
);
3430 static int stbi__jpeg_info_raw(stbi__jpeg
*j
, int *x
, int *y
, int *comp
)
3432 if (!stbi__decode_jpeg_header(j
, STBI__SCAN_header
)) {
3433 stbi__rewind( j
->s
);
3436 if (x
) *x
= j
->s
->img_x
;
3437 if (y
) *y
= j
->s
->img_y
;
3438 if (comp
) *comp
= j
->s
->img_n
;
3442 static int stbi__jpeg_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
3446 return stbi__jpeg_info_raw(&j
, x
, y
, comp
);
3450 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
3451 // simple implementation
3452 // - all input must be provided in an upfront buffer
3453 // - all output is written to a single output buffer (can malloc/realloc)
3457 #ifndef STBI_NO_ZLIB
3459 // fast-way is faster to check than jpeg huffman, but slow way is slower
3460 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
3461 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
3463 // zlib-style huffman encoding
3464 // (jpegs packs from left, zlib from right, so can't share code)
3467 stbi__uint16 fast
[1 << STBI__ZFAST_BITS
];
3468 stbi__uint16 firstcode
[16];
3470 stbi__uint16 firstsymbol
[16];
3472 stbi__uint16 value
[288];
3475 stbi_inline
static int stbi__bitreverse16(int n
)
3477 n
= ((n
& 0xAAAA) >> 1) | ((n
& 0x5555) << 1);
3478 n
= ((n
& 0xCCCC) >> 2) | ((n
& 0x3333) << 2);
3479 n
= ((n
& 0xF0F0) >> 4) | ((n
& 0x0F0F) << 4);
3480 n
= ((n
& 0xFF00) >> 8) | ((n
& 0x00FF) << 8);
3484 stbi_inline
static int stbi__bit_reverse(int v
, int bits
)
3486 STBI_ASSERT(bits
<= 16);
3487 // to bit reverse n bits, reverse 16 and shift
3488 // e.g. 11 bits, bit reverse and shift away 5
3489 return stbi__bitreverse16(v
) >> (16-bits
);
3492 static int stbi__zbuild_huffman(stbi__zhuffman
*z
, stbi_uc
*sizelist
, int num
)
3495 int code
, next_code
[16], sizes
[17];
3497 // DEFLATE spec for generating codes
3498 memset(sizes
, 0, sizeof(sizes
));
3499 memset(z
->fast
, 0, sizeof(z
->fast
));
3500 for (i
=0; i
< num
; ++i
)
3501 ++sizes
[sizelist
[i
]];
3503 for (i
=1; i
< 16; ++i
)
3504 if (sizes
[i
] > (1 << i
))
3505 return stbi__err("bad sizes", "Corrupt PNG");
3507 for (i
=1; i
< 16; ++i
) {
3508 next_code
[i
] = code
;
3509 z
->firstcode
[i
] = (stbi__uint16
) code
;
3510 z
->firstsymbol
[i
] = (stbi__uint16
) k
;
3511 code
= (code
+ sizes
[i
]);
3513 if (code
-1 >= (1 << i
)) return stbi__err("bad codelengths","Corrupt PNG");
3514 z
->maxcode
[i
] = code
<< (16-i
); // preshift for inner loop
3518 z
->maxcode
[16] = 0x10000; // sentinel
3519 for (i
=0; i
< num
; ++i
) {
3520 int s
= sizelist
[i
];
3522 int c
= next_code
[s
] - z
->firstcode
[s
] + z
->firstsymbol
[s
];
3523 stbi__uint16 fastv
= (stbi__uint16
) ((s
<< 9) | i
);
3524 z
->size
[c
] = (stbi_uc
) s
;
3525 z
->value
[c
] = (stbi__uint16
) i
;
3526 if (s
<= STBI__ZFAST_BITS
) {
3527 int j
= stbi__bit_reverse(next_code
[s
],s
);
3528 while (j
< (1 << STBI__ZFAST_BITS
)) {
3539 // zlib-from-memory implementation for PNG reading
3540 // because PNG allows splitting the zlib stream arbitrarily,
3541 // and it's annoying structurally to have PNG call ZLIB call PNG,
3542 // we require PNG read all the IDATs and combine them into a single
3547 stbi_uc
*zbuffer
, *zbuffer_end
;
3549 stbi__uint32 code_buffer
;
3556 stbi__zhuffman z_length
, z_distance
;
3559 stbi_inline
static stbi_uc
stbi__zget8(stbi__zbuf
*z
)
3561 if (z
->zbuffer
>= z
->zbuffer_end
) return 0;
3562 return *z
->zbuffer
++;
3565 static void stbi__fill_bits(stbi__zbuf
*z
)
3568 STBI_ASSERT(z
->code_buffer
< (1U << z
->num_bits
));
3569 z
->code_buffer
|= (unsigned int) stbi__zget8(z
) << z
->num_bits
;
3571 } while (z
->num_bits
<= 24);
3574 stbi_inline
static unsigned int stbi__zreceive(stbi__zbuf
*z
, int n
)
3577 if (z
->num_bits
< n
) stbi__fill_bits(z
);
3578 k
= z
->code_buffer
& ((1 << n
) - 1);
3579 z
->code_buffer
>>= n
;
3584 static int stbi__zhuffman_decode_slowpath(stbi__zbuf
*a
, stbi__zhuffman
*z
)
3587 // not resolved by fast table, so compute it the slow way
3588 // use jpeg approach, which requires MSbits at top
3589 k
= stbi__bit_reverse(a
->code_buffer
, 16);
3590 for (s
=STBI__ZFAST_BITS
+1; ; ++s
)
3591 if (k
< z
->maxcode
[s
])
3593 if (s
== 16) return -1; // invalid code!
3594 // code size is s, so:
3595 b
= (k
>> (16-s
)) - z
->firstcode
[s
] + z
->firstsymbol
[s
];
3596 STBI_ASSERT(z
->size
[b
] == s
);
3597 a
->code_buffer
>>= s
;
3602 stbi_inline
static int stbi__zhuffman_decode(stbi__zbuf
*a
, stbi__zhuffman
*z
)
3605 if (a
->num_bits
< 16) stbi__fill_bits(a
);
3606 b
= z
->fast
[a
->code_buffer
& STBI__ZFAST_MASK
];
3609 a
->code_buffer
>>= s
;
3613 return stbi__zhuffman_decode_slowpath(a
, z
);
3616 static int stbi__zexpand(stbi__zbuf
*z
, char *zout
, int n
) // need to make room for n bytes
3621 if (!z
->z_expandable
) return stbi__err("output buffer limit","Corrupt PNG");
3622 cur
= (int) (z
->zout
- z
->zout_start
);
3623 limit
= (int) (z
->zout_end
- z
->zout_start
);
3624 while (cur
+ n
> limit
)
3626 q
= (char *) STBI_REALLOC(z
->zout_start
, limit
);
3627 if (q
== NULL
) return stbi__err("outofmem", "Out of memory");
3630 z
->zout_end
= q
+ limit
;
3634 static int stbi__zlength_base
[31] = {
3635 3,4,5,6,7,8,9,10,11,13,
3636 15,17,19,23,27,31,35,43,51,59,
3637 67,83,99,115,131,163,195,227,258,0,0 };
3639 static int stbi__zlength_extra
[31]=
3640 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
3642 static int stbi__zdist_base
[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
3643 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
3645 static int stbi__zdist_extra
[32] =
3646 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
3648 static int stbi__parse_huffman_block(stbi__zbuf
*a
)
3650 char *zout
= a
->zout
;
3652 int z
= stbi__zhuffman_decode(a
, &a
->z_length
);
3654 if (z
< 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
3655 if (zout
>= a
->zout_end
) {
3656 if (!stbi__zexpand(a
, zout
, 1)) return 0;
3668 len
= stbi__zlength_base
[z
];
3669 if (stbi__zlength_extra
[z
]) len
+= stbi__zreceive(a
, stbi__zlength_extra
[z
]);
3670 z
= stbi__zhuffman_decode(a
, &a
->z_distance
);
3671 if (z
< 0) return stbi__err("bad huffman code","Corrupt PNG");
3672 dist
= stbi__zdist_base
[z
];
3673 if (stbi__zdist_extra
[z
]) dist
+= stbi__zreceive(a
, stbi__zdist_extra
[z
]);
3674 if (zout
- a
->zout_start
< dist
) return stbi__err("bad dist","Corrupt PNG");
3675 if (zout
+ len
> a
->zout_end
) {
3676 if (!stbi__zexpand(a
, zout
, len
)) return 0;
3679 p
= (stbi_uc
*) (zout
- dist
);
3680 if (dist
== 1) { // run of one byte; common in images.
3682 if (len
) { do *zout
++ = v
; while (--len
); }
3684 if (len
) { do *zout
++ = *p
++; while (--len
); }
3690 static int stbi__compute_huffman_codes(stbi__zbuf
*a
)
3692 static stbi_uc length_dezigzag
[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
3693 stbi__zhuffman z_codelength
;
3694 stbi_uc lencodes
[286+32+137];//padding for maximum single op
3695 stbi_uc codelength_sizes
[19];
3698 int hlit
= stbi__zreceive(a
,5) + 257;
3699 int hdist
= stbi__zreceive(a
,5) + 1;
3700 int hclen
= stbi__zreceive(a
,4) + 4;
3702 memset(codelength_sizes
, 0, sizeof(codelength_sizes
));
3703 for (i
=0; i
< hclen
; ++i
) {
3704 int s
= stbi__zreceive(a
,3);
3705 codelength_sizes
[length_dezigzag
[i
]] = (stbi_uc
) s
;
3707 if (!stbi__zbuild_huffman(&z_codelength
, codelength_sizes
, 19)) return 0;
3710 while (n
< hlit
+ hdist
) {
3711 int c
= stbi__zhuffman_decode(a
, &z_codelength
);
3712 if (c
< 0 || c
>= 19) return stbi__err("bad codelengths", "Corrupt PNG");
3714 lencodes
[n
++] = (stbi_uc
) c
;
3716 c
= stbi__zreceive(a
,2)+3;
3717 memset(lencodes
+n
, lencodes
[n
-1], c
);
3719 } else if (c
== 17) {
3720 c
= stbi__zreceive(a
,3)+3;
3721 memset(lencodes
+n
, 0, c
);
3724 STBI_ASSERT(c
== 18);
3725 c
= stbi__zreceive(a
,7)+11;
3726 memset(lencodes
+n
, 0, c
);
3730 if (n
!= hlit
+hdist
) return stbi__err("bad codelengths","Corrupt PNG");
3731 if (!stbi__zbuild_huffman(&a
->z_length
, lencodes
, hlit
)) return 0;
3732 if (!stbi__zbuild_huffman(&a
->z_distance
, lencodes
+hlit
, hdist
)) return 0;
3736 static int stbi__parse_uncomperssed_block(stbi__zbuf
*a
)
3740 if (a
->num_bits
& 7)
3741 stbi__zreceive(a
, a
->num_bits
& 7); // discard
3742 // drain the bit-packed data into header
3744 while (a
->num_bits
> 0) {
3745 header
[k
++] = (stbi_uc
) (a
->code_buffer
& 255); // suppress MSVC run-time check
3746 a
->code_buffer
>>= 8;
3749 STBI_ASSERT(a
->num_bits
== 0);
3750 // now fill header the normal way
3752 header
[k
++] = stbi__zget8(a
);
3753 len
= header
[1] * 256 + header
[0];
3754 nlen
= header
[3] * 256 + header
[2];
3755 if (nlen
!= (len
^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
3756 if (a
->zbuffer
+ len
> a
->zbuffer_end
) return stbi__err("read past buffer","Corrupt PNG");
3757 if (a
->zout
+ len
> a
->zout_end
)
3758 if (!stbi__zexpand(a
, a
->zout
, len
)) return 0;
3759 memcpy(a
->zout
, a
->zbuffer
, len
);
3765 static int stbi__parse_zlib_header(stbi__zbuf
*a
)
3767 int cmf
= stbi__zget8(a
);
3769 /* int cinfo = cmf >> 4; */
3770 int flg
= stbi__zget8(a
);
3771 if ((cmf
*256+flg
) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
3772 if (flg
& 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
3773 if (cm
!= 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
3774 // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
3778 // @TODO: should statically initialize these for optimal thread safety
3779 static stbi_uc stbi__zdefault_length
[288], stbi__zdefault_distance
[32];
3780 static void stbi__init_zdefaults(void)
3782 int i
; // use <= to match clearly with spec
3783 for (i
=0; i
<= 143; ++i
) stbi__zdefault_length
[i
] = 8;
3784 for ( ; i
<= 255; ++i
) stbi__zdefault_length
[i
] = 9;
3785 for ( ; i
<= 279; ++i
) stbi__zdefault_length
[i
] = 7;
3786 for ( ; i
<= 287; ++i
) stbi__zdefault_length
[i
] = 8;
3788 for (i
=0; i
<= 31; ++i
) stbi__zdefault_distance
[i
] = 5;
3791 static int stbi__parse_zlib(stbi__zbuf
*a
, int parse_header
)
3795 if (!stbi__parse_zlib_header(a
)) return 0;
3799 final
= stbi__zreceive(a
,1);
3800 type
= stbi__zreceive(a
,2);
3802 if (!stbi__parse_uncomperssed_block(a
)) return 0;
3803 } else if (type
== 3) {
3807 // use fixed code lengths
3808 if (!stbi__zdefault_distance
[31]) stbi__init_zdefaults();
3809 if (!stbi__zbuild_huffman(&a
->z_length
, stbi__zdefault_length
, 288)) return 0;
3810 if (!stbi__zbuild_huffman(&a
->z_distance
, stbi__zdefault_distance
, 32)) return 0;
3812 if (!stbi__compute_huffman_codes(a
)) return 0;
3814 if (!stbi__parse_huffman_block(a
)) return 0;
3820 static int stbi__do_zlib(stbi__zbuf
*a
, char *obuf
, int olen
, int exp
, int parse_header
)
3822 a
->zout_start
= obuf
;
3824 a
->zout_end
= obuf
+ olen
;
3825 a
->z_expandable
= exp
;
3827 return stbi__parse_zlib(a
, parse_header
);
3830 STBIDEF
char *stbi_zlib_decode_malloc_guesssize(const char *buffer
, int len
, int initial_size
, int *outlen
)
3833 char *p
= (char *) stbi__malloc(initial_size
);
3834 if (p
== NULL
) return NULL
;
3835 a
.zbuffer
= (stbi_uc
*) buffer
;
3836 a
.zbuffer_end
= (stbi_uc
*) buffer
+ len
;
3837 if (stbi__do_zlib(&a
, p
, initial_size
, 1, 1)) {
3838 if (outlen
) *outlen
= (int) (a
.zout
- a
.zout_start
);
3839 return a
.zout_start
;
3841 STBI_FREE(a
.zout_start
);
3846 STBIDEF
char *stbi_zlib_decode_malloc(char const *buffer
, int len
, int *outlen
)
3848 return stbi_zlib_decode_malloc_guesssize(buffer
, len
, 16384, outlen
);
3851 STBIDEF
char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer
, int len
, int initial_size
, int *outlen
, int parse_header
)
3854 char *p
= (char *) stbi__malloc(initial_size
);
3855 if (p
== NULL
) return NULL
;
3856 a
.zbuffer
= (stbi_uc
*) buffer
;
3857 a
.zbuffer_end
= (stbi_uc
*) buffer
+ len
;
3858 if (stbi__do_zlib(&a
, p
, initial_size
, 1, parse_header
)) {
3859 if (outlen
) *outlen
= (int) (a
.zout
- a
.zout_start
);
3860 return a
.zout_start
;
3862 STBI_FREE(a
.zout_start
);
3867 STBIDEF
int stbi_zlib_decode_buffer(char *obuffer
, int olen
, char const *ibuffer
, int ilen
)
3870 a
.zbuffer
= (stbi_uc
*) ibuffer
;
3871 a
.zbuffer_end
= (stbi_uc
*) ibuffer
+ ilen
;
3872 if (stbi__do_zlib(&a
, obuffer
, olen
, 0, 1))
3873 return (int) (a
.zout
- a
.zout_start
);
3878 STBIDEF
char *stbi_zlib_decode_noheader_malloc(char const *buffer
, int len
, int *outlen
)
3881 char *p
= (char *) stbi__malloc(16384);
3882 if (p
== NULL
) return NULL
;
3883 a
.zbuffer
= (stbi_uc
*) buffer
;
3884 a
.zbuffer_end
= (stbi_uc
*) buffer
+len
;
3885 if (stbi__do_zlib(&a
, p
, 16384, 1, 0)) {
3886 if (outlen
) *outlen
= (int) (a
.zout
- a
.zout_start
);
3887 return a
.zout_start
;
3889 STBI_FREE(a
.zout_start
);
3894 STBIDEF
int stbi_zlib_decode_noheader_buffer(char *obuffer
, int olen
, const char *ibuffer
, int ilen
)
3897 a
.zbuffer
= (stbi_uc
*) ibuffer
;
3898 a
.zbuffer_end
= (stbi_uc
*) ibuffer
+ ilen
;
3899 if (stbi__do_zlib(&a
, obuffer
, olen
, 0, 0))
3900 return (int) (a
.zout
- a
.zout_start
);
3906 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
3907 // simple implementation
3908 // - only 8-bit samples
3909 // - no CRC checking
3910 // - allocates lots of intermediate memory
3911 // - avoids problem of streaming data between subsystems
3912 // - avoids explicit window management
3914 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
3919 stbi__uint32 length
;
3923 static stbi__pngchunk
stbi__get_chunk_header(stbi__context
*s
)
3926 c
.length
= stbi__get32be(s
);
3927 c
.type
= stbi__get32be(s
);
3931 static int stbi__check_png_header(stbi__context
*s
)
3933 static stbi_uc png_sig
[8] = { 137,80,78,71,13,10,26,10 };
3935 for (i
=0; i
< 8; ++i
)
3936 if (stbi__get8(s
) != png_sig
[i
]) return stbi__err("bad png sig","Not a PNG");
3943 stbi_uc
*idata
, *expanded
, *out
;
3953 // synthetic filters used for first scanline to avoid needing a dummy row of 0s
3958 static stbi_uc first_row_filter
[5] =
3967 static int stbi__paeth(int a
, int b
, int c
)
3973 if (pa
<= pb
&& pa
<= pc
) return a
;
3974 if (pb
<= pc
) return b
;
3978 static stbi_uc stbi__depth_scale_table
[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
3980 // create the png data from post-deflated data
3981 static int stbi__create_png_image_raw(stbi__png
*a
, stbi_uc
*raw
, stbi__uint32 raw_len
, int out_n
, stbi__uint32 x
, stbi__uint32 y
, int depth
, int color
)
3983 stbi__context
*s
= a
->s
;
3984 stbi__uint32 i
,j
,stride
= x
*out_n
;
3985 stbi__uint32 img_len
, img_width_bytes
;
3987 int img_n
= s
->img_n
; // copy it into a local for later
3989 STBI_ASSERT(out_n
== s
->img_n
|| out_n
== s
->img_n
+1);
3990 a
->out
= (stbi_uc
*) stbi__malloc(x
* y
* out_n
); // extra bytes to write off the end into
3991 if (!a
->out
) return stbi__err("outofmem", "Out of memory");
3993 img_width_bytes
= (((img_n
* x
* depth
) + 7) >> 3);
3994 img_len
= (img_width_bytes
+ 1) * y
;
3995 if (s
->img_x
== x
&& s
->img_y
== y
) {
3996 if (raw_len
!= img_len
) return stbi__err("not enough pixels","Corrupt PNG");
3997 } else { // interlaced:
3998 if (raw_len
< img_len
) return stbi__err("not enough pixels","Corrupt PNG");
4001 for (j
=0; j
< y
; ++j
) {
4002 stbi_uc
*cur
= a
->out
+ stride
*j
;
4003 stbi_uc
*prior
= cur
- stride
;
4004 int filter
= *raw
++;
4005 int filter_bytes
= img_n
;
4008 return stbi__err("invalid filter","Corrupt PNG");
4011 STBI_ASSERT(img_width_bytes
<= x
);
4012 cur
+= x
*out_n
- img_width_bytes
; // store output to the rightmost img_len bytes, so we can decode in place
4014 width
= img_width_bytes
;
4017 // if first row, use special filter that doesn't sample previous row
4018 if (j
== 0) filter
= first_row_filter
[filter
];
4020 // handle first byte explicitly
4021 for (k
=0; k
< filter_bytes
; ++k
) {
4023 case STBI__F_none
: cur
[k
] = raw
[k
]; break;
4024 case STBI__F_sub
: cur
[k
] = raw
[k
]; break;
4025 case STBI__F_up
: cur
[k
] = STBI__BYTECAST(raw
[k
] + prior
[k
]); break;
4026 case STBI__F_avg
: cur
[k
] = STBI__BYTECAST(raw
[k
] + (prior
[k
]>>1)); break;
4027 case STBI__F_paeth
: cur
[k
] = STBI__BYTECAST(raw
[k
] + stbi__paeth(0,prior
[k
],0)); break;
4028 case STBI__F_avg_first
: cur
[k
] = raw
[k
]; break;
4029 case STBI__F_paeth_first
: cur
[k
] = raw
[k
]; break;
4035 cur
[img_n
] = 255; // first pixel
4045 // this is a little gross, so that we don't switch per-pixel or per-component
4046 if (depth
< 8 || img_n
== out_n
) {
4047 int nk
= (width
- 1)*img_n
;
4050 for (k=0; k < nk; ++k)
4052 // "none" filter turns into a memcpy here; make that explicit.
4053 case STBI__F_none
: memcpy(cur
, raw
, nk
); break;
4054 CASE(STBI__F_sub
) cur
[k
] = STBI__BYTECAST(raw
[k
] + cur
[k
-filter_bytes
]); break;
4055 CASE(STBI__F_up
) cur
[k
] = STBI__BYTECAST(raw
[k
] + prior
[k
]); break;
4056 CASE(STBI__F_avg
) cur
[k
] = STBI__BYTECAST(raw
[k
] + ((prior
[k
] + cur
[k
-filter_bytes
])>>1)); break;
4057 CASE(STBI__F_paeth
) cur
[k
] = STBI__BYTECAST(raw
[k
] + stbi__paeth(cur
[k
-filter_bytes
],prior
[k
],prior
[k
-filter_bytes
])); break;
4058 CASE(STBI__F_avg_first
) cur
[k
] = STBI__BYTECAST(raw
[k
] + (cur
[k
-filter_bytes
] >> 1)); break;
4059 CASE(STBI__F_paeth_first
) cur
[k
] = STBI__BYTECAST(raw
[k
] + stbi__paeth(cur
[k
-filter_bytes
],0,0)); break;
4064 STBI_ASSERT(img_n
+1 == out_n
);
4067 for (i=x-1; i >= 1; --i, cur[img_n]=255,raw+=img_n,cur+=out_n,prior+=out_n) \
4068 for (k=0; k < img_n; ++k)
4070 CASE(STBI__F_none
) cur
[k
] = raw
[k
]; break;
4071 CASE(STBI__F_sub
) cur
[k
] = STBI__BYTECAST(raw
[k
] + cur
[k
-out_n
]); break;
4072 CASE(STBI__F_up
) cur
[k
] = STBI__BYTECAST(raw
[k
] + prior
[k
]); break;
4073 CASE(STBI__F_avg
) cur
[k
] = STBI__BYTECAST(raw
[k
] + ((prior
[k
] + cur
[k
-out_n
])>>1)); break;
4074 CASE(STBI__F_paeth
) cur
[k
] = STBI__BYTECAST(raw
[k
] + stbi__paeth(cur
[k
-out_n
],prior
[k
],prior
[k
-out_n
])); break;
4075 CASE(STBI__F_avg_first
) cur
[k
] = STBI__BYTECAST(raw
[k
] + (cur
[k
-out_n
] >> 1)); break;
4076 CASE(STBI__F_paeth_first
) cur
[k
] = STBI__BYTECAST(raw
[k
] + stbi__paeth(cur
[k
-out_n
],0,0)); break;
4082 // we make a separate pass to expand bits to pixels; for performance,
4083 // this could run two scanlines behind the above code, so it won't
4084 // intefere with filtering but will still be in the cache.
4086 for (j
=0; j
< y
; ++j
) {
4087 stbi_uc
*cur
= a
->out
+ stride
*j
;
4088 stbi_uc
*in
= a
->out
+ stride
*j
+ x
*out_n
- img_width_bytes
;
4089 // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4090 // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4091 stbi_uc scale
= (color
== 0) ? stbi__depth_scale_table
[depth
] : 1; // scale grayscale values to 0..255 range
4093 // note that the final byte might overshoot and write more data than desired.
4094 // we can allocate enough data that this never writes out of memory, but it
4095 // could also overwrite the next scanline. can it overwrite non-empty data
4096 // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4097 // so we need to explicitly clamp the final ones
4100 for (k
=x
*img_n
; k
>= 2; k
-=2, ++in
) {
4101 *cur
++ = scale
* ((*in
>> 4) );
4102 *cur
++ = scale
* ((*in
) & 0x0f);
4104 if (k
> 0) *cur
++ = scale
* ((*in
>> 4) );
4105 } else if (depth
== 2) {
4106 for (k
=x
*img_n
; k
>= 4; k
-=4, ++in
) {
4107 *cur
++ = scale
* ((*in
>> 6) );
4108 *cur
++ = scale
* ((*in
>> 4) & 0x03);
4109 *cur
++ = scale
* ((*in
>> 2) & 0x03);
4110 *cur
++ = scale
* ((*in
) & 0x03);
4112 if (k
> 0) *cur
++ = scale
* ((*in
>> 6) );
4113 if (k
> 1) *cur
++ = scale
* ((*in
>> 4) & 0x03);
4114 if (k
> 2) *cur
++ = scale
* ((*in
>> 2) & 0x03);
4115 } else if (depth
== 1) {
4116 for (k
=x
*img_n
; k
>= 8; k
-=8, ++in
) {
4117 *cur
++ = scale
* ((*in
>> 7) );
4118 *cur
++ = scale
* ((*in
>> 6) & 0x01);
4119 *cur
++ = scale
* ((*in
>> 5) & 0x01);
4120 *cur
++ = scale
* ((*in
>> 4) & 0x01);
4121 *cur
++ = scale
* ((*in
>> 3) & 0x01);
4122 *cur
++ = scale
* ((*in
>> 2) & 0x01);
4123 *cur
++ = scale
* ((*in
>> 1) & 0x01);
4124 *cur
++ = scale
* ((*in
) & 0x01);
4126 if (k
> 0) *cur
++ = scale
* ((*in
>> 7) );
4127 if (k
> 1) *cur
++ = scale
* ((*in
>> 6) & 0x01);
4128 if (k
> 2) *cur
++ = scale
* ((*in
>> 5) & 0x01);
4129 if (k
> 3) *cur
++ = scale
* ((*in
>> 4) & 0x01);
4130 if (k
> 4) *cur
++ = scale
* ((*in
>> 3) & 0x01);
4131 if (k
> 5) *cur
++ = scale
* ((*in
>> 2) & 0x01);
4132 if (k
> 6) *cur
++ = scale
* ((*in
>> 1) & 0x01);
4134 if (img_n
!= out_n
) {
4136 // insert alpha = 255
4137 cur
= a
->out
+ stride
*j
;
4139 for (q
=x
-1; q
>= 0; --q
) {
4141 cur
[q
*2+0] = cur
[q
];
4144 STBI_ASSERT(img_n
== 3);
4145 for (q
=x
-1; q
>= 0; --q
) {
4147 cur
[q
*4+2] = cur
[q
*3+2];
4148 cur
[q
*4+1] = cur
[q
*3+1];
4149 cur
[q
*4+0] = cur
[q
*3+0];
4159 static int stbi__create_png_image(stbi__png
*a
, stbi_uc
*image_data
, stbi__uint32 image_data_len
, int out_n
, int depth
, int color
, int interlaced
)
4164 return stbi__create_png_image_raw(a
, image_data
, image_data_len
, out_n
, a
->s
->img_x
, a
->s
->img_y
, depth
, color
);
4167 final
= (stbi_uc
*) stbi__malloc(a
->s
->img_x
* a
->s
->img_y
* out_n
);
4168 for (p
=0; p
< 7; ++p
) {
4169 int xorig
[] = { 0,4,0,2,0,1,0 };
4170 int yorig
[] = { 0,0,4,0,2,0,1 };
4171 int xspc
[] = { 8,8,4,4,2,2,1 };
4172 int yspc
[] = { 8,8,8,4,4,2,2 };
4174 // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4175 x
= (a
->s
->img_x
- xorig
[p
] + xspc
[p
]-1) / xspc
[p
];
4176 y
= (a
->s
->img_y
- yorig
[p
] + yspc
[p
]-1) / yspc
[p
];
4178 stbi__uint32 img_len
= ((((a
->s
->img_n
* x
* depth
) + 7) >> 3) + 1) * y
;
4179 if (!stbi__create_png_image_raw(a
, image_data
, image_data_len
, out_n
, x
, y
, depth
, color
)) {
4183 for (j
=0; j
< y
; ++j
) {
4184 for (i
=0; i
< x
; ++i
) {
4185 int out_y
= j
*yspc
[p
]+yorig
[p
];
4186 int out_x
= i
*xspc
[p
]+xorig
[p
];
4187 memcpy(final
+ out_y
*a
->s
->img_x
*out_n
+ out_x
*out_n
,
4188 a
->out
+ (j
*x
+i
)*out_n
, out_n
);
4192 image_data
+= img_len
;
4193 image_data_len
-= img_len
;
4201 static int stbi__compute_transparency(stbi__png
*z
, stbi_uc tc
[3], int out_n
)
4203 stbi__context
*s
= z
->s
;
4204 stbi__uint32 i
, pixel_count
= s
->img_x
* s
->img_y
;
4205 stbi_uc
*p
= z
->out
;
4207 // compute color-based transparency, assuming we've
4208 // already got 255 as the alpha value in the output
4209 STBI_ASSERT(out_n
== 2 || out_n
== 4);
4212 for (i
=0; i
< pixel_count
; ++i
) {
4213 p
[1] = (p
[0] == tc
[0] ? 0 : 255);
4217 for (i
=0; i
< pixel_count
; ++i
) {
4218 if (p
[0] == tc
[0] && p
[1] == tc
[1] && p
[2] == tc
[2])
4226 static int stbi__expand_png_palette(stbi__png
*a
, stbi_uc
*palette
, int len
, int pal_img_n
)
4228 stbi__uint32 i
, pixel_count
= a
->s
->img_x
* a
->s
->img_y
;
4229 stbi_uc
*p
, *temp_out
, *orig
= a
->out
;
4231 p
= (stbi_uc
*) stbi__malloc(pixel_count
* pal_img_n
);
4232 if (p
== NULL
) return stbi__err("outofmem", "Out of memory");
4234 // between here and free(out) below, exitting would leak
4237 if (pal_img_n
== 3) {
4238 for (i
=0; i
< pixel_count
; ++i
) {
4241 p
[1] = palette
[n
+1];
4242 p
[2] = palette
[n
+2];
4246 for (i
=0; i
< pixel_count
; ++i
) {
4249 p
[1] = palette
[n
+1];
4250 p
[2] = palette
[n
+2];
4251 p
[3] = palette
[n
+3];
4263 static int stbi__unpremultiply_on_load
= 0;
4264 static int stbi__de_iphone_flag
= 0;
4266 STBIDEF
void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply
)
4268 stbi__unpremultiply_on_load
= flag_true_if_should_unpremultiply
;
4271 STBIDEF
void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert
)
4273 stbi__de_iphone_flag
= flag_true_if_should_convert
;
4276 static void stbi__de_iphone(stbi__png
*z
)
4278 stbi__context
*s
= z
->s
;
4279 stbi__uint32 i
, pixel_count
= s
->img_x
* s
->img_y
;
4280 stbi_uc
*p
= z
->out
;
4282 if (s
->img_out_n
== 3) { // convert bgr to rgb
4283 for (i
=0; i
< pixel_count
; ++i
) {
4290 STBI_ASSERT(s
->img_out_n
== 4);
4291 if (stbi__unpremultiply_on_load
) {
4292 // convert bgr to rgb and unpremultiply
4293 for (i
=0; i
< pixel_count
; ++i
) {
4297 p
[0] = p
[2] * 255 / a
;
4298 p
[1] = p
[1] * 255 / a
;
4307 // convert bgr to rgb
4308 for (i
=0; i
< pixel_count
; ++i
) {
4318 #define STBI__PNG_TYPE(a,b,c,d) (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
4320 static int stbi__parse_png_file(stbi__png
*z
, int scan
, int req_comp
)
4322 stbi_uc palette
[1024], pal_img_n
=0;
4323 stbi_uc has_trans
=0, tc
[3];
4324 stbi__uint32 ioff
=0, idata_limit
=0, i
, pal_len
=0;
4325 int first
=1,k
,interlace
=0, color
=0, depth
=0, is_iphone
=0;
4326 stbi__context
*s
= z
->s
;
4332 if (!stbi__check_png_header(s
)) return 0;
4334 if (scan
== STBI__SCAN_type
) return 1;
4337 stbi__pngchunk c
= stbi__get_chunk_header(s
);
4339 case STBI__PNG_TYPE('C','g','B','I'):
4341 stbi__skip(s
, c
.length
);
4343 case STBI__PNG_TYPE('I','H','D','R'): {
4345 if (!first
) return stbi__err("multiple IHDR","Corrupt PNG");
4347 if (c
.length
!= 13) return stbi__err("bad IHDR len","Corrupt PNG");
4348 s
->img_x
= stbi__get32be(s
); if (s
->img_x
> (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4349 s
->img_y
= stbi__get32be(s
); if (s
->img_y
> (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4350 depth
= stbi__get8(s
); if (depth
!= 1 && depth
!= 2 && depth
!= 4 && depth
!= 8) return stbi__err("1/2/4/8-bit only","PNG not supported: 1/2/4/8-bit only");
4351 color
= stbi__get8(s
); if (color
> 6) return stbi__err("bad ctype","Corrupt PNG");
4352 if (color
== 3) pal_img_n
= 3; else if (color
& 1) return stbi__err("bad ctype","Corrupt PNG");
4353 comp
= stbi__get8(s
); if (comp
) return stbi__err("bad comp method","Corrupt PNG");
4354 filter
= stbi__get8(s
); if (filter
) return stbi__err("bad filter method","Corrupt PNG");
4355 interlace
= stbi__get8(s
); if (interlace
>1) return stbi__err("bad interlace method","Corrupt PNG");
4356 if (!s
->img_x
|| !s
->img_y
) return stbi__err("0-pixel image","Corrupt PNG");
4358 s
->img_n
= (color
& 2 ? 3 : 1) + (color
& 4 ? 1 : 0);
4359 if ((1 << 30) / s
->img_x
/ s
->img_n
< s
->img_y
) return stbi__err("too large", "Image too large to decode");
4360 if (scan
== STBI__SCAN_header
) return 1;
4362 // if paletted, then pal_n is our final components, and
4363 // img_n is # components to decompress/filter.
4365 if ((1 << 30) / s
->img_x
/ 4 < s
->img_y
) return stbi__err("too large","Corrupt PNG");
4366 // if SCAN_header, have to scan to see if we have a tRNS
4371 case STBI__PNG_TYPE('P','L','T','E'): {
4372 if (first
) return stbi__err("first not IHDR", "Corrupt PNG");
4373 if (c
.length
> 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4374 pal_len
= c
.length
/ 3;
4375 if (pal_len
* 3 != c
.length
) return stbi__err("invalid PLTE","Corrupt PNG");
4376 for (i
=0; i
< pal_len
; ++i
) {
4377 palette
[i
*4+0] = stbi__get8(s
);
4378 palette
[i
*4+1] = stbi__get8(s
);
4379 palette
[i
*4+2] = stbi__get8(s
);
4380 palette
[i
*4+3] = 255;
4385 case STBI__PNG_TYPE('t','R','N','S'): {
4386 if (first
) return stbi__err("first not IHDR", "Corrupt PNG");
4387 if (z
->idata
) return stbi__err("tRNS after IDAT","Corrupt PNG");
4389 if (scan
== STBI__SCAN_header
) { s
->img_n
= 4; return 1; }
4390 if (pal_len
== 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
4391 if (c
.length
> pal_len
) return stbi__err("bad tRNS len","Corrupt PNG");
4393 for (i
=0; i
< c
.length
; ++i
)
4394 palette
[i
*4+3] = stbi__get8(s
);
4396 if (!(s
->img_n
& 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
4397 if (c
.length
!= (stbi__uint32
) s
->img_n
*2) return stbi__err("bad tRNS len","Corrupt PNG");
4399 for (k
=0; k
< s
->img_n
; ++k
)
4400 tc
[k
] = (stbi_uc
) (stbi__get16be(s
) & 255) * stbi__depth_scale_table
[depth
]; // non 8-bit images will be larger
4405 case STBI__PNG_TYPE('I','D','A','T'): {
4406 if (first
) return stbi__err("first not IHDR", "Corrupt PNG");
4407 if (pal_img_n
&& !pal_len
) return stbi__err("no PLTE","Corrupt PNG");
4408 if (scan
== STBI__SCAN_header
) { s
->img_n
= pal_img_n
; return 1; }
4409 if ((int)(ioff
+ c
.length
) < (int)ioff
) return 0;
4410 if (ioff
+ c
.length
> idata_limit
) {
4412 if (idata_limit
== 0) idata_limit
= c
.length
> 4096 ? c
.length
: 4096;
4413 while (ioff
+ c
.length
> idata_limit
)
4415 p
= (stbi_uc
*) STBI_REALLOC(z
->idata
, idata_limit
); if (p
== NULL
) return stbi__err("outofmem", "Out of memory");
4418 if (!stbi__getn(s
, z
->idata
+ioff
,c
.length
)) return stbi__err("outofdata","Corrupt PNG");
4423 case STBI__PNG_TYPE('I','E','N','D'): {
4424 stbi__uint32 raw_len
, bpl
;
4425 if (first
) return stbi__err("first not IHDR", "Corrupt PNG");
4426 if (scan
!= STBI__SCAN_load
) return 1;
4427 if (z
->idata
== NULL
) return stbi__err("no IDAT","Corrupt PNG");
4428 // initial guess for decoded data size to avoid unnecessary reallocs
4429 bpl
= (s
->img_x
* depth
+ 7) / 8; // bytes per line, per component
4430 raw_len
= bpl
* s
->img_y
* s
->img_n
/* pixels */ + s
->img_y
/* filter mode per row */;
4431 z
->expanded
= (stbi_uc
*) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z
->idata
, ioff
, raw_len
, (int *) &raw_len
, !is_iphone
);
4432 if (z
->expanded
== NULL
) return 0; // zlib should set error
4433 STBI_FREE(z
->idata
); z
->idata
= NULL
;
4434 if ((req_comp
== s
->img_n
+1 && req_comp
!= 3 && !pal_img_n
) || has_trans
)
4435 s
->img_out_n
= s
->img_n
+1;
4437 s
->img_out_n
= s
->img_n
;
4438 if (!stbi__create_png_image(z
, z
->expanded
, raw_len
, s
->img_out_n
, depth
, color
, interlace
)) return 0;
4440 if (!stbi__compute_transparency(z
, tc
, s
->img_out_n
)) return 0;
4441 if (is_iphone
&& stbi__de_iphone_flag
&& s
->img_out_n
> 2)
4444 // pal_img_n == 3 or 4
4445 s
->img_n
= pal_img_n
; // record the actual colors we had
4446 s
->img_out_n
= pal_img_n
;
4447 if (req_comp
>= 3) s
->img_out_n
= req_comp
;
4448 if (!stbi__expand_png_palette(z
, palette
, pal_len
, s
->img_out_n
))
4451 STBI_FREE(z
->expanded
); z
->expanded
= NULL
;
4456 // if critical, fail
4457 if (first
) return stbi__err("first not IHDR", "Corrupt PNG");
4458 if ((c
.type
& (1 << 29)) == 0) {
4459 #ifndef STBI_NO_FAILURE_STRINGS
4461 static char invalid_chunk
[] = "XXXX PNG chunk not known";
4462 invalid_chunk
[0] = STBI__BYTECAST(c
.type
>> 24);
4463 invalid_chunk
[1] = STBI__BYTECAST(c
.type
>> 16);
4464 invalid_chunk
[2] = STBI__BYTECAST(c
.type
>> 8);
4465 invalid_chunk
[3] = STBI__BYTECAST(c
.type
>> 0);
4467 return stbi__err(invalid_chunk
, "PNG not supported: unknown PNG chunk type");
4469 stbi__skip(s
, c
.length
);
4472 // end of PNG chunk, read and skip CRC
4477 static unsigned char *stbi__do_png(stbi__png
*p
, int *x
, int *y
, int *n
, int req_comp
)
4479 unsigned char *result
=NULL
;
4480 if (req_comp
< 0 || req_comp
> 4) return stbi__errpuc("bad req_comp", "Internal error");
4481 if (stbi__parse_png_file(p
, STBI__SCAN_load
, req_comp
)) {
4484 if (req_comp
&& req_comp
!= p
->s
->img_out_n
) {
4485 result
= stbi__convert_format(result
, p
->s
->img_out_n
, req_comp
, p
->s
->img_x
, p
->s
->img_y
);
4486 p
->s
->img_out_n
= req_comp
;
4487 if (result
== NULL
) return result
;
4491 if (n
) *n
= p
->s
->img_out_n
;
4493 STBI_FREE(p
->out
); p
->out
= NULL
;
4494 STBI_FREE(p
->expanded
); p
->expanded
= NULL
;
4495 STBI_FREE(p
->idata
); p
->idata
= NULL
;
4500 static unsigned char *stbi__png_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
4504 return stbi__do_png(&p
, x
,y
,comp
,req_comp
);
4507 static int stbi__png_test(stbi__context
*s
)
4510 r
= stbi__check_png_header(s
);
4515 static int stbi__png_info_raw(stbi__png
*p
, int *x
, int *y
, int *comp
)
4517 if (!stbi__parse_png_file(p
, STBI__SCAN_header
, 0)) {
4518 stbi__rewind( p
->s
);
4521 if (x
) *x
= p
->s
->img_x
;
4522 if (y
) *y
= p
->s
->img_y
;
4523 if (comp
) *comp
= p
->s
->img_n
;
4527 static int stbi__png_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
4531 return stbi__png_info_raw(&p
, x
, y
, comp
);
4535 // Microsoft/Windows BMP image
4538 static int stbi__bmp_test_raw(stbi__context
*s
)
4542 if (stbi__get8(s
) != 'B') return 0;
4543 if (stbi__get8(s
) != 'M') return 0;
4544 stbi__get32le(s
); // discard filesize
4545 stbi__get16le(s
); // discard reserved
4546 stbi__get16le(s
); // discard reserved
4547 stbi__get32le(s
); // discard data offset
4548 sz
= stbi__get32le(s
);
4549 r
= (sz
== 12 || sz
== 40 || sz
== 56 || sz
== 108 || sz
== 124);
4553 static int stbi__bmp_test(stbi__context
*s
)
4555 int r
= stbi__bmp_test_raw(s
);
4561 // returns 0..31 for the highest set bit
4562 static int stbi__high_bit(unsigned int z
)
4565 if (z
== 0) return -1;
4566 if (z
>= 0x10000) n
+= 16, z
>>= 16;
4567 if (z
>= 0x00100) n
+= 8, z
>>= 8;
4568 if (z
>= 0x00010) n
+= 4, z
>>= 4;
4569 if (z
>= 0x00004) n
+= 2, z
>>= 2;
4570 if (z
>= 0x00002) n
+= 1, z
>>= 1;
4574 static int stbi__bitcount(unsigned int a
)
4576 a
= (a
& 0x55555555) + ((a
>> 1) & 0x55555555); // max 2
4577 a
= (a
& 0x33333333) + ((a
>> 2) & 0x33333333); // max 4
4578 a
= (a
+ (a
>> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
4579 a
= (a
+ (a
>> 8)); // max 16 per 8 bits
4580 a
= (a
+ (a
>> 16)); // max 32 per 8 bits
4584 static int stbi__shiftsigned(int v
, int shift
, int bits
)
4589 if (shift
< 0) v
<<= -shift
;
4601 static stbi_uc
*stbi__bmp_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
4604 unsigned int mr
=0,mg
=0,mb
=0,ma
=0, all_a
=255;
4605 stbi_uc pal
[256][4];
4606 int psize
=0,i
,j
,compress
=0,width
;
4607 int bpp
, flip_vertically
, pad
, target
, offset
, hsz
;
4608 if (stbi__get8(s
) != 'B' || stbi__get8(s
) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
4609 stbi__get32le(s
); // discard filesize
4610 stbi__get16le(s
); // discard reserved
4611 stbi__get16le(s
); // discard reserved
4612 offset
= stbi__get32le(s
);
4613 hsz
= stbi__get32le(s
);
4614 if (hsz
!= 12 && hsz
!= 40 && hsz
!= 56 && hsz
!= 108 && hsz
!= 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
4616 s
->img_x
= stbi__get16le(s
);
4617 s
->img_y
= stbi__get16le(s
);
4619 s
->img_x
= stbi__get32le(s
);
4620 s
->img_y
= stbi__get32le(s
);
4622 if (stbi__get16le(s
) != 1) return stbi__errpuc("bad BMP", "bad BMP");
4623 bpp
= stbi__get16le(s
);
4624 if (bpp
== 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
4625 flip_vertically
= ((int) s
->img_y
) > 0;
4626 s
->img_y
= abs((int) s
->img_y
);
4629 psize
= (offset
- 14 - 24) / 3;
4631 compress
= stbi__get32le(s
);
4632 if (compress
== 1 || compress
== 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
4633 stbi__get32le(s
); // discard sizeof
4634 stbi__get32le(s
); // discard hres
4635 stbi__get32le(s
); // discard vres
4636 stbi__get32le(s
); // discard colorsused
4637 stbi__get32le(s
); // discard max important
4638 if (hsz
== 40 || hsz
== 56) {
4645 if (bpp
== 16 || bpp
== 32) {
4647 if (compress
== 0) {
4653 all_a
= 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
4659 } else if (compress
== 3) {
4660 mr
= stbi__get32le(s
);
4661 mg
= stbi__get32le(s
);
4662 mb
= stbi__get32le(s
);
4663 // not documented, but generated by photoshop and handled by mspaint
4664 if (mr
== mg
&& mg
== mb
) {
4666 return stbi__errpuc("bad BMP", "bad BMP");
4669 return stbi__errpuc("bad BMP", "bad BMP");
4672 STBI_ASSERT(hsz
== 108 || hsz
== 124);
4673 mr
= stbi__get32le(s
);
4674 mg
= stbi__get32le(s
);
4675 mb
= stbi__get32le(s
);
4676 ma
= stbi__get32le(s
);
4677 stbi__get32le(s
); // discard color space
4678 for (i
=0; i
< 12; ++i
)
4679 stbi__get32le(s
); // discard color space parameters
4681 stbi__get32le(s
); // discard rendering intent
4682 stbi__get32le(s
); // discard offset of profile data
4683 stbi__get32le(s
); // discard size of profile data
4684 stbi__get32le(s
); // discard reserved
4688 psize
= (offset
- 14 - hsz
) >> 2;
4690 s
->img_n
= ma
? 4 : 3;
4691 if (req_comp
&& req_comp
>= 3) // we can directly decode 3 or 4
4694 target
= s
->img_n
; // if they want monochrome, we'll post-convert
4695 out
= (stbi_uc
*) stbi__malloc(target
* s
->img_x
* s
->img_y
);
4696 if (!out
) return stbi__errpuc("outofmem", "Out of memory");
4699 if (psize
== 0 || psize
> 256) { STBI_FREE(out
); return stbi__errpuc("invalid", "Corrupt BMP"); }
4700 for (i
=0; i
< psize
; ++i
) {
4701 pal
[i
][2] = stbi__get8(s
);
4702 pal
[i
][1] = stbi__get8(s
);
4703 pal
[i
][0] = stbi__get8(s
);
4704 if (hsz
!= 12) stbi__get8(s
);
4707 stbi__skip(s
, offset
- 14 - hsz
- psize
* (hsz
== 12 ? 3 : 4));
4708 if (bpp
== 4) width
= (s
->img_x
+ 1) >> 1;
4709 else if (bpp
== 8) width
= s
->img_x
;
4710 else { STBI_FREE(out
); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
4712 for (j
=0; j
< (int) s
->img_y
; ++j
) {
4713 for (i
=0; i
< (int) s
->img_x
; i
+= 2) {
4714 int v
=stbi__get8(s
),v2
=0;
4719 out
[z
++] = pal
[v
][0];
4720 out
[z
++] = pal
[v
][1];
4721 out
[z
++] = pal
[v
][2];
4722 if (target
== 4) out
[z
++] = 255;
4723 if (i
+1 == (int) s
->img_x
) break;
4724 v
= (bpp
== 8) ? stbi__get8(s
) : v2
;
4725 out
[z
++] = pal
[v
][0];
4726 out
[z
++] = pal
[v
][1];
4727 out
[z
++] = pal
[v
][2];
4728 if (target
== 4) out
[z
++] = 255;
4733 int rshift
=0,gshift
=0,bshift
=0,ashift
=0,rcount
=0,gcount
=0,bcount
=0,acount
=0;
4736 stbi__skip(s
, offset
- 14 - hsz
);
4737 if (bpp
== 24) width
= 3 * s
->img_x
;
4738 else if (bpp
== 16) width
= 2*s
->img_x
;
4739 else /* bpp = 32 and pad = 0 */ width
=0;
4743 } else if (bpp
== 32) {
4744 if (mb
== 0xff && mg
== 0xff00 && mr
== 0x00ff0000 && ma
== 0xff000000)
4748 if (!mr
|| !mg
|| !mb
) { STBI_FREE(out
); return stbi__errpuc("bad masks", "Corrupt BMP"); }
4749 // right shift amt to put high bit in position #7
4750 rshift
= stbi__high_bit(mr
)-7; rcount
= stbi__bitcount(mr
);
4751 gshift
= stbi__high_bit(mg
)-7; gcount
= stbi__bitcount(mg
);
4752 bshift
= stbi__high_bit(mb
)-7; bcount
= stbi__bitcount(mb
);
4753 ashift
= stbi__high_bit(ma
)-7; acount
= stbi__bitcount(ma
);
4755 for (j
=0; j
< (int) s
->img_y
; ++j
) {
4757 for (i
=0; i
< (int) s
->img_x
; ++i
) {
4759 out
[z
+2] = stbi__get8(s
);
4760 out
[z
+1] = stbi__get8(s
);
4761 out
[z
+0] = stbi__get8(s
);
4763 a
= (easy
== 2 ? stbi__get8(s
) : 255);
4765 if (target
== 4) out
[z
++] = a
;
4768 for (i
=0; i
< (int) s
->img_x
; ++i
) {
4769 stbi__uint32 v
= (bpp
== 16 ? (stbi__uint32
) stbi__get16le(s
) : stbi__get32le(s
));
4771 out
[z
++] = STBI__BYTECAST(stbi__shiftsigned(v
& mr
, rshift
, rcount
));
4772 out
[z
++] = STBI__BYTECAST(stbi__shiftsigned(v
& mg
, gshift
, gcount
));
4773 out
[z
++] = STBI__BYTECAST(stbi__shiftsigned(v
& mb
, bshift
, bcount
));
4774 a
= (ma
? stbi__shiftsigned(v
& ma
, ashift
, acount
) : 255);
4776 if (target
== 4) out
[z
++] = STBI__BYTECAST(a
);
4783 // if alpha channel is all 0s, replace with all 255s
4784 if (target
== 4 && all_a
== 0)
4785 for (i
=4*s
->img_x
*s
->img_y
-1; i
>= 0; i
-= 4)
4788 if (flip_vertically
) {
4790 for (j
=0; j
< (int) s
->img_y
>>1; ++j
) {
4791 stbi_uc
*p1
= out
+ j
*s
->img_x
*target
;
4792 stbi_uc
*p2
= out
+ (s
->img_y
-1-j
)*s
->img_x
*target
;
4793 for (i
=0; i
< (int) s
->img_x
*target
; ++i
) {
4794 t
= p1
[i
], p1
[i
] = p2
[i
], p2
[i
] = t
;
4799 if (req_comp
&& req_comp
!= target
) {
4800 out
= stbi__convert_format(out
, target
, req_comp
, s
->img_x
, s
->img_y
);
4801 if (out
== NULL
) return out
; // stbi__convert_format frees input on failure
4806 if (comp
) *comp
= s
->img_n
;
4811 // Targa Truevision - TGA
4812 // by Jonathan Dummer
4814 static int stbi__tga_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
4816 int tga_w
, tga_h
, tga_comp
;
4818 stbi__get8(s
); // discard Offset
4819 sz
= stbi__get8(s
); // color type
4822 return 0; // only RGB or indexed allowed
4824 sz
= stbi__get8(s
); // image type
4825 // only RGB or grey allowed, +/- RLE
4826 if ((sz
!= 1) && (sz
!= 2) && (sz
!= 3) && (sz
!= 9) && (sz
!= 10) && (sz
!= 11)) return 0;
4828 tga_w
= stbi__get16le(s
);
4831 return 0; // test width
4833 tga_h
= stbi__get16le(s
);
4836 return 0; // test height
4838 sz
= stbi__get8(s
); // bits per pixel
4839 // only RGB or RGBA or grey allowed
4840 if ((sz
!= 8) && (sz
!= 16) && (sz
!= 24) && (sz
!= 32)) {
4847 if (comp
) *comp
= tga_comp
/ 8;
4848 return 1; // seems to have passed everything
4851 static int stbi__tga_test(stbi__context
*s
)
4855 stbi__get8(s
); // discard Offset
4856 sz
= stbi__get8(s
); // color type
4857 if ( sz
> 1 ) return 0; // only RGB or indexed allowed
4858 sz
= stbi__get8(s
); // image type
4859 if ( (sz
!= 1) && (sz
!= 2) && (sz
!= 3) && (sz
!= 9) && (sz
!= 10) && (sz
!= 11) ) return 0; // only RGB or grey allowed, +/- RLE
4860 stbi__get16be(s
); // discard palette start
4861 stbi__get16be(s
); // discard palette length
4862 stbi__get8(s
); // discard bits per palette color entry
4863 stbi__get16be(s
); // discard x origin
4864 stbi__get16be(s
); // discard y origin
4865 if ( stbi__get16be(s
) < 1 ) return 0; // test width
4866 if ( stbi__get16be(s
) < 1 ) return 0; // test height
4867 sz
= stbi__get8(s
); // bits per pixel
4868 if ( (sz
!= 8) && (sz
!= 16) && (sz
!= 24) && (sz
!= 32) )
4876 static stbi_uc
*stbi__tga_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
4878 // read in the TGA header stuff
4879 int tga_offset
= stbi__get8(s
);
4880 int tga_indexed
= stbi__get8(s
);
4881 int tga_image_type
= stbi__get8(s
);
4883 int tga_palette_start
= stbi__get16le(s
);
4884 int tga_palette_len
= stbi__get16le(s
);
4885 int tga_palette_bits
= stbi__get8(s
);
4886 int tga_x_origin
= stbi__get16le(s
);
4887 int tga_y_origin
= stbi__get16le(s
);
4888 int tga_width
= stbi__get16le(s
);
4889 int tga_height
= stbi__get16le(s
);
4890 int tga_bits_per_pixel
= stbi__get8(s
);
4891 int tga_comp
= tga_bits_per_pixel
/ 8;
4892 int tga_inverted
= stbi__get8(s
);
4894 unsigned char *tga_data
;
4895 unsigned char *tga_palette
= NULL
;
4897 unsigned char raw_data
[4];
4899 int RLE_repeating
= 0;
4900 int read_next_pixel
= 1;
4902 // do a tiny bit of precessing
4903 if ( tga_image_type
>= 8 )
4905 tga_image_type
-= 8;
4908 /* int tga_alpha_bits = tga_inverted & 15; */
4909 tga_inverted
= 1 - ((tga_inverted
>> 5) & 1);
4912 if ( //(tga_indexed) ||
4913 (tga_width
< 1) || (tga_height
< 1) ||
4914 (tga_image_type
< 1) || (tga_image_type
> 3) ||
4915 ((tga_bits_per_pixel
!= 8) && (tga_bits_per_pixel
!= 16) &&
4916 (tga_bits_per_pixel
!= 24) && (tga_bits_per_pixel
!= 32))
4919 return NULL
; // we don't report this as a bad TGA because we don't even know if it's TGA
4922 // If I'm paletted, then I'll use the number of bits from the palette
4925 tga_comp
= tga_palette_bits
/ 8;
4931 if (comp
) *comp
= tga_comp
;
4933 tga_data
= (unsigned char*)stbi__malloc( (size_t)tga_width
* tga_height
* tga_comp
);
4934 if (!tga_data
) return stbi__errpuc("outofmem", "Out of memory");
4936 // skip to the data's starting position (offset usually = 0)
4937 stbi__skip(s
, tga_offset
);
4939 if ( !tga_indexed
&& !tga_is_RLE
) {
4940 for (i
=0; i
< tga_height
; ++i
) {
4941 int row
= tga_inverted
? tga_height
-i
- 1 : i
;
4942 stbi_uc
*tga_row
= tga_data
+ row
*tga_width
*tga_comp
;
4943 stbi__getn(s
, tga_row
, tga_width
* tga_comp
);
4946 // do I need to load a palette?
4949 // any data to skip? (offset usually = 0)
4950 stbi__skip(s
, tga_palette_start
);
4952 tga_palette
= (unsigned char*)stbi__malloc( tga_palette_len
* tga_palette_bits
/ 8 );
4954 STBI_FREE(tga_data
);
4955 return stbi__errpuc("outofmem", "Out of memory");
4957 if (!stbi__getn(s
, tga_palette
, tga_palette_len
* tga_palette_bits
/ 8 )) {
4958 STBI_FREE(tga_data
);
4959 STBI_FREE(tga_palette
);
4960 return stbi__errpuc("bad palette", "Corrupt TGA");
4964 for (i
=0; i
< tga_width
* tga_height
; ++i
)
4966 // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
4969 if ( RLE_count
== 0 )
4971 // yep, get the next byte as a RLE command
4972 int RLE_cmd
= stbi__get8(s
);
4973 RLE_count
= 1 + (RLE_cmd
& 127);
4974 RLE_repeating
= RLE_cmd
>> 7;
4975 read_next_pixel
= 1;
4976 } else if ( !RLE_repeating
)
4978 read_next_pixel
= 1;
4982 read_next_pixel
= 1;
4984 // OK, if I need to read a pixel, do it now
4985 if ( read_next_pixel
)
4987 // load however much data we did have
4990 // read in 1 byte, then perform the lookup
4991 int pal_idx
= stbi__get8(s
);
4992 if ( pal_idx
>= tga_palette_len
)
4997 pal_idx
*= tga_bits_per_pixel
/ 8;
4998 for (j
= 0; j
*8 < tga_bits_per_pixel
; ++j
)
5000 raw_data
[j
] = tga_palette
[pal_idx
+j
];
5004 // read in the data raw
5005 for (j
= 0; j
*8 < tga_bits_per_pixel
; ++j
)
5007 raw_data
[j
] = stbi__get8(s
);
5010 // clear the reading flag for the next pixel
5011 read_next_pixel
= 0;
5012 } // end of reading a pixel
5015 for (j
= 0; j
< tga_comp
; ++j
)
5016 tga_data
[i
*tga_comp
+j
] = raw_data
[j
];
5018 // in case we're in RLE mode, keep counting down
5021 // do I need to invert the image?
5024 for (j
= 0; j
*2 < tga_height
; ++j
)
5026 int index1
= j
* tga_width
* tga_comp
;
5027 int index2
= (tga_height
- 1 - j
) * tga_width
* tga_comp
;
5028 for (i
= tga_width
* tga_comp
; i
> 0; --i
)
5030 unsigned char temp
= tga_data
[index1
];
5031 tga_data
[index1
] = tga_data
[index2
];
5032 tga_data
[index2
] = temp
;
5038 // clear my palette, if I had one
5039 if ( tga_palette
!= NULL
)
5041 STBI_FREE( tga_palette
);
5048 unsigned char* tga_pixel
= tga_data
;
5049 for (i
=0; i
< tga_width
* tga_height
; ++i
)
5051 unsigned char temp
= tga_pixel
[0];
5052 tga_pixel
[0] = tga_pixel
[2];
5053 tga_pixel
[2] = temp
;
5054 tga_pixel
+= tga_comp
;
5058 // convert to target component count
5059 if (req_comp
&& req_comp
!= tga_comp
)
5060 tga_data
= stbi__convert_format(tga_data
, tga_comp
, req_comp
, tga_width
, tga_height
);
5062 // the things I do to get rid of an error message, and yet keep
5063 // Microsoft's C compilers happy... [8^(
5064 tga_palette_start
= tga_palette_len
= tga_palette_bits
=
5065 tga_x_origin
= tga_y_origin
= 0;
5071 // *************************************************************************************************
5072 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5075 static int stbi__psd_test(stbi__context
*s
)
5077 int r
= (stbi__get32be(s
) == 0x38425053);
5082 static stbi_uc
*stbi__psd_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
5085 int channelCount
, compression
;
5086 int channel
, i
, count
, len
;
5092 if (stbi__get32be(s
) != 0x38425053) // "8BPS"
5093 return stbi__errpuc("not PSD", "Corrupt PSD image");
5095 // Check file type version.
5096 if (stbi__get16be(s
) != 1)
5097 return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5099 // Skip 6 reserved bytes.
5102 // Read the number of channels (R, G, B, A, etc).
5103 channelCount
= stbi__get16be(s
);
5104 if (channelCount
< 0 || channelCount
> 16)
5105 return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5107 // Read the rows and columns of the image.
5108 h
= stbi__get32be(s
);
5109 w
= stbi__get32be(s
);
5111 // Make sure the depth is 8 bits.
5112 bitdepth
= stbi__get16be(s
);
5113 if (bitdepth
!= 8 && bitdepth
!= 16)
5114 return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
5116 // Make sure the color mode is RGB.
5117 // Valid options are:
5126 if (stbi__get16be(s
) != 3)
5127 return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5129 // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.)
5130 stbi__skip(s
,stbi__get32be(s
) );
5132 // Skip the image resources. (resolution, pen tool paths, etc)
5133 stbi__skip(s
, stbi__get32be(s
) );
5135 // Skip the reserved data.
5136 stbi__skip(s
, stbi__get32be(s
) );
5138 // Find out if the data is compressed.
5140 // 0: no compression
5141 // 1: RLE compressed
5142 compression
= stbi__get16be(s
);
5143 if (compression
> 1)
5144 return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5146 // Create the destination image.
5147 out
= (stbi_uc
*) stbi__malloc(4 * w
*h
);
5148 if (!out
) return stbi__errpuc("outofmem", "Out of memory");
5151 // Initialize the data to zero.
5152 //memset( out, 0, pixelCount * 4 );
5154 // Finally, the image data.
5156 // RLE as used by .PSD and .TIFF
5157 // Loop until you get the number of unpacked bytes you are expecting:
5158 // Read the next source byte into n.
5159 // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5160 // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5161 // Else if n is 128, noop.
5164 // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
5165 // which we're going to just skip.
5166 stbi__skip(s
, h
* channelCount
* 2 );
5168 // Read the RLE data by channel.
5169 for (channel
= 0; channel
< 4; channel
++) {
5173 if (channel
>= channelCount
) {
5174 // Fill this channel with default data.
5175 for (i
= 0; i
< pixelCount
; i
++, p
+= 4)
5176 *p
= (channel
== 3 ? 255 : 0);
5178 // Read the RLE data.
5180 while (count
< pixelCount
) {
5181 len
= stbi__get8(s
);
5184 } else if (len
< 128) {
5185 // Copy next len+1 bytes literally.
5193 } else if (len
> 128) {
5195 // Next -len+1 bytes in the dest are replicated from next source byte.
5196 // (Interpret len as a negative 8-bit int.)
5199 val
= stbi__get8(s
);
5212 // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
5213 // where each channel consists of an 8-bit value for each pixel in the image.
5215 // Read the data by channel.
5216 for (channel
= 0; channel
< 4; channel
++) {
5220 if (channel
>= channelCount
) {
5221 // Fill this channel with default data.
5222 stbi_uc val
= channel
== 3 ? 255 : 0;
5223 for (i
= 0; i
< pixelCount
; i
++, p
+= 4)
5227 if (bitdepth
== 16) {
5228 for (i
= 0; i
< pixelCount
; i
++, p
+= 4)
5229 *p
= (stbi_uc
) (stbi__get16be(s
) >> 8);
5231 for (i
= 0; i
< pixelCount
; i
++, p
+= 4)
5238 if (req_comp
&& req_comp
!= 4) {
5239 out
= stbi__convert_format(out
, 4, req_comp
, w
, h
);
5240 if (out
== NULL
) return out
; // stbi__convert_format frees input on failure
5243 if (comp
) *comp
= 4;
5251 // *************************************************************************************************
5252 // Softimage PIC loader
5255 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5256 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5259 static int stbi__pic_is4(stbi__context
*s
,const char *str
)
5263 if (stbi__get8(s
) != (stbi_uc
)str
[i
])
5269 static int stbi__pic_test_core(stbi__context
*s
)
5273 if (!stbi__pic_is4(s
,"\x53\x80\xF6\x34"))
5279 if (!stbi__pic_is4(s
,"PICT"))
5287 stbi_uc size
,type
,channel
;
5290 static stbi_uc
*stbi__readval(stbi__context
*s
, int channel
, stbi_uc
*dest
)
5294 for (i
=0; i
<4; ++i
, mask
>>=1) {
5295 if (channel
& mask
) {
5296 if (stbi__at_eof(s
)) return stbi__errpuc("bad file","PIC file too short");
5297 dest
[i
]=stbi__get8(s
);
5304 static void stbi__copyval(int channel
,stbi_uc
*dest
,const stbi_uc
*src
)
5308 for (i
=0;i
<4; ++i
, mask
>>=1)
5313 static stbi_uc
*stbi__pic_load_core(stbi__context
*s
,int width
,int height
,int *comp
, stbi_uc
*result
)
5315 int act_comp
=0,num_packets
=0,y
,chained
;
5316 stbi__pic_packet packets
[10];
5318 // this will (should...) cater for even some bizarre stuff like having data
5319 // for the same channel in multiple packets.
5321 stbi__pic_packet
*packet
;
5323 if (num_packets
==sizeof(packets
)/sizeof(packets
[0]))
5324 return stbi__errpuc("bad format","too many packets");
5326 packet
= &packets
[num_packets
++];
5328 chained
= stbi__get8(s
);
5329 packet
->size
= stbi__get8(s
);
5330 packet
->type
= stbi__get8(s
);
5331 packet
->channel
= stbi__get8(s
);
5333 act_comp
|= packet
->channel
;
5335 if (stbi__at_eof(s
)) return stbi__errpuc("bad file","file too short (reading packets)");
5336 if (packet
->size
!= 8) return stbi__errpuc("bad format","packet isn't 8bpp");
5339 *comp
= (act_comp
& 0x10 ? 4 : 3); // has alpha channel?
5341 for(y
=0; y
<height
; ++y
) {
5344 for(packet_idx
=0; packet_idx
< num_packets
; ++packet_idx
) {
5345 stbi__pic_packet
*packet
= &packets
[packet_idx
];
5346 stbi_uc
*dest
= result
+y
*width
*4;
5348 switch (packet
->type
) {
5350 return stbi__errpuc("bad format","packet has bad compression type");
5352 case 0: {//uncompressed
5355 for(x
=0;x
<width
;++x
, dest
+=4)
5356 if (!stbi__readval(s
,packet
->channel
,dest
))
5366 stbi_uc count
,value
[4];
5368 count
=stbi__get8(s
);
5369 if (stbi__at_eof(s
)) return stbi__errpuc("bad file","file too short (pure read count)");
5372 count
= (stbi_uc
) left
;
5374 if (!stbi__readval(s
,packet
->channel
,value
)) return 0;
5376 for(i
=0; i
<count
; ++i
,dest
+=4)
5377 stbi__copyval(packet
->channel
,dest
,value
);
5383 case 2: {//Mixed RLE
5386 int count
= stbi__get8(s
), i
;
5387 if (stbi__at_eof(s
)) return stbi__errpuc("bad file","file too short (mixed read count)");
5389 if (count
>= 128) { // Repeated
5393 count
= stbi__get16be(s
);
5397 return stbi__errpuc("bad file","scanline overrun");
5399 if (!stbi__readval(s
,packet
->channel
,value
))
5402 for(i
=0;i
<count
;++i
, dest
+= 4)
5403 stbi__copyval(packet
->channel
,dest
,value
);
5406 if (count
>left
) return stbi__errpuc("bad file","scanline overrun");
5408 for(i
=0;i
<count
;++i
, dest
+=4)
5409 if (!stbi__readval(s
,packet
->channel
,dest
))
5423 static stbi_uc
*stbi__pic_load(stbi__context
*s
,int *px
,int *py
,int *comp
,int req_comp
)
5428 for (i
=0; i
<92; ++i
)
5431 x
= stbi__get16be(s
);
5432 y
= stbi__get16be(s
);
5433 if (stbi__at_eof(s
)) return stbi__errpuc("bad file","file too short (pic header)");
5434 if ((1 << 28) / x
< y
) return stbi__errpuc("too large", "Image too large to decode");
5436 stbi__get32be(s
); //skip `ratio'
5437 stbi__get16be(s
); //skip `fields'
5438 stbi__get16be(s
); //skip `pad'
5440 // intermediate buffer is RGBA
5441 result
= (stbi_uc
*) stbi__malloc(x
*y
*4);
5442 memset(result
, 0xff, x
*y
*4);
5444 if (!stbi__pic_load_core(s
,x
,y
,comp
, result
)) {
5450 if (req_comp
== 0) req_comp
= *comp
;
5451 result
=stbi__convert_format(result
,4,req_comp
,x
,y
);
5456 static int stbi__pic_test(stbi__context
*s
)
5458 int r
= stbi__pic_test_core(s
);
5464 // *************************************************************************************************
5465 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
5478 stbi_uc
*out
, *old_out
; // output buffer (always 4 components)
5479 int flags
, bgindex
, ratio
, transparent
, eflags
, delay
;
5480 stbi_uc pal
[256][4];
5481 stbi_uc lpal
[256][4];
5482 stbi__gif_lzw codes
[4096];
5483 stbi_uc
*color_table
;
5486 int start_x
, start_y
;
5492 static int stbi__gif_test_raw(stbi__context
*s
)
5495 if (stbi__get8(s
) != 'G' || stbi__get8(s
) != 'I' || stbi__get8(s
) != 'F' || stbi__get8(s
) != '8') return 0;
5497 if (sz
!= '9' && sz
!= '7') return 0;
5498 if (stbi__get8(s
) != 'a') return 0;
5502 static int stbi__gif_test(stbi__context
*s
)
5504 int r
= stbi__gif_test_raw(s
);
5509 static void stbi__gif_parse_colortable(stbi__context
*s
, stbi_uc pal
[256][4], int num_entries
, int transp
)
5512 for (i
=0; i
< num_entries
; ++i
) {
5513 pal
[i
][2] = stbi__get8(s
);
5514 pal
[i
][1] = stbi__get8(s
);
5515 pal
[i
][0] = stbi__get8(s
);
5516 pal
[i
][3] = transp
== i
? 0 : 255;
5520 static int stbi__gif_header(stbi__context
*s
, stbi__gif
*g
, int *comp
, int is_info
)
5523 if (stbi__get8(s
) != 'G' || stbi__get8(s
) != 'I' || stbi__get8(s
) != 'F' || stbi__get8(s
) != '8')
5524 return stbi__err("not GIF", "Corrupt GIF");
5526 version
= stbi__get8(s
);
5527 if (version
!= '7' && version
!= '9') return stbi__err("not GIF", "Corrupt GIF");
5528 if (stbi__get8(s
) != 'a') return stbi__err("not GIF", "Corrupt GIF");
5530 stbi__g_failure_reason
= "";
5531 g
->w
= stbi__get16le(s
);
5532 g
->h
= stbi__get16le(s
);
5533 g
->flags
= stbi__get8(s
);
5534 g
->bgindex
= stbi__get8(s
);
5535 g
->ratio
= stbi__get8(s
);
5536 g
->transparent
= -1;
5538 if (comp
!= 0) *comp
= 4; // can't actually tell whether it's 3 or 4 until we parse the comments
5540 if (is_info
) return 1;
5542 if (g
->flags
& 0x80)
5543 stbi__gif_parse_colortable(s
,g
->pal
, 2 << (g
->flags
& 7), -1);
5548 static int stbi__gif_info_raw(stbi__context
*s
, int *x
, int *y
, int *comp
)
5551 if (!stbi__gif_header(s
, &g
, comp
, 1)) {
5560 static void stbi__out_gif_code(stbi__gif
*g
, stbi__uint16 code
)
5564 // recurse to decode the prefixes, since the linked-list is backwards,
5565 // and working backwards through an interleaved image would be nasty
5566 if (g
->codes
[code
].prefix
>= 0)
5567 stbi__out_gif_code(g
, g
->codes
[code
].prefix
);
5569 if (g
->cur_y
>= g
->max_y
) return;
5571 p
= &g
->out
[g
->cur_x
+ g
->cur_y
];
5572 c
= &g
->color_table
[g
->codes
[code
].suffix
* 4];
5582 if (g
->cur_x
>= g
->max_x
) {
5583 g
->cur_x
= g
->start_x
;
5584 g
->cur_y
+= g
->step
;
5586 while (g
->cur_y
>= g
->max_y
&& g
->parse
> 0) {
5587 g
->step
= (1 << g
->parse
) * g
->line_size
;
5588 g
->cur_y
= g
->start_y
+ (g
->step
>> 1);
5594 static stbi_uc
*stbi__process_gif_raster(stbi__context
*s
, stbi__gif
*g
)
5597 stbi__int32 len
, init_code
;
5599 stbi__int32 codesize
, codemask
, avail
, oldcode
, bits
, valid_bits
, clear
;
5602 lzw_cs
= stbi__get8(s
);
5603 if (lzw_cs
> 12) return NULL
;
5604 clear
= 1 << lzw_cs
;
5606 codesize
= lzw_cs
+ 1;
5607 codemask
= (1 << codesize
) - 1;
5610 for (init_code
= 0; init_code
< clear
; init_code
++) {
5611 g
->codes
[init_code
].prefix
= -1;
5612 g
->codes
[init_code
].first
= (stbi_uc
) init_code
;
5613 g
->codes
[init_code
].suffix
= (stbi_uc
) init_code
;
5616 // support no starting clear code
5622 if (valid_bits
< codesize
) {
5624 len
= stbi__get8(s
); // start new block
5629 bits
|= (stbi__int32
) stbi__get8(s
) << valid_bits
;
5632 stbi__int32 code
= bits
& codemask
;
5634 valid_bits
-= codesize
;
5635 // @OPTIMIZE: is there some way we can accelerate the non-clear path?
5636 if (code
== clear
) { // clear code
5637 codesize
= lzw_cs
+ 1;
5638 codemask
= (1 << codesize
) - 1;
5642 } else if (code
== clear
+ 1) { // end of stream code
5644 while ((len
= stbi__get8(s
)) > 0)
5647 } else if (code
<= avail
) {
5648 if (first
) return stbi__errpuc("no clear code", "Corrupt GIF");
5651 p
= &g
->codes
[avail
++];
5652 if (avail
> 4096) return stbi__errpuc("too many codes", "Corrupt GIF");
5653 p
->prefix
= (stbi__int16
) oldcode
;
5654 p
->first
= g
->codes
[oldcode
].first
;
5655 p
->suffix
= (code
== avail
) ? p
->first
: g
->codes
[code
].first
;
5656 } else if (code
== avail
)
5657 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5659 stbi__out_gif_code(g
, (stbi__uint16
) code
);
5661 if ((avail
& codemask
) == 0 && avail
<= 0x0FFF) {
5663 codemask
= (1 << codesize
) - 1;
5668 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5674 static void stbi__fill_gif_background(stbi__gif
*g
, int x0
, int y0
, int x1
, int y1
)
5677 stbi_uc
*c
= g
->pal
[g
->bgindex
];
5678 for (y
= y0
; y
< y1
; y
+= 4 * g
->w
) {
5679 for (x
= x0
; x
< x1
; x
+= 4) {
5680 stbi_uc
*p
= &g
->out
[y
+ x
];
5689 // this function is designed to support animated gifs, although stb_image doesn't support it
5690 static stbi_uc
*stbi__gif_load_next(stbi__context
*s
, stbi__gif
*g
, int *comp
, int req_comp
)
5693 stbi_uc
*prev_out
= 0;
5695 if (g
->out
== 0 && !stbi__gif_header(s
, g
, comp
,0))
5696 return 0; // stbi__g_failure_reason set by stbi__gif_header
5699 g
->out
= (stbi_uc
*) stbi__malloc(4 * g
->w
* g
->h
);
5700 if (g
->out
== 0) return stbi__errpuc("outofmem", "Out of memory");
5702 switch ((g
->eflags
& 0x1C) >> 2) {
5703 case 0: // unspecified (also always used on 1st frame)
5704 stbi__fill_gif_background(g
, 0, 0, 4 * g
->w
, 4 * g
->w
* g
->h
);
5706 case 1: // do not dispose
5707 if (prev_out
) memcpy(g
->out
, prev_out
, 4 * g
->w
* g
->h
);
5708 g
->old_out
= prev_out
;
5710 case 2: // dispose to background
5711 if (prev_out
) memcpy(g
->out
, prev_out
, 4 * g
->w
* g
->h
);
5712 stbi__fill_gif_background(g
, g
->start_x
, g
->start_y
, g
->max_x
, g
->max_y
);
5714 case 3: // dispose to previous
5716 for (i
= g
->start_y
; i
< g
->max_y
; i
+= 4 * g
->w
)
5717 memcpy(&g
->out
[i
+ g
->start_x
], &g
->old_out
[i
+ g
->start_x
], g
->max_x
- g
->start_x
);
5723 switch (stbi__get8(s
)) {
5724 case 0x2C: /* Image Descriptor */
5726 int prev_trans
= -1;
5727 stbi__int32 x
, y
, w
, h
;
5730 x
= stbi__get16le(s
);
5731 y
= stbi__get16le(s
);
5732 w
= stbi__get16le(s
);
5733 h
= stbi__get16le(s
);
5734 if (((x
+ w
) > (g
->w
)) || ((y
+ h
) > (g
->h
)))
5735 return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
5737 g
->line_size
= g
->w
* 4;
5739 g
->start_y
= y
* g
->line_size
;
5740 g
->max_x
= g
->start_x
+ w
* 4;
5741 g
->max_y
= g
->start_y
+ h
* g
->line_size
;
5742 g
->cur_x
= g
->start_x
;
5743 g
->cur_y
= g
->start_y
;
5745 g
->lflags
= stbi__get8(s
);
5747 if (g
->lflags
& 0x40) {
5748 g
->step
= 8 * g
->line_size
; // first interlaced spacing
5751 g
->step
= g
->line_size
;
5755 if (g
->lflags
& 0x80) {
5756 stbi__gif_parse_colortable(s
,g
->lpal
, 2 << (g
->lflags
& 7), g
->eflags
& 0x01 ? g
->transparent
: -1);
5757 g
->color_table
= (stbi_uc
*) g
->lpal
;
5758 } else if (g
->flags
& 0x80) {
5759 if (g
->transparent
>= 0 && (g
->eflags
& 0x01)) {
5760 prev_trans
= g
->pal
[g
->transparent
][3];
5761 g
->pal
[g
->transparent
][3] = 0;
5763 g
->color_table
= (stbi_uc
*) g
->pal
;
5765 return stbi__errpuc("missing color table", "Corrupt GIF");
5767 o
= stbi__process_gif_raster(s
, g
);
5768 if (o
== NULL
) return NULL
;
5770 if (prev_trans
!= -1)
5771 g
->pal
[g
->transparent
][3] = (stbi_uc
) prev_trans
;
5776 case 0x21: // Comment Extension.
5779 if (stbi__get8(s
) == 0xF9) { // Graphic Control Extension.
5780 len
= stbi__get8(s
);
5782 g
->eflags
= stbi__get8(s
);
5783 g
->delay
= stbi__get16le(s
);
5784 g
->transparent
= stbi__get8(s
);
5790 while ((len
= stbi__get8(s
)) != 0)
5795 case 0x3B: // gif stream termination code
5796 return (stbi_uc
*) s
; // using '1' causes warning on some compilers
5799 return stbi__errpuc("unknown code", "Corrupt GIF");
5803 STBI_NOTUSED(req_comp
);
5806 static stbi_uc
*stbi__gif_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
5810 memset(&g
, 0, sizeof(g
));
5812 u
= stbi__gif_load_next(s
, &g
, comp
, req_comp
);
5813 if (u
== (stbi_uc
*) s
) u
= 0; // end of animated gif marker
5817 if (req_comp
&& req_comp
!= 4)
5818 u
= stbi__convert_format(u
, 4, req_comp
, g
.w
, g
.h
);
5826 static int stbi__gif_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
5828 return stbi__gif_info_raw(s
,x
,y
,comp
);
5832 // *************************************************************************************************
5833 // Radiance RGBE HDR loader
5834 // originally by Nicolas Schulz
5836 static int stbi__hdr_test_core(stbi__context
*s
)
5838 const char *signature
= "#?RADIANCE\n";
5840 for (i
=0; signature
[i
]; ++i
)
5841 if (stbi__get8(s
) != signature
[i
])
5846 static int stbi__hdr_test(stbi__context
* s
)
5848 int r
= stbi__hdr_test_core(s
);
5853 #define STBI__HDR_BUFLEN 1024
5854 static char *stbi__hdr_gettoken(stbi__context
*z
, char *buffer
)
5859 c
= (char) stbi__get8(z
);
5861 while (!stbi__at_eof(z
) && c
!= '\n') {
5863 if (len
== STBI__HDR_BUFLEN
-1) {
5864 // flush to end of line
5865 while (!stbi__at_eof(z
) && stbi__get8(z
) != '\n')
5869 c
= (char) stbi__get8(z
);
5876 static void stbi__hdr_convert(float *output
, stbi_uc
*input
, int req_comp
)
5878 if ( input
[3] != 0 ) {
5881 f1
= (float) ldexp(1.0f
, input
[3] - (int)(128 + 8));
5883 output
[0] = (input
[0] + input
[1] + input
[2]) * f1
/ 3;
5885 output
[0] = input
[0] * f1
;
5886 output
[1] = input
[1] * f1
;
5887 output
[2] = input
[2] * f1
;
5889 if (req_comp
== 2) output
[1] = 1;
5890 if (req_comp
== 4) output
[3] = 1;
5893 case 4: output
[3] = 1; /* fallthrough */
5894 case 3: output
[0] = output
[1] = output
[2] = 0;
5896 case 2: output
[1] = 1; /* fallthrough */
5897 case 1: output
[0] = 0;
5903 static float *stbi__hdr_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
5905 char buffer
[STBI__HDR_BUFLEN
];
5912 unsigned char count
, value
;
5913 int i
, j
, k
, c1
,c2
, z
;
5917 if (strcmp(stbi__hdr_gettoken(s
,buffer
), "#?RADIANCE") != 0)
5918 return stbi__errpf("not HDR", "Corrupt HDR image");
5922 token
= stbi__hdr_gettoken(s
,buffer
);
5923 if (token
[0] == 0) break;
5924 if (strcmp(token
, "FORMAT=32-bit_rle_rgbe") == 0) valid
= 1;
5927 if (!valid
) return stbi__errpf("unsupported format", "Unsupported HDR format");
5929 // Parse width and height
5930 // can't use sscanf() if we're not using stdio!
5931 token
= stbi__hdr_gettoken(s
,buffer
);
5932 if (strncmp(token
, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
5934 height
= (int) strtol(token
, &token
, 10);
5935 while (*token
== ' ') ++token
;
5936 if (strncmp(token
, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
5938 width
= (int) strtol(token
, NULL
, 10);
5943 if (comp
) *comp
= 3;
5944 if (req_comp
== 0) req_comp
= 3;
5947 hdr_data
= (float *) stbi__malloc(height
* width
* req_comp
* sizeof(float));
5950 // image data is stored as some number of sca
5951 if ( width
< 8 || width
>= 32768) {
5953 for (j
=0; j
< height
; ++j
) {
5954 for (i
=0; i
< width
; ++i
) {
5957 stbi__getn(s
, rgbe
, 4);
5958 stbi__hdr_convert(hdr_data
+ j
* width
* req_comp
+ i
* req_comp
, rgbe
, req_comp
);
5962 // Read RLE-encoded data
5965 for (j
= 0; j
< height
; ++j
) {
5968 len
= stbi__get8(s
);
5969 if (c1
!= 2 || c2
!= 2 || (len
& 0x80)) {
5970 // not run-length encoded, so we have to actually use THIS data as a decoded
5971 // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
5973 rgbe
[0] = (stbi_uc
) c1
;
5974 rgbe
[1] = (stbi_uc
) c2
;
5975 rgbe
[2] = (stbi_uc
) len
;
5976 rgbe
[3] = (stbi_uc
) stbi__get8(s
);
5977 stbi__hdr_convert(hdr_data
, rgbe
, req_comp
);
5980 STBI_FREE(scanline
);
5981 goto main_decode_loop
; // yes, this makes no sense
5984 len
|= stbi__get8(s
);
5985 if (len
!= width
) { STBI_FREE(hdr_data
); STBI_FREE(scanline
); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
5986 if (scanline
== NULL
) scanline
= (stbi_uc
*) stbi__malloc(width
* 4);
5988 for (k
= 0; k
< 4; ++k
) {
5991 count
= stbi__get8(s
);
5994 value
= stbi__get8(s
);
5996 for (z
= 0; z
< count
; ++z
)
5997 scanline
[i
++ * 4 + k
] = value
;
6000 for (z
= 0; z
< count
; ++z
)
6001 scanline
[i
++ * 4 + k
] = stbi__get8(s
);
6005 for (i
=0; i
< width
; ++i
)
6006 stbi__hdr_convert(hdr_data
+(j
*width
+ i
)*req_comp
, scanline
+ i
*4, req_comp
);
6008 STBI_FREE(scanline
);
6014 static int stbi__hdr_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
6016 char buffer
[STBI__HDR_BUFLEN
];
6020 if (strcmp(stbi__hdr_gettoken(s
,buffer
), "#?RADIANCE") != 0) {
6026 token
= stbi__hdr_gettoken(s
,buffer
);
6027 if (token
[0] == 0) break;
6028 if (strcmp(token
, "FORMAT=32-bit_rle_rgbe") == 0) valid
= 1;
6035 token
= stbi__hdr_gettoken(s
,buffer
);
6036 if (strncmp(token
, "-Y ", 3)) {
6041 *y
= (int) strtol(token
, &token
, 10);
6042 while (*token
== ' ') ++token
;
6043 if (strncmp(token
, "+X ", 3)) {
6048 *x
= (int) strtol(token
, NULL
, 10);
6052 #endif // STBI_NO_HDR
6055 static int stbi__bmp_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
6058 if (stbi__get8(s
) != 'B' || stbi__get8(s
) != 'M') {
6063 hsz
= stbi__get32le(s
);
6064 if (hsz
!= 12 && hsz
!= 40 && hsz
!= 56 && hsz
!= 108 && hsz
!= 124) {
6069 *x
= stbi__get16le(s
);
6070 *y
= stbi__get16le(s
);
6072 *x
= stbi__get32le(s
);
6073 *y
= stbi__get32le(s
);
6075 if (stbi__get16le(s
) != 1) {
6079 *comp
= stbi__get16le(s
) / 8;
6085 static int stbi__psd_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
6088 if (stbi__get32be(s
) != 0x38425053) {
6092 if (stbi__get16be(s
) != 1) {
6097 channelCount
= stbi__get16be(s
);
6098 if (channelCount
< 0 || channelCount
> 16) {
6102 *y
= stbi__get32be(s
);
6103 *x
= stbi__get32be(s
);
6104 if (stbi__get16be(s
) != 8) {
6108 if (stbi__get16be(s
) != 3) {
6118 static int stbi__pic_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
6120 int act_comp
=0,num_packets
=0,chained
;
6121 stbi__pic_packet packets
[10];
6123 if (!stbi__pic_is4(s
,"\x53\x80\xF6\x34")) {
6130 *x
= stbi__get16be(s
);
6131 *y
= stbi__get16be(s
);
6132 if (stbi__at_eof(s
)) {
6136 if ( (*x
) != 0 && (1 << 28) / (*x
) < (*y
)) {
6144 stbi__pic_packet
*packet
;
6146 if (num_packets
==sizeof(packets
)/sizeof(packets
[0]))
6149 packet
= &packets
[num_packets
++];
6150 chained
= stbi__get8(s
);
6151 packet
->size
= stbi__get8(s
);
6152 packet
->type
= stbi__get8(s
);
6153 packet
->channel
= stbi__get8(s
);
6154 act_comp
|= packet
->channel
;
6156 if (stbi__at_eof(s
)) {
6160 if (packet
->size
!= 8) {
6166 *comp
= (act_comp
& 0x10 ? 4 : 3);
6172 // *************************************************************************************************
6173 // Portable Gray Map and Portable Pixel Map loader
6176 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
6177 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
6179 // Known limitations:
6180 // Does not support comments in the header section
6181 // Does not support ASCII image data (formats P2 and P3)
6182 // Does not support 16-bit-per-channel
6186 static int stbi__pnm_test(stbi__context
*s
)
6189 p
= (char) stbi__get8(s
);
6190 t
= (char) stbi__get8(s
);
6191 if (p
!= 'P' || (t
!= '5' && t
!= '6')) {
6198 static stbi_uc
*stbi__pnm_load(stbi__context
*s
, int *x
, int *y
, int *comp
, int req_comp
)
6201 if (!stbi__pnm_info(s
, (int *)&s
->img_x
, (int *)&s
->img_y
, (int *)&s
->img_n
))
6207 out
= (stbi_uc
*) stbi__malloc(s
->img_n
* s
->img_x
* s
->img_y
);
6208 if (!out
) return stbi__errpuc("outofmem", "Out of memory");
6209 stbi__getn(s
, out
, s
->img_n
* s
->img_x
* s
->img_y
);
6211 if (req_comp
&& req_comp
!= s
->img_n
) {
6212 out
= stbi__convert_format(out
, s
->img_n
, req_comp
, s
->img_x
, s
->img_y
);
6213 if (out
== NULL
) return out
; // stbi__convert_format frees input on failure
6218 static int stbi__pnm_isspace(char c
)
6220 return c
== ' ' || c
== '\t' || c
== '\n' || c
== '\v' || c
== '\f' || c
== '\r';
6223 static void stbi__pnm_skip_whitespace(stbi__context
*s
, char *c
)
6225 while (!stbi__at_eof(s
) && stbi__pnm_isspace(*c
))
6226 *c
= (char) stbi__get8(s
);
6229 static int stbi__pnm_isdigit(char c
)
6231 return c
>= '0' && c
<= '9';
6234 static int stbi__pnm_getinteger(stbi__context
*s
, char *c
)
6238 while (!stbi__at_eof(s
) && stbi__pnm_isdigit(*c
)) {
6239 value
= value
*10 + (*c
- '0');
6240 *c
= (char) stbi__get8(s
);
6246 static int stbi__pnm_info(stbi__context
*s
, int *x
, int *y
, int *comp
)
6254 p
= (char) stbi__get8(s
);
6255 t
= (char) stbi__get8(s
);
6256 if (p
!= 'P' || (t
!= '5' && t
!= '6')) {
6261 *comp
= (t
== '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
6263 c
= (char) stbi__get8(s
);
6264 stbi__pnm_skip_whitespace(s
, &c
);
6266 *x
= stbi__pnm_getinteger(s
, &c
); // read width
6267 stbi__pnm_skip_whitespace(s
, &c
);
6269 *y
= stbi__pnm_getinteger(s
, &c
); // read height
6270 stbi__pnm_skip_whitespace(s
, &c
);
6272 maxv
= stbi__pnm_getinteger(s
, &c
); // read max value
6275 return stbi__err("max value > 255", "PPM image not 8-bit");
6281 static int stbi__info_main(stbi__context
*s
, int *x
, int *y
, int *comp
)
6283 #ifndef STBI_NO_JPEG
6284 if (stbi__jpeg_info(s
, x
, y
, comp
)) return 1;
6288 if (stbi__png_info(s
, x
, y
, comp
)) return 1;
6292 if (stbi__gif_info(s
, x
, y
, comp
)) return 1;
6296 if (stbi__bmp_info(s
, x
, y
, comp
)) return 1;
6300 if (stbi__psd_info(s
, x
, y
, comp
)) return 1;
6304 if (stbi__pic_info(s
, x
, y
, comp
)) return 1;
6308 if (stbi__pnm_info(s
, x
, y
, comp
)) return 1;
6312 if (stbi__hdr_info(s
, x
, y
, comp
)) return 1;
6315 // test tga last because it's a crappy test!
6317 if (stbi__tga_info(s
, x
, y
, comp
))
6320 return stbi__err("unknown image type", "Image not of any known type, or corrupt");
6323 #ifndef STBI_NO_STDIO
6324 STBIDEF
int stbi_info(char const *filename
, int *x
, int *y
, int *comp
)
6326 FILE *f
= stbi__fopen(filename
, "rb");
6328 if (!f
) return stbi__err("can't fopen", "Unable to open file");
6329 result
= stbi_info_from_file(f
, x
, y
, comp
);
6334 STBIDEF
int stbi_info_from_file(FILE *f
, int *x
, int *y
, int *comp
)
6338 long pos
= ftell(f
);
6339 stbi__start_file(&s
, f
);
6340 r
= stbi__info_main(&s
,x
,y
,comp
);
6341 fseek(f
,pos
,SEEK_SET
);
6344 #endif // !STBI_NO_STDIO
6346 STBIDEF
int stbi_info_from_memory(stbi_uc
const *buffer
, int len
, int *x
, int *y
, int *comp
)
6349 stbi__start_mem(&s
,buffer
,len
);
6350 return stbi__info_main(&s
,x
,y
,comp
);
6353 STBIDEF
int stbi_info_from_callbacks(stbi_io_callbacks
const *c
, void *user
, int *x
, int *y
, int *comp
)
6356 stbi__start_callbacks(&s
, (stbi_io_callbacks
*) c
, user
);
6357 return stbi__info_main(&s
,x
,y
,comp
);
6360 #endif // STB_IMAGE_IMPLEMENTATION
6364 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
6365 2.07 (2015-09-13) fix compiler warnings
6366 partial animated GIF support
6367 limited 16-bit PSD support
6368 #ifdef unused functions
6369 bug with < 92 byte PIC,PNM,HDR,TGA
6370 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
6371 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
6372 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
6373 2.03 (2015-04-12) extra corruption checking (mmozeiko)
6374 stbi_set_flip_vertically_on_load (nguillemot)
6375 fix NEON support; fix mingw support
6376 2.02 (2015-01-19) fix incorrect assert, fix warning
6377 2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
6378 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
6379 2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
6380 progressive JPEG (stb)
6381 PGM/PPM support (Ken Miller)
6382 STBI_MALLOC,STBI_REALLOC,STBI_FREE
6383 GIF bugfix -- seemingly never worked
6384 STBI_NO_*, STBI_ONLY_*
6385 1.48 (2014-12-14) fix incorrectly-named assert()
6386 1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
6388 fix bug in interlaced PNG with user-specified channel count (stb)
6390 fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
6392 fix MSVC-ARM internal compiler error by wrapping malloc
6394 various warning fixes from Ronny Chevalier
6396 fix MSVC-only compiler problem in code changed in 1.42
6398 don't define _CRT_SECURE_NO_WARNINGS (affects user code)
6399 fixes to stbi__cleanup_jpeg path
6400 added STBI_ASSERT to avoid requiring assert.h
6402 fix search&replace from 1.36 that messed up comments/error messages
6404 fix gcc struct-initialization warning
6406 fix to TGA optimization when req_comp != number of components in TGA;
6407 fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
6408 add support for BMP version 5 (more ignored fields)
6410 suppress MSVC warnings on integer casts truncating values
6411 fix accidental rename of 'skip' field of I/O
6413 remove duplicate typedef
6415 convert to header file single-file library
6416 if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
6419 fix broken STBI_SIMD path
6420 fix bug where stbi_load_from_file no longer left file pointer in correct place
6421 fix broken non-easy path for 32-bit BMP (possibly never used)
6422 TGA optimization by Arseny Kapoulkine
6424 use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
6426 make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
6428 support for "info" function for all supported filetypes (SpartanJ)
6430 a few more leak fixes, bug in PNG handling (SpartanJ)
6432 added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
6433 removed deprecated format-specific test/load functions
6434 removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
6435 error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
6436 fix inefficiency in decoding 32-bit BMP (David Woo)
6438 various warning fixes from Aurelien Pocheville
6440 fix bug in GIF palette transparency (SpartanJ)
6442 cast-to-stbi_uc to fix warnings
6444 fix bug in file buffering for PNG reported by SpartanJ
6446 refix trans_data warning (Won Chun)
6448 perf improvements reading from files on platforms with lock-heavy fgetc()
6449 minor perf improvements for jpeg
6450 deprecated type-specific functions so we'll get feedback if they're needed
6451 attempt to fix trans_data warning (Won Chun)
6452 1.23 fixed bug in iPhone support
6454 removed image *writing* support
6455 stbi_info support from Jetro Lauha
6456 GIF support from Jean-Marc Lienher
6457 iPhone PNG-extensions from James Brown
6458 warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
6459 1.21 fix use of 'stbi_uc' in header (reported by jon blow)
6460 1.20 added support for Softimage PIC, by Tom Seddon
6461 1.19 bug in interlaced PNG corruption check (found by ryg)
6463 fix a threading bug (local mutable static)
6464 1.17 support interlaced PNG
6465 1.16 major bugfix - stbi__convert_format converted one too many pixels
6466 1.15 initialize some fields for thread safety
6467 1.14 fix threadsafe conversion bug
6468 header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
6470 1.12 const qualifiers in the API
6471 1.11 Support installable IDCT, colorspace conversion routines
6472 1.10 Fixes for 64-bit (don't use "unsigned long")
6473 optimized upsampling by Fabian "ryg" Giesen
6474 1.09 Fix format-conversion for PSD code (bad global variables!)
6475 1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
6476 1.07 attempt to fix C++ warning/errors again
6477 1.06 attempt to fix C++ warning/errors again
6478 1.05 fix TGA loading to return correct *comp and use good luminance calc
6479 1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
6480 1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
6481 1.02 support for (subset of) HDR files, float interface for preferred access to them
6482 1.01 fix bug: possible bug in handling right-side up bmps... not sure
6483 fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
6484 1.00 interface to zlib that skips zlib header
6485 0.99 correct handling of alpha in palette
6486 0.98 TGA loader by lonesock; dynamically add loaders (untested)
6487 0.97 jpeg errors on too large a file; also catch another malloc failure
6488 0.96 fix detection of invalid v value - particleman@mollyrocket forum
6489 0.95 during header scan, seek to markers in case of padding
6490 0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
6491 0.93 handle jpegtran output; verbose errors
6492 0.92 read 4,8,16,24,32-bit BMP files of several formats
6493 0.91 output 24-bit Windows 3.0 BMP files
6494 0.90 fix a few more warnings; bump version number to approach 1.0
6495 0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
6496 0.60 fix compiling as c++
6497 0.59 fix warnings: merge Dave Moore's -Wall fixes
6498 0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian
6499 0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
6500 0.56 fix bug: zlib uncompressed mode len vs. nlen
6501 0.55 fix bug: restart_interval not initialized to 0
6502 0.54 allow NULL for 'int *comp'
6503 0.53 fix bug in png 3->4; speedup png decoding
6504 0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments
6505 0.51 obey req_comp requests, 1-component jpegs return as 1-component,
6506 on 'test' only check type, not whether we support this variant
6508 first released version