Cleanup
[carla.git] / source / modules / dgl / src / nanovg / stb_image.h
blobe06f7a1d73b1e35df431a74413d81296ad3dba59
1 /* stb_image - v2.10 - public domain image loader - http://nothings.org/stb_image.h
2 no warranty implied; use at your own risk
4 Do this:
5 #define STB_IMAGE_IMPLEMENTATION
6 before you include this file in *one* C or C++ file to create the implementation.
8 // i.e. it should look like this:
9 #include ...
10 #include ...
11 #include ...
12 #define STB_IMAGE_IMPLEMENTATION
13 #include "stb_image.h"
15 You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16 And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
19 QUICK NOTES:
20 Primarily of interest to game developers and other people who can
21 avoid problematic images and only need the trivial interface
23 JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24 PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
26 TGA (not sure what subset, if a subset)
27 BMP non-1bpp, non-RLE
28 PSD (composited view only, no extra channels, 8/16 bit-per-channel)
30 GIF (*comp always reports as 4-channel)
31 HDR (radiance rgbE format)
32 PIC (Softimage PIC)
33 PNM (PPM and PGM binary only)
35 Animated GIF still needs a proper API, but here's one way to do it:
36 http://gist.github.com/urraka/685d9a6340b26b830d49
38 - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
39 - decode from arbitrary I/O callbacks
40 - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
42 Full documentation under "DOCUMENTATION" below.
45 Revision 2.00 release notes:
47 - Progressive JPEG is now supported.
49 - PPM and PGM binary formats are now supported, thanks to Ken Miller.
51 - x86 platforms now make use of SSE2 SIMD instructions for
52 JPEG decoding, and ARM platforms can use NEON SIMD if requested.
53 This work was done by Fabian "ryg" Giesen. SSE2 is used by
54 default, but NEON must be enabled explicitly; see docs.
56 With other JPEG optimizations included in this version, we see
57 2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
58 on a JPEG on an ARM machine, relative to previous versions of this
59 library. The same results will not obtain for all JPGs and for all
60 x86/ARM machines. (Note that progressive JPEGs are significantly
61 slower to decode than regular JPEGs.) This doesn't mean that this
62 is the fastest JPEG decoder in the land; rather, it brings it
63 closer to parity with standard libraries. If you want the fastest
64 decode, look elsewhere. (See "Philosophy" section of docs below.)
66 See final bullet items below for more info on SIMD.
68 - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
69 the memory allocator. Unlike other STBI libraries, these macros don't
70 support a context parameter, so if you need to pass a context in to
71 the allocator, you'll have to store it in a global or a thread-local
72 variable.
74 - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
75 STBI_NO_LINEAR.
76 STBI_NO_HDR: suppress implementation of .hdr reader format
77 STBI_NO_LINEAR: suppress high-dynamic-range light-linear float API
79 - You can suppress implementation of any of the decoders to reduce
80 your code footprint by #defining one or more of the following
81 symbols before creating the implementation.
83 STBI_NO_JPEG
84 STBI_NO_PNG
85 STBI_NO_BMP
86 STBI_NO_PSD
87 STBI_NO_TGA
88 STBI_NO_GIF
89 STBI_NO_HDR
90 STBI_NO_PIC
91 STBI_NO_PNM (.ppm and .pgm)
93 - You can request *only* certain decoders and suppress all other ones
94 (this will be more forward-compatible, as addition of new decoders
95 doesn't require you to disable them explicitly):
97 STBI_ONLY_JPEG
98 STBI_ONLY_PNG
99 STBI_ONLY_BMP
100 STBI_ONLY_PSD
101 STBI_ONLY_TGA
102 STBI_ONLY_GIF
103 STBI_ONLY_HDR
104 STBI_ONLY_PIC
105 STBI_ONLY_PNM (.ppm and .pgm)
107 Note that you can define multiples of these, and you will get all
108 of them ("only x" and "only y" is interpreted to mean "only x&y").
110 - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
111 want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
113 - Compilation of all SIMD code can be suppressed with
114 #define STBI_NO_SIMD
115 It should not be necessary to disable SIMD unless you have issues
116 compiling (e.g. using an x86 compiler which doesn't support SSE
117 intrinsics or that doesn't support the method used to detect
118 SSE2 support at run-time), and even those can be reported as
119 bugs so I can refine the built-in compile-time checking to be
120 smarter.
122 - The old STBI_SIMD system which allowed installing a user-defined
123 IDCT etc. has been removed. If you need this, don't upgrade. My
124 assumption is that almost nobody was doing this, and those who
125 were will find the built-in SIMD more satisfactory anyway.
127 - RGB values computed for JPEG images are slightly different from
128 previous versions of stb_image. (This is due to using less
129 integer precision in SIMD.) The C code has been adjusted so
130 that the same RGB values will be computed regardless of whether
131 SIMD support is available, so your app should always produce
132 consistent results. But these results are slightly different from
133 previous versions. (Specifically, about 3% of available YCbCr values
134 will compute different RGB results from pre-1.49 versions by +-1;
135 most of the deviating values are one smaller in the G channel.)
137 - If you must produce consistent results with previous versions of
138 stb_image, #define STBI_JPEG_OLD and you will get the same results
139 you used to; however, you will not get the SIMD speedups for
140 the YCbCr-to-RGB conversion step (although you should still see
141 significant JPEG speedup from the other changes).
143 Please note that STBI_JPEG_OLD is a temporary feature; it will be
144 removed in future versions of the library. It is only intended for
145 near-term back-compatibility use.
148 Latest revision history:
149 2.10 (2016-01-22) avoid warning introduced in 2.09
150 2.09 (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
151 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
152 2.07 (2015-09-13) partial animated GIF support
153 limited 16-bit PSD support
154 minor bugs, code cleanup, and compiler warnings
155 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
156 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
157 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
158 2.03 (2015-04-12) additional corruption checking
159 stbi_set_flip_vertically_on_load
160 fix NEON support; fix mingw support
161 2.02 (2015-01-19) fix incorrect assert, fix warning
162 2.01 (2015-01-17) fix various warnings
163 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
164 2.00 (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
165 progressive JPEG
166 PGM/PPM support
167 STBI_MALLOC,STBI_REALLOC,STBI_FREE
168 STBI_NO_*, STBI_ONLY_*
169 GIF bugfix
170 1.48 (2014-12-14) fix incorrectly-named assert()
171 1.47 (2014-12-14) 1/2/4-bit PNG support (both grayscale and paletted)
172 optimize PNG
173 fix bug in interlaced PNG with user-specified channel count
175 See end of file for full revision history.
178 ============================ Contributors =========================
180 Image formats Extensions, features
181 Sean Barrett (jpeg, png, bmp) Jetro Lauha (stbi_info)
182 Nicolas Schulz (hdr, psd) Martin "SpartanJ" Golini (stbi_info)
183 Jonathan Dummer (tga) James "moose2000" Brown (iPhone PNG)
184 Jean-Marc Lienher (gif) Ben "Disch" Wenger (io callbacks)
185 Tom Seddon (pic) Omar Cornut (1/2/4-bit PNG)
186 Thatcher Ulrich (psd) Nicolas Guillemot (vertical flip)
187 Ken Miller (pgm, ppm) Richard Mitton (16-bit PSD)
188 urraka@github (animated gif) Junggon Kim (PNM comments)
189 Daniel Gibson (16-bit TGA)
191 Optimizations & bugfixes
192 Fabian "ryg" Giesen
193 Arseny Kapoulkine
195 Bug & warning fixes
196 Marc LeBlanc David Woo Guillaume George Martins Mozeiko
197 Christpher Lloyd Martin Golini Jerry Jansson Joseph Thomson
198 Dave Moore Roy Eltham Hayaki Saito Phil Jordan
199 Won Chun Luke Graham Johan Duparc Nathan Reed
200 the Horde3D community Thomas Ruf Ronny Chevalier Nick Verigakis
201 Janez Zemva John Bartholomew Michal Cichon svdijk@github
202 Jonathan Blow Ken Hamada Tero Hanninen Baldur Karlsson
203 Laurent Gomila Cort Stratton Sergio Gonzalez romigrou@github
204 Aruelien Pocheville Thibault Reuille Cass Everitt
205 Ryamond Barbiero Paul Du Bois Engin Manap
206 Blazej Dariusz Roszkowski
207 Michaelangel007@github
210 LICENSE
212 This software is in the public domain. Where that dedication is not
213 recognized, you are granted a perpetual, irrevocable license to copy,
214 distribute, and modify this file as you see fit.
218 #ifndef STBI_INCLUDE_STB_IMAGE_H
219 #define STBI_INCLUDE_STB_IMAGE_H
221 // DOCUMENTATION
223 // Limitations:
224 // - no 16-bit-per-channel PNG
225 // - no 12-bit-per-channel JPEG
226 // - no JPEGs with arithmetic coding
227 // - no 1-bit BMP
228 // - GIF always returns *comp=4
230 // Basic usage (see HDR discussion below for HDR usage):
231 // int x,y,n;
232 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
233 // // ... process data if not NULL ...
234 // // ... x = width, y = height, n = # 8-bit components per pixel ...
235 // // ... replace '0' with '1'..'4' to force that many components per pixel
236 // // ... but 'n' will always be the number that it would have been if you said 0
237 // stbi_image_free(data)
239 // Standard parameters:
240 // int *x -- outputs image width in pixels
241 // int *y -- outputs image height in pixels
242 // int *comp -- outputs # of image components in image file
243 // int req_comp -- if non-zero, # of image components requested in result
245 // The return value from an image loader is an 'unsigned char *' which points
246 // to the pixel data, or NULL on an allocation failure or if the image is
247 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
248 // with each pixel consisting of N interleaved 8-bit components; the first
249 // pixel pointed to is top-left-most in the image. There is no padding between
250 // image scanlines or between pixels, regardless of format. The number of
251 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
252 // If req_comp is non-zero, *comp has the number of components that _would_
253 // have been output otherwise. E.g. if you set req_comp to 4, you will always
254 // get RGBA output, but you can check *comp to see if it's trivially opaque
255 // because e.g. there were only 3 channels in the source image.
257 // An output image with N components has the following components interleaved
258 // in this order in each pixel:
260 // N=#comp components
261 // 1 grey
262 // 2 grey, alpha
263 // 3 red, green, blue
264 // 4 red, green, blue, alpha
266 // If image loading fails for any reason, the return value will be NULL,
267 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
268 // can be queried for an extremely brief, end-user unfriendly explanation
269 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
270 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
271 // more user-friendly ones.
273 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
275 // ===========================================================================
277 // Philosophy
279 // stb libraries are designed with the following priorities:
281 // 1. easy to use
282 // 2. easy to maintain
283 // 3. good performance
285 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
286 // and for best performance I may provide less-easy-to-use APIs that give higher
287 // performance, in addition to the easy to use ones. Nevertheless, it's important
288 // to keep in mind that from the standpoint of you, a client of this library,
289 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
291 // Some secondary priorities arise directly from the first two, some of which
292 // make more explicit reasons why performance can't be emphasized.
294 // - Portable ("ease of use")
295 // - Small footprint ("easy to maintain")
296 // - No dependencies ("ease of use")
298 // ===========================================================================
300 // I/O callbacks
302 // I/O callbacks allow you to read from arbitrary sources, like packaged
303 // files or some other source. Data read from callbacks are processed
304 // through a small internal buffer (currently 128 bytes) to try to reduce
305 // overhead.
307 // The three functions you must define are "read" (reads some bytes of data),
308 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
310 // ===========================================================================
312 // SIMD support
314 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
315 // supported by the compiler. For ARM Neon support, you must explicitly
316 // request it.
318 // (The old do-it-yourself SIMD API is no longer supported in the current
319 // code.)
321 // On x86, SSE2 will automatically be used when available based on a run-time
322 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
323 // the typical path is to have separate builds for NEON and non-NEON devices
324 // (at least this is true for iOS and Android). Therefore, the NEON support is
325 // toggled by a build flag: define STBI_NEON to get NEON loops.
327 // The output of the JPEG decoder is slightly different from versions where
328 // SIMD support was introduced (that is, for versions before 1.49). The
329 // difference is only +-1 in the 8-bit RGB channels, and only on a small
330 // fraction of pixels. You can force the pre-1.49 behavior by defining
331 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
332 // and hence cost some performance.
334 // If for some reason you do not want to use any of SIMD code, or if
335 // you have issues compiling it, you can disable it entirely by
336 // defining STBI_NO_SIMD.
338 // ===========================================================================
340 // HDR image support (disable by defining STBI_NO_HDR)
342 // stb_image now supports loading HDR images in general, and currently
343 // the Radiance .HDR file format, although the support is provided
344 // generically. You can still load any file through the existing interface;
345 // if you attempt to load an HDR file, it will be automatically remapped to
346 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
347 // both of these constants can be reconfigured through this interface:
349 // stbi_hdr_to_ldr_gamma(2.2f);
350 // stbi_hdr_to_ldr_scale(1.0f);
352 // (note, do not use _inverse_ constants; stbi_image will invert them
353 // appropriately).
355 // Additionally, there is a new, parallel interface for loading files as
356 // (linear) floats to preserve the full dynamic range:
358 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
360 // If you load LDR images through this interface, those images will
361 // be promoted to floating point values, run through the inverse of
362 // constants corresponding to the above:
364 // stbi_ldr_to_hdr_scale(1.0f);
365 // stbi_ldr_to_hdr_gamma(2.2f);
367 // Finally, given a filename (or an open file or memory block--see header
368 // file for details) containing image data, you can query for the "most
369 // appropriate" interface to use (that is, whether the image is HDR or
370 // not), using:
372 // stbi_is_hdr(char *filename);
374 // ===========================================================================
376 // iPhone PNG support:
378 // By default we convert iphone-formatted PNGs back to RGB, even though
379 // they are internally encoded differently. You can disable this conversion
380 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
381 // you will always just get the native iphone "format" through (which
382 // is BGR stored in RGB).
384 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
385 // pixel to remove any premultiplied alpha *only* if the image file explicitly
386 // says there's premultiplied data (currently only happens in iPhone images,
387 // and only if iPhone convert-to-rgb processing is on).
391 #ifndef STBI_NO_STDIO
392 #include <stdio.h>
393 #endif // STBI_NO_STDIO
395 #define STBI_VERSION 1
397 enum
399 STBI_default = 0, // only used for req_comp
401 STBI_grey = 1,
402 STBI_grey_alpha = 2,
403 STBI_rgb = 3,
404 STBI_rgb_alpha = 4
407 typedef unsigned char stbi_uc;
409 #ifdef __cplusplus
410 extern "C" {
411 #endif
413 #ifdef STB_IMAGE_STATIC
414 #define STBIDEF static
415 #else
416 #define STBIDEF extern
417 #endif
419 //////////////////////////////////////////////////////////////////////////////
421 // PRIMARY API - works on images of any type
425 // load image by filename, open file, or memory buffer
428 typedef struct
430 int (*read) (void *user,char *data,int size); // fill 'data' with 'size' bytes. return number of bytes actually read
431 void (*skip) (void *user,int n); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
432 int (*eof) (void *user); // returns nonzero if we are at end of file/data
433 } stbi_io_callbacks;
435 STBIDEF stbi_uc *stbi_load (char const *filename, int *x, int *y, int *comp, int req_comp);
436 STBIDEF stbi_uc *stbi_load_from_memory (stbi_uc const *buffer, int len , int *x, int *y, int *comp, int req_comp);
437 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk , void *user, int *x, int *y, int *comp, int req_comp);
439 #ifndef STBI_NO_STDIO
440 STBIDEF stbi_uc *stbi_load_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
441 // for stbi_load_from_file, file pointer is left pointing immediately after image
442 #endif
444 #ifndef STBI_NO_LINEAR
445 STBIDEF float *stbi_loadf (char const *filename, int *x, int *y, int *comp, int req_comp);
446 STBIDEF float *stbi_loadf_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
447 STBIDEF float *stbi_loadf_from_callbacks (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp);
449 #ifndef STBI_NO_STDIO
450 STBIDEF float *stbi_loadf_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
451 #endif
452 #endif
454 #ifndef STBI_NO_HDR
455 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma);
456 STBIDEF void stbi_hdr_to_ldr_scale(float scale);
457 #endif // STBI_NO_HDR
459 #ifndef STBI_NO_LINEAR
460 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma);
461 STBIDEF void stbi_ldr_to_hdr_scale(float scale);
462 #endif // STBI_NO_LINEAR
464 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
465 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
466 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
467 #ifndef STBI_NO_STDIO
468 STBIDEF int stbi_is_hdr (char const *filename);
469 STBIDEF int stbi_is_hdr_from_file(FILE *f);
470 #endif // STBI_NO_STDIO
473 // get a VERY brief reason for failure
474 // NOT THREADSAFE
475 STBIDEF const char *stbi_failure_reason (void);
477 // free the loaded image -- this is just free()
478 STBIDEF void stbi_image_free (void *retval_from_stbi_load);
480 // get image dimensions & components without fully decoding
481 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
482 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
484 #ifndef STBI_NO_STDIO
485 STBIDEF int stbi_info (char const *filename, int *x, int *y, int *comp);
486 STBIDEF int stbi_info_from_file (FILE *f, int *x, int *y, int *comp);
488 #endif
492 // for image formats that explicitly notate that they have premultiplied alpha,
493 // we just return the colors as stored in the file. set this flag to force
494 // unpremultiplication. results are undefined if the unpremultiply overflow.
495 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
497 // indicate whether we should process iphone images back to canonical format,
498 // or just pass them through "as-is"
499 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
501 // flip the image vertically, so the first pixel in the output array is the bottom left
502 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
504 // ZLIB client - used by PNG, available for other purposes
506 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
507 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
508 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
509 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
511 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
512 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
515 #ifdef __cplusplus
517 #endif
521 //// end header file /////////////////////////////////////////////////////
522 #endif // STBI_INCLUDE_STB_IMAGE_H
524 #ifdef STB_IMAGE_IMPLEMENTATION
526 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
527 || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
528 || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
529 || defined(STBI_ONLY_ZLIB)
530 #ifndef STBI_ONLY_JPEG
531 #define STBI_NO_JPEG
532 #endif
533 #ifndef STBI_ONLY_PNG
534 #define STBI_NO_PNG
535 #endif
536 #ifndef STBI_ONLY_BMP
537 #define STBI_NO_BMP
538 #endif
539 #ifndef STBI_ONLY_PSD
540 #define STBI_NO_PSD
541 #endif
542 #ifndef STBI_ONLY_TGA
543 #define STBI_NO_TGA
544 #endif
545 #ifndef STBI_ONLY_GIF
546 #define STBI_NO_GIF
547 #endif
548 #ifndef STBI_ONLY_HDR
549 #define STBI_NO_HDR
550 #endif
551 #ifndef STBI_ONLY_PIC
552 #define STBI_NO_PIC
553 #endif
554 #ifndef STBI_ONLY_PNM
555 #define STBI_NO_PNM
556 #endif
557 #endif
559 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
560 #define STBI_NO_ZLIB
561 #endif
564 #include <stdarg.h>
565 #include <stddef.h> // ptrdiff_t on osx
566 #include <stdlib.h>
567 #include <string.h>
569 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
570 #include <math.h> // ldexp
571 #endif
573 #ifndef STBI_NO_STDIO
574 #include <stdio.h>
575 #endif
577 #ifndef STBI_ASSERT
578 #include <assert.h>
579 #define STBI_ASSERT(x) assert(x)
580 #endif
583 #ifndef _MSC_VER
584 #ifdef __cplusplus
585 #define stbi_inline inline
586 #else
587 #define stbi_inline
588 #endif
589 #else
590 #define stbi_inline __forceinline
591 #endif
594 #ifdef _MSC_VER
595 typedef unsigned short stbi__uint16;
596 typedef signed short stbi__int16;
597 typedef unsigned int stbi__uint32;
598 typedef signed int stbi__int32;
599 #else
600 #include <stdint.h>
601 typedef uint16_t stbi__uint16;
602 typedef int16_t stbi__int16;
603 typedef uint32_t stbi__uint32;
604 typedef int32_t stbi__int32;
605 #endif
607 // should produce compiler error if size is wrong
608 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
610 #ifdef _MSC_VER
611 #define STBI_NOTUSED(v) (void)(v)
612 #else
613 #define STBI_NOTUSED(v) (void)sizeof(v)
614 #endif
616 #ifdef _MSC_VER
617 #define STBI_HAS_LROTL
618 #endif
620 #ifdef STBI_HAS_LROTL
621 #define stbi_lrot(x,y) _lrotl(x,y)
622 #else
623 #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (32 - (y))))
624 #endif
626 #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
627 // ok
628 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
629 // ok
630 #else
631 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
632 #endif
634 #ifndef STBI_MALLOC
635 #define STBI_MALLOC(sz) malloc(sz)
636 #define STBI_REALLOC(p,newsz) realloc(p,newsz)
637 #define STBI_FREE(p) free(p)
638 #endif
640 #ifndef STBI_REALLOC_SIZED
641 #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
642 #endif
644 // x86/x64 detection
645 #if defined(__x86_64__) || defined(_M_X64)
646 #define STBI__X64_TARGET
647 #elif defined(__i386) || defined(_M_IX86)
648 #define STBI__X86_TARGET
649 #endif
651 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
652 // NOTE: not clear do we actually need this for the 64-bit path?
653 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
654 // (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
655 // this is just broken and gcc are jerks for not fixing it properly
656 // http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
657 #define STBI_NO_SIMD
658 #endif
660 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
661 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
663 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
664 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
665 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
666 // simultaneously enabling "-mstackrealign".
668 // See https://github.com/nothings/stb/issues/81 for more information.
670 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
671 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
672 #define STBI_NO_SIMD
673 #endif
675 #if !defined(STBI_NO_SIMD) && defined(STBI__X86_TARGET)
676 #define STBI_SSE2
677 #include <emmintrin.h>
679 #ifdef _MSC_VER
681 #if _MSC_VER >= 1400 // not VC6
682 #include <intrin.h> // __cpuid
683 static int stbi__cpuid3(void)
685 int info[4];
686 __cpuid(info,1);
687 return info[3];
689 #else
690 static int stbi__cpuid3(void)
692 int res;
693 __asm {
694 mov eax,1
695 cpuid
696 mov res,edx
698 return res;
700 #endif
702 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
704 static int stbi__sse2_available()
706 int info3 = stbi__cpuid3();
707 return ((info3 >> 26) & 1) != 0;
709 #else // assume GCC-style if not VC++
710 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
712 static int stbi__sse2_available()
714 #if defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 408 // GCC 4.8 or later
715 // GCC 4.8+ has a nice way to do this
716 return __builtin_cpu_supports("sse2");
717 #else
718 // portable way to do this, preferably without using GCC inline ASM?
719 // just bail for now.
720 return 0;
721 #endif
723 #endif
724 #endif
726 // ARM NEON
727 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
728 #undef STBI_NEON
729 #endif
731 #ifdef STBI_NEON
732 #include <arm_neon.h>
733 // assume GCC or Clang on ARM targets
734 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
735 #endif
737 #ifndef STBI_SIMD_ALIGN
738 #define STBI_SIMD_ALIGN(type, name) type name
739 #endif
741 ///////////////////////////////////////////////
743 // stbi__context struct and start_xxx functions
745 // stbi__context structure is our basic context used by all images, so it
746 // contains all the IO context, plus some basic image information
747 typedef struct
749 stbi__uint32 img_x, img_y;
750 int img_n, img_out_n;
752 stbi_io_callbacks io;
753 void *io_user_data;
755 int read_from_callbacks;
756 int buflen;
757 stbi_uc buffer_start[128];
759 stbi_uc *img_buffer, *img_buffer_end;
760 stbi_uc *img_buffer_original, *img_buffer_original_end;
761 } stbi__context;
764 static void stbi__refill_buffer(stbi__context *s);
766 // initialize a memory-decode context
767 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
769 s->io.read = NULL;
770 s->read_from_callbacks = 0;
771 s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
772 s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
775 // initialize a callback-based context
776 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
778 s->io = *c;
779 s->io_user_data = user;
780 s->buflen = sizeof(s->buffer_start);
781 s->read_from_callbacks = 1;
782 s->img_buffer_original = s->buffer_start;
783 stbi__refill_buffer(s);
784 s->img_buffer_original_end = s->img_buffer_end;
787 #ifndef STBI_NO_STDIO
789 static int stbi__stdio_read(void *user, char *data, int size)
791 return (int) fread(data,1,size,(FILE*) user);
794 static void stbi__stdio_skip(void *user, int n)
796 fseek((FILE*) user, n, SEEK_CUR);
799 static int stbi__stdio_eof(void *user)
801 return feof((FILE*) user);
804 static stbi_io_callbacks stbi__stdio_callbacks =
806 stbi__stdio_read,
807 stbi__stdio_skip,
808 stbi__stdio_eof,
811 static void stbi__start_file(stbi__context *s, FILE *f)
813 stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
816 //static void stop_file(stbi__context *s) { }
818 #endif // !STBI_NO_STDIO
820 static void stbi__rewind(stbi__context *s)
822 // conceptually rewind SHOULD rewind to the beginning of the stream,
823 // but we just rewind to the beginning of the initial buffer, because
824 // we only use it after doing 'test', which only ever looks at at most 92 bytes
825 s->img_buffer = s->img_buffer_original;
826 s->img_buffer_end = s->img_buffer_original_end;
829 #ifndef STBI_NO_JPEG
830 static int stbi__jpeg_test(stbi__context *s);
831 static stbi_uc *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
832 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
833 #endif
835 #ifndef STBI_NO_PNG
836 static int stbi__png_test(stbi__context *s);
837 static stbi_uc *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
838 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
839 #endif
841 #ifndef STBI_NO_BMP
842 static int stbi__bmp_test(stbi__context *s);
843 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
844 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
845 #endif
847 #ifndef STBI_NO_TGA
848 static int stbi__tga_test(stbi__context *s);
849 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
850 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
851 #endif
853 #ifndef STBI_NO_PSD
854 static int stbi__psd_test(stbi__context *s);
855 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
856 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
857 #endif
859 #ifndef STBI_NO_HDR
860 static int stbi__hdr_test(stbi__context *s);
861 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
862 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
863 #endif
865 #ifndef STBI_NO_PIC
866 static int stbi__pic_test(stbi__context *s);
867 static stbi_uc *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
868 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
869 #endif
871 #ifndef STBI_NO_GIF
872 static int stbi__gif_test(stbi__context *s);
873 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
874 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
875 #endif
877 #ifndef STBI_NO_PNM
878 static int stbi__pnm_test(stbi__context *s);
879 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
880 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
881 #endif
883 // this is not threadsafe
884 static const char *stbi__g_failure_reason;
886 STBIDEF const char *stbi_failure_reason(void)
888 return stbi__g_failure_reason;
891 static int stbi__err(const char *str)
893 stbi__g_failure_reason = str;
894 return 0;
897 static void *stbi__malloc(size_t size)
899 return STBI_MALLOC(size);
902 // stbi__err - error
903 // stbi__errpf - error returning pointer to float
904 // stbi__errpuc - error returning pointer to unsigned char
906 #ifdef STBI_NO_FAILURE_STRINGS
907 #define stbi__err(x,y) 0
908 #elif defined(STBI_FAILURE_USERMSG)
909 #define stbi__err(x,y) stbi__err(y)
910 #else
911 #define stbi__err(x,y) stbi__err(x)
912 #endif
914 #define stbi__errpf(x,y) ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
915 #define stbi__errpuc(x,y) ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
917 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
919 STBI_FREE(retval_from_stbi_load);
922 #ifndef STBI_NO_LINEAR
923 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
924 #endif
926 #ifndef STBI_NO_HDR
927 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp);
928 #endif
930 static int stbi__vertically_flip_on_load = 0;
932 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
934 stbi__vertically_flip_on_load = flag_true_if_should_flip;
937 static unsigned char *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
939 #ifndef STBI_NO_JPEG
940 if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp);
941 #endif
942 #ifndef STBI_NO_PNG
943 if (stbi__png_test(s)) return stbi__png_load(s,x,y,comp,req_comp);
944 #endif
945 #ifndef STBI_NO_BMP
946 if (stbi__bmp_test(s)) return stbi__bmp_load(s,x,y,comp,req_comp);
947 #endif
948 #ifndef STBI_NO_GIF
949 if (stbi__gif_test(s)) return stbi__gif_load(s,x,y,comp,req_comp);
950 #endif
951 #ifndef STBI_NO_PSD
952 if (stbi__psd_test(s)) return stbi__psd_load(s,x,y,comp,req_comp);
953 #endif
954 #ifndef STBI_NO_PIC
955 if (stbi__pic_test(s)) return stbi__pic_load(s,x,y,comp,req_comp);
956 #endif
957 #ifndef STBI_NO_PNM
958 if (stbi__pnm_test(s)) return stbi__pnm_load(s,x,y,comp,req_comp);
959 #endif
961 #ifndef STBI_NO_HDR
962 if (stbi__hdr_test(s)) {
963 float *hdr = stbi__hdr_load(s, x,y,comp,req_comp);
964 return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
966 #endif
968 #ifndef STBI_NO_TGA
969 // test tga last because it's a crappy test!
970 if (stbi__tga_test(s))
971 return stbi__tga_load(s,x,y,comp,req_comp);
972 #endif
974 return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
977 static unsigned char *stbi__load_flip(stbi__context *s, int *x, int *y, int *comp, int req_comp)
979 unsigned char *result = stbi__load_main(s, x, y, comp, req_comp);
981 if (stbi__vertically_flip_on_load && result != NULL) {
982 int w = *x, h = *y;
983 int depth = req_comp ? req_comp : *comp;
984 int row,col,z;
985 stbi_uc temp;
987 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
988 for (row = 0; row < (h>>1); row++) {
989 for (col = 0; col < w; col++) {
990 for (z = 0; z < depth; z++) {
991 temp = result[(row * w + col) * depth + z];
992 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
993 result[((h - row - 1) * w + col) * depth + z] = temp;
999 return result;
1002 #ifndef STBI_NO_HDR
1003 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
1005 if (stbi__vertically_flip_on_load && result != NULL) {
1006 int w = *x, h = *y;
1007 int depth = req_comp ? req_comp : *comp;
1008 int row,col,z;
1009 float temp;
1011 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
1012 for (row = 0; row < (h>>1); row++) {
1013 for (col = 0; col < w; col++) {
1014 for (z = 0; z < depth; z++) {
1015 temp = result[(row * w + col) * depth + z];
1016 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1017 result[((h - row - 1) * w + col) * depth + z] = temp;
1023 #endif
1025 #ifndef STBI_NO_STDIO
1027 static FILE *stbi__fopen(char const *filename, char const *mode)
1029 FILE *f;
1030 #if defined(_MSC_VER) && _MSC_VER >= 1400
1031 if (0 != fopen_s(&f, filename, mode))
1032 f=0;
1033 #else
1034 f = fopen(filename, mode);
1035 #endif
1036 return f;
1040 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
1042 FILE *f = stbi__fopen(filename, "rb");
1043 unsigned char *result;
1044 if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1045 result = stbi_load_from_file(f,x,y,comp,req_comp);
1046 fclose(f);
1047 return result;
1050 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1052 unsigned char *result;
1053 stbi__context s;
1054 stbi__start_file(&s,f);
1055 result = stbi__load_flip(&s,x,y,comp,req_comp);
1056 if (result) {
1057 // need to 'unget' all the characters in the IO buffer
1058 fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1060 return result;
1062 #endif //!STBI_NO_STDIO
1064 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1066 stbi__context s;
1067 stbi__start_mem(&s,buffer,len);
1068 return stbi__load_flip(&s,x,y,comp,req_comp);
1071 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1073 stbi__context s;
1074 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1075 return stbi__load_flip(&s,x,y,comp,req_comp);
1078 #ifndef STBI_NO_LINEAR
1079 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1081 unsigned char *data;
1082 #ifndef STBI_NO_HDR
1083 if (stbi__hdr_test(s)) {
1084 float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp);
1085 if (hdr_data)
1086 stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
1087 return hdr_data;
1089 #endif
1090 data = stbi__load_flip(s, x, y, comp, req_comp);
1091 if (data)
1092 return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1093 return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1096 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1098 stbi__context s;
1099 stbi__start_mem(&s,buffer,len);
1100 return stbi__loadf_main(&s,x,y,comp,req_comp);
1103 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1105 stbi__context s;
1106 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1107 return stbi__loadf_main(&s,x,y,comp,req_comp);
1110 #ifndef STBI_NO_STDIO
1111 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
1113 float *result;
1114 FILE *f = stbi__fopen(filename, "rb");
1115 if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1116 result = stbi_loadf_from_file(f,x,y,comp,req_comp);
1117 fclose(f);
1118 return result;
1121 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1123 stbi__context s;
1124 stbi__start_file(&s,f);
1125 return stbi__loadf_main(&s,x,y,comp,req_comp);
1127 #endif // !STBI_NO_STDIO
1129 #endif // !STBI_NO_LINEAR
1131 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1132 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1133 // reports false!
1135 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
1137 #ifndef STBI_NO_HDR
1138 stbi__context s;
1139 stbi__start_mem(&s,buffer,len);
1140 return stbi__hdr_test(&s);
1141 #else
1142 STBI_NOTUSED(buffer);
1143 STBI_NOTUSED(len);
1144 return 0;
1145 #endif
1148 #ifndef STBI_NO_STDIO
1149 STBIDEF int stbi_is_hdr (char const *filename)
1151 FILE *f = stbi__fopen(filename, "rb");
1152 int result=0;
1153 if (f) {
1154 result = stbi_is_hdr_from_file(f);
1155 fclose(f);
1157 return result;
1160 STBIDEF int stbi_is_hdr_from_file(FILE *f)
1162 #ifndef STBI_NO_HDR
1163 stbi__context s;
1164 stbi__start_file(&s,f);
1165 return stbi__hdr_test(&s);
1166 #else
1167 STBI_NOTUSED(f);
1168 return 0;
1169 #endif
1171 #endif // !STBI_NO_STDIO
1173 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
1175 #ifndef STBI_NO_HDR
1176 stbi__context s;
1177 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1178 return stbi__hdr_test(&s);
1179 #else
1180 STBI_NOTUSED(clbk);
1181 STBI_NOTUSED(user);
1182 return 0;
1183 #endif
1186 #ifndef STBI_NO_LINEAR
1187 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
1189 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
1190 STBIDEF void stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1191 #endif
1193 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
1195 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
1196 STBIDEF void stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
1199 //////////////////////////////////////////////////////////////////////////////
1201 // Common code used by all image loaders
1204 enum
1206 STBI__SCAN_load=0,
1207 STBI__SCAN_type,
1208 STBI__SCAN_header
1211 static void stbi__refill_buffer(stbi__context *s)
1213 int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
1214 if (n == 0) {
1215 // at end of file, treat same as if from memory, but need to handle case
1216 // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1217 s->read_from_callbacks = 0;
1218 s->img_buffer = s->buffer_start;
1219 s->img_buffer_end = s->buffer_start+1;
1220 *s->img_buffer = 0;
1221 } else {
1222 s->img_buffer = s->buffer_start;
1223 s->img_buffer_end = s->buffer_start + n;
1227 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
1229 if (s->img_buffer < s->img_buffer_end)
1230 return *s->img_buffer++;
1231 if (s->read_from_callbacks) {
1232 stbi__refill_buffer(s);
1233 return *s->img_buffer++;
1235 return 0;
1238 stbi_inline static int stbi__at_eof(stbi__context *s)
1240 if (s->io.read) {
1241 if (!(s->io.eof)(s->io_user_data)) return 0;
1242 // if feof() is true, check if buffer = end
1243 // special case: we've only got the special 0 character at the end
1244 if (s->read_from_callbacks == 0) return 1;
1247 return s->img_buffer >= s->img_buffer_end;
1250 static void stbi__skip(stbi__context *s, int n)
1252 if (n < 0) {
1253 s->img_buffer = s->img_buffer_end;
1254 return;
1256 if (s->io.read) {
1257 int blen = (int) (s->img_buffer_end - s->img_buffer);
1258 if (blen < n) {
1259 s->img_buffer = s->img_buffer_end;
1260 (s->io.skip)(s->io_user_data, n - blen);
1261 return;
1264 s->img_buffer += n;
1267 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
1269 if (s->io.read) {
1270 int blen = (int) (s->img_buffer_end - s->img_buffer);
1271 if (blen < n) {
1272 int res, count;
1274 memcpy(buffer, s->img_buffer, blen);
1276 count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
1277 res = (count == (n-blen));
1278 s->img_buffer = s->img_buffer_end;
1279 return res;
1283 if (s->img_buffer+n <= s->img_buffer_end) {
1284 memcpy(buffer, s->img_buffer, n);
1285 s->img_buffer += n;
1286 return 1;
1287 } else
1288 return 0;
1291 static int stbi__get16be(stbi__context *s)
1293 int z = stbi__get8(s);
1294 return (z << 8) + stbi__get8(s);
1297 static stbi__uint32 stbi__get32be(stbi__context *s)
1299 stbi__uint32 z = stbi__get16be(s);
1300 return (z << 16) + stbi__get16be(s);
1303 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
1304 // nothing
1305 #else
1306 static int stbi__get16le(stbi__context *s)
1308 int z = stbi__get8(s);
1309 return z + (stbi__get8(s) << 8);
1311 #endif
1313 #ifndef STBI_NO_BMP
1314 static stbi__uint32 stbi__get32le(stbi__context *s)
1316 stbi__uint32 z = stbi__get16le(s);
1317 return z + (stbi__get16le(s) << 16);
1319 #endif
1321 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1324 //////////////////////////////////////////////////////////////////////////////
1326 // generic converter from built-in img_n to req_comp
1327 // individual types do this automatically as much as possible (e.g. jpeg
1328 // does all cases internally since it needs to colorspace convert anyway,
1329 // and it never has alpha, so very few cases ). png can automatically
1330 // interleave an alpha=255 channel, but falls back to this for other cases
1332 // assume data buffer is malloced, so malloc a new one and free that one
1333 // only failure mode is malloc failing
1335 static stbi_uc stbi__compute_y(int r, int g, int b)
1337 return (stbi_uc) (((r*77) + (g*150) + (29*b)) >> 8);
1340 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1342 int i,j;
1343 unsigned char *good;
1345 if (req_comp == img_n) return data;
1346 STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1348 good = (unsigned char *) stbi__malloc(req_comp * x * y);
1349 if (good == NULL) {
1350 STBI_FREE(data);
1351 return stbi__errpuc("outofmem", "Out of memory");
1354 for (j=0; j < (int) y; ++j) {
1355 unsigned char *src = data + j * x * img_n ;
1356 unsigned char *dest = good + j * x * req_comp;
1358 #define COMBO(a,b) ((a)*8+(b))
1359 #define CASE(a,b) case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1360 // convert source image with img_n components to one with req_comp components;
1361 // avoid switch per pixel, so use switch per scanline and massive macros
1362 switch (COMBO(img_n, req_comp)) {
1363 CASE(1,2) dest[0]=src[0], dest[1]=255; break;
1364 CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1365 CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
1366 CASE(2,1) dest[0]=src[0]; break;
1367 CASE(2,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1368 CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1]; break;
1369 CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255; break;
1370 CASE(3,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1371 CASE(3,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = 255; break;
1372 CASE(4,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1373 CASE(4,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = src[3]; break;
1374 CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2]; break;
1375 default: STBI_ASSERT(0);
1377 #undef CASE
1380 STBI_FREE(data);
1381 return good;
1384 #ifndef STBI_NO_LINEAR
1385 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1387 int i,k,n;
1388 float *output = (float *) stbi__malloc(x * y * comp * sizeof(float));
1389 if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1390 // compute number of non-alpha components
1391 if (comp & 1) n = comp; else n = comp-1;
1392 for (i=0; i < x*y; ++i) {
1393 for (k=0; k < n; ++k) {
1394 output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1396 if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
1398 STBI_FREE(data);
1399 return output;
1401 #endif
1403 #ifndef STBI_NO_HDR
1404 #define stbi__float2int(x) ((int) (x))
1405 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp)
1407 int i,k,n;
1408 stbi_uc *output = (stbi_uc *) stbi__malloc(x * y * comp);
1409 if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1410 // compute number of non-alpha components
1411 if (comp & 1) n = comp; else n = comp-1;
1412 for (i=0; i < x*y; ++i) {
1413 for (k=0; k < n; ++k) {
1414 float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1415 if (z < 0) z = 0;
1416 if (z > 255) z = 255;
1417 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1419 if (k < comp) {
1420 float z = data[i*comp+k] * 255 + 0.5f;
1421 if (z < 0) z = 0;
1422 if (z > 255) z = 255;
1423 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1426 STBI_FREE(data);
1427 return output;
1429 #endif
1431 //////////////////////////////////////////////////////////////////////////////
1433 // "baseline" JPEG/JFIF decoder
1435 // simple implementation
1436 // - doesn't support delayed output of y-dimension
1437 // - simple interface (only one output format: 8-bit interleaved RGB)
1438 // - doesn't try to recover corrupt jpegs
1439 // - doesn't allow partial loading, loading multiple at once
1440 // - still fast on x86 (copying globals into locals doesn't help x86)
1441 // - allocates lots of intermediate memory (full size of all components)
1442 // - non-interleaved case requires this anyway
1443 // - allows good upsampling (see next)
1444 // high-quality
1445 // - upsampled channels are bilinearly interpolated, even across blocks
1446 // - quality integer IDCT derived from IJG's 'slow'
1447 // performance
1448 // - fast huffman; reasonable integer IDCT
1449 // - some SIMD kernels for common paths on targets with SSE2/NEON
1450 // - uses a lot of intermediate memory, could cache poorly
1452 #ifndef STBI_NO_JPEG
1454 // huffman decoding acceleration
1455 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1457 typedef struct
1459 stbi_uc fast[1 << FAST_BITS];
1460 // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1461 stbi__uint16 code[256];
1462 stbi_uc values[256];
1463 stbi_uc size[257];
1464 unsigned int maxcode[18];
1465 int delta[17]; // old 'firstsymbol' - old 'firstcode'
1466 } stbi__huffman;
1468 typedef struct
1470 stbi__context *s;
1471 stbi__huffman huff_dc[4];
1472 stbi__huffman huff_ac[4];
1473 stbi_uc dequant[4][64];
1474 stbi__int16 fast_ac[4][1 << FAST_BITS];
1476 // sizes for components, interleaved MCUs
1477 int img_h_max, img_v_max;
1478 int img_mcu_x, img_mcu_y;
1479 int img_mcu_w, img_mcu_h;
1481 // definition of jpeg image component
1482 struct
1484 int id;
1485 int h,v;
1486 int tq;
1487 int hd,ha;
1488 int dc_pred;
1490 int x,y,w2,h2;
1491 stbi_uc *data;
1492 void *raw_data, *raw_coeff;
1493 stbi_uc *linebuf;
1494 short *coeff; // progressive only
1495 int coeff_w, coeff_h; // number of 8x8 coefficient blocks
1496 } img_comp[4];
1498 stbi__uint32 code_buffer; // jpeg entropy-coded buffer
1499 int code_bits; // number of valid bits
1500 unsigned char marker; // marker seen while filling entropy buffer
1501 int nomore; // flag if we saw a marker so must stop
1503 int progressive;
1504 int spec_start;
1505 int spec_end;
1506 int succ_high;
1507 int succ_low;
1508 int eob_run;
1510 int scan_n, order[4];
1511 int restart_interval, todo;
1513 // kernels
1514 void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
1515 void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1516 stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
1517 } stbi__jpeg;
1519 static int stbi__build_huffman(stbi__huffman *h, int *count)
1521 int i,j,k=0,code;
1522 // build size list for each symbol (from JPEG spec)
1523 for (i=0; i < 16; ++i)
1524 for (j=0; j < count[i]; ++j)
1525 h->size[k++] = (stbi_uc) (i+1);
1526 h->size[k] = 0;
1528 // compute actual symbols (from jpeg spec)
1529 code = 0;
1530 k = 0;
1531 for(j=1; j <= 16; ++j) {
1532 // compute delta to add to code to compute symbol id
1533 h->delta[j] = k - code;
1534 if (h->size[k] == j) {
1535 while (h->size[k] == j)
1536 h->code[k++] = (stbi__uint16) (code++);
1537 if (code-1 >= (1 << j)) return stbi__err("bad code lengths","Corrupt JPEG");
1539 // compute largest code + 1 for this size, preshifted as needed later
1540 h->maxcode[j] = code << (16-j);
1541 code <<= 1;
1543 h->maxcode[j] = 0xffffffff;
1545 // build non-spec acceleration table; 255 is flag for not-accelerated
1546 memset(h->fast, 255, 1 << FAST_BITS);
1547 for (i=0; i < k; ++i) {
1548 int s = h->size[i];
1549 if (s <= FAST_BITS) {
1550 int c = h->code[i] << (FAST_BITS-s);
1551 int m = 1 << (FAST_BITS-s);
1552 for (j=0; j < m; ++j) {
1553 h->fast[c+j] = (stbi_uc) i;
1557 return 1;
1560 // build a table that decodes both magnitude and value of small ACs in
1561 // one go.
1562 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
1564 int i;
1565 for (i=0; i < (1 << FAST_BITS); ++i) {
1566 stbi_uc fast = h->fast[i];
1567 fast_ac[i] = 0;
1568 if (fast < 255) {
1569 int rs = h->values[fast];
1570 int run = (rs >> 4) & 15;
1571 int magbits = rs & 15;
1572 int len = h->size[fast];
1574 if (magbits && len + magbits <= FAST_BITS) {
1575 // magnitude code followed by receive_extend code
1576 int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
1577 int m = 1 << (magbits - 1);
1578 if (k < m) k += (-1 << magbits) + 1;
1579 // if the result is small enough, we can fit it in fast_ac table
1580 if (k >= -128 && k <= 127)
1581 fast_ac[i] = (stbi__int16) ((k << 8) + (run << 4) + (len + magbits));
1587 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
1589 do {
1590 int b = j->nomore ? 0 : stbi__get8(j->s);
1591 if (b == 0xff) {
1592 int c = stbi__get8(j->s);
1593 if (c != 0) {
1594 j->marker = (unsigned char) c;
1595 j->nomore = 1;
1596 return;
1599 j->code_buffer |= b << (24 - j->code_bits);
1600 j->code_bits += 8;
1601 } while (j->code_bits <= 24);
1604 // (1 << n) - 1
1605 static stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1607 // decode a jpeg huffman value from the bitstream
1608 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
1610 unsigned int temp;
1611 int c,k;
1613 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1615 // look at the top FAST_BITS and determine what symbol ID it is,
1616 // if the code is <= FAST_BITS
1617 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1618 k = h->fast[c];
1619 if (k < 255) {
1620 int s = h->size[k];
1621 if (s > j->code_bits)
1622 return -1;
1623 j->code_buffer <<= s;
1624 j->code_bits -= s;
1625 return h->values[k];
1628 // naive test is to shift the code_buffer down so k bits are
1629 // valid, then test against maxcode. To speed this up, we've
1630 // preshifted maxcode left so that it has (16-k) 0s at the
1631 // end; in other words, regardless of the number of bits, it
1632 // wants to be compared against something shifted to have 16;
1633 // that way we don't need to shift inside the loop.
1634 temp = j->code_buffer >> 16;
1635 for (k=FAST_BITS+1 ; ; ++k)
1636 if (temp < h->maxcode[k])
1637 break;
1638 if (k == 17) {
1639 // error! code not found
1640 j->code_bits -= 16;
1641 return -1;
1644 if (k > j->code_bits)
1645 return -1;
1647 // convert the huffman code to the symbol id
1648 c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
1649 STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
1651 // convert the id to a symbol
1652 j->code_bits -= k;
1653 j->code_buffer <<= k;
1654 return h->values[c];
1657 // bias[n] = (-1<<n) + 1
1658 static int const stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
1660 // combined JPEG 'receive' and JPEG 'extend', since baseline
1661 // always extends everything it receives.
1662 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
1664 unsigned int k;
1665 int sgn;
1666 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1668 sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
1669 k = stbi_lrot(j->code_buffer, n);
1670 STBI_ASSERT(n >= 0 && n < (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask)));
1671 j->code_buffer = k & ~stbi__bmask[n];
1672 k &= stbi__bmask[n];
1673 j->code_bits -= n;
1674 return k + (stbi__jbias[n] & ~sgn);
1677 // get some unsigned bits
1678 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
1680 unsigned int k;
1681 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1682 k = stbi_lrot(j->code_buffer, n);
1683 j->code_buffer = k & ~stbi__bmask[n];
1684 k &= stbi__bmask[n];
1685 j->code_bits -= n;
1686 return k;
1689 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
1691 unsigned int k;
1692 if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
1693 k = j->code_buffer;
1694 j->code_buffer <<= 1;
1695 --j->code_bits;
1696 return k & 0x80000000;
1699 // given a value that's at position X in the zigzag stream,
1700 // where does it appear in the 8x8 matrix coded as row-major?
1701 static stbi_uc stbi__jpeg_dezigzag[64+15] =
1703 0, 1, 8, 16, 9, 2, 3, 10,
1704 17, 24, 32, 25, 18, 11, 4, 5,
1705 12, 19, 26, 33, 40, 48, 41, 34,
1706 27, 20, 13, 6, 7, 14, 21, 28,
1707 35, 42, 49, 56, 57, 50, 43, 36,
1708 29, 22, 15, 23, 30, 37, 44, 51,
1709 58, 59, 52, 45, 38, 31, 39, 46,
1710 53, 60, 61, 54, 47, 55, 62, 63,
1711 // let corrupt input sample past end
1712 63, 63, 63, 63, 63, 63, 63, 63,
1713 63, 63, 63, 63, 63, 63, 63
1716 // decode one 64-entry block--
1717 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi_uc *dequant)
1719 int diff,dc,k;
1720 int t;
1722 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1723 t = stbi__jpeg_huff_decode(j, hdc);
1724 if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1726 // 0 all the ac values now so we can do it 32-bits at a time
1727 memset(data,0,64*sizeof(data[0]));
1729 diff = t ? stbi__extend_receive(j, t) : 0;
1730 dc = j->img_comp[b].dc_pred + diff;
1731 j->img_comp[b].dc_pred = dc;
1732 data[0] = (short) (dc * dequant[0]);
1734 // decode AC components, see JPEG spec
1735 k = 1;
1736 do {
1737 unsigned int zig;
1738 int c,r,s;
1739 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1740 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1741 r = fac[c];
1742 if (r) { // fast-AC path
1743 k += (r >> 4) & 15; // run
1744 s = r & 15; // combined length
1745 j->code_buffer <<= s;
1746 j->code_bits -= s;
1747 // decode into unzigzag'd location
1748 zig = stbi__jpeg_dezigzag[k++];
1749 data[zig] = (short) ((r >> 8) * dequant[zig]);
1750 } else {
1751 int rs = stbi__jpeg_huff_decode(j, hac);
1752 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1753 s = rs & 15;
1754 r = rs >> 4;
1755 if (s == 0) {
1756 if (rs != 0xf0) break; // end block
1757 k += 16;
1758 } else {
1759 k += r;
1760 // decode into unzigzag'd location
1761 zig = stbi__jpeg_dezigzag[k++];
1762 data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
1765 } while (k < 64);
1766 return 1;
1769 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
1771 int diff,dc;
1772 int t;
1773 if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1775 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1777 if (j->succ_high == 0) {
1778 // first scan for DC coefficient, must be first
1779 memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
1780 t = stbi__jpeg_huff_decode(j, hdc);
1781 diff = t ? stbi__extend_receive(j, t) : 0;
1783 dc = j->img_comp[b].dc_pred + diff;
1784 j->img_comp[b].dc_pred = dc;
1785 data[0] = (short) (dc << j->succ_low);
1786 } else {
1787 // refinement scan for DC coefficient
1788 if (stbi__jpeg_get_bit(j))
1789 data[0] += (short) (1 << j->succ_low);
1791 return 1;
1794 // @OPTIMIZE: store non-zigzagged during the decode passes,
1795 // and only de-zigzag when dequantizing
1796 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
1798 int k;
1799 if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1801 if (j->succ_high == 0) {
1802 int shift = j->succ_low;
1804 if (j->eob_run) {
1805 --j->eob_run;
1806 return 1;
1809 k = j->spec_start;
1810 do {
1811 unsigned int zig;
1812 int c,r,s;
1813 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1814 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1815 r = fac[c];
1816 if (r) { // fast-AC path
1817 k += (r >> 4) & 15; // run
1818 s = r & 15; // combined length
1819 j->code_buffer <<= s;
1820 j->code_bits -= s;
1821 zig = stbi__jpeg_dezigzag[k++];
1822 data[zig] = (short) ((r >> 8) << shift);
1823 } else {
1824 int rs = stbi__jpeg_huff_decode(j, hac);
1825 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1826 s = rs & 15;
1827 r = rs >> 4;
1828 if (s == 0) {
1829 if (r < 15) {
1830 j->eob_run = (1 << r);
1831 if (r)
1832 j->eob_run += stbi__jpeg_get_bits(j, r);
1833 --j->eob_run;
1834 break;
1836 k += 16;
1837 } else {
1838 k += r;
1839 zig = stbi__jpeg_dezigzag[k++];
1840 data[zig] = (short) (stbi__extend_receive(j,s) << shift);
1843 } while (k <= j->spec_end);
1844 } else {
1845 // refinement scan for these AC coefficients
1847 short bit = (short) (1 << j->succ_low);
1849 if (j->eob_run) {
1850 --j->eob_run;
1851 for (k = j->spec_start; k <= j->spec_end; ++k) {
1852 short *p = &data[stbi__jpeg_dezigzag[k]];
1853 if (*p != 0)
1854 if (stbi__jpeg_get_bit(j))
1855 if ((*p & bit)==0) {
1856 if (*p > 0)
1857 *p += bit;
1858 else
1859 *p -= bit;
1862 } else {
1863 k = j->spec_start;
1864 do {
1865 int r,s;
1866 int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
1867 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1868 s = rs & 15;
1869 r = rs >> 4;
1870 if (s == 0) {
1871 if (r < 15) {
1872 j->eob_run = (1 << r) - 1;
1873 if (r)
1874 j->eob_run += stbi__jpeg_get_bits(j, r);
1875 r = 64; // force end of block
1876 } else {
1877 // r=15 s=0 should write 16 0s, so we just do
1878 // a run of 15 0s and then write s (which is 0),
1879 // so we don't have to do anything special here
1881 } else {
1882 if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
1883 // sign bit
1884 if (stbi__jpeg_get_bit(j))
1885 s = bit;
1886 else
1887 s = -bit;
1890 // advance by r
1891 while (k <= j->spec_end) {
1892 short *p = &data[stbi__jpeg_dezigzag[k++]];
1893 if (*p != 0) {
1894 if (stbi__jpeg_get_bit(j))
1895 if ((*p & bit)==0) {
1896 if (*p > 0)
1897 *p += bit;
1898 else
1899 *p -= bit;
1901 } else {
1902 if (r == 0) {
1903 *p = (short) s;
1904 break;
1906 --r;
1909 } while (k <= j->spec_end);
1912 return 1;
1915 // take a -128..127 value and stbi__clamp it and convert to 0..255
1916 stbi_inline static stbi_uc stbi__clamp(int x)
1918 // trick to use a single test to catch both cases
1919 if ((unsigned int) x > 255) {
1920 if (x < 0) return 0;
1921 if (x > 255) return 255;
1923 return (stbi_uc) x;
1926 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
1927 #define stbi__fsh(x) ((x) << 12)
1929 // derived from jidctint -- DCT_ISLOW
1930 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
1931 int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
1932 p2 = s2; \
1933 p3 = s6; \
1934 p1 = (p2+p3) * stbi__f2f(0.5411961f); \
1935 t2 = p1 + p3*stbi__f2f(-1.847759065f); \
1936 t3 = p1 + p2*stbi__f2f( 0.765366865f); \
1937 p2 = s0; \
1938 p3 = s4; \
1939 t0 = stbi__fsh(p2+p3); \
1940 t1 = stbi__fsh(p2-p3); \
1941 x0 = t0+t3; \
1942 x3 = t0-t3; \
1943 x1 = t1+t2; \
1944 x2 = t1-t2; \
1945 t0 = s7; \
1946 t1 = s5; \
1947 t2 = s3; \
1948 t3 = s1; \
1949 p3 = t0+t2; \
1950 p4 = t1+t3; \
1951 p1 = t0+t3; \
1952 p2 = t1+t2; \
1953 p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
1954 t0 = t0*stbi__f2f( 0.298631336f); \
1955 t1 = t1*stbi__f2f( 2.053119869f); \
1956 t2 = t2*stbi__f2f( 3.072711026f); \
1957 t3 = t3*stbi__f2f( 1.501321110f); \
1958 p1 = p5 + p1*stbi__f2f(-0.899976223f); \
1959 p2 = p5 + p2*stbi__f2f(-2.562915447f); \
1960 p3 = p3*stbi__f2f(-1.961570560f); \
1961 p4 = p4*stbi__f2f(-0.390180644f); \
1962 t3 += p1+p4; \
1963 t2 += p2+p3; \
1964 t1 += p2+p4; \
1965 t0 += p1+p3;
1967 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
1969 int i,val[64],*v=val;
1970 stbi_uc *o;
1971 short *d = data;
1973 // columns
1974 for (i=0; i < 8; ++i,++d, ++v) {
1975 // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
1976 if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
1977 && d[40]==0 && d[48]==0 && d[56]==0) {
1978 // no shortcut 0 seconds
1979 // (1|2|3|4|5|6|7)==0 0 seconds
1980 // all separate -0.047 seconds
1981 // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
1982 int dcterm = d[0] << 2;
1983 v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
1984 } else {
1985 STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
1986 // constants scaled things up by 1<<12; let's bring them back
1987 // down, but keep 2 extra bits of precision
1988 x0 += 512; x1 += 512; x2 += 512; x3 += 512;
1989 v[ 0] = (x0+t3) >> 10;
1990 v[56] = (x0-t3) >> 10;
1991 v[ 8] = (x1+t2) >> 10;
1992 v[48] = (x1-t2) >> 10;
1993 v[16] = (x2+t1) >> 10;
1994 v[40] = (x2-t1) >> 10;
1995 v[24] = (x3+t0) >> 10;
1996 v[32] = (x3-t0) >> 10;
2000 for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
2001 // no fast case since the first 1D IDCT spread components out
2002 STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
2003 // constants scaled things up by 1<<12, plus we had 1<<2 from first
2004 // loop, plus horizontal and vertical each scale by sqrt(8) so together
2005 // we've got an extra 1<<3, so 1<<17 total we need to remove.
2006 // so we want to round that, which means adding 0.5 * 1<<17,
2007 // aka 65536. Also, we'll end up with -128 to 127 that we want
2008 // to encode as 0..255 by adding 128, so we'll add that before the shift
2009 x0 += 65536 + (128<<17);
2010 x1 += 65536 + (128<<17);
2011 x2 += 65536 + (128<<17);
2012 x3 += 65536 + (128<<17);
2013 // tried computing the shifts into temps, or'ing the temps to see
2014 // if any were out of range, but that was slower
2015 o[0] = stbi__clamp((x0+t3) >> 17);
2016 o[7] = stbi__clamp((x0-t3) >> 17);
2017 o[1] = stbi__clamp((x1+t2) >> 17);
2018 o[6] = stbi__clamp((x1-t2) >> 17);
2019 o[2] = stbi__clamp((x2+t1) >> 17);
2020 o[5] = stbi__clamp((x2-t1) >> 17);
2021 o[3] = stbi__clamp((x3+t0) >> 17);
2022 o[4] = stbi__clamp((x3-t0) >> 17);
2026 #ifdef STBI_SSE2
2027 // sse2 integer IDCT. not the fastest possible implementation but it
2028 // produces bit-identical results to the generic C version so it's
2029 // fully "transparent".
2030 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2032 // This is constructed to match our regular (generic) integer IDCT exactly.
2033 __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2034 __m128i tmp;
2036 // dot product constant: even elems=x, odd elems=y
2037 #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2039 // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2040 // out(1) = c1[even]*x + c1[odd]*y
2041 #define dct_rot(out0,out1, x,y,c0,c1) \
2042 __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2043 __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2044 __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2045 __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2046 __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2047 __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2049 // out = in << 12 (in 16-bit, out 32-bit)
2050 #define dct_widen(out, in) \
2051 __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2052 __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2054 // wide add
2055 #define dct_wadd(out, a, b) \
2056 __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2057 __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2059 // wide sub
2060 #define dct_wsub(out, a, b) \
2061 __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2062 __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2064 // butterfly a/b, add bias, then shift by "s" and pack
2065 #define dct_bfly32o(out0, out1, a,b,bias,s) \
2067 __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2068 __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2069 dct_wadd(sum, abiased, b); \
2070 dct_wsub(dif, abiased, b); \
2071 out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2072 out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2075 // 8-bit interleave step (for transposes)
2076 #define dct_interleave8(a, b) \
2077 tmp = a; \
2078 a = _mm_unpacklo_epi8(a, b); \
2079 b = _mm_unpackhi_epi8(tmp, b)
2081 // 16-bit interleave step (for transposes)
2082 #define dct_interleave16(a, b) \
2083 tmp = a; \
2084 a = _mm_unpacklo_epi16(a, b); \
2085 b = _mm_unpackhi_epi16(tmp, b)
2087 #define dct_pass(bias,shift) \
2089 /* even part */ \
2090 dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2091 __m128i sum04 = _mm_add_epi16(row0, row4); \
2092 __m128i dif04 = _mm_sub_epi16(row0, row4); \
2093 dct_widen(t0e, sum04); \
2094 dct_widen(t1e, dif04); \
2095 dct_wadd(x0, t0e, t3e); \
2096 dct_wsub(x3, t0e, t3e); \
2097 dct_wadd(x1, t1e, t2e); \
2098 dct_wsub(x2, t1e, t2e); \
2099 /* odd part */ \
2100 dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2101 dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2102 __m128i sum17 = _mm_add_epi16(row1, row7); \
2103 __m128i sum35 = _mm_add_epi16(row3, row5); \
2104 dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2105 dct_wadd(x4, y0o, y4o); \
2106 dct_wadd(x5, y1o, y5o); \
2107 dct_wadd(x6, y2o, y5o); \
2108 dct_wadd(x7, y3o, y4o); \
2109 dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2110 dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2111 dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2112 dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2115 __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2116 __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
2117 __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2118 __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2119 __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
2120 __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
2121 __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
2122 __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
2124 // rounding biases in column/row passes, see stbi__idct_block for explanation.
2125 __m128i bias_0 = _mm_set1_epi32(512);
2126 __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
2128 // load
2129 row0 = _mm_load_si128((const __m128i *) (data + 0*8));
2130 row1 = _mm_load_si128((const __m128i *) (data + 1*8));
2131 row2 = _mm_load_si128((const __m128i *) (data + 2*8));
2132 row3 = _mm_load_si128((const __m128i *) (data + 3*8));
2133 row4 = _mm_load_si128((const __m128i *) (data + 4*8));
2134 row5 = _mm_load_si128((const __m128i *) (data + 5*8));
2135 row6 = _mm_load_si128((const __m128i *) (data + 6*8));
2136 row7 = _mm_load_si128((const __m128i *) (data + 7*8));
2138 // column pass
2139 dct_pass(bias_0, 10);
2142 // 16bit 8x8 transpose pass 1
2143 dct_interleave16(row0, row4);
2144 dct_interleave16(row1, row5);
2145 dct_interleave16(row2, row6);
2146 dct_interleave16(row3, row7);
2148 // transpose pass 2
2149 dct_interleave16(row0, row2);
2150 dct_interleave16(row1, row3);
2151 dct_interleave16(row4, row6);
2152 dct_interleave16(row5, row7);
2154 // transpose pass 3
2155 dct_interleave16(row0, row1);
2156 dct_interleave16(row2, row3);
2157 dct_interleave16(row4, row5);
2158 dct_interleave16(row6, row7);
2161 // row pass
2162 dct_pass(bias_1, 17);
2165 // pack
2166 __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2167 __m128i p1 = _mm_packus_epi16(row2, row3);
2168 __m128i p2 = _mm_packus_epi16(row4, row5);
2169 __m128i p3 = _mm_packus_epi16(row6, row7);
2171 // 8bit 8x8 transpose pass 1
2172 dct_interleave8(p0, p2); // a0e0a1e1...
2173 dct_interleave8(p1, p3); // c0g0c1g1...
2175 // transpose pass 2
2176 dct_interleave8(p0, p1); // a0c0e0g0...
2177 dct_interleave8(p2, p3); // b0d0f0h0...
2179 // transpose pass 3
2180 dct_interleave8(p0, p2); // a0b0c0d0...
2181 dct_interleave8(p1, p3); // a4b4c4d4...
2183 // store
2184 _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
2185 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2186 _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
2187 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2188 _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
2189 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2190 _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
2191 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
2194 #undef dct_const
2195 #undef dct_rot
2196 #undef dct_widen
2197 #undef dct_wadd
2198 #undef dct_wsub
2199 #undef dct_bfly32o
2200 #undef dct_interleave8
2201 #undef dct_interleave16
2202 #undef dct_pass
2205 #endif // STBI_SSE2
2207 #ifdef STBI_NEON
2209 // NEON integer IDCT. should produce bit-identical
2210 // results to the generic C version.
2211 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2213 int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2215 int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2216 int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2217 int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
2218 int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
2219 int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2220 int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2221 int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2222 int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2223 int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
2224 int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
2225 int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
2226 int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
2228 #define dct_long_mul(out, inq, coeff) \
2229 int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2230 int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2232 #define dct_long_mac(out, acc, inq, coeff) \
2233 int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2234 int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2236 #define dct_widen(out, inq) \
2237 int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2238 int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2240 // wide add
2241 #define dct_wadd(out, a, b) \
2242 int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2243 int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2245 // wide sub
2246 #define dct_wsub(out, a, b) \
2247 int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2248 int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2250 // butterfly a/b, then shift using "shiftop" by "s" and pack
2251 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2253 dct_wadd(sum, a, b); \
2254 dct_wsub(dif, a, b); \
2255 out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2256 out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2259 #define dct_pass(shiftop, shift) \
2261 /* even part */ \
2262 int16x8_t sum26 = vaddq_s16(row2, row6); \
2263 dct_long_mul(p1e, sum26, rot0_0); \
2264 dct_long_mac(t2e, p1e, row6, rot0_1); \
2265 dct_long_mac(t3e, p1e, row2, rot0_2); \
2266 int16x8_t sum04 = vaddq_s16(row0, row4); \
2267 int16x8_t dif04 = vsubq_s16(row0, row4); \
2268 dct_widen(t0e, sum04); \
2269 dct_widen(t1e, dif04); \
2270 dct_wadd(x0, t0e, t3e); \
2271 dct_wsub(x3, t0e, t3e); \
2272 dct_wadd(x1, t1e, t2e); \
2273 dct_wsub(x2, t1e, t2e); \
2274 /* odd part */ \
2275 int16x8_t sum15 = vaddq_s16(row1, row5); \
2276 int16x8_t sum17 = vaddq_s16(row1, row7); \
2277 int16x8_t sum35 = vaddq_s16(row3, row5); \
2278 int16x8_t sum37 = vaddq_s16(row3, row7); \
2279 int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2280 dct_long_mul(p5o, sumodd, rot1_0); \
2281 dct_long_mac(p1o, p5o, sum17, rot1_1); \
2282 dct_long_mac(p2o, p5o, sum35, rot1_2); \
2283 dct_long_mul(p3o, sum37, rot2_0); \
2284 dct_long_mul(p4o, sum15, rot2_1); \
2285 dct_wadd(sump13o, p1o, p3o); \
2286 dct_wadd(sump24o, p2o, p4o); \
2287 dct_wadd(sump23o, p2o, p3o); \
2288 dct_wadd(sump14o, p1o, p4o); \
2289 dct_long_mac(x4, sump13o, row7, rot3_0); \
2290 dct_long_mac(x5, sump24o, row5, rot3_1); \
2291 dct_long_mac(x6, sump23o, row3, rot3_2); \
2292 dct_long_mac(x7, sump14o, row1, rot3_3); \
2293 dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2294 dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2295 dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2296 dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2299 // load
2300 row0 = vld1q_s16(data + 0*8);
2301 row1 = vld1q_s16(data + 1*8);
2302 row2 = vld1q_s16(data + 2*8);
2303 row3 = vld1q_s16(data + 3*8);
2304 row4 = vld1q_s16(data + 4*8);
2305 row5 = vld1q_s16(data + 5*8);
2306 row6 = vld1q_s16(data + 6*8);
2307 row7 = vld1q_s16(data + 7*8);
2309 // add DC bias
2310 row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2312 // column pass
2313 dct_pass(vrshrn_n_s32, 10);
2315 // 16bit 8x8 transpose
2317 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2318 // whether compilers actually get this is another story, sadly.
2319 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2320 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2321 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2323 // pass 1
2324 dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2325 dct_trn16(row2, row3);
2326 dct_trn16(row4, row5);
2327 dct_trn16(row6, row7);
2329 // pass 2
2330 dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2331 dct_trn32(row1, row3);
2332 dct_trn32(row4, row6);
2333 dct_trn32(row5, row7);
2335 // pass 3
2336 dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2337 dct_trn64(row1, row5);
2338 dct_trn64(row2, row6);
2339 dct_trn64(row3, row7);
2341 #undef dct_trn16
2342 #undef dct_trn32
2343 #undef dct_trn64
2346 // row pass
2347 // vrshrn_n_s32 only supports shifts up to 16, we need
2348 // 17. so do a non-rounding shift of 16 first then follow
2349 // up with a rounding shift by 1.
2350 dct_pass(vshrn_n_s32, 16);
2353 // pack and round
2354 uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2355 uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2356 uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2357 uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2358 uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2359 uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2360 uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2361 uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2363 // again, these can translate into one instruction, but often don't.
2364 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2365 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2366 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2368 // sadly can't use interleaved stores here since we only write
2369 // 8 bytes to each scan line!
2371 // 8x8 8-bit transpose pass 1
2372 dct_trn8_8(p0, p1);
2373 dct_trn8_8(p2, p3);
2374 dct_trn8_8(p4, p5);
2375 dct_trn8_8(p6, p7);
2377 // pass 2
2378 dct_trn8_16(p0, p2);
2379 dct_trn8_16(p1, p3);
2380 dct_trn8_16(p4, p6);
2381 dct_trn8_16(p5, p7);
2383 // pass 3
2384 dct_trn8_32(p0, p4);
2385 dct_trn8_32(p1, p5);
2386 dct_trn8_32(p2, p6);
2387 dct_trn8_32(p3, p7);
2389 // store
2390 vst1_u8(out, p0); out += out_stride;
2391 vst1_u8(out, p1); out += out_stride;
2392 vst1_u8(out, p2); out += out_stride;
2393 vst1_u8(out, p3); out += out_stride;
2394 vst1_u8(out, p4); out += out_stride;
2395 vst1_u8(out, p5); out += out_stride;
2396 vst1_u8(out, p6); out += out_stride;
2397 vst1_u8(out, p7);
2399 #undef dct_trn8_8
2400 #undef dct_trn8_16
2401 #undef dct_trn8_32
2404 #undef dct_long_mul
2405 #undef dct_long_mac
2406 #undef dct_widen
2407 #undef dct_wadd
2408 #undef dct_wsub
2409 #undef dct_bfly32o
2410 #undef dct_pass
2413 #endif // STBI_NEON
2415 #define STBI__MARKER_none 0xff
2416 // if there's a pending marker from the entropy stream, return that
2417 // otherwise, fetch from the stream and get a marker. if there's no
2418 // marker, return 0xff, which is never a valid marker value
2419 static stbi_uc stbi__get_marker(stbi__jpeg *j)
2421 stbi_uc x;
2422 if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2423 x = stbi__get8(j->s);
2424 if (x != 0xff) return STBI__MARKER_none;
2425 while (x == 0xff)
2426 x = stbi__get8(j->s);
2427 return x;
2430 // in each scan, we'll have scan_n components, and the order
2431 // of the components is specified by order[]
2432 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2434 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2435 // the dc prediction
2436 static void stbi__jpeg_reset(stbi__jpeg *j)
2438 j->code_bits = 0;
2439 j->code_buffer = 0;
2440 j->nomore = 0;
2441 j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
2442 j->marker = STBI__MARKER_none;
2443 j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2444 j->eob_run = 0;
2445 // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2446 // since we don't even allow 1<<30 pixels
2449 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
2451 stbi__jpeg_reset(z);
2452 if (!z->progressive) {
2453 if (z->scan_n == 1) {
2454 int i,j;
2455 STBI_SIMD_ALIGN(short, data[64]);
2456 int n = z->order[0];
2457 // non-interleaved data, we just need to process one block at a time,
2458 // in trivial scanline order
2459 // number of blocks to do just depends on how many actual "pixels" this
2460 // component has, independent of interleaved MCU blocking and such
2461 int w = (z->img_comp[n].x+7) >> 3;
2462 int h = (z->img_comp[n].y+7) >> 3;
2463 for (j=0; j < h; ++j) {
2464 for (i=0; i < w; ++i) {
2465 int ha = z->img_comp[n].ha;
2466 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2467 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2468 // every data block is an MCU, so countdown the restart interval
2469 if (--z->todo <= 0) {
2470 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2471 // if it's NOT a restart, then just bail, so we get corrupt data
2472 // rather than no data
2473 if (!STBI__RESTART(z->marker)) return 1;
2474 stbi__jpeg_reset(z);
2478 return 1;
2479 } else { // interleaved
2480 int i,j,k,x,y;
2481 STBI_SIMD_ALIGN(short, data[64]);
2482 for (j=0; j < z->img_mcu_y; ++j) {
2483 for (i=0; i < z->img_mcu_x; ++i) {
2484 // scan an interleaved mcu... process scan_n components in order
2485 for (k=0; k < z->scan_n; ++k) {
2486 int n = z->order[k];
2487 // scan out an mcu's worth of this component; that's just determined
2488 // by the basic H and V specified for the component
2489 for (y=0; y < z->img_comp[n].v; ++y) {
2490 for (x=0; x < z->img_comp[n].h; ++x) {
2491 int x2 = (i*z->img_comp[n].h + x)*8;
2492 int y2 = (j*z->img_comp[n].v + y)*8;
2493 int ha = z->img_comp[n].ha;
2494 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2495 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
2499 // after all interleaved components, that's an interleaved MCU,
2500 // so now count down the restart interval
2501 if (--z->todo <= 0) {
2502 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2503 if (!STBI__RESTART(z->marker)) return 1;
2504 stbi__jpeg_reset(z);
2508 return 1;
2510 } else {
2511 if (z->scan_n == 1) {
2512 int i,j;
2513 int n = z->order[0];
2514 // non-interleaved data, we just need to process one block at a time,
2515 // in trivial scanline order
2516 // number of blocks to do just depends on how many actual "pixels" this
2517 // component has, independent of interleaved MCU blocking and such
2518 int w = (z->img_comp[n].x+7) >> 3;
2519 int h = (z->img_comp[n].y+7) >> 3;
2520 for (j=0; j < h; ++j) {
2521 for (i=0; i < w; ++i) {
2522 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2523 if (z->spec_start == 0) {
2524 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2525 return 0;
2526 } else {
2527 int ha = z->img_comp[n].ha;
2528 if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
2529 return 0;
2531 // every data block is an MCU, so countdown the restart interval
2532 if (--z->todo <= 0) {
2533 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2534 if (!STBI__RESTART(z->marker)) return 1;
2535 stbi__jpeg_reset(z);
2539 return 1;
2540 } else { // interleaved
2541 int i,j,k,x,y;
2542 for (j=0; j < z->img_mcu_y; ++j) {
2543 for (i=0; i < z->img_mcu_x; ++i) {
2544 // scan an interleaved mcu... process scan_n components in order
2545 for (k=0; k < z->scan_n; ++k) {
2546 int n = z->order[k];
2547 // scan out an mcu's worth of this component; that's just determined
2548 // by the basic H and V specified for the component
2549 for (y=0; y < z->img_comp[n].v; ++y) {
2550 for (x=0; x < z->img_comp[n].h; ++x) {
2551 int x2 = (i*z->img_comp[n].h + x);
2552 int y2 = (j*z->img_comp[n].v + y);
2553 short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
2554 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2555 return 0;
2559 // after all interleaved components, that's an interleaved MCU,
2560 // so now count down the restart interval
2561 if (--z->todo <= 0) {
2562 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2563 if (!STBI__RESTART(z->marker)) return 1;
2564 stbi__jpeg_reset(z);
2568 return 1;
2573 static void stbi__jpeg_dequantize(short *data, stbi_uc *dequant)
2575 int i;
2576 for (i=0; i < 64; ++i)
2577 data[i] *= dequant[i];
2580 static void stbi__jpeg_finish(stbi__jpeg *z)
2582 if (z->progressive) {
2583 // dequantize and idct the data
2584 int i,j,n;
2585 for (n=0; n < z->s->img_n; ++n) {
2586 int w = (z->img_comp[n].x+7) >> 3;
2587 int h = (z->img_comp[n].y+7) >> 3;
2588 for (j=0; j < h; ++j) {
2589 for (i=0; i < w; ++i) {
2590 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2591 stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
2592 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2599 static int stbi__process_marker(stbi__jpeg *z, int m)
2601 int L;
2602 switch (m) {
2603 case STBI__MARKER_none: // no marker found
2604 return stbi__err("expected marker","Corrupt JPEG");
2606 case 0xDD: // DRI - specify restart interval
2607 if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
2608 z->restart_interval = stbi__get16be(z->s);
2609 return 1;
2611 case 0xDB: // DQT - define quantization table
2612 L = stbi__get16be(z->s)-2;
2613 while (L > 0) {
2614 int q = stbi__get8(z->s);
2615 int p = q >> 4;
2616 int t = q & 15,i;
2617 if (p != 0) return stbi__err("bad DQT type","Corrupt JPEG");
2618 if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
2619 for (i=0; i < 64; ++i)
2620 z->dequant[t][stbi__jpeg_dezigzag[i]] = stbi__get8(z->s);
2621 L -= 65;
2623 return L==0;
2625 case 0xC4: // DHT - define huffman table
2626 L = stbi__get16be(z->s)-2;
2627 while (L > 0) {
2628 stbi_uc *v;
2629 int sizes[16],i,n=0;
2630 int q = stbi__get8(z->s);
2631 int tc = q >> 4;
2632 int th = q & 15;
2633 if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
2634 for (i=0; i < 16; ++i) {
2635 sizes[i] = stbi__get8(z->s);
2636 n += sizes[i];
2638 L -= 17;
2639 if (tc == 0) {
2640 if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
2641 v = z->huff_dc[th].values;
2642 } else {
2643 if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
2644 v = z->huff_ac[th].values;
2646 for (i=0; i < n; ++i)
2647 v[i] = stbi__get8(z->s);
2648 if (tc != 0)
2649 stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
2650 L -= n;
2652 return L==0;
2654 // check for comment block or APP blocks
2655 if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
2656 stbi__skip(z->s, stbi__get16be(z->s)-2);
2657 return 1;
2659 return 0;
2662 // after we see SOS
2663 static int stbi__process_scan_header(stbi__jpeg *z)
2665 int i;
2666 int Ls = stbi__get16be(z->s);
2667 z->scan_n = stbi__get8(z->s);
2668 if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
2669 if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
2670 for (i=0; i < z->scan_n; ++i) {
2671 int id = stbi__get8(z->s), which;
2672 int q = stbi__get8(z->s);
2673 for (which = 0; which < z->s->img_n; ++which)
2674 if (z->img_comp[which].id == id)
2675 break;
2676 if (which == z->s->img_n) return 0; // no match
2677 z->img_comp[which].hd = q >> 4; if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
2678 z->img_comp[which].ha = q & 15; if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
2679 z->order[i] = which;
2683 int aa;
2684 z->spec_start = stbi__get8(z->s);
2685 z->spec_end = stbi__get8(z->s); // should be 63, but might be 0
2686 aa = stbi__get8(z->s);
2687 z->succ_high = (aa >> 4);
2688 z->succ_low = (aa & 15);
2689 if (z->progressive) {
2690 if (z->spec_start > 63 || z->spec_end > 63 || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
2691 return stbi__err("bad SOS", "Corrupt JPEG");
2692 } else {
2693 if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
2694 if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
2695 z->spec_end = 63;
2699 return 1;
2702 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
2704 stbi__context *s = z->s;
2705 int Lf,p,i,q, h_max=1,v_max=1,c;
2706 Lf = stbi__get16be(s); if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
2707 p = stbi__get8(s); if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
2708 s->img_y = stbi__get16be(s); if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
2709 s->img_x = stbi__get16be(s); if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
2710 c = stbi__get8(s);
2711 if (c != 3 && c != 1) return stbi__err("bad component count","Corrupt JPEG"); // JFIF requires
2712 s->img_n = c;
2713 for (i=0; i < c; ++i) {
2714 z->img_comp[i].data = NULL;
2715 z->img_comp[i].linebuf = NULL;
2718 if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
2720 for (i=0; i < s->img_n; ++i) {
2721 z->img_comp[i].id = stbi__get8(s);
2722 if (z->img_comp[i].id != i+1) // JFIF requires
2723 if (z->img_comp[i].id != i) // some version of jpegtran outputs non-JFIF-compliant files!
2724 return stbi__err("bad component ID","Corrupt JPEG");
2725 q = stbi__get8(s);
2726 z->img_comp[i].h = (q >> 4); if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
2727 z->img_comp[i].v = q & 15; if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
2728 z->img_comp[i].tq = stbi__get8(s); if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
2731 if (scan != STBI__SCAN_load) return 1;
2733 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
2735 for (i=0; i < s->img_n; ++i) {
2736 if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
2737 if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
2740 // compute interleaved mcu info
2741 z->img_h_max = h_max;
2742 z->img_v_max = v_max;
2743 z->img_mcu_w = h_max * 8;
2744 z->img_mcu_h = v_max * 8;
2745 z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
2746 z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
2748 for (i=0; i < s->img_n; ++i) {
2749 // number of effective pixels (e.g. for non-interleaved MCU)
2750 z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
2751 z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
2752 // to simplify generation, we'll allocate enough memory to decode
2753 // the bogus oversized data from using interleaved MCUs and their
2754 // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
2755 // discard the extra data until colorspace conversion
2756 z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
2757 z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
2758 z->img_comp[i].raw_data = stbi__malloc(z->img_comp[i].w2 * z->img_comp[i].h2+15);
2760 if (z->img_comp[i].raw_data == NULL) {
2761 for(--i; i >= 0; --i) {
2762 STBI_FREE(z->img_comp[i].raw_data);
2763 z->img_comp[i].raw_data = NULL;
2765 return stbi__err("outofmem", "Out of memory");
2767 // align blocks for idct using mmx/sse
2768 z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
2769 z->img_comp[i].linebuf = NULL;
2770 if (z->progressive) {
2771 z->img_comp[i].coeff_w = (z->img_comp[i].w2 + 7) >> 3;
2772 z->img_comp[i].coeff_h = (z->img_comp[i].h2 + 7) >> 3;
2773 z->img_comp[i].raw_coeff = STBI_MALLOC(z->img_comp[i].coeff_w * z->img_comp[i].coeff_h * 64 * sizeof(short) + 15);
2774 z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
2775 } else {
2776 z->img_comp[i].coeff = 0;
2777 z->img_comp[i].raw_coeff = 0;
2781 return 1;
2784 // use comparisons since in some cases we handle more than one case (e.g. SOF)
2785 #define stbi__DNL(x) ((x) == 0xdc)
2786 #define stbi__SOI(x) ((x) == 0xd8)
2787 #define stbi__EOI(x) ((x) == 0xd9)
2788 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
2789 #define stbi__SOS(x) ((x) == 0xda)
2791 #define stbi__SOF_progressive(x) ((x) == 0xc2)
2793 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
2795 int m;
2796 z->marker = STBI__MARKER_none; // initialize cached marker to empty
2797 m = stbi__get_marker(z);
2798 if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
2799 if (scan == STBI__SCAN_type) return 1;
2800 m = stbi__get_marker(z);
2801 while (!stbi__SOF(m)) {
2802 if (!stbi__process_marker(z,m)) return 0;
2803 m = stbi__get_marker(z);
2804 while (m == STBI__MARKER_none) {
2805 // some files have extra padding after their blocks, so ok, we'll scan
2806 if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
2807 m = stbi__get_marker(z);
2810 z->progressive = stbi__SOF_progressive(m);
2811 if (!stbi__process_frame_header(z, scan)) return 0;
2812 return 1;
2815 // decode image to YCbCr format
2816 static int stbi__decode_jpeg_image(stbi__jpeg *j)
2818 int m;
2819 for (m = 0; m < 4; m++) {
2820 j->img_comp[m].raw_data = NULL;
2821 j->img_comp[m].raw_coeff = NULL;
2823 j->restart_interval = 0;
2824 if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
2825 m = stbi__get_marker(j);
2826 while (!stbi__EOI(m)) {
2827 if (stbi__SOS(m)) {
2828 if (!stbi__process_scan_header(j)) return 0;
2829 if (!stbi__parse_entropy_coded_data(j)) return 0;
2830 if (j->marker == STBI__MARKER_none ) {
2831 // handle 0s at the end of image data from IP Kamera 9060
2832 while (!stbi__at_eof(j->s)) {
2833 int x = stbi__get8(j->s);
2834 if (x == 255) {
2835 j->marker = stbi__get8(j->s);
2836 break;
2837 } else if (x != 0) {
2838 return stbi__err("junk before marker", "Corrupt JPEG");
2841 // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2843 } else {
2844 if (!stbi__process_marker(j, m)) return 0;
2846 m = stbi__get_marker(j);
2848 if (j->progressive)
2849 stbi__jpeg_finish(j);
2850 return 1;
2853 // static jfif-centered resampling (across block boundaries)
2855 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
2856 int w, int hs);
2858 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2860 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2862 STBI_NOTUSED(out);
2863 STBI_NOTUSED(in_far);
2864 STBI_NOTUSED(w);
2865 STBI_NOTUSED(hs);
2866 return in_near;
2869 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2871 // need to generate two samples vertically for every one in input
2872 int i;
2873 STBI_NOTUSED(hs);
2874 for (i=0; i < w; ++i)
2875 out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
2876 return out;
2879 static stbi_uc* stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2881 // need to generate two samples horizontally for every one in input
2882 int i;
2883 stbi_uc *input = in_near;
2885 if (w == 1) {
2886 // if only one sample, can't do any interpolation
2887 out[0] = out[1] = input[0];
2888 return out;
2891 out[0] = input[0];
2892 out[1] = stbi__div4(input[0]*3 + input[1] + 2);
2893 for (i=1; i < w-1; ++i) {
2894 int n = 3*input[i]+2;
2895 out[i*2+0] = stbi__div4(n+input[i-1]);
2896 out[i*2+1] = stbi__div4(n+input[i+1]);
2898 out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
2899 out[i*2+1] = input[w-1];
2901 STBI_NOTUSED(in_far);
2902 STBI_NOTUSED(hs);
2904 return out;
2907 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2909 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2911 // need to generate 2x2 samples for every one in input
2912 int i,t0,t1;
2913 if (w == 1) {
2914 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2915 return out;
2918 t1 = 3*in_near[0] + in_far[0];
2919 out[0] = stbi__div4(t1+2);
2920 for (i=1; i < w; ++i) {
2921 t0 = t1;
2922 t1 = 3*in_near[i]+in_far[i];
2923 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
2924 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
2926 out[w*2-1] = stbi__div4(t1+2);
2928 STBI_NOTUSED(hs);
2930 return out;
2933 #if defined(STBI_SSE2) || defined(STBI_NEON)
2934 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2936 // need to generate 2x2 samples for every one in input
2937 int i=0,t0,t1;
2939 if (w == 1) {
2940 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2941 return out;
2944 t1 = 3*in_near[0] + in_far[0];
2945 // process groups of 8 pixels for as long as we can.
2946 // note we can't handle the last pixel in a row in this loop
2947 // because we need to handle the filter boundary conditions.
2948 for (; i < ((w-1) & ~7); i += 8) {
2949 #if defined(STBI_SSE2)
2950 // load and perform the vertical filtering pass
2951 // this uses 3*x + y = 4*x + (y - x)
2952 __m128i zero = _mm_setzero_si128();
2953 __m128i farb = _mm_loadl_epi64((__m128i *) (in_far + i));
2954 __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
2955 __m128i farw = _mm_unpacklo_epi8(farb, zero);
2956 __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
2957 __m128i diff = _mm_sub_epi16(farw, nearw);
2958 __m128i nears = _mm_slli_epi16(nearw, 2);
2959 __m128i curr = _mm_add_epi16(nears, diff); // current row
2961 // horizontal filter works the same based on shifted vers of current
2962 // row. "prev" is current row shifted right by 1 pixel; we need to
2963 // insert the previous pixel value (from t1).
2964 // "next" is current row shifted left by 1 pixel, with first pixel
2965 // of next block of 8 pixels added in.
2966 __m128i prv0 = _mm_slli_si128(curr, 2);
2967 __m128i nxt0 = _mm_srli_si128(curr, 2);
2968 __m128i prev = _mm_insert_epi16(prv0, t1, 0);
2969 __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
2971 // horizontal filter, polyphase implementation since it's convenient:
2972 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
2973 // odd pixels = 3*cur + next = cur*4 + (next - cur)
2974 // note the shared term.
2975 __m128i bias = _mm_set1_epi16(8);
2976 __m128i curs = _mm_slli_epi16(curr, 2);
2977 __m128i prvd = _mm_sub_epi16(prev, curr);
2978 __m128i nxtd = _mm_sub_epi16(next, curr);
2979 __m128i curb = _mm_add_epi16(curs, bias);
2980 __m128i even = _mm_add_epi16(prvd, curb);
2981 __m128i odd = _mm_add_epi16(nxtd, curb);
2983 // interleave even and odd pixels, then undo scaling.
2984 __m128i int0 = _mm_unpacklo_epi16(even, odd);
2985 __m128i int1 = _mm_unpackhi_epi16(even, odd);
2986 __m128i de0 = _mm_srli_epi16(int0, 4);
2987 __m128i de1 = _mm_srli_epi16(int1, 4);
2989 // pack and write output
2990 __m128i outv = _mm_packus_epi16(de0, de1);
2991 _mm_storeu_si128((__m128i *) (out + i*2), outv);
2992 #elif defined(STBI_NEON)
2993 // load and perform the vertical filtering pass
2994 // this uses 3*x + y = 4*x + (y - x)
2995 uint8x8_t farb = vld1_u8(in_far + i);
2996 uint8x8_t nearb = vld1_u8(in_near + i);
2997 int16x8_t diff = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
2998 int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
2999 int16x8_t curr = vaddq_s16(nears, diff); // current row
3001 // horizontal filter works the same based on shifted vers of current
3002 // row. "prev" is current row shifted right by 1 pixel; we need to
3003 // insert the previous pixel value (from t1).
3004 // "next" is current row shifted left by 1 pixel, with first pixel
3005 // of next block of 8 pixels added in.
3006 int16x8_t prv0 = vextq_s16(curr, curr, 7);
3007 int16x8_t nxt0 = vextq_s16(curr, curr, 1);
3008 int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
3009 int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
3011 // horizontal filter, polyphase implementation since it's convenient:
3012 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3013 // odd pixels = 3*cur + next = cur*4 + (next - cur)
3014 // note the shared term.
3015 int16x8_t curs = vshlq_n_s16(curr, 2);
3016 int16x8_t prvd = vsubq_s16(prev, curr);
3017 int16x8_t nxtd = vsubq_s16(next, curr);
3018 int16x8_t even = vaddq_s16(curs, prvd);
3019 int16x8_t odd = vaddq_s16(curs, nxtd);
3021 // undo scaling and round, then store with even/odd phases interleaved
3022 uint8x8x2_t o;
3023 o.val[0] = vqrshrun_n_s16(even, 4);
3024 o.val[1] = vqrshrun_n_s16(odd, 4);
3025 vst2_u8(out + i*2, o);
3026 #endif
3028 // "previous" value for next iter
3029 t1 = 3*in_near[i+7] + in_far[i+7];
3032 t0 = t1;
3033 t1 = 3*in_near[i] + in_far[i];
3034 out[i*2] = stbi__div16(3*t1 + t0 + 8);
3036 for (++i; i < w; ++i) {
3037 t0 = t1;
3038 t1 = 3*in_near[i]+in_far[i];
3039 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3040 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
3042 out[w*2-1] = stbi__div4(t1+2);
3044 STBI_NOTUSED(hs);
3046 return out;
3048 #endif
3050 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3052 // resample with nearest-neighbor
3053 int i,j;
3054 STBI_NOTUSED(in_far);
3055 for (i=0; i < w; ++i)
3056 for (j=0; j < hs; ++j)
3057 out[i*hs+j] = in_near[i];
3058 return out;
3061 #ifdef STBI_JPEG_OLD
3062 // this is the same YCbCr-to-RGB calculation that stb_image has used
3063 // historically before the algorithm changes in 1.49
3064 #define float2fixed(x) ((int) ((x) * 65536 + 0.5))
3065 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3067 int i;
3068 for (i=0; i < count; ++i) {
3069 int y_fixed = (y[i] << 16) + 32768; // rounding
3070 int r,g,b;
3071 int cr = pcr[i] - 128;
3072 int cb = pcb[i] - 128;
3073 r = y_fixed + cr*float2fixed(1.40200f);
3074 g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
3075 b = y_fixed + cb*float2fixed(1.77200f);
3076 r >>= 16;
3077 g >>= 16;
3078 b >>= 16;
3079 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3080 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3081 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3082 out[0] = (stbi_uc)r;
3083 out[1] = (stbi_uc)g;
3084 out[2] = (stbi_uc)b;
3085 out[3] = 255;
3086 out += step;
3089 #else
3090 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3091 // to make sure the code produces the same results in both SIMD and scalar
3092 #define float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
3093 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3095 int i;
3096 for (i=0; i < count; ++i) {
3097 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3098 int r,g,b;
3099 int cr = pcr[i] - 128;
3100 int cb = pcb[i] - 128;
3101 r = y_fixed + cr* float2fixed(1.40200f);
3102 g = y_fixed + (cr*-float2fixed(0.71414f)) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3103 b = y_fixed + cb* float2fixed(1.77200f);
3104 r >>= 20;
3105 g >>= 20;
3106 b >>= 20;
3107 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3108 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3109 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3110 out[0] = (stbi_uc)r;
3111 out[1] = (stbi_uc)g;
3112 out[2] = (stbi_uc)b;
3113 out[3] = 255;
3114 out += step;
3117 #endif
3119 #if defined(STBI_SSE2) || defined(STBI_NEON)
3120 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
3122 int i = 0;
3124 #ifdef STBI_SSE2
3125 // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3126 // it's useful in practice (you wouldn't use it for textures, for example).
3127 // so just accelerate step == 4 case.
3128 if (step == 4) {
3129 // this is a fairly straightforward implementation and not super-optimized.
3130 __m128i signflip = _mm_set1_epi8(-0x80);
3131 __m128i cr_const0 = _mm_set1_epi16( (short) ( 1.40200f*4096.0f+0.5f));
3132 __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
3133 __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
3134 __m128i cb_const1 = _mm_set1_epi16( (short) ( 1.77200f*4096.0f+0.5f));
3135 __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
3136 __m128i xw = _mm_set1_epi16(255); // alpha channel
3138 for (; i+7 < count; i += 8) {
3139 // load
3140 __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
3141 __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
3142 __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
3143 __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3144 __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3146 // unpack to short (and left-shift cr, cb by 8)
3147 __m128i yw = _mm_unpacklo_epi8(y_bias, y_bytes);
3148 __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3149 __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3151 // color transform
3152 __m128i yws = _mm_srli_epi16(yw, 4);
3153 __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3154 __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3155 __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3156 __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3157 __m128i rws = _mm_add_epi16(cr0, yws);
3158 __m128i gwt = _mm_add_epi16(cb0, yws);
3159 __m128i bws = _mm_add_epi16(yws, cb1);
3160 __m128i gws = _mm_add_epi16(gwt, cr1);
3162 // descale
3163 __m128i rw = _mm_srai_epi16(rws, 4);
3164 __m128i bw = _mm_srai_epi16(bws, 4);
3165 __m128i gw = _mm_srai_epi16(gws, 4);
3167 // back to byte, set up for transpose
3168 __m128i brb = _mm_packus_epi16(rw, bw);
3169 __m128i gxb = _mm_packus_epi16(gw, xw);
3171 // transpose to interleave channels
3172 __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3173 __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3174 __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3175 __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3177 // store
3178 _mm_storeu_si128((__m128i *) (out + 0), o0);
3179 _mm_storeu_si128((__m128i *) (out + 16), o1);
3180 out += 32;
3183 #endif
3185 #ifdef STBI_NEON
3186 // in this version, step=3 support would be easy to add. but is there demand?
3187 if (step == 4) {
3188 // this is a fairly straightforward implementation and not super-optimized.
3189 uint8x8_t signflip = vdup_n_u8(0x80);
3190 int16x8_t cr_const0 = vdupq_n_s16( (short) ( 1.40200f*4096.0f+0.5f));
3191 int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
3192 int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
3193 int16x8_t cb_const1 = vdupq_n_s16( (short) ( 1.77200f*4096.0f+0.5f));
3195 for (; i+7 < count; i += 8) {
3196 // load
3197 uint8x8_t y_bytes = vld1_u8(y + i);
3198 uint8x8_t cr_bytes = vld1_u8(pcr + i);
3199 uint8x8_t cb_bytes = vld1_u8(pcb + i);
3200 int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3201 int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3203 // expand to s16
3204 int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3205 int16x8_t crw = vshll_n_s8(cr_biased, 7);
3206 int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3208 // color transform
3209 int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3210 int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3211 int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3212 int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3213 int16x8_t rws = vaddq_s16(yws, cr0);
3214 int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3215 int16x8_t bws = vaddq_s16(yws, cb1);
3217 // undo scaling, round, convert to byte
3218 uint8x8x4_t o;
3219 o.val[0] = vqrshrun_n_s16(rws, 4);
3220 o.val[1] = vqrshrun_n_s16(gws, 4);
3221 o.val[2] = vqrshrun_n_s16(bws, 4);
3222 o.val[3] = vdup_n_u8(255);
3224 // store, interleaving r/g/b/a
3225 vst4_u8(out, o);
3226 out += 8*4;
3229 #endif
3231 for (; i < count; ++i) {
3232 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3233 int r,g,b;
3234 int cr = pcr[i] - 128;
3235 int cb = pcb[i] - 128;
3236 r = y_fixed + cr* float2fixed(1.40200f);
3237 g = y_fixed + cr*-float2fixed(0.71414f) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3238 b = y_fixed + cb* float2fixed(1.77200f);
3239 r >>= 20;
3240 g >>= 20;
3241 b >>= 20;
3242 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3243 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3244 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3245 out[0] = (stbi_uc)r;
3246 out[1] = (stbi_uc)g;
3247 out[2] = (stbi_uc)b;
3248 out[3] = 255;
3249 out += step;
3252 #endif
3254 // set up the kernels
3255 static void stbi__setup_jpeg(stbi__jpeg *j)
3257 j->idct_block_kernel = stbi__idct_block;
3258 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3259 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3261 #ifdef STBI_SSE2
3262 if (stbi__sse2_available()) {
3263 j->idct_block_kernel = stbi__idct_simd;
3264 #ifndef STBI_JPEG_OLD
3265 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3266 #endif
3267 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3269 #endif
3271 #ifdef STBI_NEON
3272 j->idct_block_kernel = stbi__idct_simd;
3273 #ifndef STBI_JPEG_OLD
3274 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3275 #endif
3276 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3277 #endif
3280 // clean up the temporary component buffers
3281 static void stbi__cleanup_jpeg(stbi__jpeg *j)
3283 int i;
3284 for (i=0; i < j->s->img_n; ++i) {
3285 if (j->img_comp[i].raw_data) {
3286 STBI_FREE(j->img_comp[i].raw_data);
3287 j->img_comp[i].raw_data = NULL;
3288 j->img_comp[i].data = NULL;
3290 if (j->img_comp[i].raw_coeff) {
3291 STBI_FREE(j->img_comp[i].raw_coeff);
3292 j->img_comp[i].raw_coeff = 0;
3293 j->img_comp[i].coeff = 0;
3295 if (j->img_comp[i].linebuf) {
3296 STBI_FREE(j->img_comp[i].linebuf);
3297 j->img_comp[i].linebuf = NULL;
3302 typedef struct
3304 resample_row_func resample;
3305 stbi_uc *line0,*line1;
3306 int hs,vs; // expansion factor in each axis
3307 int w_lores; // horizontal pixels pre-expansion
3308 int ystep; // how far through vertical expansion we are
3309 int ypos; // which pre-expansion row we're on
3310 } stbi__resample;
3312 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
3314 int n, decode_n;
3315 z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3317 // validate req_comp
3318 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3320 // load a jpeg image from whichever source, but leave in YCbCr format
3321 if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3323 // determine actual number of components to generate
3324 n = req_comp ? req_comp : z->s->img_n;
3326 if (z->s->img_n == 3 && n < 3)
3327 decode_n = 1;
3328 else
3329 decode_n = z->s->img_n;
3331 // resample and color-convert
3333 int k;
3334 unsigned int i,j;
3335 stbi_uc *output;
3336 stbi_uc *coutput[4];
3338 stbi__resample res_comp[4];
3340 for (k=0; k < decode_n; ++k) {
3341 stbi__resample *r = &res_comp[k];
3343 // allocate line buffer big enough for upsampling off the edges
3344 // with upsample factor of 4
3345 z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
3346 if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3348 r->hs = z->img_h_max / z->img_comp[k].h;
3349 r->vs = z->img_v_max / z->img_comp[k].v;
3350 r->ystep = r->vs >> 1;
3351 r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
3352 r->ypos = 0;
3353 r->line0 = r->line1 = z->img_comp[k].data;
3355 if (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3356 else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3357 else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3358 else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3359 else r->resample = stbi__resample_row_generic;
3362 // can't error after this so, this is safe
3363 output = (stbi_uc *) stbi__malloc(n * z->s->img_x * z->s->img_y + 1);
3364 if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3366 // now go ahead and resample
3367 for (j=0; j < z->s->img_y; ++j) {
3368 stbi_uc *out = output + n * z->s->img_x * j;
3369 for (k=0; k < decode_n; ++k) {
3370 stbi__resample *r = &res_comp[k];
3371 int y_bot = r->ystep >= (r->vs >> 1);
3372 coutput[k] = r->resample(z->img_comp[k].linebuf,
3373 y_bot ? r->line1 : r->line0,
3374 y_bot ? r->line0 : r->line1,
3375 r->w_lores, r->hs);
3376 if (++r->ystep >= r->vs) {
3377 r->ystep = 0;
3378 r->line0 = r->line1;
3379 if (++r->ypos < z->img_comp[k].y)
3380 r->line1 += z->img_comp[k].w2;
3383 if (n >= 3) {
3384 stbi_uc *y = coutput[0];
3385 if (z->s->img_n == 3) {
3386 z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3387 } else
3388 for (i=0; i < z->s->img_x; ++i) {
3389 out[0] = out[1] = out[2] = y[i];
3390 out[3] = 255; // not used if n==3
3391 out += n;
3393 } else {
3394 stbi_uc *y = coutput[0];
3395 if (n == 1)
3396 for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
3397 else
3398 for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
3401 stbi__cleanup_jpeg(z);
3402 *out_x = z->s->img_x;
3403 *out_y = z->s->img_y;
3404 if (comp) *comp = z->s->img_n; // report original components, not output
3405 return output;
3409 static unsigned char *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3411 stbi__jpeg j;
3412 j.s = s;
3413 stbi__setup_jpeg(&j);
3414 return load_jpeg_image(&j, x,y,comp,req_comp);
3417 static int stbi__jpeg_test(stbi__context *s)
3419 int r;
3420 stbi__jpeg j;
3421 j.s = s;
3422 stbi__setup_jpeg(&j);
3423 r = stbi__decode_jpeg_header(&j, STBI__SCAN_type);
3424 stbi__rewind(s);
3425 return r;
3428 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
3430 if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
3431 stbi__rewind( j->s );
3432 return 0;
3434 if (x) *x = j->s->img_x;
3435 if (y) *y = j->s->img_y;
3436 if (comp) *comp = j->s->img_n;
3437 return 1;
3440 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
3442 stbi__jpeg j;
3443 j.s = s;
3444 return stbi__jpeg_info_raw(&j, x, y, comp);
3446 #endif
3448 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
3449 // simple implementation
3450 // - all input must be provided in an upfront buffer
3451 // - all output is written to a single output buffer (can malloc/realloc)
3452 // performance
3453 // - fast huffman
3455 #ifndef STBI_NO_ZLIB
3457 // fast-way is faster to check than jpeg huffman, but slow way is slower
3458 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
3459 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
3461 // zlib-style huffman encoding
3462 // (jpegs packs from left, zlib from right, so can't share code)
3463 typedef struct
3465 stbi__uint16 fast[1 << STBI__ZFAST_BITS];
3466 stbi__uint16 firstcode[16];
3467 int maxcode[17];
3468 stbi__uint16 firstsymbol[16];
3469 stbi_uc size[288];
3470 stbi__uint16 value[288];
3471 } stbi__zhuffman;
3473 stbi_inline static int stbi__bitreverse16(int n)
3475 n = ((n & 0xAAAA) >> 1) | ((n & 0x5555) << 1);
3476 n = ((n & 0xCCCC) >> 2) | ((n & 0x3333) << 2);
3477 n = ((n & 0xF0F0) >> 4) | ((n & 0x0F0F) << 4);
3478 n = ((n & 0xFF00) >> 8) | ((n & 0x00FF) << 8);
3479 return n;
3482 stbi_inline static int stbi__bit_reverse(int v, int bits)
3484 STBI_ASSERT(bits <= 16);
3485 // to bit reverse n bits, reverse 16 and shift
3486 // e.g. 11 bits, bit reverse and shift away 5
3487 return stbi__bitreverse16(v) >> (16-bits);
3490 static int stbi__zbuild_huffman(stbi__zhuffman *z, stbi_uc *sizelist, int num)
3492 int i,k=0;
3493 int code, next_code[16], sizes[17];
3495 // DEFLATE spec for generating codes
3496 memset(sizes, 0, sizeof(sizes));
3497 memset(z->fast, 0, sizeof(z->fast));
3498 for (i=0; i < num; ++i)
3499 ++sizes[sizelist[i]];
3500 sizes[0] = 0;
3501 for (i=1; i < 16; ++i)
3502 if (sizes[i] > (1 << i))
3503 return stbi__err("bad sizes", "Corrupt PNG");
3504 code = 0;
3505 for (i=1; i < 16; ++i) {
3506 next_code[i] = code;
3507 z->firstcode[i] = (stbi__uint16) code;
3508 z->firstsymbol[i] = (stbi__uint16) k;
3509 code = (code + sizes[i]);
3510 if (sizes[i])
3511 if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
3512 z->maxcode[i] = code << (16-i); // preshift for inner loop
3513 code <<= 1;
3514 k += sizes[i];
3516 z->maxcode[16] = 0x10000; // sentinel
3517 for (i=0; i < num; ++i) {
3518 int s = sizelist[i];
3519 if (s) {
3520 int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
3521 stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
3522 z->size [c] = (stbi_uc ) s;
3523 z->value[c] = (stbi__uint16) i;
3524 if (s <= STBI__ZFAST_BITS) {
3525 int j = stbi__bit_reverse(next_code[s],s);
3526 while (j < (1 << STBI__ZFAST_BITS)) {
3527 z->fast[j] = fastv;
3528 j += (1 << s);
3531 ++next_code[s];
3534 return 1;
3537 // zlib-from-memory implementation for PNG reading
3538 // because PNG allows splitting the zlib stream arbitrarily,
3539 // and it's annoying structurally to have PNG call ZLIB call PNG,
3540 // we require PNG read all the IDATs and combine them into a single
3541 // memory buffer
3543 typedef struct
3545 stbi_uc *zbuffer, *zbuffer_end;
3546 int num_bits;
3547 stbi__uint32 code_buffer;
3549 char *zout;
3550 char *zout_start;
3551 char *zout_end;
3552 int z_expandable;
3554 stbi__zhuffman z_length, z_distance;
3555 } stbi__zbuf;
3557 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
3559 if (z->zbuffer >= z->zbuffer_end) return 0;
3560 return *z->zbuffer++;
3563 static void stbi__fill_bits(stbi__zbuf *z)
3565 do {
3566 STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
3567 z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
3568 z->num_bits += 8;
3569 } while (z->num_bits <= 24);
3572 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
3574 unsigned int k;
3575 if (z->num_bits < n) stbi__fill_bits(z);
3576 k = z->code_buffer & ((1 << n) - 1);
3577 z->code_buffer >>= n;
3578 z->num_bits -= n;
3579 return k;
3582 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
3584 int b,s,k;
3585 // not resolved by fast table, so compute it the slow way
3586 // use jpeg approach, which requires MSbits at top
3587 k = stbi__bit_reverse(a->code_buffer, 16);
3588 for (s=STBI__ZFAST_BITS+1; ; ++s)
3589 if (k < z->maxcode[s])
3590 break;
3591 if (s == 16) return -1; // invalid code!
3592 // code size is s, so:
3593 b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
3594 STBI_ASSERT(z->size[b] == s);
3595 a->code_buffer >>= s;
3596 a->num_bits -= s;
3597 return z->value[b];
3600 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
3602 int b,s;
3603 if (a->num_bits < 16) stbi__fill_bits(a);
3604 b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
3605 if (b) {
3606 s = b >> 9;
3607 a->code_buffer >>= s;
3608 a->num_bits -= s;
3609 return b & 511;
3611 return stbi__zhuffman_decode_slowpath(a, z);
3614 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n) // need to make room for n bytes
3616 char *q;
3617 int cur, limit, old_limit;
3618 z->zout = zout;
3619 if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
3620 cur = (int) (z->zout - z->zout_start);
3621 limit = old_limit = (int) (z->zout_end - z->zout_start);
3622 while (cur + n > limit)
3623 limit *= 2;
3624 q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
3625 STBI_NOTUSED(old_limit);
3626 if (q == NULL) return stbi__err("outofmem", "Out of memory");
3627 z->zout_start = q;
3628 z->zout = q + cur;
3629 z->zout_end = q + limit;
3630 return 1;
3633 static int stbi__zlength_base[31] = {
3634 3,4,5,6,7,8,9,10,11,13,
3635 15,17,19,23,27,31,35,43,51,59,
3636 67,83,99,115,131,163,195,227,258,0,0 };
3638 static int stbi__zlength_extra[31]=
3639 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
3641 static int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
3642 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
3644 static int stbi__zdist_extra[32] =
3645 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
3647 static int stbi__parse_huffman_block(stbi__zbuf *a)
3649 char *zout = a->zout;
3650 for(;;) {
3651 int z = stbi__zhuffman_decode(a, &a->z_length);
3652 if (z < 256) {
3653 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
3654 if (zout >= a->zout_end) {
3655 if (!stbi__zexpand(a, zout, 1)) return 0;
3656 zout = a->zout;
3658 *zout++ = (char) z;
3659 } else {
3660 stbi_uc *p;
3661 int len,dist;
3662 if (z == 256) {
3663 a->zout = zout;
3664 return 1;
3666 z -= 257;
3667 len = stbi__zlength_base[z];
3668 if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
3669 z = stbi__zhuffman_decode(a, &a->z_distance);
3670 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
3671 dist = stbi__zdist_base[z];
3672 if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
3673 if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
3674 if (zout + len > a->zout_end) {
3675 if (!stbi__zexpand(a, zout, len)) return 0;
3676 zout = a->zout;
3678 p = (stbi_uc *) (zout - dist);
3679 if (dist == 1) { // run of one byte; common in images.
3680 stbi_uc v = *p;
3681 if (len) { do *zout++ = v; while (--len); }
3682 } else {
3683 if (len) { do *zout++ = *p++; while (--len); }
3689 static int stbi__compute_huffman_codes(stbi__zbuf *a)
3691 static stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
3692 stbi__zhuffman z_codelength;
3693 stbi_uc lencodes[286+32+137];//padding for maximum single op
3694 stbi_uc codelength_sizes[19];
3695 int i,n;
3697 int hlit = stbi__zreceive(a,5) + 257;
3698 int hdist = stbi__zreceive(a,5) + 1;
3699 int hclen = stbi__zreceive(a,4) + 4;
3701 memset(codelength_sizes, 0, sizeof(codelength_sizes));
3702 for (i=0; i < hclen; ++i) {
3703 int s = stbi__zreceive(a,3);
3704 codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
3706 if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
3708 n = 0;
3709 while (n < hlit + hdist) {
3710 int c = stbi__zhuffman_decode(a, &z_codelength);
3711 if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
3712 if (c < 16)
3713 lencodes[n++] = (stbi_uc) c;
3714 else if (c == 16) {
3715 c = stbi__zreceive(a,2)+3;
3716 memset(lencodes+n, lencodes[n-1], c);
3717 n += c;
3718 } else if (c == 17) {
3719 c = stbi__zreceive(a,3)+3;
3720 memset(lencodes+n, 0, c);
3721 n += c;
3722 } else {
3723 STBI_ASSERT(c == 18);
3724 c = stbi__zreceive(a,7)+11;
3725 memset(lencodes+n, 0, c);
3726 n += c;
3729 if (n != hlit+hdist) return stbi__err("bad codelengths","Corrupt PNG");
3730 if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
3731 if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
3732 return 1;
3735 static int stbi__parse_uncomperssed_block(stbi__zbuf *a)
3737 stbi_uc header[4];
3738 int len,nlen,k;
3739 if (a->num_bits & 7)
3740 stbi__zreceive(a, a->num_bits & 7); // discard
3741 // drain the bit-packed data into header
3742 k = 0;
3743 while (a->num_bits > 0) {
3744 header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
3745 a->code_buffer >>= 8;
3746 a->num_bits -= 8;
3748 STBI_ASSERT(a->num_bits == 0);
3749 // now fill header the normal way
3750 while (k < 4)
3751 header[k++] = stbi__zget8(a);
3752 len = header[1] * 256 + header[0];
3753 nlen = header[3] * 256 + header[2];
3754 if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
3755 if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
3756 if (a->zout + len > a->zout_end)
3757 if (!stbi__zexpand(a, a->zout, len)) return 0;
3758 memcpy(a->zout, a->zbuffer, len);
3759 a->zbuffer += len;
3760 a->zout += len;
3761 return 1;
3764 static int stbi__parse_zlib_header(stbi__zbuf *a)
3766 int cmf = stbi__zget8(a);
3767 int cm = cmf & 15;
3768 /* int cinfo = cmf >> 4; */
3769 int flg = stbi__zget8(a);
3770 if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
3771 if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
3772 if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
3773 // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
3774 return 1;
3777 // @TODO: should statically initialize these for optimal thread safety
3778 static stbi_uc stbi__zdefault_length[288], stbi__zdefault_distance[32];
3779 static void stbi__init_zdefaults(void)
3781 int i; // use <= to match clearly with spec
3782 for (i=0; i <= 143; ++i) stbi__zdefault_length[i] = 8;
3783 for ( ; i <= 255; ++i) stbi__zdefault_length[i] = 9;
3784 for ( ; i <= 279; ++i) stbi__zdefault_length[i] = 7;
3785 for ( ; i <= 287; ++i) stbi__zdefault_length[i] = 8;
3787 for (i=0; i <= 31; ++i) stbi__zdefault_distance[i] = 5;
3790 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
3792 int final, type;
3793 if (parse_header)
3794 if (!stbi__parse_zlib_header(a)) return 0;
3795 a->num_bits = 0;
3796 a->code_buffer = 0;
3797 do {
3798 final = stbi__zreceive(a,1);
3799 type = stbi__zreceive(a,2);
3800 if (type == 0) {
3801 if (!stbi__parse_uncomperssed_block(a)) return 0;
3802 } else if (type == 3) {
3803 return 0;
3804 } else {
3805 if (type == 1) {
3806 // use fixed code lengths
3807 if (!stbi__zdefault_distance[31]) stbi__init_zdefaults();
3808 if (!stbi__zbuild_huffman(&a->z_length , stbi__zdefault_length , 288)) return 0;
3809 if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance, 32)) return 0;
3810 } else {
3811 if (!stbi__compute_huffman_codes(a)) return 0;
3813 if (!stbi__parse_huffman_block(a)) return 0;
3815 } while (!final);
3816 return 1;
3819 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
3821 a->zout_start = obuf;
3822 a->zout = obuf;
3823 a->zout_end = obuf + olen;
3824 a->z_expandable = exp;
3826 return stbi__parse_zlib(a, parse_header);
3829 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
3831 stbi__zbuf a;
3832 char *p = (char *) stbi__malloc(initial_size);
3833 if (p == NULL) return NULL;
3834 a.zbuffer = (stbi_uc *) buffer;
3835 a.zbuffer_end = (stbi_uc *) buffer + len;
3836 if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
3837 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3838 return a.zout_start;
3839 } else {
3840 STBI_FREE(a.zout_start);
3841 return NULL;
3845 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
3847 return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
3850 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
3852 stbi__zbuf a;
3853 char *p = (char *) stbi__malloc(initial_size);
3854 if (p == NULL) return NULL;
3855 a.zbuffer = (stbi_uc *) buffer;
3856 a.zbuffer_end = (stbi_uc *) buffer + len;
3857 if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
3858 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3859 return a.zout_start;
3860 } else {
3861 STBI_FREE(a.zout_start);
3862 return NULL;
3866 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
3868 stbi__zbuf a;
3869 a.zbuffer = (stbi_uc *) ibuffer;
3870 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3871 if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
3872 return (int) (a.zout - a.zout_start);
3873 else
3874 return -1;
3877 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
3879 stbi__zbuf a;
3880 char *p = (char *) stbi__malloc(16384);
3881 if (p == NULL) return NULL;
3882 a.zbuffer = (stbi_uc *) buffer;
3883 a.zbuffer_end = (stbi_uc *) buffer+len;
3884 if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
3885 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3886 return a.zout_start;
3887 } else {
3888 STBI_FREE(a.zout_start);
3889 return NULL;
3893 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
3895 stbi__zbuf a;
3896 a.zbuffer = (stbi_uc *) ibuffer;
3897 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3898 if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
3899 return (int) (a.zout - a.zout_start);
3900 else
3901 return -1;
3903 #endif
3905 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
3906 // simple implementation
3907 // - only 8-bit samples
3908 // - no CRC checking
3909 // - allocates lots of intermediate memory
3910 // - avoids problem of streaming data between subsystems
3911 // - avoids explicit window management
3912 // performance
3913 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
3915 #ifndef STBI_NO_PNG
3916 typedef struct
3918 stbi__uint32 length;
3919 stbi__uint32 type;
3920 } stbi__pngchunk;
3922 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
3924 stbi__pngchunk c;
3925 c.length = stbi__get32be(s);
3926 c.type = stbi__get32be(s);
3927 return c;
3930 static int stbi__check_png_header(stbi__context *s)
3932 static stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
3933 int i;
3934 for (i=0; i < 8; ++i)
3935 if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
3936 return 1;
3939 typedef struct
3941 stbi__context *s;
3942 stbi_uc *idata, *expanded, *out;
3943 } stbi__png;
3946 enum {
3947 STBI__F_none=0,
3948 STBI__F_sub=1,
3949 STBI__F_up=2,
3950 STBI__F_avg=3,
3951 STBI__F_paeth=4,
3952 // synthetic filters used for first scanline to avoid needing a dummy row of 0s
3953 STBI__F_avg_first,
3954 STBI__F_paeth_first
3957 static stbi_uc first_row_filter[5] =
3959 STBI__F_none,
3960 STBI__F_sub,
3961 STBI__F_none,
3962 STBI__F_avg_first,
3963 STBI__F_paeth_first
3966 static int stbi__paeth(int a, int b, int c)
3968 int p = a + b - c;
3969 int pa = abs(p-a);
3970 int pb = abs(p-b);
3971 int pc = abs(p-c);
3972 if (pa <= pb && pa <= pc) return a;
3973 if (pb <= pc) return b;
3974 return c;
3977 static stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
3979 // create the png data from post-deflated data
3980 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
3982 stbi__context *s = a->s;
3983 stbi__uint32 i,j,stride = x*out_n;
3984 stbi__uint32 img_len, img_width_bytes;
3985 int k;
3986 int img_n = s->img_n; // copy it into a local for later
3988 STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
3989 a->out = (stbi_uc *) stbi__malloc(x * y * out_n); // extra bytes to write off the end into
3990 if (!a->out) return stbi__err("outofmem", "Out of memory");
3992 img_width_bytes = (((img_n * x * depth) + 7) >> 3);
3993 img_len = (img_width_bytes + 1) * y;
3994 if (s->img_x == x && s->img_y == y) {
3995 if (raw_len != img_len) return stbi__err("not enough pixels","Corrupt PNG");
3996 } else { // interlaced:
3997 if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
4000 for (j=0; j < y; ++j) {
4001 stbi_uc *cur = a->out + stride*j;
4002 stbi_uc *prior = cur - stride;
4003 int filter = *raw++;
4004 int filter_bytes = img_n;
4005 int width = x;
4006 if (filter > 4)
4007 return stbi__err("invalid filter","Corrupt PNG");
4009 if (depth < 8) {
4010 STBI_ASSERT(img_width_bytes <= x);
4011 cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
4012 filter_bytes = 1;
4013 width = img_width_bytes;
4016 // if first row, use special filter that doesn't sample previous row
4017 if (j == 0) filter = first_row_filter[filter];
4019 // handle first byte explicitly
4020 for (k=0; k < filter_bytes; ++k) {
4021 switch (filter) {
4022 case STBI__F_none : cur[k] = raw[k]; break;
4023 case STBI__F_sub : cur[k] = raw[k]; break;
4024 case STBI__F_up : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4025 case STBI__F_avg : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
4026 case STBI__F_paeth : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
4027 case STBI__F_avg_first : cur[k] = raw[k]; break;
4028 case STBI__F_paeth_first: cur[k] = raw[k]; break;
4032 if (depth == 8) {
4033 if (img_n != out_n)
4034 cur[img_n] = 255; // first pixel
4035 raw += img_n;
4036 cur += out_n;
4037 prior += out_n;
4038 } else {
4039 raw += 1;
4040 cur += 1;
4041 prior += 1;
4044 // this is a little gross, so that we don't switch per-pixel or per-component
4045 if (depth < 8 || img_n == out_n) {
4046 int nk = (width - 1)*img_n;
4047 #define CASE(f) \
4048 case f: \
4049 for (k=0; k < nk; ++k)
4050 switch (filter) {
4051 // "none" filter turns into a memcpy here; make that explicit.
4052 case STBI__F_none: memcpy(cur, raw, nk); break;
4053 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); break;
4054 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4055 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); break;
4056 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); break;
4057 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); break;
4058 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); break;
4060 #undef CASE
4061 raw += nk;
4062 } else {
4063 STBI_ASSERT(img_n+1 == out_n);
4064 #define CASE(f) \
4065 case f: \
4066 for (i=x-1; i >= 1; --i, cur[img_n]=255,raw+=img_n,cur+=out_n,prior+=out_n) \
4067 for (k=0; k < img_n; ++k)
4068 switch (filter) {
4069 CASE(STBI__F_none) cur[k] = raw[k]; break;
4070 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-out_n]); break;
4071 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4072 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-out_n])>>1)); break;
4073 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],prior[k],prior[k-out_n])); break;
4074 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-out_n] >> 1)); break;
4075 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],0,0)); break;
4077 #undef CASE
4081 // we make a separate pass to expand bits to pixels; for performance,
4082 // this could run two scanlines behind the above code, so it won't
4083 // intefere with filtering but will still be in the cache.
4084 if (depth < 8) {
4085 for (j=0; j < y; ++j) {
4086 stbi_uc *cur = a->out + stride*j;
4087 stbi_uc *in = a->out + stride*j + x*out_n - img_width_bytes;
4088 // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4089 // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4090 stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4092 // note that the final byte might overshoot and write more data than desired.
4093 // we can allocate enough data that this never writes out of memory, but it
4094 // could also overwrite the next scanline. can it overwrite non-empty data
4095 // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4096 // so we need to explicitly clamp the final ones
4098 if (depth == 4) {
4099 for (k=x*img_n; k >= 2; k-=2, ++in) {
4100 *cur++ = scale * ((*in >> 4) );
4101 *cur++ = scale * ((*in ) & 0x0f);
4103 if (k > 0) *cur++ = scale * ((*in >> 4) );
4104 } else if (depth == 2) {
4105 for (k=x*img_n; k >= 4; k-=4, ++in) {
4106 *cur++ = scale * ((*in >> 6) );
4107 *cur++ = scale * ((*in >> 4) & 0x03);
4108 *cur++ = scale * ((*in >> 2) & 0x03);
4109 *cur++ = scale * ((*in ) & 0x03);
4111 if (k > 0) *cur++ = scale * ((*in >> 6) );
4112 if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
4113 if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
4114 } else if (depth == 1) {
4115 for (k=x*img_n; k >= 8; k-=8, ++in) {
4116 *cur++ = scale * ((*in >> 7) );
4117 *cur++ = scale * ((*in >> 6) & 0x01);
4118 *cur++ = scale * ((*in >> 5) & 0x01);
4119 *cur++ = scale * ((*in >> 4) & 0x01);
4120 *cur++ = scale * ((*in >> 3) & 0x01);
4121 *cur++ = scale * ((*in >> 2) & 0x01);
4122 *cur++ = scale * ((*in >> 1) & 0x01);
4123 *cur++ = scale * ((*in ) & 0x01);
4125 if (k > 0) *cur++ = scale * ((*in >> 7) );
4126 if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
4127 if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
4128 if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
4129 if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
4130 if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
4131 if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
4133 if (img_n != out_n) {
4134 int q;
4135 // insert alpha = 255
4136 cur = a->out + stride*j;
4137 if (img_n == 1) {
4138 for (q=x-1; q >= 0; --q) {
4139 cur[q*2+1] = 255;
4140 cur[q*2+0] = cur[q];
4142 } else {
4143 STBI_ASSERT(img_n == 3);
4144 for (q=x-1; q >= 0; --q) {
4145 cur[q*4+3] = 255;
4146 cur[q*4+2] = cur[q*3+2];
4147 cur[q*4+1] = cur[q*3+1];
4148 cur[q*4+0] = cur[q*3+0];
4155 return 1;
4158 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4160 stbi_uc *final;
4161 int p;
4162 if (!interlaced)
4163 return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4165 // de-interlacing
4166 final = (stbi_uc *) stbi__malloc(a->s->img_x * a->s->img_y * out_n);
4167 for (p=0; p < 7; ++p) {
4168 int xorig[] = { 0,4,0,2,0,1,0 };
4169 int yorig[] = { 0,0,4,0,2,0,1 };
4170 int xspc[] = { 8,8,4,4,2,2,1 };
4171 int yspc[] = { 8,8,8,4,4,2,2 };
4172 int i,j,x,y;
4173 // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4174 x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
4175 y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
4176 if (x && y) {
4177 stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4178 if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4179 STBI_FREE(final);
4180 return 0;
4182 for (j=0; j < y; ++j) {
4183 for (i=0; i < x; ++i) {
4184 int out_y = j*yspc[p]+yorig[p];
4185 int out_x = i*xspc[p]+xorig[p];
4186 memcpy(final + out_y*a->s->img_x*out_n + out_x*out_n,
4187 a->out + (j*x+i)*out_n, out_n);
4190 STBI_FREE(a->out);
4191 image_data += img_len;
4192 image_data_len -= img_len;
4195 a->out = final;
4197 return 1;
4200 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
4202 stbi__context *s = z->s;
4203 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4204 stbi_uc *p = z->out;
4206 // compute color-based transparency, assuming we've
4207 // already got 255 as the alpha value in the output
4208 STBI_ASSERT(out_n == 2 || out_n == 4);
4210 if (out_n == 2) {
4211 for (i=0; i < pixel_count; ++i) {
4212 p[1] = (p[0] == tc[0] ? 0 : 255);
4213 p += 2;
4215 } else {
4216 for (i=0; i < pixel_count; ++i) {
4217 if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4218 p[3] = 0;
4219 p += 4;
4222 return 1;
4225 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
4227 stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4228 stbi_uc *p, *temp_out, *orig = a->out;
4230 p = (stbi_uc *) stbi__malloc(pixel_count * pal_img_n);
4231 if (p == NULL) return stbi__err("outofmem", "Out of memory");
4233 // between here and free(out) below, exitting would leak
4234 temp_out = p;
4236 if (pal_img_n == 3) {
4237 for (i=0; i < pixel_count; ++i) {
4238 int n = orig[i]*4;
4239 p[0] = palette[n ];
4240 p[1] = palette[n+1];
4241 p[2] = palette[n+2];
4242 p += 3;
4244 } else {
4245 for (i=0; i < pixel_count; ++i) {
4246 int n = orig[i]*4;
4247 p[0] = palette[n ];
4248 p[1] = palette[n+1];
4249 p[2] = palette[n+2];
4250 p[3] = palette[n+3];
4251 p += 4;
4254 STBI_FREE(a->out);
4255 a->out = temp_out;
4257 STBI_NOTUSED(len);
4259 return 1;
4262 static int stbi__unpremultiply_on_load = 0;
4263 static int stbi__de_iphone_flag = 0;
4265 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4267 stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
4270 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
4272 stbi__de_iphone_flag = flag_true_if_should_convert;
4275 static void stbi__de_iphone(stbi__png *z)
4277 stbi__context *s = z->s;
4278 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4279 stbi_uc *p = z->out;
4281 if (s->img_out_n == 3) { // convert bgr to rgb
4282 for (i=0; i < pixel_count; ++i) {
4283 stbi_uc t = p[0];
4284 p[0] = p[2];
4285 p[2] = t;
4286 p += 3;
4288 } else {
4289 STBI_ASSERT(s->img_out_n == 4);
4290 if (stbi__unpremultiply_on_load) {
4291 // convert bgr to rgb and unpremultiply
4292 for (i=0; i < pixel_count; ++i) {
4293 stbi_uc a = p[3];
4294 stbi_uc t = p[0];
4295 if (a) {
4296 p[0] = p[2] * 255 / a;
4297 p[1] = p[1] * 255 / a;
4298 p[2] = t * 255 / a;
4299 } else {
4300 p[0] = p[2];
4301 p[2] = t;
4303 p += 4;
4305 } else {
4306 // convert bgr to rgb
4307 for (i=0; i < pixel_count; ++i) {
4308 stbi_uc t = p[0];
4309 p[0] = p[2];
4310 p[2] = t;
4311 p += 4;
4317 #define STBI__PNG_TYPE(a,b,c,d) (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
4319 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
4321 stbi_uc palette[1024], pal_img_n=0;
4322 stbi_uc has_trans=0, tc[3];
4323 stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
4324 int first=1,k,interlace=0, color=0, depth=0, is_iphone=0;
4325 stbi__context *s = z->s;
4327 z->expanded = NULL;
4328 z->idata = NULL;
4329 z->out = NULL;
4331 if (!stbi__check_png_header(s)) return 0;
4333 if (scan == STBI__SCAN_type) return 1;
4335 for (;;) {
4336 stbi__pngchunk c = stbi__get_chunk_header(s);
4337 switch (c.type) {
4338 case STBI__PNG_TYPE('C','g','B','I'):
4339 is_iphone = 1;
4340 stbi__skip(s, c.length);
4341 break;
4342 case STBI__PNG_TYPE('I','H','D','R'): {
4343 int comp,filter;
4344 if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
4345 first = 0;
4346 if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
4347 s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4348 s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4349 depth = stbi__get8(s); if (depth != 1 && depth != 2 && depth != 4 && depth != 8) return stbi__err("1/2/4/8-bit only","PNG not supported: 1/2/4/8-bit only");
4350 color = stbi__get8(s); if (color > 6) return stbi__err("bad ctype","Corrupt PNG");
4351 if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
4352 comp = stbi__get8(s); if (comp) return stbi__err("bad comp method","Corrupt PNG");
4353 filter= stbi__get8(s); if (filter) return stbi__err("bad filter method","Corrupt PNG");
4354 interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
4355 if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
4356 if (!pal_img_n) {
4357 s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
4358 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
4359 if (scan == STBI__SCAN_header) return 1;
4360 } else {
4361 // if paletted, then pal_n is our final components, and
4362 // img_n is # components to decompress/filter.
4363 s->img_n = 1;
4364 if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
4365 // if SCAN_header, have to scan to see if we have a tRNS
4367 break;
4370 case STBI__PNG_TYPE('P','L','T','E'): {
4371 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4372 if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4373 pal_len = c.length / 3;
4374 if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
4375 for (i=0; i < pal_len; ++i) {
4376 palette[i*4+0] = stbi__get8(s);
4377 palette[i*4+1] = stbi__get8(s);
4378 palette[i*4+2] = stbi__get8(s);
4379 palette[i*4+3] = 255;
4381 break;
4384 case STBI__PNG_TYPE('t','R','N','S'): {
4385 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4386 if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
4387 if (pal_img_n) {
4388 if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
4389 if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
4390 if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
4391 pal_img_n = 4;
4392 for (i=0; i < c.length; ++i)
4393 palette[i*4+3] = stbi__get8(s);
4394 } else {
4395 if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
4396 if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
4397 has_trans = 1;
4398 for (k=0; k < s->img_n; ++k)
4399 tc[k] = (stbi_uc) (stbi__get16be(s) & 255) * stbi__depth_scale_table[depth]; // non 8-bit images will be larger
4401 break;
4404 case STBI__PNG_TYPE('I','D','A','T'): {
4405 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4406 if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
4407 if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
4408 if ((int)(ioff + c.length) < (int)ioff) return 0;
4409 if (ioff + c.length > idata_limit) {
4410 stbi__uint32 idata_limit_old = idata_limit;
4411 stbi_uc *p;
4412 if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
4413 while (ioff + c.length > idata_limit)
4414 idata_limit *= 2;
4415 STBI_NOTUSED(idata_limit_old);
4416 p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
4417 z->idata = p;
4419 if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
4420 ioff += c.length;
4421 break;
4424 case STBI__PNG_TYPE('I','E','N','D'): {
4425 stbi__uint32 raw_len, bpl;
4426 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4427 if (scan != STBI__SCAN_load) return 1;
4428 if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
4429 // initial guess for decoded data size to avoid unnecessary reallocs
4430 bpl = (s->img_x * depth + 7) / 8; // bytes per line, per component
4431 raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
4432 z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
4433 if (z->expanded == NULL) return 0; // zlib should set error
4434 STBI_FREE(z->idata); z->idata = NULL;
4435 if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
4436 s->img_out_n = s->img_n+1;
4437 else
4438 s->img_out_n = s->img_n;
4439 if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, depth, color, interlace)) return 0;
4440 if (has_trans)
4441 if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
4442 if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
4443 stbi__de_iphone(z);
4444 if (pal_img_n) {
4445 // pal_img_n == 3 or 4
4446 s->img_n = pal_img_n; // record the actual colors we had
4447 s->img_out_n = pal_img_n;
4448 if (req_comp >= 3) s->img_out_n = req_comp;
4449 if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
4450 return 0;
4452 STBI_FREE(z->expanded); z->expanded = NULL;
4453 return 1;
4456 default:
4457 // if critical, fail
4458 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4459 if ((c.type & (1 << 29)) == 0) {
4460 #ifndef STBI_NO_FAILURE_STRINGS
4461 // not threadsafe
4462 static char invalid_chunk[] = "XXXX PNG chunk not known";
4463 invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
4464 invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
4465 invalid_chunk[2] = STBI__BYTECAST(c.type >> 8);
4466 invalid_chunk[3] = STBI__BYTECAST(c.type >> 0);
4467 #endif
4468 return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
4470 stbi__skip(s, c.length);
4471 break;
4473 // end of PNG chunk, read and skip CRC
4474 stbi__get32be(s);
4478 static unsigned char *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp)
4480 unsigned char *result=NULL;
4481 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
4482 if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
4483 result = p->out;
4484 p->out = NULL;
4485 if (req_comp && req_comp != p->s->img_out_n) {
4486 result = stbi__convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
4487 p->s->img_out_n = req_comp;
4488 if (result == NULL) return result;
4490 *x = p->s->img_x;
4491 *y = p->s->img_y;
4492 if (n) *n = p->s->img_out_n;
4494 STBI_FREE(p->out); p->out = NULL;
4495 STBI_FREE(p->expanded); p->expanded = NULL;
4496 STBI_FREE(p->idata); p->idata = NULL;
4498 return result;
4501 static unsigned char *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4503 stbi__png p;
4504 p.s = s;
4505 return stbi__do_png(&p, x,y,comp,req_comp);
4508 static int stbi__png_test(stbi__context *s)
4510 int r;
4511 r = stbi__check_png_header(s);
4512 stbi__rewind(s);
4513 return r;
4516 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
4518 if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
4519 stbi__rewind( p->s );
4520 return 0;
4522 if (x) *x = p->s->img_x;
4523 if (y) *y = p->s->img_y;
4524 if (comp) *comp = p->s->img_n;
4525 return 1;
4528 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
4530 stbi__png p;
4531 p.s = s;
4532 return stbi__png_info_raw(&p, x, y, comp);
4534 #endif
4536 // Microsoft/Windows BMP image
4538 #ifndef STBI_NO_BMP
4539 static int stbi__bmp_test_raw(stbi__context *s)
4541 int r;
4542 int sz;
4543 if (stbi__get8(s) != 'B') return 0;
4544 if (stbi__get8(s) != 'M') return 0;
4545 stbi__get32le(s); // discard filesize
4546 stbi__get16le(s); // discard reserved
4547 stbi__get16le(s); // discard reserved
4548 stbi__get32le(s); // discard data offset
4549 sz = stbi__get32le(s);
4550 r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
4551 return r;
4554 static int stbi__bmp_test(stbi__context *s)
4556 int r = stbi__bmp_test_raw(s);
4557 stbi__rewind(s);
4558 return r;
4562 // returns 0..31 for the highest set bit
4563 static int stbi__high_bit(unsigned int z)
4565 int n=0;
4566 if (z == 0) return -1;
4567 if (z >= 0x10000) n += 16, z >>= 16;
4568 if (z >= 0x00100) n += 8, z >>= 8;
4569 if (z >= 0x00010) n += 4, z >>= 4;
4570 if (z >= 0x00004) n += 2, z >>= 2;
4571 if (z >= 0x00002) n += 1, z >>= 1;
4572 return n;
4575 static int stbi__bitcount(unsigned int a)
4577 a = (a & 0x55555555) + ((a >> 1) & 0x55555555); // max 2
4578 a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // max 4
4579 a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
4580 a = (a + (a >> 8)); // max 16 per 8 bits
4581 a = (a + (a >> 16)); // max 32 per 8 bits
4582 return a & 0xff;
4585 static int stbi__shiftsigned(int v, int shift, int bits)
4587 int result;
4588 int z=0;
4590 if (shift < 0) v <<= -shift;
4591 else v >>= shift;
4592 result = v;
4594 z = bits;
4595 while (z < 8) {
4596 result += v >> z;
4597 z += bits;
4599 return result;
4602 typedef struct
4604 int bpp, offset, hsz;
4605 unsigned int mr,mg,mb,ma, all_a;
4606 } stbi__bmp_data;
4608 static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info)
4610 int hsz;
4611 if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
4612 stbi__get32le(s); // discard filesize
4613 stbi__get16le(s); // discard reserved
4614 stbi__get16le(s); // discard reserved
4615 info->offset = stbi__get32le(s);
4616 info->hsz = hsz = stbi__get32le(s);
4618 if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
4619 if (hsz == 12) {
4620 s->img_x = stbi__get16le(s);
4621 s->img_y = stbi__get16le(s);
4622 } else {
4623 s->img_x = stbi__get32le(s);
4624 s->img_y = stbi__get32le(s);
4626 if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
4627 info->bpp = stbi__get16le(s);
4628 if (info->bpp == 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
4629 if (hsz != 12) {
4630 int compress = stbi__get32le(s);
4631 if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
4632 stbi__get32le(s); // discard sizeof
4633 stbi__get32le(s); // discard hres
4634 stbi__get32le(s); // discard vres
4635 stbi__get32le(s); // discard colorsused
4636 stbi__get32le(s); // discard max important
4637 if (hsz == 40 || hsz == 56) {
4638 if (hsz == 56) {
4639 stbi__get32le(s);
4640 stbi__get32le(s);
4641 stbi__get32le(s);
4642 stbi__get32le(s);
4644 if (info->bpp == 16 || info->bpp == 32) {
4645 info->mr = info->mg = info->mb = 0;
4646 if (compress == 0) {
4647 if (info->bpp == 32) {
4648 info->mr = 0xffu << 16;
4649 info->mg = 0xffu << 8;
4650 info->mb = 0xffu << 0;
4651 info->ma = 0xffu << 24;
4652 info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
4653 } else {
4654 info->mr = 31u << 10;
4655 info->mg = 31u << 5;
4656 info->mb = 31u << 0;
4658 } else if (compress == 3) {
4659 info->mr = stbi__get32le(s);
4660 info->mg = stbi__get32le(s);
4661 info->mb = stbi__get32le(s);
4662 // not documented, but generated by photoshop and handled by mspaint
4663 if (info->mr == info->mg && info->mg == info->mb) {
4664 // ?!?!?
4665 return stbi__errpuc("bad BMP", "bad BMP");
4667 } else
4668 return stbi__errpuc("bad BMP", "bad BMP");
4670 } else {
4671 int i;
4672 if (hsz != 108 && hsz != 124)
4673 return stbi__errpuc("bad BMP", "bad BMP");
4674 info->mr = stbi__get32le(s);
4675 info->mg = stbi__get32le(s);
4676 info->mb = stbi__get32le(s);
4677 info->ma = stbi__get32le(s);
4678 stbi__get32le(s); // discard color space
4679 for (i=0; i < 12; ++i)
4680 stbi__get32le(s); // discard color space parameters
4681 if (hsz == 124) {
4682 stbi__get32le(s); // discard rendering intent
4683 stbi__get32le(s); // discard offset of profile data
4684 stbi__get32le(s); // discard size of profile data
4685 stbi__get32le(s); // discard reserved
4689 return (void *) 1;
4693 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4695 stbi_uc *out;
4696 unsigned int mr=0,mg=0,mb=0,ma=0, all_a;
4697 stbi_uc pal[256][4];
4698 int psize=0,i,j,width;
4699 int flip_vertically, pad, target;
4700 stbi__bmp_data info;
4702 info.all_a = 255;
4703 if (stbi__bmp_parse_header(s, &info) == NULL)
4704 return NULL; // error code already set
4706 flip_vertically = ((int) s->img_y) > 0;
4707 s->img_y = abs((int) s->img_y);
4709 mr = info.mr;
4710 mg = info.mg;
4711 mb = info.mb;
4712 ma = info.ma;
4713 all_a = info.all_a;
4715 if (info.hsz == 12) {
4716 if (info.bpp < 24)
4717 psize = (info.offset - 14 - 24) / 3;
4718 } else {
4719 if (info.bpp < 16)
4720 psize = (info.offset - 14 - info.hsz) >> 2;
4723 s->img_n = ma ? 4 : 3;
4724 if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
4725 target = req_comp;
4726 else
4727 target = s->img_n; // if they want monochrome, we'll post-convert
4729 out = (stbi_uc *) stbi__malloc(target * s->img_x * s->img_y);
4730 if (!out) return stbi__errpuc("outofmem", "Out of memory");
4731 if (info.bpp < 16) {
4732 int z=0;
4733 if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
4734 for (i=0; i < psize; ++i) {
4735 pal[i][2] = stbi__get8(s);
4736 pal[i][1] = stbi__get8(s);
4737 pal[i][0] = stbi__get8(s);
4738 if (info.hsz != 12) stbi__get8(s);
4739 pal[i][3] = 255;
4741 stbi__skip(s, info.offset - 14 - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
4742 if (info.bpp == 4) width = (s->img_x + 1) >> 1;
4743 else if (info.bpp == 8) width = s->img_x;
4744 else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
4745 pad = (-width)&3;
4746 for (j=0; j < (int) s->img_y; ++j) {
4747 for (i=0; i < (int) s->img_x; i += 2) {
4748 int v=stbi__get8(s),v2=0;
4749 if (info.bpp == 4) {
4750 v2 = v & 15;
4751 v >>= 4;
4753 out[z++] = pal[v][0];
4754 out[z++] = pal[v][1];
4755 out[z++] = pal[v][2];
4756 if (target == 4) out[z++] = 255;
4757 if (i+1 == (int) s->img_x) break;
4758 v = (info.bpp == 8) ? stbi__get8(s) : v2;
4759 out[z++] = pal[v][0];
4760 out[z++] = pal[v][1];
4761 out[z++] = pal[v][2];
4762 if (target == 4) out[z++] = 255;
4764 stbi__skip(s, pad);
4766 } else {
4767 int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
4768 int z = 0;
4769 int easy=0;
4770 stbi__skip(s, info.offset - 14 - info.hsz);
4771 if (info.bpp == 24) width = 3 * s->img_x;
4772 else if (info.bpp == 16) width = 2*s->img_x;
4773 else /* bpp = 32 and pad = 0 */ width=0;
4774 pad = (-width) & 3;
4775 if (info.bpp == 24) {
4776 easy = 1;
4777 } else if (info.bpp == 32) {
4778 if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
4779 easy = 2;
4781 if (!easy) {
4782 if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
4783 // right shift amt to put high bit in position #7
4784 rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
4785 gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
4786 bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
4787 ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
4789 for (j=0; j < (int) s->img_y; ++j) {
4790 if (easy) {
4791 for (i=0; i < (int) s->img_x; ++i) {
4792 unsigned char a;
4793 out[z+2] = stbi__get8(s);
4794 out[z+1] = stbi__get8(s);
4795 out[z+0] = stbi__get8(s);
4796 z += 3;
4797 a = (easy == 2 ? stbi__get8(s) : 255);
4798 all_a |= a;
4799 if (target == 4) out[z++] = a;
4801 } else {
4802 int bpp = info.bpp;
4803 for (i=0; i < (int) s->img_x; ++i) {
4804 stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
4805 int a;
4806 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
4807 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
4808 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
4809 a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
4810 all_a |= a;
4811 if (target == 4) out[z++] = STBI__BYTECAST(a);
4814 stbi__skip(s, pad);
4818 // if alpha channel is all 0s, replace with all 255s
4819 if (target == 4 && all_a == 0)
4820 for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
4821 out[i] = 255;
4823 if (flip_vertically) {
4824 stbi_uc t;
4825 for (j=0; j < (int) s->img_y>>1; ++j) {
4826 stbi_uc *p1 = out + j *s->img_x*target;
4827 stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
4828 for (i=0; i < (int) s->img_x*target; ++i) {
4829 t = p1[i], p1[i] = p2[i], p2[i] = t;
4834 if (req_comp && req_comp != target) {
4835 out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
4836 if (out == NULL) return out; // stbi__convert_format frees input on failure
4839 *x = s->img_x;
4840 *y = s->img_y;
4841 if (comp) *comp = s->img_n;
4842 return out;
4844 #endif
4846 // Targa Truevision - TGA
4847 // by Jonathan Dummer
4848 #ifndef STBI_NO_TGA
4849 // returns STBI_rgb or whatever, 0 on error
4850 static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
4852 // only RGB or RGBA (incl. 16bit) or grey allowed
4853 if(is_rgb16) *is_rgb16 = 0;
4854 switch(bits_per_pixel) {
4855 case 8: return STBI_grey;
4856 case 16: if(is_grey) return STBI_grey_alpha;
4857 // else: fall-through
4858 case 15: if(is_rgb16) *is_rgb16 = 1;
4859 return STBI_rgb;
4860 case 24: // fall-through
4861 case 32: return bits_per_pixel/8;
4862 default: return 0;
4866 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
4868 int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
4869 int sz, tga_colormap_type;
4870 stbi__get8(s); // discard Offset
4871 tga_colormap_type = stbi__get8(s); // colormap type
4872 if( tga_colormap_type > 1 ) {
4873 stbi__rewind(s);
4874 return 0; // only RGB or indexed allowed
4876 tga_image_type = stbi__get8(s); // image type
4877 if ( tga_colormap_type == 1 ) { // colormapped (paletted) image
4878 if (tga_image_type != 1 && tga_image_type != 9) {
4879 stbi__rewind(s);
4880 return 0;
4882 stbi__skip(s,4); // skip index of first colormap entry and number of entries
4883 sz = stbi__get8(s); // check bits per palette color entry
4884 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) {
4885 stbi__rewind(s);
4886 return 0;
4888 stbi__skip(s,4); // skip image x and y origin
4889 tga_colormap_bpp = sz;
4890 } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
4891 if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) {
4892 stbi__rewind(s);
4893 return 0; // only RGB or grey allowed, +/- RLE
4895 stbi__skip(s,9); // skip colormap specification and image x/y origin
4896 tga_colormap_bpp = 0;
4898 tga_w = stbi__get16le(s);
4899 if( tga_w < 1 ) {
4900 stbi__rewind(s);
4901 return 0; // test width
4903 tga_h = stbi__get16le(s);
4904 if( tga_h < 1 ) {
4905 stbi__rewind(s);
4906 return 0; // test height
4908 tga_bits_per_pixel = stbi__get8(s); // bits per pixel
4909 stbi__get8(s); // ignore alpha bits
4910 if (tga_colormap_bpp != 0) {
4911 if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
4912 // when using a colormap, tga_bits_per_pixel is the size of the indexes
4913 // I don't think anything but 8 or 16bit indexes makes sense
4914 stbi__rewind(s);
4915 return 0;
4917 tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
4918 } else {
4919 tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
4921 if(!tga_comp) {
4922 stbi__rewind(s);
4923 return 0;
4925 if (x) *x = tga_w;
4926 if (y) *y = tga_h;
4927 if (comp) *comp = tga_comp;
4928 return 1; // seems to have passed everything
4931 static int stbi__tga_test(stbi__context *s)
4933 int res = 0;
4934 int sz, tga_color_type;
4935 stbi__get8(s); // discard Offset
4936 tga_color_type = stbi__get8(s); // color type
4937 if ( tga_color_type > 1 ) goto errorEnd; // only RGB or indexed allowed
4938 sz = stbi__get8(s); // image type
4939 if ( tga_color_type == 1 ) { // colormapped (paletted) image
4940 if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
4941 stbi__skip(s,4); // skip index of first colormap entry and number of entries
4942 sz = stbi__get8(s); // check bits per palette color entry
4943 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
4944 stbi__skip(s,4); // skip image x and y origin
4945 } else { // "normal" image w/o colormap
4946 if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE
4947 stbi__skip(s,9); // skip colormap specification and image x/y origin
4949 if ( stbi__get16le(s) < 1 ) goto errorEnd; // test width
4950 if ( stbi__get16le(s) < 1 ) goto errorEnd; // test height
4951 sz = stbi__get8(s); // bits per pixel
4952 if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index
4953 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
4955 res = 1; // if we got this far, everything's good and we can return 1 instead of 0
4957 errorEnd:
4958 stbi__rewind(s);
4959 return res;
4962 // read 16bit value and convert to 24bit RGB
4963 void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out)
4965 stbi__uint16 px = stbi__get16le(s);
4966 stbi__uint16 fiveBitMask = 31;
4967 // we have 3 channels with 5bits each
4968 int r = (px >> 10) & fiveBitMask;
4969 int g = (px >> 5) & fiveBitMask;
4970 int b = px & fiveBitMask;
4971 // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
4972 out[0] = (r * 255)/31;
4973 out[1] = (g * 255)/31;
4974 out[2] = (b * 255)/31;
4976 // some people claim that the most significant bit might be used for alpha
4977 // (possibly if an alpha-bit is set in the "image descriptor byte")
4978 // but that only made 16bit test images completely translucent..
4979 // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
4982 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4984 // read in the TGA header stuff
4985 int tga_offset = stbi__get8(s);
4986 int tga_indexed = stbi__get8(s);
4987 int tga_image_type = stbi__get8(s);
4988 int tga_is_RLE = 0;
4989 int tga_palette_start = stbi__get16le(s);
4990 int tga_palette_len = stbi__get16le(s);
4991 int tga_palette_bits = stbi__get8(s);
4992 int tga_x_origin = stbi__get16le(s);
4993 int tga_y_origin = stbi__get16le(s);
4994 int tga_width = stbi__get16le(s);
4995 int tga_height = stbi__get16le(s);
4996 int tga_bits_per_pixel = stbi__get8(s);
4997 int tga_comp, tga_rgb16=0;
4998 int tga_inverted = stbi__get8(s);
4999 // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
5000 // image data
5001 unsigned char *tga_data;
5002 unsigned char *tga_palette = NULL;
5003 int i, j;
5004 unsigned char raw_data[4];
5005 int RLE_count = 0;
5006 int RLE_repeating = 0;
5007 int read_next_pixel = 1;
5009 // do a tiny bit of precessing
5010 if ( tga_image_type >= 8 )
5012 tga_image_type -= 8;
5013 tga_is_RLE = 1;
5015 tga_inverted = 1 - ((tga_inverted >> 5) & 1);
5017 // If I'm paletted, then I'll use the number of bits from the palette
5018 if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
5019 else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
5021 if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
5022 return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
5024 // tga info
5025 *x = tga_width;
5026 *y = tga_height;
5027 if (comp) *comp = tga_comp;
5029 tga_data = (unsigned char*)stbi__malloc( (size_t)tga_width * tga_height * tga_comp );
5030 if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
5032 // skip to the data's starting position (offset usually = 0)
5033 stbi__skip(s, tga_offset );
5035 if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) {
5036 for (i=0; i < tga_height; ++i) {
5037 int row = tga_inverted ? tga_height -i - 1 : i;
5038 stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
5039 stbi__getn(s, tga_row, tga_width * tga_comp);
5041 } else {
5042 // do I need to load a palette?
5043 if ( tga_indexed)
5045 // any data to skip? (offset usually = 0)
5046 stbi__skip(s, tga_palette_start );
5047 // load the palette
5048 tga_palette = (unsigned char*)stbi__malloc( tga_palette_len * tga_comp );
5049 if (!tga_palette) {
5050 STBI_FREE(tga_data);
5051 return stbi__errpuc("outofmem", "Out of memory");
5053 if (tga_rgb16) {
5054 stbi_uc *pal_entry = tga_palette;
5055 STBI_ASSERT(tga_comp == STBI_rgb);
5056 for (i=0; i < tga_palette_len; ++i) {
5057 stbi__tga_read_rgb16(s, pal_entry);
5058 pal_entry += tga_comp;
5060 } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
5061 STBI_FREE(tga_data);
5062 STBI_FREE(tga_palette);
5063 return stbi__errpuc("bad palette", "Corrupt TGA");
5066 // load the data
5067 for (i=0; i < tga_width * tga_height; ++i)
5069 // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
5070 if ( tga_is_RLE )
5072 if ( RLE_count == 0 )
5074 // yep, get the next byte as a RLE command
5075 int RLE_cmd = stbi__get8(s);
5076 RLE_count = 1 + (RLE_cmd & 127);
5077 RLE_repeating = RLE_cmd >> 7;
5078 read_next_pixel = 1;
5079 } else if ( !RLE_repeating )
5081 read_next_pixel = 1;
5083 } else
5085 read_next_pixel = 1;
5087 // OK, if I need to read a pixel, do it now
5088 if ( read_next_pixel )
5090 // load however much data we did have
5091 if ( tga_indexed )
5093 // read in index, then perform the lookup
5094 int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
5095 if ( pal_idx >= tga_palette_len ) {
5096 // invalid index
5097 pal_idx = 0;
5099 pal_idx *= tga_comp;
5100 for (j = 0; j < tga_comp; ++j) {
5101 raw_data[j] = tga_palette[pal_idx+j];
5103 } else if(tga_rgb16) {
5104 STBI_ASSERT(tga_comp == STBI_rgb);
5105 stbi__tga_read_rgb16(s, raw_data);
5106 } else {
5107 // read in the data raw
5108 for (j = 0; j < tga_comp; ++j) {
5109 raw_data[j] = stbi__get8(s);
5112 // clear the reading flag for the next pixel
5113 read_next_pixel = 0;
5114 } // end of reading a pixel
5116 // copy data
5117 for (j = 0; j < tga_comp; ++j)
5118 tga_data[i*tga_comp+j] = raw_data[j];
5120 // in case we're in RLE mode, keep counting down
5121 --RLE_count;
5123 // do I need to invert the image?
5124 if ( tga_inverted )
5126 for (j = 0; j*2 < tga_height; ++j)
5128 int index1 = j * tga_width * tga_comp;
5129 int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
5130 for (i = tga_width * tga_comp; i > 0; --i)
5132 unsigned char temp = tga_data[index1];
5133 tga_data[index1] = tga_data[index2];
5134 tga_data[index2] = temp;
5135 ++index1;
5136 ++index2;
5140 // clear my palette, if I had one
5141 if ( tga_palette != NULL )
5143 STBI_FREE( tga_palette );
5147 // swap RGB - if the source data was RGB16, it already is in the right order
5148 if (tga_comp >= 3 && !tga_rgb16)
5150 unsigned char* tga_pixel = tga_data;
5151 for (i=0; i < tga_width * tga_height; ++i)
5153 unsigned char temp = tga_pixel[0];
5154 tga_pixel[0] = tga_pixel[2];
5155 tga_pixel[2] = temp;
5156 tga_pixel += tga_comp;
5160 // convert to target component count
5161 if (req_comp && req_comp != tga_comp)
5162 tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
5164 // the things I do to get rid of an error message, and yet keep
5165 // Microsoft's C compilers happy... [8^(
5166 tga_palette_start = tga_palette_len = tga_palette_bits =
5167 tga_x_origin = tga_y_origin = 0;
5168 // OK, done
5169 return tga_data;
5171 #endif
5173 // *************************************************************************************************
5174 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5176 #ifndef STBI_NO_PSD
5177 static int stbi__psd_test(stbi__context *s)
5179 int r = (stbi__get32be(s) == 0x38425053);
5180 stbi__rewind(s);
5181 return r;
5184 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5186 int pixelCount;
5187 int channelCount, compression;
5188 int channel, i, count, len;
5189 int bitdepth;
5190 int w,h;
5191 stbi_uc *out;
5193 // Check identifier
5194 if (stbi__get32be(s) != 0x38425053) // "8BPS"
5195 return stbi__errpuc("not PSD", "Corrupt PSD image");
5197 // Check file type version.
5198 if (stbi__get16be(s) != 1)
5199 return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5201 // Skip 6 reserved bytes.
5202 stbi__skip(s, 6 );
5204 // Read the number of channels (R, G, B, A, etc).
5205 channelCount = stbi__get16be(s);
5206 if (channelCount < 0 || channelCount > 16)
5207 return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5209 // Read the rows and columns of the image.
5210 h = stbi__get32be(s);
5211 w = stbi__get32be(s);
5213 // Make sure the depth is 8 bits.
5214 bitdepth = stbi__get16be(s);
5215 if (bitdepth != 8 && bitdepth != 16)
5216 return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
5218 // Make sure the color mode is RGB.
5219 // Valid options are:
5220 // 0: Bitmap
5221 // 1: Grayscale
5222 // 2: Indexed color
5223 // 3: RGB color
5224 // 4: CMYK color
5225 // 7: Multichannel
5226 // 8: Duotone
5227 // 9: Lab color
5228 if (stbi__get16be(s) != 3)
5229 return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5231 // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.)
5232 stbi__skip(s,stbi__get32be(s) );
5234 // Skip the image resources. (resolution, pen tool paths, etc)
5235 stbi__skip(s, stbi__get32be(s) );
5237 // Skip the reserved data.
5238 stbi__skip(s, stbi__get32be(s) );
5240 // Find out if the data is compressed.
5241 // Known values:
5242 // 0: no compression
5243 // 1: RLE compressed
5244 compression = stbi__get16be(s);
5245 if (compression > 1)
5246 return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5248 // Create the destination image.
5249 out = (stbi_uc *) stbi__malloc(4 * w*h);
5250 if (!out) return stbi__errpuc("outofmem", "Out of memory");
5251 pixelCount = w*h;
5253 // Initialize the data to zero.
5254 //memset( out, 0, pixelCount * 4 );
5256 // Finally, the image data.
5257 if (compression) {
5258 // RLE as used by .PSD and .TIFF
5259 // Loop until you get the number of unpacked bytes you are expecting:
5260 // Read the next source byte into n.
5261 // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5262 // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5263 // Else if n is 128, noop.
5264 // Endloop
5266 // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
5267 // which we're going to just skip.
5268 stbi__skip(s, h * channelCount * 2 );
5270 // Read the RLE data by channel.
5271 for (channel = 0; channel < 4; channel++) {
5272 stbi_uc *p;
5274 p = out+channel;
5275 if (channel >= channelCount) {
5276 // Fill this channel with default data.
5277 for (i = 0; i < pixelCount; i++, p += 4)
5278 *p = (channel == 3 ? 255 : 0);
5279 } else {
5280 // Read the RLE data.
5281 count = 0;
5282 while (count < pixelCount) {
5283 len = stbi__get8(s);
5284 if (len == 128) {
5285 // No-op.
5286 } else if (len < 128) {
5287 // Copy next len+1 bytes literally.
5288 len++;
5289 count += len;
5290 while (len) {
5291 *p = stbi__get8(s);
5292 p += 4;
5293 len--;
5295 } else if (len > 128) {
5296 stbi_uc val;
5297 // Next -len+1 bytes in the dest are replicated from next source byte.
5298 // (Interpret len as a negative 8-bit int.)
5299 len ^= 0x0FF;
5300 len += 2;
5301 val = stbi__get8(s);
5302 count += len;
5303 while (len) {
5304 *p = val;
5305 p += 4;
5306 len--;
5313 } else {
5314 // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
5315 // where each channel consists of an 8-bit value for each pixel in the image.
5317 // Read the data by channel.
5318 for (channel = 0; channel < 4; channel++) {
5319 stbi_uc *p;
5321 p = out + channel;
5322 if (channel >= channelCount) {
5323 // Fill this channel with default data.
5324 stbi_uc val = channel == 3 ? 255 : 0;
5325 for (i = 0; i < pixelCount; i++, p += 4)
5326 *p = val;
5327 } else {
5328 // Read the data.
5329 if (bitdepth == 16) {
5330 for (i = 0; i < pixelCount; i++, p += 4)
5331 *p = (stbi_uc) (stbi__get16be(s) >> 8);
5332 } else {
5333 for (i = 0; i < pixelCount; i++, p += 4)
5334 *p = stbi__get8(s);
5340 if (req_comp && req_comp != 4) {
5341 out = stbi__convert_format(out, 4, req_comp, w, h);
5342 if (out == NULL) return out; // stbi__convert_format frees input on failure
5345 if (comp) *comp = 4;
5346 *y = h;
5347 *x = w;
5349 return out;
5351 #endif
5353 // *************************************************************************************************
5354 // Softimage PIC loader
5355 // by Tom Seddon
5357 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5358 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5360 #ifndef STBI_NO_PIC
5361 static int stbi__pic_is4(stbi__context *s,const char *str)
5363 int i;
5364 for (i=0; i<4; ++i)
5365 if (stbi__get8(s) != (stbi_uc)str[i])
5366 return 0;
5368 return 1;
5371 static int stbi__pic_test_core(stbi__context *s)
5373 int i;
5375 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
5376 return 0;
5378 for(i=0;i<84;++i)
5379 stbi__get8(s);
5381 if (!stbi__pic_is4(s,"PICT"))
5382 return 0;
5384 return 1;
5387 typedef struct
5389 stbi_uc size,type,channel;
5390 } stbi__pic_packet;
5392 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
5394 int mask=0x80, i;
5396 for (i=0; i<4; ++i, mask>>=1) {
5397 if (channel & mask) {
5398 if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
5399 dest[i]=stbi__get8(s);
5403 return dest;
5406 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
5408 int mask=0x80,i;
5410 for (i=0;i<4; ++i, mask>>=1)
5411 if (channel&mask)
5412 dest[i]=src[i];
5415 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
5417 int act_comp=0,num_packets=0,y,chained;
5418 stbi__pic_packet packets[10];
5420 // this will (should...) cater for even some bizarre stuff like having data
5421 // for the same channel in multiple packets.
5422 do {
5423 stbi__pic_packet *packet;
5425 if (num_packets==sizeof(packets)/sizeof(packets[0]))
5426 return stbi__errpuc("bad format","too many packets");
5428 packet = &packets[num_packets++];
5430 chained = stbi__get8(s);
5431 packet->size = stbi__get8(s);
5432 packet->type = stbi__get8(s);
5433 packet->channel = stbi__get8(s);
5435 act_comp |= packet->channel;
5437 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (reading packets)");
5438 if (packet->size != 8) return stbi__errpuc("bad format","packet isn't 8bpp");
5439 } while (chained);
5441 *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
5443 for(y=0; y<height; ++y) {
5444 int packet_idx;
5446 for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
5447 stbi__pic_packet *packet = &packets[packet_idx];
5448 stbi_uc *dest = result+y*width*4;
5450 switch (packet->type) {
5451 default:
5452 return stbi__errpuc("bad format","packet has bad compression type");
5454 case 0: {//uncompressed
5455 int x;
5457 for(x=0;x<width;++x, dest+=4)
5458 if (!stbi__readval(s,packet->channel,dest))
5459 return 0;
5460 break;
5463 case 1://Pure RLE
5465 int left=width, i;
5467 while (left>0) {
5468 stbi_uc count,value[4];
5470 count=stbi__get8(s);
5471 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pure read count)");
5473 if (count > left)
5474 count = (stbi_uc) left;
5476 if (!stbi__readval(s,packet->channel,value)) return 0;
5478 for(i=0; i<count; ++i,dest+=4)
5479 stbi__copyval(packet->channel,dest,value);
5480 left -= count;
5483 break;
5485 case 2: {//Mixed RLE
5486 int left=width;
5487 while (left>0) {
5488 int count = stbi__get8(s), i;
5489 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (mixed read count)");
5491 if (count >= 128) { // Repeated
5492 stbi_uc value[4];
5494 if (count==128)
5495 count = stbi__get16be(s);
5496 else
5497 count -= 127;
5498 if (count > left)
5499 return stbi__errpuc("bad file","scanline overrun");
5501 if (!stbi__readval(s,packet->channel,value))
5502 return 0;
5504 for(i=0;i<count;++i, dest += 4)
5505 stbi__copyval(packet->channel,dest,value);
5506 } else { // Raw
5507 ++count;
5508 if (count>left) return stbi__errpuc("bad file","scanline overrun");
5510 for(i=0;i<count;++i, dest+=4)
5511 if (!stbi__readval(s,packet->channel,dest))
5512 return 0;
5514 left-=count;
5516 break;
5522 return result;
5525 static stbi_uc *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp)
5527 stbi_uc *result;
5528 int i, x,y;
5530 for (i=0; i<92; ++i)
5531 stbi__get8(s);
5533 x = stbi__get16be(s);
5534 y = stbi__get16be(s);
5535 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pic header)");
5536 if ((1 << 28) / x < y) return stbi__errpuc("too large", "Image too large to decode");
5538 stbi__get32be(s); //skip `ratio'
5539 stbi__get16be(s); //skip `fields'
5540 stbi__get16be(s); //skip `pad'
5542 // intermediate buffer is RGBA
5543 result = (stbi_uc *) stbi__malloc(x*y*4);
5544 memset(result, 0xff, x*y*4);
5546 if (!stbi__pic_load_core(s,x,y,comp, result)) {
5547 STBI_FREE(result);
5548 result=0;
5550 *px = x;
5551 *py = y;
5552 if (req_comp == 0) req_comp = *comp;
5553 result=stbi__convert_format(result,4,req_comp,x,y);
5555 return result;
5558 static int stbi__pic_test(stbi__context *s)
5560 int r = stbi__pic_test_core(s);
5561 stbi__rewind(s);
5562 return r;
5564 #endif
5566 // *************************************************************************************************
5567 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
5569 #ifndef STBI_NO_GIF
5570 typedef struct
5572 stbi__int16 prefix;
5573 stbi_uc first;
5574 stbi_uc suffix;
5575 } stbi__gif_lzw;
5577 typedef struct
5579 int w,h;
5580 stbi_uc *out, *old_out; // output buffer (always 4 components)
5581 int flags, bgindex, ratio, transparent, eflags, delay;
5582 stbi_uc pal[256][4];
5583 stbi_uc lpal[256][4];
5584 stbi__gif_lzw codes[4096];
5585 stbi_uc *color_table;
5586 int parse, step;
5587 int lflags;
5588 int start_x, start_y;
5589 int max_x, max_y;
5590 int cur_x, cur_y;
5591 int line_size;
5592 } stbi__gif;
5594 static int stbi__gif_test_raw(stbi__context *s)
5596 int sz;
5597 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
5598 sz = stbi__get8(s);
5599 if (sz != '9' && sz != '7') return 0;
5600 if (stbi__get8(s) != 'a') return 0;
5601 return 1;
5604 static int stbi__gif_test(stbi__context *s)
5606 int r = stbi__gif_test_raw(s);
5607 stbi__rewind(s);
5608 return r;
5611 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
5613 int i;
5614 for (i=0; i < num_entries; ++i) {
5615 pal[i][2] = stbi__get8(s);
5616 pal[i][1] = stbi__get8(s);
5617 pal[i][0] = stbi__get8(s);
5618 pal[i][3] = transp == i ? 0 : 255;
5622 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
5624 stbi_uc version;
5625 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
5626 return stbi__err("not GIF", "Corrupt GIF");
5628 version = stbi__get8(s);
5629 if (version != '7' && version != '9') return stbi__err("not GIF", "Corrupt GIF");
5630 if (stbi__get8(s) != 'a') return stbi__err("not GIF", "Corrupt GIF");
5632 stbi__g_failure_reason = "";
5633 g->w = stbi__get16le(s);
5634 g->h = stbi__get16le(s);
5635 g->flags = stbi__get8(s);
5636 g->bgindex = stbi__get8(s);
5637 g->ratio = stbi__get8(s);
5638 g->transparent = -1;
5640 if (comp != 0) *comp = 4; // can't actually tell whether it's 3 or 4 until we parse the comments
5642 if (is_info) return 1;
5644 if (g->flags & 0x80)
5645 stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
5647 return 1;
5650 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
5652 stbi__gif g;
5653 if (!stbi__gif_header(s, &g, comp, 1)) {
5654 stbi__rewind( s );
5655 return 0;
5657 if (x) *x = g.w;
5658 if (y) *y = g.h;
5659 return 1;
5662 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
5664 stbi_uc *p, *c;
5666 // recurse to decode the prefixes, since the linked-list is backwards,
5667 // and working backwards through an interleaved image would be nasty
5668 if (g->codes[code].prefix >= 0)
5669 stbi__out_gif_code(g, g->codes[code].prefix);
5671 if (g->cur_y >= g->max_y) return;
5673 p = &g->out[g->cur_x + g->cur_y];
5674 c = &g->color_table[g->codes[code].suffix * 4];
5676 if (c[3] >= 128) {
5677 p[0] = c[2];
5678 p[1] = c[1];
5679 p[2] = c[0];
5680 p[3] = c[3];
5682 g->cur_x += 4;
5684 if (g->cur_x >= g->max_x) {
5685 g->cur_x = g->start_x;
5686 g->cur_y += g->step;
5688 while (g->cur_y >= g->max_y && g->parse > 0) {
5689 g->step = (1 << g->parse) * g->line_size;
5690 g->cur_y = g->start_y + (g->step >> 1);
5691 --g->parse;
5696 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
5698 stbi_uc lzw_cs;
5699 stbi__int32 len, init_code;
5700 stbi__uint32 first;
5701 stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
5702 stbi__gif_lzw *p;
5704 lzw_cs = stbi__get8(s);
5705 if (lzw_cs > 12) return NULL;
5706 clear = 1 << lzw_cs;
5707 first = 1;
5708 codesize = lzw_cs + 1;
5709 codemask = (1 << codesize) - 1;
5710 bits = 0;
5711 valid_bits = 0;
5712 for (init_code = 0; init_code < clear; init_code++) {
5713 g->codes[init_code].prefix = -1;
5714 g->codes[init_code].first = (stbi_uc) init_code;
5715 g->codes[init_code].suffix = (stbi_uc) init_code;
5718 // support no starting clear code
5719 avail = clear+2;
5720 oldcode = -1;
5722 len = 0;
5723 for(;;) {
5724 if (valid_bits < codesize) {
5725 if (len == 0) {
5726 len = stbi__get8(s); // start new block
5727 if (len == 0)
5728 return g->out;
5730 --len;
5731 bits |= (stbi__int32) stbi__get8(s) << valid_bits;
5732 valid_bits += 8;
5733 } else {
5734 stbi__int32 code = bits & codemask;
5735 bits >>= codesize;
5736 valid_bits -= codesize;
5737 // @OPTIMIZE: is there some way we can accelerate the non-clear path?
5738 if (code == clear) { // clear code
5739 codesize = lzw_cs + 1;
5740 codemask = (1 << codesize) - 1;
5741 avail = clear + 2;
5742 oldcode = -1;
5743 first = 0;
5744 } else if (code == clear + 1) { // end of stream code
5745 stbi__skip(s, len);
5746 while ((len = stbi__get8(s)) > 0)
5747 stbi__skip(s,len);
5748 return g->out;
5749 } else if (code <= avail) {
5750 if (first) return stbi__errpuc("no clear code", "Corrupt GIF");
5752 if (oldcode >= 0) {
5753 p = &g->codes[avail++];
5754 if (avail > 4096) return stbi__errpuc("too many codes", "Corrupt GIF");
5755 p->prefix = (stbi__int16) oldcode;
5756 p->first = g->codes[oldcode].first;
5757 p->suffix = (code == avail) ? p->first : g->codes[code].first;
5758 } else if (code == avail)
5759 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5761 stbi__out_gif_code(g, (stbi__uint16) code);
5763 if ((avail & codemask) == 0 && avail <= 0x0FFF) {
5764 codesize++;
5765 codemask = (1 << codesize) - 1;
5768 oldcode = code;
5769 } else {
5770 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5776 static void stbi__fill_gif_background(stbi__gif *g, int x0, int y0, int x1, int y1)
5778 int x, y;
5779 stbi_uc *c = g->pal[g->bgindex];
5780 for (y = y0; y < y1; y += 4 * g->w) {
5781 for (x = x0; x < x1; x += 4) {
5782 stbi_uc *p = &g->out[y + x];
5783 p[0] = c[2];
5784 p[1] = c[1];
5785 p[2] = c[0];
5786 p[3] = 0;
5791 // this function is designed to support animated gifs, although stb_image doesn't support it
5792 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp)
5794 int i;
5795 stbi_uc *prev_out = 0;
5797 if (g->out == 0 && !stbi__gif_header(s, g, comp,0))
5798 return 0; // stbi__g_failure_reason set by stbi__gif_header
5800 prev_out = g->out;
5801 g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5802 if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5804 switch ((g->eflags & 0x1C) >> 2) {
5805 case 0: // unspecified (also always used on 1st frame)
5806 stbi__fill_gif_background(g, 0, 0, 4 * g->w, 4 * g->w * g->h);
5807 break;
5808 case 1: // do not dispose
5809 if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5810 g->old_out = prev_out;
5811 break;
5812 case 2: // dispose to background
5813 if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5814 stbi__fill_gif_background(g, g->start_x, g->start_y, g->max_x, g->max_y);
5815 break;
5816 case 3: // dispose to previous
5817 if (g->old_out) {
5818 for (i = g->start_y; i < g->max_y; i += 4 * g->w)
5819 memcpy(&g->out[i + g->start_x], &g->old_out[i + g->start_x], g->max_x - g->start_x);
5821 break;
5824 for (;;) {
5825 switch (stbi__get8(s)) {
5826 case 0x2C: /* Image Descriptor */
5828 int prev_trans = -1;
5829 stbi__int32 x, y, w, h;
5830 stbi_uc *o;
5832 x = stbi__get16le(s);
5833 y = stbi__get16le(s);
5834 w = stbi__get16le(s);
5835 h = stbi__get16le(s);
5836 if (((x + w) > (g->w)) || ((y + h) > (g->h)))
5837 return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
5839 g->line_size = g->w * 4;
5840 g->start_x = x * 4;
5841 g->start_y = y * g->line_size;
5842 g->max_x = g->start_x + w * 4;
5843 g->max_y = g->start_y + h * g->line_size;
5844 g->cur_x = g->start_x;
5845 g->cur_y = g->start_y;
5847 g->lflags = stbi__get8(s);
5849 if (g->lflags & 0x40) {
5850 g->step = 8 * g->line_size; // first interlaced spacing
5851 g->parse = 3;
5852 } else {
5853 g->step = g->line_size;
5854 g->parse = 0;
5857 if (g->lflags & 0x80) {
5858 stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
5859 g->color_table = (stbi_uc *) g->lpal;
5860 } else if (g->flags & 0x80) {
5861 if (g->transparent >= 0 && (g->eflags & 0x01)) {
5862 prev_trans = g->pal[g->transparent][3];
5863 g->pal[g->transparent][3] = 0;
5865 g->color_table = (stbi_uc *) g->pal;
5866 } else
5867 return stbi__errpuc("missing color table", "Corrupt GIF");
5869 o = stbi__process_gif_raster(s, g);
5870 if (o == NULL) return NULL;
5872 if (prev_trans != -1)
5873 g->pal[g->transparent][3] = (stbi_uc) prev_trans;
5875 return o;
5878 case 0x21: // Comment Extension.
5880 int len;
5881 if (stbi__get8(s) == 0xF9) { // Graphic Control Extension.
5882 len = stbi__get8(s);
5883 if (len == 4) {
5884 g->eflags = stbi__get8(s);
5885 g->delay = stbi__get16le(s);
5886 g->transparent = stbi__get8(s);
5887 } else {
5888 stbi__skip(s, len);
5889 break;
5892 while ((len = stbi__get8(s)) != 0)
5893 stbi__skip(s, len);
5894 break;
5897 case 0x3B: // gif stream termination code
5898 return (stbi_uc *) s; // using '1' causes warning on some compilers
5900 default:
5901 return stbi__errpuc("unknown code", "Corrupt GIF");
5905 STBI_NOTUSED(req_comp);
5908 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5910 stbi_uc *u = 0;
5911 stbi__gif g;
5912 memset(&g, 0, sizeof(g));
5914 u = stbi__gif_load_next(s, &g, comp, req_comp);
5915 if (u == (stbi_uc *) s) u = 0; // end of animated gif marker
5916 if (u) {
5917 *x = g.w;
5918 *y = g.h;
5919 if (req_comp && req_comp != 4)
5920 u = stbi__convert_format(u, 4, req_comp, g.w, g.h);
5922 else if (g.out)
5923 STBI_FREE(g.out);
5925 return u;
5928 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
5930 return stbi__gif_info_raw(s,x,y,comp);
5932 #endif
5934 // *************************************************************************************************
5935 // Radiance RGBE HDR loader
5936 // originally by Nicolas Schulz
5937 #ifndef STBI_NO_HDR
5938 static int stbi__hdr_test_core(stbi__context *s)
5940 const char *signature = "#?RADIANCE\n";
5941 int i;
5942 for (i=0; signature[i]; ++i)
5943 if (stbi__get8(s) != signature[i])
5944 return 0;
5945 return 1;
5948 static int stbi__hdr_test(stbi__context* s)
5950 int r = stbi__hdr_test_core(s);
5951 stbi__rewind(s);
5952 return r;
5955 #define STBI__HDR_BUFLEN 1024
5956 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
5958 int len=0;
5959 char c = '\0';
5961 c = (char) stbi__get8(z);
5963 while (!stbi__at_eof(z) && c != '\n') {
5964 buffer[len++] = c;
5965 if (len == STBI__HDR_BUFLEN-1) {
5966 // flush to end of line
5967 while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
5969 break;
5971 c = (char) stbi__get8(z);
5974 buffer[len] = 0;
5975 return buffer;
5978 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
5980 if ( input[3] != 0 ) {
5981 float f1;
5982 // Exponent
5983 f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
5984 if (req_comp <= 2)
5985 output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
5986 else {
5987 output[0] = input[0] * f1;
5988 output[1] = input[1] * f1;
5989 output[2] = input[2] * f1;
5991 if (req_comp == 2) output[1] = 1;
5992 if (req_comp == 4) output[3] = 1;
5993 } else {
5994 switch (req_comp) {
5995 case 4: output[3] = 1; /* fallthrough */
5996 case 3: output[0] = output[1] = output[2] = 0;
5997 break;
5998 case 2: output[1] = 1; /* fallthrough */
5999 case 1: output[0] = 0;
6000 break;
6005 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6007 char buffer[STBI__HDR_BUFLEN];
6008 char *token;
6009 int valid = 0;
6010 int width, height;
6011 stbi_uc *scanline;
6012 float *hdr_data;
6013 int len;
6014 unsigned char count, value;
6015 int i, j, k, c1,c2, z;
6018 // Check identifier
6019 if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
6020 return stbi__errpf("not HDR", "Corrupt HDR image");
6022 // Parse header
6023 for(;;) {
6024 token = stbi__hdr_gettoken(s,buffer);
6025 if (token[0] == 0) break;
6026 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6029 if (!valid) return stbi__errpf("unsupported format", "Unsupported HDR format");
6031 // Parse width and height
6032 // can't use sscanf() if we're not using stdio!
6033 token = stbi__hdr_gettoken(s,buffer);
6034 if (strncmp(token, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6035 token += 3;
6036 height = (int) strtol(token, &token, 10);
6037 while (*token == ' ') ++token;
6038 if (strncmp(token, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6039 token += 3;
6040 width = (int) strtol(token, NULL, 10);
6042 *x = width;
6043 *y = height;
6045 if (comp) *comp = 3;
6046 if (req_comp == 0) req_comp = 3;
6048 // Read data
6049 hdr_data = (float *) stbi__malloc(height * width * req_comp * sizeof(float));
6051 // Load image data
6052 // image data is stored as some number of sca
6053 if ( width < 8 || width >= 32768) {
6054 // Read flat data
6055 for (j=0; j < height; ++j) {
6056 for (i=0; i < width; ++i) {
6057 stbi_uc rgbe[4];
6058 main_decode_loop:
6059 stbi__getn(s, rgbe, 4);
6060 stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
6063 } else {
6064 // Read RLE-encoded data
6065 scanline = NULL;
6067 for (j = 0; j < height; ++j) {
6068 c1 = stbi__get8(s);
6069 c2 = stbi__get8(s);
6070 len = stbi__get8(s);
6071 if (c1 != 2 || c2 != 2 || (len & 0x80)) {
6072 // not run-length encoded, so we have to actually use THIS data as a decoded
6073 // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
6074 stbi_uc rgbe[4];
6075 rgbe[0] = (stbi_uc) c1;
6076 rgbe[1] = (stbi_uc) c2;
6077 rgbe[2] = (stbi_uc) len;
6078 rgbe[3] = (stbi_uc) stbi__get8(s);
6079 stbi__hdr_convert(hdr_data, rgbe, req_comp);
6080 i = 1;
6081 j = 0;
6082 STBI_FREE(scanline);
6083 goto main_decode_loop; // yes, this makes no sense
6085 len <<= 8;
6086 len |= stbi__get8(s);
6087 if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
6088 if (scanline == NULL) scanline = (stbi_uc *) stbi__malloc(width * 4);
6090 for (k = 0; k < 4; ++k) {
6091 i = 0;
6092 while (i < width) {
6093 count = stbi__get8(s);
6094 if (count > 128) {
6095 // Run
6096 value = stbi__get8(s);
6097 count -= 128;
6098 for (z = 0; z < count; ++z)
6099 scanline[i++ * 4 + k] = value;
6100 } else {
6101 // Dump
6102 for (z = 0; z < count; ++z)
6103 scanline[i++ * 4 + k] = stbi__get8(s);
6107 for (i=0; i < width; ++i)
6108 stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
6110 STBI_FREE(scanline);
6113 return hdr_data;
6116 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
6118 char buffer[STBI__HDR_BUFLEN];
6119 char *token;
6120 int valid = 0;
6122 if (stbi__hdr_test(s) == 0) {
6123 stbi__rewind( s );
6124 return 0;
6127 for(;;) {
6128 token = stbi__hdr_gettoken(s,buffer);
6129 if (token[0] == 0) break;
6130 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6133 if (!valid) {
6134 stbi__rewind( s );
6135 return 0;
6137 token = stbi__hdr_gettoken(s,buffer);
6138 if (strncmp(token, "-Y ", 3)) {
6139 stbi__rewind( s );
6140 return 0;
6142 token += 3;
6143 *y = (int) strtol(token, &token, 10);
6144 while (*token == ' ') ++token;
6145 if (strncmp(token, "+X ", 3)) {
6146 stbi__rewind( s );
6147 return 0;
6149 token += 3;
6150 *x = (int) strtol(token, NULL, 10);
6151 *comp = 3;
6152 return 1;
6154 #endif // STBI_NO_HDR
6156 #ifndef STBI_NO_BMP
6157 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
6159 void *p;
6160 stbi__bmp_data info;
6162 info.all_a = 255;
6163 p = stbi__bmp_parse_header(s, &info);
6164 stbi__rewind( s );
6165 if (p == NULL)
6166 return 0;
6167 *x = s->img_x;
6168 *y = s->img_y;
6169 *comp = info.ma ? 4 : 3;
6170 return 1;
6172 #endif
6174 #ifndef STBI_NO_PSD
6175 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
6177 int channelCount;
6178 if (stbi__get32be(s) != 0x38425053) {
6179 stbi__rewind( s );
6180 return 0;
6182 if (stbi__get16be(s) != 1) {
6183 stbi__rewind( s );
6184 return 0;
6186 stbi__skip(s, 6);
6187 channelCount = stbi__get16be(s);
6188 if (channelCount < 0 || channelCount > 16) {
6189 stbi__rewind( s );
6190 return 0;
6192 *y = stbi__get32be(s);
6193 *x = stbi__get32be(s);
6194 if (stbi__get16be(s) != 8) {
6195 stbi__rewind( s );
6196 return 0;
6198 if (stbi__get16be(s) != 3) {
6199 stbi__rewind( s );
6200 return 0;
6202 *comp = 4;
6203 return 1;
6205 #endif
6207 #ifndef STBI_NO_PIC
6208 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
6210 int act_comp=0,num_packets=0,chained;
6211 stbi__pic_packet packets[10];
6213 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
6214 stbi__rewind(s);
6215 return 0;
6218 stbi__skip(s, 88);
6220 *x = stbi__get16be(s);
6221 *y = stbi__get16be(s);
6222 if (stbi__at_eof(s)) {
6223 stbi__rewind( s);
6224 return 0;
6226 if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
6227 stbi__rewind( s );
6228 return 0;
6231 stbi__skip(s, 8);
6233 do {
6234 stbi__pic_packet *packet;
6236 if (num_packets==sizeof(packets)/sizeof(packets[0]))
6237 return 0;
6239 packet = &packets[num_packets++];
6240 chained = stbi__get8(s);
6241 packet->size = stbi__get8(s);
6242 packet->type = stbi__get8(s);
6243 packet->channel = stbi__get8(s);
6244 act_comp |= packet->channel;
6246 if (stbi__at_eof(s)) {
6247 stbi__rewind( s );
6248 return 0;
6250 if (packet->size != 8) {
6251 stbi__rewind( s );
6252 return 0;
6254 } while (chained);
6256 *comp = (act_comp & 0x10 ? 4 : 3);
6258 return 1;
6260 #endif
6262 // *************************************************************************************************
6263 // Portable Gray Map and Portable Pixel Map loader
6264 // by Ken Miller
6266 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
6267 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
6269 // Known limitations:
6270 // Does not support comments in the header section
6271 // Does not support ASCII image data (formats P2 and P3)
6272 // Does not support 16-bit-per-channel
6274 #ifndef STBI_NO_PNM
6276 static int stbi__pnm_test(stbi__context *s)
6278 char p, t;
6279 p = (char) stbi__get8(s);
6280 t = (char) stbi__get8(s);
6281 if (p != 'P' || (t != '5' && t != '6')) {
6282 stbi__rewind( s );
6283 return 0;
6285 return 1;
6288 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6290 stbi_uc *out;
6291 if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
6292 return 0;
6293 *x = s->img_x;
6294 *y = s->img_y;
6295 *comp = s->img_n;
6297 out = (stbi_uc *) stbi__malloc(s->img_n * s->img_x * s->img_y);
6298 if (!out) return stbi__errpuc("outofmem", "Out of memory");
6299 stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
6301 if (req_comp && req_comp != s->img_n) {
6302 out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
6303 if (out == NULL) return out; // stbi__convert_format frees input on failure
6305 return out;
6308 static int stbi__pnm_isspace(char c)
6310 return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
6313 static void stbi__pnm_skip_whitespace(stbi__context *s, char *c)
6315 for (;;) {
6316 while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
6317 *c = (char) stbi__get8(s);
6319 if (stbi__at_eof(s) || *c != '#')
6320 break;
6322 while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' )
6323 *c = (char) stbi__get8(s);
6327 static int stbi__pnm_isdigit(char c)
6329 return c >= '0' && c <= '9';
6332 static int stbi__pnm_getinteger(stbi__context *s, char *c)
6334 int value = 0;
6336 while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
6337 value = value*10 + (*c - '0');
6338 *c = (char) stbi__get8(s);
6341 return value;
6344 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
6346 int maxv;
6347 char c, p, t;
6349 stbi__rewind( s );
6351 // Get identifier
6352 p = (char) stbi__get8(s);
6353 t = (char) stbi__get8(s);
6354 if (p != 'P' || (t != '5' && t != '6')) {
6355 stbi__rewind( s );
6356 return 0;
6359 *comp = (t == '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
6361 c = (char) stbi__get8(s);
6362 stbi__pnm_skip_whitespace(s, &c);
6364 *x = stbi__pnm_getinteger(s, &c); // read width
6365 stbi__pnm_skip_whitespace(s, &c);
6367 *y = stbi__pnm_getinteger(s, &c); // read height
6368 stbi__pnm_skip_whitespace(s, &c);
6370 maxv = stbi__pnm_getinteger(s, &c); // read max value
6372 if (maxv > 255)
6373 return stbi__err("max value > 255", "PPM image not 8-bit");
6374 else
6375 return 1;
6377 #endif
6379 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
6381 #ifndef STBI_NO_JPEG
6382 if (stbi__jpeg_info(s, x, y, comp)) return 1;
6383 #endif
6385 #ifndef STBI_NO_PNG
6386 if (stbi__png_info(s, x, y, comp)) return 1;
6387 #endif
6389 #ifndef STBI_NO_GIF
6390 if (stbi__gif_info(s, x, y, comp)) return 1;
6391 #endif
6393 #ifndef STBI_NO_BMP
6394 if (stbi__bmp_info(s, x, y, comp)) return 1;
6395 #endif
6397 #ifndef STBI_NO_PSD
6398 if (stbi__psd_info(s, x, y, comp)) return 1;
6399 #endif
6401 #ifndef STBI_NO_PIC
6402 if (stbi__pic_info(s, x, y, comp)) return 1;
6403 #endif
6405 #ifndef STBI_NO_PNM
6406 if (stbi__pnm_info(s, x, y, comp)) return 1;
6407 #endif
6409 #ifndef STBI_NO_HDR
6410 if (stbi__hdr_info(s, x, y, comp)) return 1;
6411 #endif
6413 // test tga last because it's a crappy test!
6414 #ifndef STBI_NO_TGA
6415 if (stbi__tga_info(s, x, y, comp))
6416 return 1;
6417 #endif
6418 return stbi__err("unknown image type", "Image not of any known type, or corrupt");
6421 #ifndef STBI_NO_STDIO
6422 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
6424 FILE *f = stbi__fopen(filename, "rb");
6425 int result;
6426 if (!f) return stbi__err("can't fopen", "Unable to open file");
6427 result = stbi_info_from_file(f, x, y, comp);
6428 fclose(f);
6429 return result;
6432 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
6434 int r;
6435 stbi__context s;
6436 long pos = ftell(f);
6437 stbi__start_file(&s, f);
6438 r = stbi__info_main(&s,x,y,comp);
6439 fseek(f,pos,SEEK_SET);
6440 return r;
6442 #endif // !STBI_NO_STDIO
6444 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
6446 stbi__context s;
6447 stbi__start_mem(&s,buffer,len);
6448 return stbi__info_main(&s,x,y,comp);
6451 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
6453 stbi__context s;
6454 stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
6455 return stbi__info_main(&s,x,y,comp);
6458 #endif // STB_IMAGE_IMPLEMENTATION
6461 revision history:
6462 2.10 (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
6463 2.09 (2016-01-16) allow comments in PNM files
6464 16-bit-per-pixel TGA (not bit-per-component)
6465 info() for TGA could break due to .hdr handling
6466 info() for BMP to shares code instead of sloppy parse
6467 can use STBI_REALLOC_SIZED if allocator doesn't support realloc
6468 code cleanup
6469 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
6470 2.07 (2015-09-13) fix compiler warnings
6471 partial animated GIF support
6472 limited 16-bpc PSD support
6473 #ifdef unused functions
6474 bug with < 92 byte PIC,PNM,HDR,TGA
6475 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
6476 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
6477 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
6478 2.03 (2015-04-12) extra corruption checking (mmozeiko)
6479 stbi_set_flip_vertically_on_load (nguillemot)
6480 fix NEON support; fix mingw support
6481 2.02 (2015-01-19) fix incorrect assert, fix warning
6482 2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
6483 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
6484 2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
6485 progressive JPEG (stb)
6486 PGM/PPM support (Ken Miller)
6487 STBI_MALLOC,STBI_REALLOC,STBI_FREE
6488 GIF bugfix -- seemingly never worked
6489 STBI_NO_*, STBI_ONLY_*
6490 1.48 (2014-12-14) fix incorrectly-named assert()
6491 1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
6492 optimize PNG (ryg)
6493 fix bug in interlaced PNG with user-specified channel count (stb)
6494 1.46 (2014-08-26)
6495 fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
6496 1.45 (2014-08-16)
6497 fix MSVC-ARM internal compiler error by wrapping malloc
6498 1.44 (2014-08-07)
6499 various warning fixes from Ronny Chevalier
6500 1.43 (2014-07-15)
6501 fix MSVC-only compiler problem in code changed in 1.42
6502 1.42 (2014-07-09)
6503 don't define _CRT_SECURE_NO_WARNINGS (affects user code)
6504 fixes to stbi__cleanup_jpeg path
6505 added STBI_ASSERT to avoid requiring assert.h
6506 1.41 (2014-06-25)
6507 fix search&replace from 1.36 that messed up comments/error messages
6508 1.40 (2014-06-22)
6509 fix gcc struct-initialization warning
6510 1.39 (2014-06-15)
6511 fix to TGA optimization when req_comp != number of components in TGA;
6512 fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
6513 add support for BMP version 5 (more ignored fields)
6514 1.38 (2014-06-06)
6515 suppress MSVC warnings on integer casts truncating values
6516 fix accidental rename of 'skip' field of I/O
6517 1.37 (2014-06-04)
6518 remove duplicate typedef
6519 1.36 (2014-06-03)
6520 convert to header file single-file library
6521 if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
6522 1.35 (2014-05-27)
6523 various warnings
6524 fix broken STBI_SIMD path
6525 fix bug where stbi_load_from_file no longer left file pointer in correct place
6526 fix broken non-easy path for 32-bit BMP (possibly never used)
6527 TGA optimization by Arseny Kapoulkine
6528 1.34 (unknown)
6529 use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
6530 1.33 (2011-07-14)
6531 make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
6532 1.32 (2011-07-13)
6533 support for "info" function for all supported filetypes (SpartanJ)
6534 1.31 (2011-06-20)
6535 a few more leak fixes, bug in PNG handling (SpartanJ)
6536 1.30 (2011-06-11)
6537 added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
6538 removed deprecated format-specific test/load functions
6539 removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
6540 error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
6541 fix inefficiency in decoding 32-bit BMP (David Woo)
6542 1.29 (2010-08-16)
6543 various warning fixes from Aurelien Pocheville
6544 1.28 (2010-08-01)
6545 fix bug in GIF palette transparency (SpartanJ)
6546 1.27 (2010-08-01)
6547 cast-to-stbi_uc to fix warnings
6548 1.26 (2010-07-24)
6549 fix bug in file buffering for PNG reported by SpartanJ
6550 1.25 (2010-07-17)
6551 refix trans_data warning (Won Chun)
6552 1.24 (2010-07-12)
6553 perf improvements reading from files on platforms with lock-heavy fgetc()
6554 minor perf improvements for jpeg
6555 deprecated type-specific functions so we'll get feedback if they're needed
6556 attempt to fix trans_data warning (Won Chun)
6557 1.23 fixed bug in iPhone support
6558 1.22 (2010-07-10)
6559 removed image *writing* support
6560 stbi_info support from Jetro Lauha
6561 GIF support from Jean-Marc Lienher
6562 iPhone PNG-extensions from James Brown
6563 warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
6564 1.21 fix use of 'stbi_uc' in header (reported by jon blow)
6565 1.20 added support for Softimage PIC, by Tom Seddon
6566 1.19 bug in interlaced PNG corruption check (found by ryg)
6567 1.18 (2008-08-02)
6568 fix a threading bug (local mutable static)
6569 1.17 support interlaced PNG
6570 1.16 major bugfix - stbi__convert_format converted one too many pixels
6571 1.15 initialize some fields for thread safety
6572 1.14 fix threadsafe conversion bug
6573 header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
6574 1.13 threadsafe
6575 1.12 const qualifiers in the API
6576 1.11 Support installable IDCT, colorspace conversion routines
6577 1.10 Fixes for 64-bit (don't use "unsigned long")
6578 optimized upsampling by Fabian "ryg" Giesen
6579 1.09 Fix format-conversion for PSD code (bad global variables!)
6580 1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
6581 1.07 attempt to fix C++ warning/errors again
6582 1.06 attempt to fix C++ warning/errors again
6583 1.05 fix TGA loading to return correct *comp and use good luminance calc
6584 1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
6585 1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
6586 1.02 support for (subset of) HDR files, float interface for preferred access to them
6587 1.01 fix bug: possible bug in handling right-side up bmps... not sure
6588 fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
6589 1.00 interface to zlib that skips zlib header
6590 0.99 correct handling of alpha in palette
6591 0.98 TGA loader by lonesock; dynamically add loaders (untested)
6592 0.97 jpeg errors on too large a file; also catch another malloc failure
6593 0.96 fix detection of invalid v value - particleman@mollyrocket forum
6594 0.95 during header scan, seek to markers in case of padding
6595 0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
6596 0.93 handle jpegtran output; verbose errors
6597 0.92 read 4,8,16,24,32-bit BMP files of several formats
6598 0.91 output 24-bit Windows 3.0 BMP files
6599 0.90 fix a few more warnings; bump version number to approach 1.0
6600 0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
6601 0.60 fix compiling as c++
6602 0.59 fix warnings: merge Dave Moore's -Wall fixes
6603 0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian
6604 0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
6605 0.56 fix bug: zlib uncompressed mode len vs. nlen
6606 0.55 fix bug: restart_interval not initialized to 0
6607 0.54 allow NULL for 'int *comp'
6608 0.53 fix bug in png 3->4; speedup png decoding
6609 0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments
6610 0.51 obey req_comp requests, 1-component jpegs return as 1-component,
6611 on 'test' only check type, not whether we support this variant
6612 0.50 (2006-11-19)
6613 first released version