Update readme
[small3dlib.git] / small3dlib.h
blob4b7736524182ff875e4de93faa1fa63edd537f67
1 #ifndef SMALL3DLIB_H
2 #define SMALL3DLIB_H
4 /*
5 Simple realtime 3D software rasterization renderer. It is fast, focused on
6 resource-limited computers, located in a single C header file, with no
7 dependencies, using only 32bit integer arithmetics.
9 author: Miloslav Ciz
10 license: CC0 1.0 (public domain)
11 found at https://creativecommons.org/publicdomain/zero/1.0/
12 + additional waiver of all IP
13 version: 0.905d
15 Before including the library, define S3L_PIXEL_FUNCTION to the name of the
16 function you'll be using to draw single pixels (this function will be called
17 by the library to render the frames). Also either init S3L_resolutionX and
18 S3L_resolutionY or define S3L_RESOLUTION_X and S3L_RESOLUTION_Y.
20 You'll also need to decide what rendering strategy and other settings you
21 want to use, depending on your specific usecase. You may want to use a
22 z-buffer (full or reduced, S3L_Z_BUFFER), sorted-drawing (S3L_SORT), or even
23 none of these. See the description of the options in this file.
25 The rendering itself is done with S3L_drawScene, usually preceded by
26 S3L_newFrame (for clearing zBuffer etc.).
28 The library is meant to be used in not so huge programs that use single
29 translation unit and so includes both declarations and implementation at once.
30 If you for some reason use multiple translation units (which include the
31 library), you'll have to handle this yourself (e.g. create a wrapper, manually
32 split the library into .c and .h etc.).
34 --------------------
36 This work's goal is to never be encumbered by any exclusive intellectual
37 property rights. The work is therefore provided under CC0 1.0 + additional
38 WAIVER OF ALL INTELLECTUAL PROPERTY RIGHTS that waives the rest of
39 intellectual property rights not already waived by CC0 1.0. The WAIVER OF ALL
40 INTELLECTUAL PROPERTY RGHTS is as follows:
42 Each contributor to this work agrees that they waive any exclusive rights,
43 including but not limited to copyright, patents, trademark, trade dress,
44 industrial design, plant varieties and trade secrets, to any and all ideas,
45 concepts, processes, discoveries, improvements and inventions conceived,
46 discovered, made, designed, researched or developed by the contributor either
47 solely or jointly with others, which relate to this work or result from this
48 work. Should any waiver of such right be judged legally invalid or
49 ineffective under applicable law, the contributor hereby grants to each
50 affected person a royalty-free, non transferable, non sublicensable, non
51 exclusive, irrevocable and unconditional license to this right.
53 --------------------
55 CONVENTIONS:
57 This library should never draw pixels outside the specified screen
58 boundaries, so you don't have to check this (that would cost CPU time)!
60 You can safely assume that triangles are rasterized one by one and from top
61 down, left to right (so you can utilize e.g. various caches), and if sorting
62 is disabled the order of rasterization will be that specified in the scene
63 structure and model arrays (of course, some triangles and models may be
64 skipped due to culling etc.).
66 Angles are in S3L_Units, a full angle (2 pi) is S3L_FRACTIONS_PER_UNITs.
68 We use row vectors.
70 In 3D space, a left-handed coord. system is used. One spatial unit is split
71 into S3L_FRACTIONS_PER_UNITs fractions (fixed point arithmetic).
73 y ^
74 | _
75 | /| z
76 | /
77 | /
78 [0,0,0]-------> x
80 Untransformed camera is placed at [0,0,0], looking forward along +z axis. The
81 projection plane is centered at [0,0,0], stretrinch from
82 -S3L_FRACTIONS_PER_UNIT to S3L_FRACTIONS_PER_UNIT horizontally (x),
83 vertical size (y) depends on the aspect ratio (S3L_RESOLUTION_X and
84 S3L_RESOLUTION_Y). Camera FOV is defined by focal length in S3L_Units.
86 Rotations use Euler angles and are generally in the extrinsic Euler angles in
87 ZXY order (by Z, then by X, then by Y). Positive rotation about an axis
88 rotates CW (clock-wise) when looking in the direction of the axis.
90 Coordinates of pixels on the screen start at the top left, from [0,0].
92 There is NO subpixel accuracy (screen coordinates are only integer).
94 Triangle rasterization rules are these (mostly same as OpenGL, D3D etc.):
96 - Let's define:
97 - left side:
98 - not exactly horizontal, and on the left side of triangle
99 - exactly horizontal and above the topmost
100 (in other words: its normal points at least a little to the left or
101 completely up)
102 - right side: not left side
103 - Pixel centers are at integer coordinates and triangle for drawing are
104 specified with integer coordinates of pixel centers.
105 - A pixel is rasterized:
106 - if its center is inside the triangle OR
107 - if its center is exactly on the triangle side which is left and at the
108 same time is not on the side that's right (case of a triangle that's on
109 a single line) OR
110 - if its center is exactly on the triangle corner of sides neither of which
111 is right.
113 These rules imply among others:
115 - Adjacent triangles don't have any overlapping pixels, nor gaps between.
116 - Triangles of points that lie on a single line are NOT rasterized.
117 - A single "long" triangle CAN be rasterized as isolated islands of pixels.
118 - Transforming (e.g. mirroring, rotating by 90 degrees etc.) a result of
119 rasterizing triangle A is NOT generally equal to applying the same
120 transformation to triangle A first and then rasterizing it. Even the number
121 of rasterized pixels is usually different.
122 - If specifying a triangle with integer coordinates (which we are), then:
123 - The bottom-most corner (or side) of a triangle is never rasterized
124 (because it is connected to a right side).
125 - The top-most corner can only be rasterized on completely horizontal side
126 (otherwise it is connected to a right side).
127 - Vertically middle corner is rasterized if and only if it is on the left
128 of the triangle and at the same time is also not the bottom-most corner.
131 #include <stdint.h>
133 #ifdef S3L_RESOLUTION_X
134 #ifdef S3L_RESOLUTION_Y
135 #define S3L_MAX_PIXELS (S3L_RESOLUTION_X * S3L_RESOLUTION_Y)
136 #endif
137 #endif
139 #ifndef S3L_RESOLUTION_X
140 #ifndef S3L_MAX_PIXELS
141 #error Dynamic resolution set (S3L_RESOLUTION_X not defined), but\
142 S3L_MAX_PIXELS not defined!
143 #endif
145 uint16_t S3L_resolutionX = 512; /**< If a static resolution is not set with
146 S3L_RESOLUTION_X, this variable can be
147 used to change X resolution at runtime,
148 in which case S3L_MAX_PIXELS has to be
149 defined (to allocate zBuffer etc.)! */
150 #define S3L_RESOLUTION_X S3L_resolutionX
151 #endif
153 #ifndef S3L_RESOLUTION_Y
154 #ifndef S3L_MAX_PIXELS
155 #error Dynamic resolution set (S3L_RESOLUTION_Y not defined), but\
156 S3L_MAX_PIXELS not defined!
157 #endif
159 uint16_t S3L_resolutionY = 512; /**< Same as S3L_resolutionX, but for Y
160 resolution. */
161 #define S3L_RESOLUTION_Y S3L_resolutionY
162 #endif
164 #ifndef S3L_USE_WIDER_TYPES
165 /** If true, the library will use wider data types which will largely supress
166 many rendering bugs and imprecisions happening due to overflows, but this will
167 also consumer more RAM and may potentially be slower on computers with smaller
168 native integer. */
169 #define S3L_USE_WIDER_TYPES 0
170 #endif
172 #ifndef S3L_SIN_METHOD
173 /** Says which method should be used for computing sin/cos functions, possible
174 values: 0 (lookup table, takes more program memory), 1 (Bhaskara's
175 approximation, slower). This may cause the trigonometric functions give
176 slightly different results. */
177 #define S3L_SIN_METHOD 0
178 #endif
180 /** Units of measurement in 3D space. There is S3L_FRACTIONS_PER_UNIT in one
181 spatial unit. By dividing the unit into fractions we effectively achieve a
182 fixed point arithmetic. The number of fractions is a constant that serves as
183 1.0 in floating point arithmetic (normalization etc.). */
185 typedef
186 #if S3L_USE_WIDER_TYPES
187 int64_t
188 #else
189 int32_t
190 #endif
191 S3L_Unit;
193 /** How many fractions a spatial unit is split into, i.e. this is the fixed
194 point scaling. This is NOT SUPPOSED TO BE REDEFINED, so rather don't do it
195 (otherwise things may overflow etc.). */
196 #define S3L_FRACTIONS_PER_UNIT 512
197 #define S3L_F S3L_FRACTIONS_PER_UNIT
199 typedef
200 #if S3L_USE_WIDER_TYPES
201 int32_t
202 #else
203 int16_t
204 #endif
205 S3L_ScreenCoord;
207 typedef
208 #if S3L_USE_WIDER_TYPES
209 uint32_t
210 #else
211 uint16_t
212 #endif
213 S3L_Index;
215 #ifndef S3L_NEAR_CROSS_STRATEGY
216 /** Specifies how the library will handle triangles that partially cross the
217 near plane. These are problematic and require special handling. Possible
218 values:
220 0: Strictly cull any triangle crossing the near plane. This will make such
221 triangles disappear. This is good for performance or models viewed only
222 from at least small distance.
223 1: Forcefully push the vertices crossing near plane in front of it. This is
224 a cheap technique that can be good enough for displaying simple
225 environments on slow devices, but texturing and geometric artifacts/warps
226 will appear.
227 2: Geometrically correct the triangles crossing the near plane. This may
228 result in some triangles being subdivided into two and is a little more
229 expensive, but the results will be geometrically correct, even though
230 barycentric correction is not performed so texturing artifacts will
231 appear. Can be ideal with S3L_FLAT.
232 3: Perform both geometrical and barycentric correction of triangle crossing
233 the near plane. This is significantly more expensive but results in
234 correct rendering. */
235 #define S3L_NEAR_CROSS_STRATEGY 0
236 #endif
238 #ifndef S3L_FLAT
239 /** If on, disables computation of per-pixel values such as barycentric
240 coordinates and depth -- these will still be available but will be the same
241 for the whole triangle. This can be used to create flat-shaded renders and
242 will be a lot faster. With this option on you will probably want to use
243 sorting instead of z-buffer. */
244 #define S3L_FLAT 0
245 #endif
247 #if S3L_FLAT
248 #define S3L_COMPUTE_DEPTH 0
249 #define S3L_PERSPECTIVE_CORRECTION 0
250 // don't disable z-buffer, it makes sense to use it with no sorting
251 #endif
253 #ifndef S3L_PERSPECTIVE_CORRECTION
254 /** Specifies what type of perspective correction (PC) to use. Remember this
255 is an expensive operation! Possible values:
257 0: No perspective correction. Fastest, inaccurate from most angles.
258 1: Per-pixel perspective correction, accurate but very expensive.
259 2: Approximation (computing only at every S3L_PC_APPROX_LENGTHth pixel).
260 Quake-style approximation is used, which only computes the PC after
261 S3L_PC_APPROX_LENGTH pixels. This is reasonably accurate and fast. */
262 #define S3L_PERSPECTIVE_CORRECTION 0
263 #endif
265 #ifndef S3L_PC_APPROX_LENGTH
266 /** For S3L_PERSPECTIVE_CORRECTION == 2, this specifies after how many pixels
267 PC is recomputed. Should be a power of two to keep up the performance.
268 Smaller is nicer but slower. */
269 #define S3L_PC_APPROX_LENGTH 32
270 #endif
272 #if S3L_PERSPECTIVE_CORRECTION
273 #define S3L_COMPUTE_DEPTH 1 // PC inevitably computes depth, so enable it
274 #endif
276 #ifndef S3L_COMPUTE_DEPTH
277 /** Whether to compute depth for each pixel (fragment). Some other options
278 may turn this on automatically. If you don't need depth information, turning
279 this off can save performance. Depth will still be accessible in
280 S3L_PixelInfo, but will be constant -- equal to center point depth -- over
281 the whole triangle. */
282 #define S3L_COMPUTE_DEPTH 1
283 #endif
285 #ifndef S3L_Z_BUFFER
286 /** What type of z-buffer (depth buffer) to use for visibility determination.
287 Possible values:
289 0: Don't use z-buffer. This saves a lot of memory, but visibility checking
290 won't be pixel-accurate and has to mostly be done by other means (typically
291 sorting).
292 1: Use full z-buffer (of S3L_Units) for visibiltiy determination. This is the
293 most accurate option (and also a fast one), but requires a big amount of
294 memory.
295 2: Use reduced-size z-buffer (of bytes). This is fast and somewhat accurate,
296 but inaccuracies can occur and a considerable amount of memory is
297 needed. */
298 #define S3L_Z_BUFFER 0
299 #endif
301 #ifndef S3L_REDUCED_Z_BUFFER_GRANULARITY
302 /** For S3L_Z_BUFFER == 2 this sets the reduced z-buffer granularity. */
303 #define S3L_REDUCED_Z_BUFFER_GRANULARITY 5
304 #endif
306 #ifndef S3L_STENCIL_BUFFER
307 /** Whether to use stencil buffer for drawing -- with this a pixel that would
308 be resterized over an already rasterized pixel (within a frame) will be
309 discarded. This is mostly for front-to-back sorted drawing. */
310 #define S3L_STENCIL_BUFFER 0
311 #endif
313 #ifndef S3L_SORT
314 /** Defines how to sort triangles before drawing a frame. This can be used to
315 solve visibility in case z-buffer is not used, to prevent overwriting already
316 rasterized pixels, implement transparency etc. Note that for simplicity and
317 performance a relatively simple sorting is used which doesn't work completely
318 correctly, so mistakes can occur (even the best sorting wouldn't be able to
319 solve e.g. intersecting triangles). Note that sorting requires a bit of extra
320 memory -- an array of the triangles to sort -- the size of this array limits
321 the maximum number of triangles that can be drawn in a single frame
322 (S3L_MAX_TRIANGES_DRAWN). Possible values:
324 0: Don't sort triangles. This is fastest and doesn't use extra memory.
325 1: Sort triangles from back to front. This can in most cases solve visibility
326 without requiring almost any extra memory compared to z-buffer.
327 2: Sort triangles from front to back. This can be faster than back to front
328 because we prevent computing pixels that will be overwritten by nearer
329 ones, but we need a 1b stencil buffer for this (enable S3L_STENCIL_BUFFER),
330 so a bit more memory is needed. */
331 #define S3L_SORT 0
332 #endif
334 #ifndef S3L_MAX_TRIANGES_DRAWN
335 /** Maximum number of triangles that can be drawn in sorted modes. This
336 affects the size of the cache used for triangle sorting. */
337 #define S3L_MAX_TRIANGES_DRAWN 128
338 #endif
340 #ifndef S3L_NEAR
341 /** Distance of the near clipping plane. Points in front or EXATLY ON this
342 plane are considered outside the frustum. This must be >= 0. */
343 #define S3L_NEAR (S3L_F / 4)
344 #endif
346 #if S3L_NEAR <= 0
347 #define S3L_NEAR 1 // Can't be <= 0.
348 #endif
350 #ifndef S3L_NORMAL_COMPUTE_MAXIMUM_AVERAGE
351 /** Affects the S3L_computeModelNormals function. See its description for
352 details. */
353 #define S3L_NORMAL_COMPUTE_MAXIMUM_AVERAGE 6
354 #endif
356 #ifndef S3L_FAST_LERP_QUALITY
357 /** Quality (scaling) of SOME (stepped) linear interpolations. 0 will most
358 likely be a tiny bit faster, but artifacts can occur for bigger tris, while
359 higher values can fix this -- in theory all higher values will have the same
360 speed (it is a shift value), but it mustn't be too high to prevent
361 overflow. */
362 #define S3L_FAST_LERP_QUALITY 11
363 #endif
365 /** Vector that consists of four scalars and can represent homogenous
366 coordinates, but is generally also used as Vec3 and Vec2 for various
367 purposes. */
368 typedef struct
370 S3L_Unit x;
371 S3L_Unit y;
372 S3L_Unit z;
373 S3L_Unit w;
374 } S3L_Vec4;
376 #define S3L_logVec4(v)\
377 printf("Vec4: %d %d %d %d\n",((v).x),((v).y),((v).z),((v).w))
379 static inline void S3L_vec4Init(S3L_Vec4 *v);
380 static inline void S3L_vec4Set(S3L_Vec4 *v, S3L_Unit x, S3L_Unit y,
381 S3L_Unit z, S3L_Unit w);
382 static inline void S3L_vec3Add(S3L_Vec4 *result, S3L_Vec4 added);
383 static inline void S3L_vec3Sub(S3L_Vec4 *result, S3L_Vec4 substracted);
384 S3L_Unit S3L_vec3Length(S3L_Vec4 v);
386 /** Normalizes Vec3. Note that this function tries to normalize correctly
387 rather than quickly! If you need to normalize quickly, do it yourself in a
388 way that best fits your case. */
389 void S3L_vec3Normalize(S3L_Vec4 *v);
391 /** Like S3L_vec3Normalize, but doesn't perform any checks on the input vector,
392 which is faster, but can be very innacurate or overflowing. You are supposed
393 to provide a "nice" vector (not too big or small). */
394 static inline void S3L_vec3NormalizeFast(S3L_Vec4 *v);
396 S3L_Unit S3L_vec2Length(S3L_Vec4 v);
397 void S3L_vec3Cross(S3L_Vec4 a, S3L_Vec4 b, S3L_Vec4 *result);
398 static inline S3L_Unit S3L_vec3Dot(S3L_Vec4 a, S3L_Vec4 b);
400 /** Computes a reflection direction (typically used e.g. for specular component
401 in Phong illumination). The input vectors must be normalized. The result will
402 be normalized as well. */
403 void S3L_reflect(S3L_Vec4 toLight, S3L_Vec4 normal, S3L_Vec4 *result);
405 /** Determines the winding of a triangle, returns 1 (CW, clockwise), -1 (CCW,
406 counterclockwise) or 0 (points lie on a single line). */
407 static inline int8_t S3L_triangleWinding(
408 S3L_ScreenCoord x0,
409 S3L_ScreenCoord y0,
410 S3L_ScreenCoord x1,
411 S3L_ScreenCoord y1,
412 S3L_ScreenCoord x2,
413 S3L_ScreenCoord y2);
415 typedef struct
417 S3L_Vec4 translation;
418 S3L_Vec4 rotation; /**< Euler angles. Rortation is applied in this order:
419 1. z = by z (roll) CW looking along z+
420 2. x = by x (pitch) CW looking along x+
421 3. y = by y (yaw) CW looking along y+ */
422 S3L_Vec4 scale;
423 } S3L_Transform3D;
425 #define S3L_logTransform3D(t)\
426 printf("Transform3D: T = [%d %d %d], R = [%d %d %d], S = [%d %d %d]\n",\
427 (t).translation.x,(t).translation.y,(t).translation.z,\
428 (t).rotation.x,(t).rotation.y,(t).rotation.z,\
429 (t).scale.x,(t).scale.y,(t).scale.z)
431 static inline void S3L_transform3DInit(S3L_Transform3D *t);
433 void S3L_lookAt(S3L_Vec4 pointTo, S3L_Transform3D *t);
435 void S3L_transform3DSet(
436 S3L_Unit tx,
437 S3L_Unit ty,
438 S3L_Unit tz,
439 S3L_Unit rx,
440 S3L_Unit ry,
441 S3L_Unit rz,
442 S3L_Unit sx,
443 S3L_Unit sy,
444 S3L_Unit sz,
445 S3L_Transform3D *t);
447 /** Converts rotation transformation to three direction vectors of given length
448 (any one can be NULL, in which case it won't be computed). */
449 void S3L_rotationToDirections(
450 S3L_Vec4 rotation,
451 S3L_Unit length,
452 S3L_Vec4 *forw,
453 S3L_Vec4 *right,
454 S3L_Vec4 *up);
456 /** 4x4 matrix, used mostly for 3D transforms. The indexing is this:
457 matrix[column][row]. */
458 typedef S3L_Unit S3L_Mat4[4][4];
460 #define S3L_logMat4(m)\
461 printf("Mat4:\n %d %d %d %d\n %d %d %d %d\n %d %d %d %d\n %d %d %d %d\n"\
462 ,(m)[0][0],(m)[1][0],(m)[2][0],(m)[3][0],\
463 (m)[0][1],(m)[1][1],(m)[2][1],(m)[3][1],\
464 (m)[0][2],(m)[1][2],(m)[2][2],(m)[3][2],\
465 (m)[0][3],(m)[1][3],(m)[2][3],(m)[3][3])
467 /** Initializes a 4x4 matrix to identity. */
468 static inline void S3L_mat4Init(S3L_Mat4 m);
470 void S3L_mat4Copy(S3L_Mat4 src, S3L_Mat4 dst);
472 void S3L_mat4Transpose(S3L_Mat4 m);
474 void S3L_makeTranslationMat(
475 S3L_Unit offsetX,
476 S3L_Unit offsetY,
477 S3L_Unit offsetZ,
478 S3L_Mat4 m);
480 /** Makes a scaling matrix. DON'T FORGET: scale of 1.0 is set with
481 S3L_FRACTIONS_PER_UNIT! */
482 void S3L_makeScaleMatrix(
483 S3L_Unit scaleX,
484 S3L_Unit scaleY,
485 S3L_Unit scaleZ,
486 S3L_Mat4 m);
488 /** Makes a matrix for rotation in the ZXY order. */
489 void S3L_makeRotationMatrixZXY(
490 S3L_Unit byX,
491 S3L_Unit byY,
492 S3L_Unit byZ,
493 S3L_Mat4 m);
495 void S3L_makeWorldMatrix(S3L_Transform3D worldTransform, S3L_Mat4 m);
496 void S3L_makeCameraMatrix(S3L_Transform3D cameraTransform, S3L_Mat4 m);
498 /** Multiplies a vector by a matrix with normalization by
499 S3L_FRACTIONS_PER_UNIT. Result is stored in the input vector. */
500 void S3L_vec4Xmat4(S3L_Vec4 *v, S3L_Mat4 m);
502 /** Same as S3L_vec4Xmat4 but faster, because this version doesn't compute the
503 W component of the result, which is usually not needed. */
504 void S3L_vec3Xmat4(S3L_Vec4 *v, S3L_Mat4 m);
506 /** Multiplies two matrices with normalization by S3L_FRACTIONS_PER_UNIT.
507 Result is stored in the first matrix. The result represents a transformation
508 that has the same effect as applying the transformation represented by m1 and
509 then m2 (in that order). */
510 void S3L_mat4Xmat4(S3L_Mat4 m1, S3L_Mat4 m2);
512 typedef struct
514 S3L_Unit focalLength; /**< Defines the field of view (FOV). 0 sets an
515 orthographics projection (scale is controlled
516 with camera's scale in its transform). */
517 S3L_Transform3D transform;
518 } S3L_Camera;
520 void S3L_cameraInit(S3L_Camera *camera);
522 typedef struct
524 uint8_t backfaceCulling; /**< What backface culling to use. Possible
525 values:
526 - 0 none
527 - 1 clock-wise
528 - 2 counter clock-wise */
529 int8_t visible; /**< Can be used to easily hide the model. */
530 } S3L_DrawConfig;
532 void S3L_drawConfigInit(S3L_DrawConfig *config);
534 typedef struct
536 const S3L_Unit *vertices;
537 S3L_Index vertexCount;
538 const S3L_Index *triangles;
539 S3L_Index triangleCount;
540 S3L_Transform3D transform;
541 S3L_Mat4 *customTransformMatrix; /**< This can be used to override the
542 transform (if != 0) with a custom
543 transform matrix, which is more
544 general. */
545 S3L_DrawConfig config;
546 } S3L_Model3D; ///< Represents a 3D model.
548 void S3L_model3DInit(
549 const S3L_Unit *vertices,
550 S3L_Index vertexCount,
551 const S3L_Index *triangles,
552 S3L_Index triangleCount,
553 S3L_Model3D *model);
555 typedef struct
557 S3L_Model3D *models;
558 S3L_Index modelCount;
559 S3L_Camera camera;
560 } S3L_Scene; ///< Represent the 3D scene to be rendered.
562 void S3L_sceneInit(
563 S3L_Model3D *models,
564 S3L_Index modelCount,
565 S3L_Scene *scene);
567 typedef struct
569 S3L_ScreenCoord x; ///< Screen X coordinate.
570 S3L_ScreenCoord y; ///< Screen Y coordinate.
572 S3L_Unit barycentric[3]; /**< Barycentric coords correspond to the three
573 vertices. These serve to locate the pixel on a
574 triangle and interpolate values between its
575 three points. Each one goes from 0 to
576 S3L_FRACTIONS_PER_UNIT (including), but due to
577 rounding error may fall outside this range (you
578 can use S3L_correctBarycentricCoords to fix this
579 for the price of some performance). The sum of
580 the three coordinates will always be exactly
581 S3L_FRACTIONS_PER_UNIT. */
582 S3L_Index modelIndex; ///< Model index within the scene.
583 S3L_Index triangleIndex; ///< Triangle index within the model.
584 uint32_t triangleID; /**< Unique ID of the triangle withing the whole
585 scene. This can be used e.g. by a cache to
586 quickly find out if a triangle has changed. */
587 S3L_Unit depth; ///< Depth (only if depth is turned on).
588 S3L_Unit previousZ; /**< Z-buffer value (not necessarily world depth in
589 S3L_Units!) that was in the z-buffer on the
590 pixels position before this pixel was
591 rasterized. This can be used to set the value
592 back, e.g. for transparency. */
593 S3L_ScreenCoord triangleSize[2]; /**< Rasterized triangle width and height,
594 can be used e.g. for MIP mapping. */
595 } S3L_PixelInfo; /**< Used to pass the info about a rasterized pixel
596 (fragment) to the user-defined drawing func. */
598 static inline void S3L_pixelInfoInit(S3L_PixelInfo *p);
600 /** Corrects barycentric coordinates so that they exactly meet the defined
601 conditions (each fall into <0,S3L_FRACTIONS_PER_UNIT>, sum =
602 S3L_FRACTIONS_PER_UNIT). Note that doing this per-pixel can slow the program
603 down significantly. */
604 static inline void S3L_correctBarycentricCoords(S3L_Unit barycentric[3]);
606 // general helper functions
607 static inline S3L_Unit S3L_abs(S3L_Unit value);
608 static inline S3L_Unit S3L_min(S3L_Unit v1, S3L_Unit v2);
609 static inline S3L_Unit S3L_max(S3L_Unit v1, S3L_Unit v2);
610 static inline S3L_Unit S3L_clamp(S3L_Unit v, S3L_Unit v1, S3L_Unit v2);
611 static inline S3L_Unit S3L_wrap(S3L_Unit value, S3L_Unit mod);
612 static inline S3L_Unit S3L_nonZero(S3L_Unit value);
613 static inline S3L_Unit S3L_zeroClamp(S3L_Unit value);
615 S3L_Unit S3L_sin(S3L_Unit x);
616 S3L_Unit S3L_asin(S3L_Unit x);
617 static inline S3L_Unit S3L_cos(S3L_Unit x);
619 S3L_Unit S3L_vec3Length(S3L_Vec4 v);
620 S3L_Unit S3L_sqrt(S3L_Unit value);
622 /** Projects a single point from 3D space to the screen space (pixels), which
623 can be useful e.g. for drawing sprites. The w component of input and result
624 holds the point size. If this size is 0 in the result, the sprite is outside
625 the view. */
626 void S3L_project3DPointToScreen(
627 S3L_Vec4 point,
628 S3L_Camera camera,
629 S3L_Vec4 *result);
631 /** Computes a normalized normal of given triangle. */
632 void S3L_triangleNormal(S3L_Vec4 t0, S3L_Vec4 t1, S3L_Vec4 t2,
633 S3L_Vec4 *n);
635 /** Helper function for retrieving per-vertex indexed values from an array,
636 e.g. texturing (UV) coordinates. The 'indices' array contains three indices
637 for each triangle, each index pointing into 'values' array, which contains
638 the values, each one consisting of 'numComponents' components (e.g. 2 for
639 UV coordinates). The three values are retrieved into 'v0', 'v1' and 'v2'
640 vectors (into x, y, z and w, depending on 'numComponents'). This function is
641 meant to be used per-triangle (typically from a cache), NOT per-pixel, as it
642 is not as fast as possible! */
643 void S3L_getIndexedTriangleValues(
644 S3L_Index triangleIndex,
645 const S3L_Index *indices,
646 const S3L_Unit *values,
647 uint8_t numComponents,
648 S3L_Vec4 *v0,
649 S3L_Vec4 *v1,
650 S3L_Vec4 *v2);
652 /** Computes a normalized normal for every vertex of given model (this is
653 relatively slow and SHOUDN'T be done each frame). The dst array must have a
654 sufficient size preallocated! The size is: number of model vertices * 3 *
655 sizeof(S3L_Unit). Note that for advanced allowing sharp edges it is not
656 sufficient to have per-vertex normals, but must be per-triangle. This
657 function doesn't support this.
659 The function computes a normal for each vertex by averaging normals of
660 the triangles containing the vertex. The maximum number of these triangle
661 normals that will be averaged is set with
662 S3L_NORMAL_COMPUTE_MAXIMUM_AVERAGE. */
663 void S3L_computeModelNormals(S3L_Model3D model, S3L_Unit *dst,
664 int8_t transformNormals);
666 /** Interpolated between two values, v1 and v2, in the same ratio as t is to
667 tMax. Does NOT prevent zero division. */
668 static inline S3L_Unit S3L_interpolate(
669 S3L_Unit v1,
670 S3L_Unit v2,
671 S3L_Unit t,
672 S3L_Unit tMax);
674 /** Same as S3L_interpolate but with v1 == 0. Should be faster. */
675 static inline S3L_Unit S3L_interpolateFrom0(
676 S3L_Unit v2,
677 S3L_Unit t,
678 S3L_Unit tMax);
680 /** Like S3L_interpolate, but uses a parameter that goes from 0 to
681 S3L_FRACTIONS_PER_UNIT - 1, which can be faster. */
682 static inline S3L_Unit S3L_interpolateByUnit(
683 S3L_Unit v1,
684 S3L_Unit v2,
685 S3L_Unit t);
687 /** Same as S3L_interpolateByUnit but with v1 == 0. Should be faster. */
688 static inline S3L_Unit S3L_interpolateByUnitFrom0(
689 S3L_Unit v2,
690 S3L_Unit t);
692 static inline S3L_Unit S3L_distanceManhattan(S3L_Vec4 a, S3L_Vec4 b);
694 /** Returns a value interpolated between the three triangle vertices based on
695 barycentric coordinates. */
696 static inline S3L_Unit S3L_interpolateBarycentric(
697 S3L_Unit value0,
698 S3L_Unit value1,
699 S3L_Unit value2,
700 S3L_Unit barycentric[3]);
702 static inline void S3L_mapProjectionPlaneToScreen(
703 S3L_Vec4 point,
704 S3L_ScreenCoord *screenX,
705 S3L_ScreenCoord *screenY);
707 /** Draws a triangle according to given config. The vertices are specified in
708 Screen Space space (pixels). If perspective correction is enabled, each
709 vertex has to have a depth (Z position in camera space) specified in the Z
710 component. */
711 void S3L_drawTriangle(
712 S3L_Vec4 point0,
713 S3L_Vec4 point1,
714 S3L_Vec4 point2,
715 S3L_Index modelIndex,
716 S3L_Index triangleIndex);
718 /** This should be called before rendering each frame. The function clears
719 buffers and does potentially other things needed for the frame. */
720 void S3L_newFrame(void);
722 void S3L_zBufferClear(void);
723 void S3L_stencilBufferClear(void);
725 /** Writes a value (not necessarily depth! depends on the format of z-buffer)
726 to z-buffer (if enabled). Does NOT check boundaries! */
727 void S3L_zBufferWrite(S3L_ScreenCoord x, S3L_ScreenCoord y, S3L_Unit value);
729 /** Reads a value (not necessarily depth! depends on the format of z-buffer)
730 from z-buffer (if enabled). Does NOT check boundaries! */
731 S3L_Unit S3L_zBufferRead(S3L_ScreenCoord x, S3L_ScreenCoord y);
733 static inline void S3L_rotate2DPoint(S3L_Unit *x, S3L_Unit *y, S3L_Unit angle);
735 /** Predefined vertices of a cube to simply insert in an array. These come with
736 S3L_CUBE_TRIANGLES and S3L_CUBE_TEXCOORDS. */
737 #define S3L_CUBE_VERTICES(m)\
738 /* 0 front, bottom, right */\
739 m/2, -m/2, -m/2,\
740 /* 1 front, bottom, left */\
741 -m/2, -m/2, -m/2,\
742 /* 2 front, top, right */\
743 m/2, m/2, -m/2,\
744 /* 3 front, top, left */\
745 -m/2, m/2, -m/2,\
746 /* 4 back, bottom, right */\
747 m/2, -m/2, m/2,\
748 /* 5 back, bottom, left */\
749 -m/2, -m/2, m/2,\
750 /* 6 back, top, right */\
751 m/2, m/2, m/2,\
752 /* 7 back, top, left */\
753 -m/2, m/2, m/2
755 #define S3L_CUBE_VERTEX_COUNT 8
757 /** Predefined triangle indices of a cube, to be used with S3L_CUBE_VERTICES
758 and S3L_CUBE_TEXCOORDS. */
759 #define S3L_CUBE_TRIANGLES\
760 3, 0, 2, /* front */\
761 1, 0, 3,\
762 0, 4, 2, /* right */\
763 2, 4, 6,\
764 4, 5, 6, /* back */\
765 7, 6, 5,\
766 3, 7, 1, /* left */\
767 1, 7, 5,\
768 6, 3, 2, /* top */\
769 7, 3, 6,\
770 1, 4, 0, /* bottom */\
771 5, 4, 1
773 #define S3L_CUBE_TRIANGLE_COUNT 12
775 /** Predefined texture coordinates of a cube, corresponding to triangles (NOT
776 vertices), to be used with S3L_CUBE_VERTICES and S3L_CUBE_TRIANGLES. */
777 #define S3L_CUBE_TEXCOORDS(m)\
778 0,0, m,m, m,0,\
779 0,m, m,m, 0,0,\
780 m,m, m,0, 0,m,\
781 0,m, m,0, 0,0,\
782 m,0, 0,0, m,m,\
783 0,m, m,m, 0,0,\
784 0,0, 0,m, m,0,\
785 m,0, 0,m, m,m,\
786 0,0, m,m, m,0,\
787 0,m, m,m, 0,0,\
788 m,0, 0,m, m,m,\
789 0,0, 0,m, m,0
791 //=============================================================================
792 // privates
794 #define S3L_UNUSED(what) (void)(what) ///< helper macro for unused vars
796 #define S3L_HALF_RESOLUTION_X (S3L_RESOLUTION_X >> 1)
797 #define S3L_HALF_RESOLUTION_Y (S3L_RESOLUTION_Y >> 1)
799 #define S3L_PROJECTION_PLANE_HEIGHT\
800 ((S3L_RESOLUTION_Y * S3L_F * 2) / S3L_RESOLUTION_X)
802 #if S3L_Z_BUFFER == 1
803 #define S3L_MAX_DEPTH 2147483647
804 S3L_Unit S3L_zBuffer[S3L_MAX_PIXELS];
805 #define S3L_zBufferFormat(depth) (depth)
806 #elif S3L_Z_BUFFER == 2
807 #define S3L_MAX_DEPTH 255
808 uint8_t S3L_zBuffer[S3L_MAX_PIXELS];
809 #define S3L_zBufferFormat(depth)\
810 S3L_min(255,(depth) >> S3L_REDUCED_Z_BUFFER_GRANULARITY)
811 #endif
813 #if S3L_Z_BUFFER
814 static inline int8_t S3L_zTest(
815 S3L_ScreenCoord x,
816 S3L_ScreenCoord y,
817 S3L_Unit depth)
819 uint32_t index = y * S3L_RESOLUTION_X + x;
821 depth = S3L_zBufferFormat(depth);
823 #if S3L_Z_BUFFER == 2
824 #define cmp <= /* For reduced z-buffer we need equality test, because
825 otherwise pixels at the maximum depth (255) would never be
826 drawn over the background (which also has the depth of
827 255). */
828 #else
829 #define cmp < /* For normal z-buffer we leave out equality test to not waste
830 time by drawing over already drawn pixls. */
831 #endif
833 if (depth cmp S3L_zBuffer[index])
835 S3L_zBuffer[index] = depth;
836 return 1;
839 #undef cmp
841 return 0;
843 #endif
845 S3L_Unit S3L_zBufferRead(S3L_ScreenCoord x, S3L_ScreenCoord y)
847 #if S3L_Z_BUFFER
848 return S3L_zBuffer[y * S3L_RESOLUTION_X + x];
849 #else
850 S3L_UNUSED(x);
851 S3L_UNUSED(y);
853 return 0;
854 #endif
857 void S3L_zBufferWrite(S3L_ScreenCoord x, S3L_ScreenCoord y, S3L_Unit value)
859 #if S3L_Z_BUFFER
860 S3L_zBuffer[y * S3L_RESOLUTION_X + x] = value;
861 #else
862 S3L_UNUSED(x);
863 S3L_UNUSED(y);
864 S3L_UNUSED(value);
865 #endif
868 #if S3L_STENCIL_BUFFER
869 #define S3L_STENCIL_BUFFER_SIZE\
870 ((S3L_RESOLUTION_X * S3L_RESOLUTION_Y - 1) / 8 + 1)
872 uint8_t S3L_stencilBuffer[S3L_STENCIL_BUFFER_SIZE];
874 static inline int8_t S3L_stencilTest(
875 S3L_ScreenCoord x,
876 S3L_ScreenCoord y)
878 uint32_t index = y * S3L_RESOLUTION_X + x;
879 uint32_t bit = (index & 0x00000007);
880 index = index >> 3;
882 uint8_t val = S3L_stencilBuffer[index];
884 if ((val >> bit) & 0x1)
885 return 0;
887 S3L_stencilBuffer[index] = val | (0x1 << bit);
889 return 1;
891 #endif
893 #define S3L_COMPUTE_LERP_DEPTH\
894 (S3L_COMPUTE_DEPTH && (S3L_PERSPECTIVE_CORRECTION == 0))
896 #define S3L_SIN_TABLE_LENGTH 128
898 #if S3L_SIN_METHOD == 0
899 static const S3L_Unit S3L_sinTable[S3L_SIN_TABLE_LENGTH] =
901 /* 511 was chosen here as a highest number that doesn't overflow during
902 compilation for S3L_F == 1024 */
904 (0*S3L_F)/511, (6*S3L_F)/511,
905 (12*S3L_F)/511, (18*S3L_F)/511,
906 (25*S3L_F)/511, (31*S3L_F)/511,
907 (37*S3L_F)/511, (43*S3L_F)/511,
908 (50*S3L_F)/511, (56*S3L_F)/511,
909 (62*S3L_F)/511, (68*S3L_F)/511,
910 (74*S3L_F)/511, (81*S3L_F)/511,
911 (87*S3L_F)/511, (93*S3L_F)/511,
912 (99*S3L_F)/511, (105*S3L_F)/511,
913 (111*S3L_F)/511, (118*S3L_F)/511,
914 (124*S3L_F)/511, (130*S3L_F)/511,
915 (136*S3L_F)/511, (142*S3L_F)/511,
916 (148*S3L_F)/511, (154*S3L_F)/511,
917 (160*S3L_F)/511, (166*S3L_F)/511,
918 (172*S3L_F)/511, (178*S3L_F)/511,
919 (183*S3L_F)/511, (189*S3L_F)/511,
920 (195*S3L_F)/511, (201*S3L_F)/511,
921 (207*S3L_F)/511, (212*S3L_F)/511,
922 (218*S3L_F)/511, (224*S3L_F)/511,
923 (229*S3L_F)/511, (235*S3L_F)/511,
924 (240*S3L_F)/511, (246*S3L_F)/511,
925 (251*S3L_F)/511, (257*S3L_F)/511,
926 (262*S3L_F)/511, (268*S3L_F)/511,
927 (273*S3L_F)/511, (278*S3L_F)/511,
928 (283*S3L_F)/511, (289*S3L_F)/511,
929 (294*S3L_F)/511, (299*S3L_F)/511,
930 (304*S3L_F)/511, (309*S3L_F)/511,
931 (314*S3L_F)/511, (319*S3L_F)/511,
932 (324*S3L_F)/511, (328*S3L_F)/511,
933 (333*S3L_F)/511, (338*S3L_F)/511,
934 (343*S3L_F)/511, (347*S3L_F)/511,
935 (352*S3L_F)/511, (356*S3L_F)/511,
936 (361*S3L_F)/511, (365*S3L_F)/511,
937 (370*S3L_F)/511, (374*S3L_F)/511,
938 (378*S3L_F)/511, (382*S3L_F)/511,
939 (386*S3L_F)/511, (391*S3L_F)/511,
940 (395*S3L_F)/511, (398*S3L_F)/511,
941 (402*S3L_F)/511, (406*S3L_F)/511,
942 (410*S3L_F)/511, (414*S3L_F)/511,
943 (417*S3L_F)/511, (421*S3L_F)/511,
944 (424*S3L_F)/511, (428*S3L_F)/511,
945 (431*S3L_F)/511, (435*S3L_F)/511,
946 (438*S3L_F)/511, (441*S3L_F)/511,
947 (444*S3L_F)/511, (447*S3L_F)/511,
948 (450*S3L_F)/511, (453*S3L_F)/511,
949 (456*S3L_F)/511, (459*S3L_F)/511,
950 (461*S3L_F)/511, (464*S3L_F)/511,
951 (467*S3L_F)/511, (469*S3L_F)/511,
952 (472*S3L_F)/511, (474*S3L_F)/511,
953 (476*S3L_F)/511, (478*S3L_F)/511,
954 (481*S3L_F)/511, (483*S3L_F)/511,
955 (485*S3L_F)/511, (487*S3L_F)/511,
956 (488*S3L_F)/511, (490*S3L_F)/511,
957 (492*S3L_F)/511, (494*S3L_F)/511,
958 (495*S3L_F)/511, (497*S3L_F)/511,
959 (498*S3L_F)/511, (499*S3L_F)/511,
960 (501*S3L_F)/511, (502*S3L_F)/511,
961 (503*S3L_F)/511, (504*S3L_F)/511,
962 (505*S3L_F)/511, (506*S3L_F)/511,
963 (507*S3L_F)/511, (507*S3L_F)/511,
964 (508*S3L_F)/511, (509*S3L_F)/511,
965 (509*S3L_F)/511, (510*S3L_F)/511,
966 (510*S3L_F)/511, (510*S3L_F)/511,
967 (510*S3L_F)/511, (510*S3L_F)/511
969 #endif
971 #define S3L_SIN_TABLE_UNIT_STEP\
972 (S3L_F / (S3L_SIN_TABLE_LENGTH * 4))
974 void S3L_vec4Init(S3L_Vec4 *v)
976 v->x = 0; v->y = 0; v->z = 0; v->w = S3L_F;
979 void S3L_vec4Set(S3L_Vec4 *v, S3L_Unit x, S3L_Unit y, S3L_Unit z, S3L_Unit w)
981 v->x = x;
982 v->y = y;
983 v->z = z;
984 v->w = w;
987 void S3L_vec3Add(S3L_Vec4 *result, S3L_Vec4 added)
989 result->x += added.x;
990 result->y += added.y;
991 result->z += added.z;
994 void S3L_vec3Sub(S3L_Vec4 *result, S3L_Vec4 substracted)
996 result->x -= substracted.x;
997 result->y -= substracted.y;
998 result->z -= substracted.z;
1001 void S3L_mat4Init(S3L_Mat4 m)
1003 #define M(x,y) m[x][y]
1004 #define S S3L_F
1006 M(0,0) = S; M(1,0) = 0; M(2,0) = 0; M(3,0) = 0;
1007 M(0,1) = 0; M(1,1) = S; M(2,1) = 0; M(3,1) = 0;
1008 M(0,2) = 0; M(1,2) = 0; M(2,2) = S; M(3,2) = 0;
1009 M(0,3) = 0; M(1,3) = 0; M(2,3) = 0; M(3,3) = S;
1011 #undef M
1012 #undef S
1015 void S3L_mat4Copy(S3L_Mat4 src, S3L_Mat4 dst)
1017 for (uint8_t j = 0; j < 4; ++j)
1018 for (uint8_t i = 0; i < 4; ++i)
1019 dst[i][j] = src[i][j];
1022 S3L_Unit S3L_vec3Dot(S3L_Vec4 a, S3L_Vec4 b)
1024 return (a.x * b.x + a.y * b.y + a.z * b.z) / S3L_F;
1027 void S3L_reflect(S3L_Vec4 toLight, S3L_Vec4 normal, S3L_Vec4 *result)
1029 S3L_Unit d = 2 * S3L_vec3Dot(toLight,normal);
1031 result->x = (normal.x * d) / S3L_F - toLight.x;
1032 result->y = (normal.y * d) / S3L_F - toLight.y;
1033 result->z = (normal.z * d) / S3L_F - toLight.z;
1036 void S3L_vec3Cross(S3L_Vec4 a, S3L_Vec4 b, S3L_Vec4 *result)
1038 result->x = a.y * b.z - a.z * b.y;
1039 result->y = a.z * b.x - a.x * b.z;
1040 result->z = a.x * b.y - a.y * b.x;
1043 void S3L_triangleNormal(S3L_Vec4 t0, S3L_Vec4 t1, S3L_Vec4 t2, S3L_Vec4 *n)
1045 #define ANTI_OVERFLOW 32
1047 t1.x = (t1.x - t0.x) / ANTI_OVERFLOW;
1048 t1.y = (t1.y - t0.y) / ANTI_OVERFLOW;
1049 t1.z = (t1.z - t0.z) / ANTI_OVERFLOW;
1051 t2.x = (t2.x - t0.x) / ANTI_OVERFLOW;
1052 t2.y = (t2.y - t0.y) / ANTI_OVERFLOW;
1053 t2.z = (t2.z - t0.z) / ANTI_OVERFLOW;
1055 #undef ANTI_OVERFLOW
1057 S3L_vec3Cross(t1,t2,n);
1059 S3L_vec3Normalize(n);
1062 void S3L_getIndexedTriangleValues(
1063 S3L_Index triangleIndex,
1064 const S3L_Index *indices,
1065 const S3L_Unit *values,
1066 uint8_t numComponents,
1067 S3L_Vec4 *v0,
1068 S3L_Vec4 *v1,
1069 S3L_Vec4 *v2)
1071 uint32_t i0, i1;
1072 S3L_Unit *value;
1074 i0 = triangleIndex * 3;
1075 i1 = indices[i0] * numComponents;
1076 value = (S3L_Unit *) v0;
1078 if (numComponents > 4)
1079 numComponents = 4;
1081 for (uint8_t j = 0; j < numComponents; ++j)
1083 *value = values[i1];
1084 i1++;
1085 value++;
1088 i0++;
1089 i1 = indices[i0] * numComponents;
1090 value = (S3L_Unit *) v1;
1092 for (uint8_t j = 0; j < numComponents; ++j)
1094 *value = values[i1];
1095 i1++;
1096 value++;
1099 i0++;
1100 i1 = indices[i0] * numComponents;
1101 value = (S3L_Unit *) v2;
1103 for (uint8_t j = 0; j < numComponents; ++j)
1105 *value = values[i1];
1106 i1++;
1107 value++;
1111 void S3L_computeModelNormals(S3L_Model3D model, S3L_Unit *dst,
1112 int8_t transformNormals)
1114 S3L_Index vPos = 0;
1116 S3L_Vec4 n;
1118 n.w = 0;
1120 S3L_Vec4 ns[S3L_NORMAL_COMPUTE_MAXIMUM_AVERAGE];
1121 S3L_Index normalCount;
1123 for (uint32_t i = 0; i < model.vertexCount; ++i)
1125 normalCount = 0;
1127 for (uint32_t j = 0; j < model.triangleCount * 3; j += 3)
1129 if (
1130 (model.triangles[j] == i) ||
1131 (model.triangles[j + 1] == i) ||
1132 (model.triangles[j + 2] == i))
1134 S3L_Vec4 t0, t1, t2;
1135 uint32_t vIndex;
1137 #define getVertex(n)\
1138 vIndex = model.triangles[j + n] * 3;\
1139 t##n.x = model.vertices[vIndex];\
1140 vIndex++;\
1141 t##n.y = model.vertices[vIndex];\
1142 vIndex++;\
1143 t##n.z = model.vertices[vIndex];
1145 getVertex(0)
1146 getVertex(1)
1147 getVertex(2)
1149 #undef getVertex
1151 S3L_triangleNormal(t0,t1,t2,&(ns[normalCount]));
1153 normalCount++;
1155 if (normalCount >= S3L_NORMAL_COMPUTE_MAXIMUM_AVERAGE)
1156 break;
1160 n.x = S3L_F;
1161 n.y = 0;
1162 n.z = 0;
1164 if (normalCount != 0)
1166 // compute average
1168 n.x = 0;
1170 for (uint8_t i = 0; i < normalCount; ++i)
1172 n.x += ns[i].x;
1173 n.y += ns[i].y;
1174 n.z += ns[i].z;
1177 n.x /= normalCount;
1178 n.y /= normalCount;
1179 n.z /= normalCount;
1181 S3L_vec3Normalize(&n);
1184 dst[vPos] = n.x;
1185 vPos++;
1187 dst[vPos] = n.y;
1188 vPos++;
1190 dst[vPos] = n.z;
1191 vPos++;
1194 S3L_Mat4 m;
1196 S3L_makeWorldMatrix(model.transform,m);
1198 if (transformNormals)
1199 for (S3L_Index i = 0; i < model.vertexCount * 3; i += 3)
1201 n.x = dst[i];
1202 n.y = dst[i + 1];
1203 n.z = dst[i + 2];
1205 S3L_vec4Xmat4(&n,m);
1207 dst[i] = n.x;
1208 dst[i + 1] = n.y;
1209 dst[i + 2] = n.z;
1213 void S3L_vec4Xmat4(S3L_Vec4 *v, S3L_Mat4 m)
1215 S3L_Vec4 vBackup;
1217 vBackup.x = v->x;
1218 vBackup.y = v->y;
1219 vBackup.z = v->z;
1220 vBackup.w = v->w;
1222 #define dotCol(col)\
1223 ((vBackup.x * m[col][0]) +\
1224 (vBackup.y * m[col][1]) +\
1225 (vBackup.z * m[col][2]) +\
1226 (vBackup.w * m[col][3])) / S3L_F
1228 v->x = dotCol(0);
1229 v->y = dotCol(1);
1230 v->z = dotCol(2);
1231 v->w = dotCol(3);
1234 void S3L_vec3Xmat4(S3L_Vec4 *v, S3L_Mat4 m)
1236 S3L_Vec4 vBackup;
1238 #undef dotCol
1239 #define dotCol(col)\
1240 (vBackup.x * m[col][0]) / S3L_F +\
1241 (vBackup.y * m[col][1]) / S3L_F +\
1242 (vBackup.z * m[col][2]) / S3L_F +\
1243 m[col][3]
1245 vBackup.x = v->x;
1246 vBackup.y = v->y;
1247 vBackup.z = v->z;
1248 vBackup.w = v->w;
1250 v->x = dotCol(0);
1251 v->y = dotCol(1);
1252 v->z = dotCol(2);
1253 v->w = S3L_F;
1256 #undef dotCol
1258 S3L_Unit S3L_abs(S3L_Unit value)
1260 return value * (((value >= 0) << 1) - 1);
1263 S3L_Unit S3L_min(S3L_Unit v1, S3L_Unit v2)
1265 return v1 >= v2 ? v2 : v1;
1268 S3L_Unit S3L_max(S3L_Unit v1, S3L_Unit v2)
1270 return v1 >= v2 ? v1 : v2;
1273 S3L_Unit S3L_clamp(S3L_Unit v, S3L_Unit v1, S3L_Unit v2)
1275 return v >= v1 ? (v <= v2 ? v : v2) : v1;
1278 S3L_Unit S3L_zeroClamp(S3L_Unit value)
1280 return (value * (value >= 0));
1283 S3L_Unit S3L_wrap(S3L_Unit value, S3L_Unit mod)
1285 return value >= 0 ? (value % mod) : (mod + (value % mod) - 1);
1288 S3L_Unit S3L_nonZero(S3L_Unit value)
1290 return (value + (value == 0));
1293 S3L_Unit S3L_interpolate(S3L_Unit v1, S3L_Unit v2, S3L_Unit t, S3L_Unit tMax)
1295 return v1 + ((v2 - v1) * t) / tMax;
1298 S3L_Unit S3L_interpolateByUnit(S3L_Unit v1, S3L_Unit v2, S3L_Unit t)
1300 return v1 + ((v2 - v1) * t) / S3L_F;
1303 S3L_Unit S3L_interpolateByUnitFrom0(S3L_Unit v2, S3L_Unit t)
1305 return (v2 * t) / S3L_F;
1308 S3L_Unit S3L_interpolateFrom0(S3L_Unit v2, S3L_Unit t, S3L_Unit tMax)
1310 return (v2 * t) / tMax;
1313 S3L_Unit S3L_distanceManhattan(S3L_Vec4 a, S3L_Vec4 b)
1315 return
1316 S3L_abs(a.x - b.x) +
1317 S3L_abs(a.y - b.y) +
1318 S3L_abs(a.z - b.z);
1321 void S3L_mat4Xmat4(S3L_Mat4 m1, S3L_Mat4 m2)
1323 S3L_Mat4 mat1;
1325 for (uint16_t row = 0; row < 4; ++row)
1326 for (uint16_t col = 0; col < 4; ++col)
1327 mat1[col][row] = m1[col][row];
1329 for (uint16_t row = 0; row < 4; ++row)
1330 for (uint16_t col = 0; col < 4; ++col)
1332 m1[col][row] = 0;
1334 for (uint16_t i = 0; i < 4; ++i)
1335 m1[col][row] +=
1336 (mat1[i][row] * m2[col][i]) / S3L_F;
1340 S3L_Unit S3L_sin(S3L_Unit x)
1342 #if S3L_SIN_METHOD == 0
1343 x = S3L_wrap(x / S3L_SIN_TABLE_UNIT_STEP,S3L_SIN_TABLE_LENGTH * 4);
1344 int8_t positive = 1;
1346 if (x < S3L_SIN_TABLE_LENGTH)
1349 else if (x < S3L_SIN_TABLE_LENGTH * 2)
1351 x = S3L_SIN_TABLE_LENGTH * 2 - x - 1;
1353 else if (x < S3L_SIN_TABLE_LENGTH * 3)
1355 x = x - S3L_SIN_TABLE_LENGTH * 2;
1356 positive = 0;
1358 else
1360 x = S3L_SIN_TABLE_LENGTH - (x - S3L_SIN_TABLE_LENGTH * 3) - 1;
1361 positive = 0;
1364 return positive ? S3L_sinTable[x] : -1 * S3L_sinTable[x];
1365 #else
1366 int8_t sign = 1;
1368 if (x < 0) // odd function
1370 x *= -1;
1371 sign = -1;
1374 x %= S3L_F;
1376 if (x > S3L_F / 2)
1378 x -= S3L_F / 2;
1379 sign *= -1;
1382 S3L_Unit tmp = S3L_F - 2 * x;
1384 #define _PI2 ((S3L_Unit) (9.8696044 * S3L_F))
1385 return sign * // Bhaskara's approximation
1386 (((32 * x * _PI2) / S3L_F) * tmp) /
1387 ((_PI2 * (5 * S3L_F - (8 * x * tmp) /
1388 S3L_F)) / S3L_F);
1389 #undef _PI2
1390 #endif
1393 S3L_Unit S3L_asin(S3L_Unit x)
1395 #if S3L_SIN_METHOD == 0
1396 x = S3L_clamp(x,-S3L_F,S3L_F);
1398 int8_t sign = 1;
1400 if (x < 0)
1402 sign = -1;
1403 x *= -1;
1406 int16_t low = 0, high = S3L_SIN_TABLE_LENGTH -1, middle;
1408 while (low <= high) // binary search
1410 middle = (low + high) / 2;
1412 S3L_Unit v = S3L_sinTable[middle];
1414 if (v > x)
1415 high = middle - 1;
1416 else if (v < x)
1417 low = middle + 1;
1418 else
1419 break;
1422 middle *= S3L_SIN_TABLE_UNIT_STEP;
1424 return sign * middle;
1425 #else
1426 S3L_Unit low = -1 * S3L_F / 4,
1427 high = S3L_F / 4,
1428 middle;
1430 while (low <= high) // binary search
1432 middle = (low + high) / 2;
1434 S3L_Unit v = S3L_sin(middle);
1436 if (v > x)
1437 high = middle - 1;
1438 else if (v < x)
1439 low = middle + 1;
1440 else
1441 break;
1444 return middle;
1445 #endif
1448 S3L_Unit S3L_cos(S3L_Unit x)
1450 return S3L_sin(x + S3L_F / 4);
1453 void S3L_correctBarycentricCoords(S3L_Unit barycentric[3])
1455 barycentric[0] = S3L_clamp(barycentric[0],0,S3L_F);
1456 barycentric[1] = S3L_clamp(barycentric[1],0,S3L_F);
1458 S3L_Unit d = S3L_F - barycentric[0] - barycentric[1];
1460 if (d < 0)
1462 barycentric[0] += d;
1463 barycentric[2] = 0;
1465 else
1466 barycentric[2] = d;
1469 void S3L_makeTranslationMat(
1470 S3L_Unit offsetX,
1471 S3L_Unit offsetY,
1472 S3L_Unit offsetZ,
1473 S3L_Mat4 m)
1475 #define M(x,y) m[x][y]
1476 #define S S3L_F
1478 M(0,0) = S; M(1,0) = 0; M(2,0) = 0; M(3,0) = 0;
1479 M(0,1) = 0; M(1,1) = S; M(2,1) = 0; M(3,1) = 0;
1480 M(0,2) = 0; M(1,2) = 0; M(2,2) = S; M(3,2) = 0;
1481 M(0,3) = offsetX; M(1,3) = offsetY; M(2,3) = offsetZ; M(3,3) = S;
1483 #undef M
1484 #undef S
1487 void S3L_makeScaleMatrix(
1488 S3L_Unit scaleX,
1489 S3L_Unit scaleY,
1490 S3L_Unit scaleZ,
1491 S3L_Mat4 m)
1493 #define M(x,y) m[x][y]
1495 M(0,0) = scaleX; M(1,0) = 0; M(2,0) = 0; M(3,0) = 0;
1496 M(0,1) = 0; M(1,1) = scaleY; M(2,1) = 0; M(3,1) = 0;
1497 M(0,2) = 0; M(1,2) = 0; M(2,2) = scaleZ; M(3,2) = 0;
1498 M(0,3) = 0; M(1,3) = 0; M(2,3) = 0; M(3,3) = S3L_F;
1500 #undef M
1503 void S3L_makeRotationMatrixZXY(
1504 S3L_Unit byX,
1505 S3L_Unit byY,
1506 S3L_Unit byZ,
1507 S3L_Mat4 m)
1509 byX *= -1;
1510 byY *= -1;
1511 byZ *= -1;
1513 S3L_Unit sx = S3L_sin(byX);
1514 S3L_Unit sy = S3L_sin(byY);
1515 S3L_Unit sz = S3L_sin(byZ);
1517 S3L_Unit cx = S3L_cos(byX);
1518 S3L_Unit cy = S3L_cos(byY);
1519 S3L_Unit cz = S3L_cos(byZ);
1521 #define M(x,y) m[x][y]
1522 #define S S3L_F
1524 M(0,0) = (cy * cz) / S + (sy * sx * sz) / (S * S);
1525 M(1,0) = (cx * sz) / S;
1526 M(2,0) = (cy * sx * sz) / (S * S) - (cz * sy) / S;
1527 M(3,0) = 0;
1529 M(0,1) = (cz * sy * sx) / (S * S) - (cy * sz) / S;
1530 M(1,1) = (cx * cz) / S;
1531 M(2,1) = (cy * cz * sx) / (S * S) + (sy * sz) / S;
1532 M(3,1) = 0;
1534 M(0,2) = (cx * sy) / S;
1535 M(1,2) = -1 * sx;
1536 M(2,2) = (cy * cx) / S;
1537 M(3,2) = 0;
1539 M(0,3) = 0;
1540 M(1,3) = 0;
1541 M(2,3) = 0;
1542 M(3,3) = S3L_F;
1544 #undef M
1545 #undef S
1548 S3L_Unit S3L_sqrt(S3L_Unit value)
1550 int8_t sign = 1;
1552 if (value < 0)
1554 sign = -1;
1555 value *= -1;
1558 uint32_t result = 0;
1559 uint32_t a = value;
1560 uint32_t b = 1u << 30;
1562 while (b > a)
1563 b >>= 2;
1565 while (b != 0)
1567 if (a >= result + b)
1569 a -= result + b;
1570 result = result + 2 * b;
1573 b >>= 2;
1574 result >>= 1;
1577 return result * sign;
1580 S3L_Unit S3L_vec3Length(S3L_Vec4 v)
1582 return S3L_sqrt(v.x * v.x + v.y * v.y + v.z * v.z);
1585 S3L_Unit S3L_vec2Length(S3L_Vec4 v)
1587 return S3L_sqrt(v.x * v.x + v.y * v.y);
1590 void S3L_vec3Normalize(S3L_Vec4 *v)
1592 #define SCALE 16
1593 #define BOTTOM_LIMIT 16
1594 #define UPPER_LIMIT 900
1596 /* Here we try to decide if the vector is too small and would cause
1597 inaccurate result due to very its inaccurate length. If so, we scale
1598 it up. We can't scale up everything as big vectors overflow in length
1599 calculations. */
1601 if (
1602 S3L_abs(v->x) <= BOTTOM_LIMIT &&
1603 S3L_abs(v->y) <= BOTTOM_LIMIT &&
1604 S3L_abs(v->z) <= BOTTOM_LIMIT)
1606 v->x *= SCALE;
1607 v->y *= SCALE;
1608 v->z *= SCALE;
1610 else if (
1611 S3L_abs(v->x) > UPPER_LIMIT ||
1612 S3L_abs(v->y) > UPPER_LIMIT ||
1613 S3L_abs(v->z) > UPPER_LIMIT)
1615 v->x /= SCALE;
1616 v->y /= SCALE;
1617 v->z /= SCALE;
1620 #undef SCALE
1621 #undef BOTTOM_LIMIT
1622 #undef UPPER_LIMIT
1624 S3L_Unit l = S3L_vec3Length(*v);
1626 if (l == 0)
1627 return;
1629 v->x = (v->x * S3L_F) / l;
1630 v->y = (v->y * S3L_F) / l;
1631 v->z = (v->z * S3L_F) / l;
1634 void S3L_vec3NormalizeFast(S3L_Vec4 *v)
1636 S3L_Unit l = S3L_vec3Length(*v);
1638 if (l == 0)
1639 return;
1641 v->x = (v->x * S3L_F) / l;
1642 v->y = (v->y * S3L_F) / l;
1643 v->z = (v->z * S3L_F) / l;
1646 void S3L_transform3DInit(S3L_Transform3D *t)
1648 S3L_vec4Init(&(t->translation));
1649 S3L_vec4Init(&(t->rotation));
1650 t->scale.x = S3L_F;
1651 t->scale.y = S3L_F;
1652 t->scale.z = S3L_F;
1653 t->scale.w = 0;
1656 /** Performs perspecive division (z-divide). Does NOT check for division by
1657 zero. */
1658 static inline void S3L_perspectiveDivide(S3L_Vec4 *vector,
1659 S3L_Unit focalLength)
1661 if (focalLength == 0)
1662 return;
1664 vector->x = (vector->x * focalLength) / vector->z;
1665 vector->y = (vector->y * focalLength) / vector->z;
1668 void S3L_project3DPointToScreen(
1669 S3L_Vec4 point,
1670 S3L_Camera camera,
1671 S3L_Vec4 *result)
1673 // TODO: hotfix to prevent a mapping bug probably to overlfows
1674 S3L_Vec4 toPoint = point, camForw;
1676 S3L_vec3Sub(&toPoint,camera.transform.translation);
1678 S3L_vec3Normalize(&toPoint);
1680 S3L_rotationToDirections(camera.transform.rotation,S3L_FRACTIONS_PER_UNIT,
1681 &camForw,0,0);
1683 if (S3L_vec3Dot(toPoint,camForw) < S3L_FRACTIONS_PER_UNIT / 6)
1685 result->z = -1;
1686 result->w = 0;
1687 return;
1689 // end of hotfix
1690 S3L_Mat4 m;
1691 S3L_makeCameraMatrix(camera.transform,m);
1693 S3L_Unit s = point.w;
1695 point.w = S3L_F;
1697 S3L_vec3Xmat4(&point,m);
1699 point.z = S3L_nonZero(point.z);
1701 S3L_perspectiveDivide(&point,camera.focalLength);
1703 S3L_ScreenCoord x, y;
1705 S3L_mapProjectionPlaneToScreen(point,&x,&y);
1707 result->x = x;
1708 result->y = y;
1709 result->z = point.z;
1711 result->w =
1712 (point.z <= 0) ? 0 :
1714 camera.focalLength > 0 ?(
1715 (s * camera.focalLength * S3L_RESOLUTION_X) /
1716 (point.z * S3L_F)) :
1717 ((camera.transform.scale.x * S3L_RESOLUTION_X) / S3L_F)
1721 void S3L_lookAt(S3L_Vec4 pointTo, S3L_Transform3D *t)
1723 S3L_Vec4 v;
1725 v.x = pointTo.x - t->translation.x;
1726 v.y = pointTo.z - t->translation.z;
1728 S3L_Unit dx = v.x;
1729 S3L_Unit l = S3L_vec2Length(v);
1731 dx = (v.x * S3L_F) / S3L_nonZero(l); // normalize
1733 t->rotation.y = -1 * S3L_asin(dx);
1735 if (v.y < 0)
1736 t->rotation.y = S3L_F / 2 - t->rotation.y;
1738 v.x = pointTo.y - t->translation.y;
1739 v.y = l;
1741 l = S3L_vec2Length(v);
1743 dx = (v.x * S3L_F) / S3L_nonZero(l);
1745 t->rotation.x = S3L_asin(dx);
1748 void S3L_transform3DSet(
1749 S3L_Unit tx,
1750 S3L_Unit ty,
1751 S3L_Unit tz,
1752 S3L_Unit rx,
1753 S3L_Unit ry,
1754 S3L_Unit rz,
1755 S3L_Unit sx,
1756 S3L_Unit sy,
1757 S3L_Unit sz,
1758 S3L_Transform3D *t)
1760 t->translation.x = tx;
1761 t->translation.y = ty;
1762 t->translation.z = tz;
1764 t->rotation.x = rx;
1765 t->rotation.y = ry;
1766 t->rotation.z = rz;
1768 t->scale.x = sx;
1769 t->scale.y = sy;
1770 t->scale.z = sz;
1773 void S3L_cameraInit(S3L_Camera *camera)
1775 camera->focalLength = S3L_F;
1776 S3L_transform3DInit(&(camera->transform));
1779 void S3L_rotationToDirections(
1780 S3L_Vec4 rotation,
1781 S3L_Unit length,
1782 S3L_Vec4 *forw,
1783 S3L_Vec4 *right,
1784 S3L_Vec4 *up)
1786 S3L_Mat4 m;
1788 S3L_makeRotationMatrixZXY(rotation.x,rotation.y,rotation.z,m);
1790 if (forw != 0)
1792 forw->x = 0;
1793 forw->y = 0;
1794 forw->z = length;
1795 S3L_vec3Xmat4(forw,m);
1798 if (right != 0)
1800 right->x = length;
1801 right->y = 0;
1802 right->z = 0;
1803 S3L_vec3Xmat4(right,m);
1806 if (up != 0)
1808 up->x = 0;
1809 up->y = length;
1810 up->z = 0;
1811 S3L_vec3Xmat4(up,m);
1815 void S3L_pixelInfoInit(S3L_PixelInfo *p)
1817 p->x = 0;
1818 p->y = 0;
1819 p->barycentric[0] = S3L_F;
1820 p->barycentric[1] = 0;
1821 p->barycentric[2] = 0;
1822 p->modelIndex = 0;
1823 p->triangleIndex = 0;
1824 p->triangleID = 0;
1825 p->depth = 0;
1826 p->previousZ = 0;
1829 void S3L_model3DInit(
1830 const S3L_Unit *vertices,
1831 S3L_Index vertexCount,
1832 const S3L_Index *triangles,
1833 S3L_Index triangleCount,
1834 S3L_Model3D *model)
1836 model->vertices = vertices;
1837 model->vertexCount = vertexCount;
1838 model->triangles = triangles;
1839 model->triangleCount = triangleCount;
1840 model->customTransformMatrix = 0;
1842 S3L_transform3DInit(&(model->transform));
1843 S3L_drawConfigInit(&(model->config));
1846 void S3L_sceneInit(
1847 S3L_Model3D *models,
1848 S3L_Index modelCount,
1849 S3L_Scene *scene)
1851 scene->models = models;
1852 scene->modelCount = modelCount;
1853 S3L_cameraInit(&(scene->camera));
1856 void S3L_drawConfigInit(S3L_DrawConfig *config)
1858 config->backfaceCulling = 2;
1859 config->visible = 1;
1862 #ifndef S3L_PIXEL_FUNCTION
1863 #error Pixel rendering function (S3L_PIXEL_FUNCTION) not specified!
1864 #endif
1866 static inline void S3L_PIXEL_FUNCTION(S3L_PixelInfo *pixel); // forward decl
1868 /** Serves to accelerate linear interpolation for performance-critical
1869 code. Functions such as S3L_interpolate require division to compute each
1870 interpolated value, while S3L_FastLerpState only requires a division for
1871 the initiation and a shift for retrieving each interpolated value.
1873 S3L_FastLerpState stores a value and a step, both scaled (shifted by
1874 S3L_FAST_LERP_QUALITY) to increase precision. The step is being added to the
1875 value, which achieves the interpolation. This will only be useful for
1876 interpolations in which we need to get the interpolated value in every step.
1878 BEWARE! Shifting a negative value is undefined, so handling shifting of
1879 negative values has to be done cleverly. */
1880 typedef struct
1882 S3L_Unit valueScaled;
1883 S3L_Unit stepScaled;
1884 } S3L_FastLerpState;
1886 #define S3L_getFastLerpValue(state)\
1887 (state.valueScaled >> S3L_FAST_LERP_QUALITY)
1889 #define S3L_stepFastLerp(state)\
1890 state.valueScaled += state.stepScaled
1892 static inline S3L_Unit S3L_interpolateBarycentric(
1893 S3L_Unit value0,
1894 S3L_Unit value1,
1895 S3L_Unit value2,
1896 S3L_Unit barycentric[3])
1898 return
1900 (value0 * barycentric[0]) +
1901 (value1 * barycentric[1]) +
1902 (value2 * barycentric[2])
1903 ) / S3L_F;
1906 void S3L_mapProjectionPlaneToScreen(
1907 S3L_Vec4 point,
1908 S3L_ScreenCoord *screenX,
1909 S3L_ScreenCoord *screenY)
1911 *screenX =
1912 S3L_HALF_RESOLUTION_X +
1913 (point.x * S3L_HALF_RESOLUTION_X) / S3L_F;
1915 *screenY =
1916 S3L_HALF_RESOLUTION_Y -
1917 (point.y * S3L_HALF_RESOLUTION_X) / S3L_F;
1920 void S3L_zBufferClear(void)
1922 #if S3L_Z_BUFFER
1923 for (uint32_t i = 0; i < S3L_RESOLUTION_X * S3L_RESOLUTION_Y; ++i)
1924 S3L_zBuffer[i] = S3L_MAX_DEPTH;
1925 #endif
1928 void S3L_stencilBufferClear(void)
1930 #if S3L_STENCIL_BUFFER
1931 for (uint32_t i = 0; i < S3L_STENCIL_BUFFER_SIZE; ++i)
1932 S3L_stencilBuffer[i] = 0;
1933 #endif
1936 void S3L_newFrame(void)
1938 S3L_zBufferClear();
1939 S3L_stencilBufferClear();
1942 /* the following serves to communicate info about if the triangle has been split
1943 and how the barycentrics should be remapped. */
1944 uint8_t _S3L_projectedTriangleState = 0; // 0 = normal, 1 = cut, 2 = split
1946 #if S3L_NEAR_CROSS_STRATEGY == 3
1947 S3L_Vec4 _S3L_triangleRemapBarycentrics[6];
1948 #endif
1950 void S3L_drawTriangle(
1951 S3L_Vec4 point0,
1952 S3L_Vec4 point1,
1953 S3L_Vec4 point2,
1954 S3L_Index modelIndex,
1955 S3L_Index triangleIndex)
1957 S3L_PixelInfo p;
1958 S3L_pixelInfoInit(&p);
1959 p.modelIndex = modelIndex;
1960 p.triangleIndex = triangleIndex;
1961 p.triangleID = (modelIndex << 16) | triangleIndex;
1963 S3L_Vec4 *tPointSS, *lPointSS, *rPointSS; /* points in Screen Space (in
1964 S3L_Units, normalized by
1965 S3L_F) */
1967 S3L_Unit *barycentric0; // bar. coord that gets higher from L to R
1968 S3L_Unit *barycentric1; // bar. coord that gets higher from R to L
1969 S3L_Unit *barycentric2; // bar. coord that gets higher from bottom up
1971 // sort the vertices:
1973 #define assignPoints(t,a,b)\
1975 tPointSS = &point##t;\
1976 barycentric2 = &(p.barycentric[t]);\
1977 if (S3L_triangleWinding(point##t.x,point##t.y,point##a.x,point##a.y,\
1978 point##b.x,point##b.y) >= 0)\
1980 lPointSS = &point##a; rPointSS = &point##b;\
1981 barycentric0 = &(p.barycentric[b]);\
1982 barycentric1 = &(p.barycentric[a]);\
1984 else\
1986 lPointSS = &point##b; rPointSS = &point##a;\
1987 barycentric0 = &(p.barycentric[a]);\
1988 barycentric1 = &(p.barycentric[b]);\
1992 if (point0.y <= point1.y)
1994 if (point0.y <= point2.y)
1995 assignPoints(0,1,2)
1996 else
1997 assignPoints(2,0,1)
1999 else
2001 if (point1.y <= point2.y)
2002 assignPoints(1,0,2)
2003 else
2004 assignPoints(2,0,1)
2007 #undef assignPoints
2009 #if S3L_FLAT
2010 *barycentric0 = S3L_F / 3;
2011 *barycentric1 = S3L_F / 3;
2012 *barycentric2 = S3L_F - 2 * (S3L_F / 3);
2013 #endif
2015 p.triangleSize[0] = rPointSS->x - lPointSS->x;
2016 p.triangleSize[1] =
2017 (rPointSS->y > lPointSS->y ? rPointSS->y : lPointSS->y) - tPointSS->y;
2019 // now draw the triangle line by line:
2021 S3L_ScreenCoord splitY; // Y of the vertically middle point of the triangle
2022 S3L_ScreenCoord endY; // bottom Y of the whole triangle
2023 int splitOnLeft; /* whether splitY is the y coord. of left or right
2024 point */
2026 if (rPointSS->y <= lPointSS->y)
2028 splitY = rPointSS->y;
2029 splitOnLeft = 0;
2030 endY = lPointSS->y;
2032 else
2034 splitY = lPointSS->y;
2035 splitOnLeft = 1;
2036 endY = rPointSS->y;
2039 S3L_ScreenCoord currentY = tPointSS->y;
2041 /* We'll be using an algorithm similar to Bresenham line algorithm. The
2042 specifics of this algorithm are among others:
2044 - drawing possibly NON-CONTINUOUS line
2045 - NOT tracing the line exactly, but rather rasterizing one the right
2046 side of it, according to the pixel CENTERS, INCLUDING the pixel
2047 centers
2049 The principle is this:
2051 - Move vertically by pixels and accumulate the error (abs(dx/dy)).
2052 - If the error is greater than one (crossed the next pixel center), keep
2053 moving horizontally and substracting 1 from the error until it is less
2054 than 1 again.
2055 - To make this INTEGER ONLY, scale the case so that distance between
2056 pixels is equal to dy (instead of 1). This way the error becomes
2057 dx/dy * dy == dx, and we're comparing the error to (and potentially
2058 substracting) 1 * dy == dy. */
2060 int16_t
2061 /* triangle side:
2062 left right */
2063 lX, rX, // current x position on the screen
2064 lDx, rDx, // dx (end point - start point)
2065 lDy, rDy, // dy (end point - start point)
2066 lInc, rInc, // direction in which to increment (1 or -1)
2067 lErr, rErr, // current error (Bresenham)
2068 lErrCmp, rErrCmp, // helper for deciding comparison (> vs >=)
2069 lErrAdd, rErrAdd, // error value to add in each Bresenham cycle
2070 lErrSub, rErrSub; // error value to substract when moving in x direction
2072 S3L_FastLerpState lSideFLS, rSideFLS;
2074 #if S3L_COMPUTE_LERP_DEPTH
2075 S3L_FastLerpState lDepthFLS, rDepthFLS;
2077 #define initDepthFLS(s,p1,p2)\
2078 s##DepthFLS.valueScaled = p1##PointSS->z << S3L_FAST_LERP_QUALITY;\
2079 s##DepthFLS.stepScaled = ((p2##PointSS->z << S3L_FAST_LERP_QUALITY) -\
2080 s##DepthFLS.valueScaled) / (s##Dy != 0 ? s##Dy : 1);
2081 #else
2082 #define initDepthFLS(s,p1,p2) ;
2083 #endif
2085 /* init side for the algorithm, params:
2086 s - which side (l or r)
2087 p1 - point from (t, l or r)
2088 p2 - point to (t, l or r)
2089 down - whether the side coordinate goes top-down or vice versa */
2090 #define initSide(s,p1,p2,down)\
2091 s##X = p1##PointSS->x;\
2092 s##Dx = p2##PointSS->x - p1##PointSS->x;\
2093 s##Dy = p2##PointSS->y - p1##PointSS->y;\
2094 initDepthFLS(s,p1,p2)\
2095 s##SideFLS.stepScaled = (S3L_F << S3L_FAST_LERP_QUALITY)\
2096 / (s##Dy != 0 ? s##Dy : 1);\
2097 s##SideFLS.valueScaled = 0;\
2098 if (!down)\
2100 s##SideFLS.valueScaled =\
2101 S3L_F << S3L_FAST_LERP_QUALITY;\
2102 s##SideFLS.stepScaled *= -1;\
2104 s##Inc = s##Dx >= 0 ? 1 : -1;\
2105 if (s##Dx < 0)\
2106 {s##Err = 0; s##ErrCmp = 0;}\
2107 else\
2108 {s##Err = s##Dy; s##ErrCmp = 1;}\
2109 s##ErrAdd = S3L_abs(s##Dx);\
2110 s##ErrSub = s##Dy != 0 ? s##Dy : 1; /* don't allow 0, could lead to an
2111 infinite substracting loop */
2113 #define stepSide(s)\
2114 while (s##Err - s##Dy >= s##ErrCmp)\
2116 s##X += s##Inc;\
2117 s##Err -= s##ErrSub;\
2119 s##Err += s##ErrAdd;
2121 initSide(r,t,r,1)
2122 initSide(l,t,l,1)
2124 #if S3L_PERSPECTIVE_CORRECTION
2125 /* PC is done by linearly interpolating reciprocals from which the corrected
2126 velues can be computed. See
2127 http://www.lysator.liu.se/~mikaelk/doc/perspectivetexture/ */
2129 #if S3L_PERSPECTIVE_CORRECTION == 1
2130 #define Z_RECIP_NUMERATOR\
2131 (S3L_F * S3L_F * S3L_F)
2132 #elif S3L_PERSPECTIVE_CORRECTION == 2
2133 #define Z_RECIP_NUMERATOR\
2134 (S3L_F * S3L_F)
2135 #endif
2136 /* ^ This numerator is a number by which we divide values for the
2137 reciprocals. For PC == 2 it has to be lower because linear interpolation
2138 scaling would make it overflow -- this results in lower depth precision
2139 in bigger distance for PC == 2. */
2141 S3L_Unit
2142 tPointRecipZ, lPointRecipZ, rPointRecipZ, /* Reciprocals of the depth of
2143 each triangle point. */
2144 lRecip0, lRecip1, rRecip0, rRecip1; /* Helper variables for swapping
2145 the above after split. */
2147 tPointRecipZ = Z_RECIP_NUMERATOR / S3L_nonZero(tPointSS->z);
2148 lPointRecipZ = Z_RECIP_NUMERATOR / S3L_nonZero(lPointSS->z);
2149 rPointRecipZ = Z_RECIP_NUMERATOR / S3L_nonZero(rPointSS->z);
2151 lRecip0 = tPointRecipZ;
2152 lRecip1 = lPointRecipZ;
2153 rRecip0 = tPointRecipZ;
2154 rRecip1 = rPointRecipZ;
2156 #define manageSplitPerspective(b0,b1)\
2157 b1##Recip0 = b0##PointRecipZ;\
2158 b1##Recip1 = b1##PointRecipZ;\
2159 b0##Recip0 = b0##PointRecipZ;\
2160 b0##Recip1 = tPointRecipZ;
2161 #else
2162 #define manageSplitPerspective(b0,b1) ;
2163 #endif
2165 // clip to the screen in y dimension:
2167 endY = S3L_min(endY,S3L_RESOLUTION_Y);
2169 /* Clipping above the screen (y < 0) can't be easily done here, will be
2170 handled inside the loop. */
2172 while (currentY < endY) /* draw the triangle from top to bottom -- the
2173 bottom-most row is left out because, following
2174 from the rasterization rules (see start of the
2175 file), it is to never be rasterized. */
2177 if (currentY == splitY) // reached a vertical split of the triangle?
2179 #define manageSplit(b0,b1,s0,s1)\
2180 S3L_Unit *tmp = barycentric##b0;\
2181 barycentric##b0 = barycentric##b1;\
2182 barycentric##b1 = tmp;\
2183 s0##SideFLS.valueScaled = (S3L_F\
2184 << S3L_FAST_LERP_QUALITY) - s0##SideFLS.valueScaled;\
2185 s0##SideFLS.stepScaled *= -1;\
2186 manageSplitPerspective(s0,s1)
2188 if (splitOnLeft)
2190 initSide(l,l,r,0);
2191 manageSplit(0,2,r,l)
2193 else
2195 initSide(r,r,l,0);
2196 manageSplit(1,2,l,r)
2200 stepSide(r)
2201 stepSide(l)
2203 if (currentY >= 0) /* clipping of pixels whose y < 0 (can't be easily done
2204 outside the loop because of the Bresenham-like
2205 algorithm steps) */
2207 p.y = currentY;
2209 // draw the horizontal line
2211 #if !S3L_FLAT
2212 S3L_Unit rowLength = S3L_nonZero(rX - lX - 1); // prevent zero div
2214 #if S3L_PERSPECTIVE_CORRECTION
2215 S3L_Unit lOverZ, lRecipZ, rOverZ, rRecipZ, lT, rT;
2217 lT = S3L_getFastLerpValue(lSideFLS);
2218 rT = S3L_getFastLerpValue(rSideFLS);
2220 lOverZ = S3L_interpolateByUnitFrom0(lRecip1,lT);
2221 lRecipZ = S3L_interpolateByUnit(lRecip0,lRecip1,lT);
2223 rOverZ = S3L_interpolateByUnitFrom0(rRecip1,rT);
2224 rRecipZ = S3L_interpolateByUnit(rRecip0,rRecip1,rT);
2225 #else
2226 S3L_FastLerpState b0FLS, b1FLS;
2228 #if S3L_COMPUTE_LERP_DEPTH
2229 S3L_FastLerpState depthFLS;
2231 depthFLS.valueScaled = lDepthFLS.valueScaled;
2232 depthFLS.stepScaled =
2233 (rDepthFLS.valueScaled - lDepthFLS.valueScaled) / rowLength;
2234 #endif
2236 b0FLS.valueScaled = 0;
2237 b1FLS.valueScaled = lSideFLS.valueScaled;
2239 b0FLS.stepScaled = rSideFLS.valueScaled / rowLength;
2240 b1FLS.stepScaled = -1 * lSideFLS.valueScaled / rowLength;
2241 #endif
2242 #endif
2244 // clip to the screen in x dimension:
2246 S3L_ScreenCoord rXClipped = S3L_min(rX,S3L_RESOLUTION_X),
2247 lXClipped = lX;
2249 if (lXClipped < 0)
2251 lXClipped = 0;
2253 #if !S3L_PERSPECTIVE_CORRECTION && !S3L_FLAT
2254 b0FLS.valueScaled -= lX * b0FLS.stepScaled;
2255 b1FLS.valueScaled -= lX * b1FLS.stepScaled;
2257 #if S3L_COMPUTE_LERP_DEPTH
2258 depthFLS.valueScaled -= lX * depthFLS.stepScaled;
2259 #endif
2260 #endif
2263 #if S3L_PERSPECTIVE_CORRECTION
2264 S3L_ScreenCoord i = lXClipped - lX; /* helper var to save one
2265 substraction in the inner
2266 loop */
2267 #endif
2269 #if S3L_PERSPECTIVE_CORRECTION == 2
2270 S3L_FastLerpState
2271 depthPC, // interpolates depth between row segments
2272 b0PC, // interpolates barycentric0 between row segments
2273 b1PC; // interpolates barycentric1 between row segments
2275 /* ^ These interpolate values between row segments (lines of pixels
2276 of S3L_PC_APPROX_LENGTH length). After each row segment perspective
2277 correction is recomputed. */
2279 depthPC.valueScaled =
2280 (Z_RECIP_NUMERATOR /
2281 S3L_nonZero(S3L_interpolate(lRecipZ,rRecipZ,i,rowLength)))
2282 << S3L_FAST_LERP_QUALITY;
2284 b0PC.valueScaled =
2286 S3L_interpolateFrom0(rOverZ,i,rowLength)
2287 * depthPC.valueScaled
2288 ) / (Z_RECIP_NUMERATOR / S3L_F);
2290 b1PC.valueScaled =
2292 (lOverZ - S3L_interpolateFrom0(lOverZ,i,rowLength))
2293 * depthPC.valueScaled
2294 ) / (Z_RECIP_NUMERATOR / S3L_F);
2296 int8_t rowCount = S3L_PC_APPROX_LENGTH;
2297 #endif
2299 #if S3L_Z_BUFFER
2300 uint32_t zBufferIndex = p.y * S3L_RESOLUTION_X + lXClipped;
2301 #endif
2303 // draw the row -- inner loop:
2304 for (S3L_ScreenCoord x = lXClipped; x < rXClipped; ++x)
2306 int8_t testsPassed = 1;
2308 #if S3L_STENCIL_BUFFER
2309 if (!S3L_stencilTest(x,p.y))
2310 testsPassed = 0;
2311 #endif
2312 p.x = x;
2314 #if S3L_COMPUTE_DEPTH
2315 #if S3L_PERSPECTIVE_CORRECTION == 1
2316 p.depth = Z_RECIP_NUMERATOR /
2317 S3L_nonZero(S3L_interpolate(lRecipZ,rRecipZ,i,rowLength));
2318 #elif S3L_PERSPECTIVE_CORRECTION == 2
2319 if (rowCount >= S3L_PC_APPROX_LENGTH)
2321 // init the linear interpolation to the next PC correct value
2323 rowCount = 0;
2325 S3L_Unit nextI = i + S3L_PC_APPROX_LENGTH;
2327 if (nextI < rowLength)
2329 S3L_Unit nextDepthScaled =
2331 Z_RECIP_NUMERATOR /
2332 S3L_nonZero(S3L_interpolate(lRecipZ,rRecipZ,nextI,rowLength))
2333 ) << S3L_FAST_LERP_QUALITY;
2335 depthPC.stepScaled =
2336 (nextDepthScaled - depthPC.valueScaled) / S3L_PC_APPROX_LENGTH;
2338 S3L_Unit nextValue =
2340 S3L_interpolateFrom0(rOverZ,nextI,rowLength)
2341 * nextDepthScaled
2342 ) / (Z_RECIP_NUMERATOR / S3L_F);
2344 b0PC.stepScaled =
2345 (nextValue - b0PC.valueScaled) / S3L_PC_APPROX_LENGTH;
2347 nextValue =
2349 (lOverZ - S3L_interpolateFrom0(lOverZ,nextI,rowLength))
2350 * nextDepthScaled
2351 ) / (Z_RECIP_NUMERATOR / S3L_F);
2353 b1PC.stepScaled =
2354 (nextValue - b1PC.valueScaled) / S3L_PC_APPROX_LENGTH;
2356 else
2358 /* A special case where we'd be interpolating outside the triangle.
2359 It seems like a valid approach at first, but it creates a bug
2360 in a case when the rasaterized triangle is near screen 0 and can
2361 actually never reach the extrapolated screen position. So we
2362 have to clamp to the actual end of the triangle here. */
2364 S3L_Unit maxI = S3L_nonZero(rowLength - i);
2366 S3L_Unit nextDepthScaled =
2368 Z_RECIP_NUMERATOR /
2369 S3L_nonZero(rRecipZ)
2370 ) << S3L_FAST_LERP_QUALITY;
2372 depthPC.stepScaled =
2373 (nextDepthScaled - depthPC.valueScaled) / maxI;
2375 S3L_Unit nextValue =
2377 rOverZ
2378 * nextDepthScaled
2379 ) / (Z_RECIP_NUMERATOR / S3L_F);
2381 b0PC.stepScaled =
2382 (nextValue - b0PC.valueScaled) / maxI;
2384 b1PC.stepScaled =
2385 -1 * b1PC.valueScaled / maxI;
2389 p.depth = S3L_getFastLerpValue(depthPC);
2390 #else
2391 p.depth = S3L_getFastLerpValue(depthFLS);
2392 S3L_stepFastLerp(depthFLS);
2393 #endif
2394 #else // !S3L_COMPUTE_DEPTH
2395 p.depth = (tPointSS->z + lPointSS->z + rPointSS->z) / 3;
2396 #endif
2398 #if S3L_Z_BUFFER
2399 p.previousZ = S3L_zBuffer[zBufferIndex];
2401 zBufferIndex++;
2403 if (!S3L_zTest(p.x,p.y,p.depth))
2404 testsPassed = 0;
2405 #endif
2407 if (testsPassed)
2409 #if !S3L_FLAT
2410 #if S3L_PERSPECTIVE_CORRECTION == 0
2411 *barycentric0 = S3L_getFastLerpValue(b0FLS);
2412 *barycentric1 = S3L_getFastLerpValue(b1FLS);
2413 #elif S3L_PERSPECTIVE_CORRECTION == 1
2414 *barycentric0 =
2416 S3L_interpolateFrom0(rOverZ,i,rowLength)
2417 * p.depth
2418 ) / (Z_RECIP_NUMERATOR / S3L_F);
2420 *barycentric1 =
2422 (lOverZ - S3L_interpolateFrom0(lOverZ,i,rowLength))
2423 * p.depth
2424 ) / (Z_RECIP_NUMERATOR / S3L_F);
2425 #elif S3L_PERSPECTIVE_CORRECTION == 2
2426 *barycentric0 = S3L_getFastLerpValue(b0PC);
2427 *barycentric1 = S3L_getFastLerpValue(b1PC);
2428 #endif
2430 *barycentric2 =
2431 S3L_F - *barycentric0 - *barycentric1;
2432 #endif
2434 #if S3L_NEAR_CROSS_STRATEGY == 3
2435 if (_S3L_projectedTriangleState != 0)
2437 S3L_Unit newBarycentric[3];
2439 newBarycentric[0] = S3L_interpolateBarycentric(
2440 _S3L_triangleRemapBarycentrics[0].x,
2441 _S3L_triangleRemapBarycentrics[1].x,
2442 _S3L_triangleRemapBarycentrics[2].x,
2443 p.barycentric);
2445 newBarycentric[1] = S3L_interpolateBarycentric(
2446 _S3L_triangleRemapBarycentrics[0].y,
2447 _S3L_triangleRemapBarycentrics[1].y,
2448 _S3L_triangleRemapBarycentrics[2].y,
2449 p.barycentric);
2451 newBarycentric[2] = S3L_interpolateBarycentric(
2452 _S3L_triangleRemapBarycentrics[0].z,
2453 _S3L_triangleRemapBarycentrics[1].z,
2454 _S3L_triangleRemapBarycentrics[2].z,
2455 p.barycentric);
2457 p.barycentric[0] = newBarycentric[0];
2458 p.barycentric[1] = newBarycentric[1];
2459 p.barycentric[2] = newBarycentric[2];
2461 #endif
2462 S3L_PIXEL_FUNCTION(&p);
2463 } // tests passed
2465 #if !S3L_FLAT
2466 #if S3L_PERSPECTIVE_CORRECTION
2467 i++;
2468 #if S3L_PERSPECTIVE_CORRECTION == 2
2469 rowCount++;
2471 S3L_stepFastLerp(depthPC);
2472 S3L_stepFastLerp(b0PC);
2473 S3L_stepFastLerp(b1PC);
2474 #endif
2475 #else
2476 S3L_stepFastLerp(b0FLS);
2477 S3L_stepFastLerp(b1FLS);
2478 #endif
2479 #endif
2480 } // inner loop
2481 } // y clipping
2483 #if !S3L_FLAT
2484 S3L_stepFastLerp(lSideFLS);
2485 S3L_stepFastLerp(rSideFLS);
2487 #if S3L_COMPUTE_LERP_DEPTH
2488 S3L_stepFastLerp(lDepthFLS);
2489 S3L_stepFastLerp(rDepthFLS);
2490 #endif
2491 #endif
2493 ++currentY;
2494 } // row drawing
2496 #undef manageSplit
2497 #undef initPC
2498 #undef initSide
2499 #undef stepSide
2500 #undef Z_RECIP_NUMERATOR
2503 void S3L_rotate2DPoint(S3L_Unit *x, S3L_Unit *y, S3L_Unit angle)
2505 if (angle < S3L_SIN_TABLE_UNIT_STEP)
2506 return; // no visible rotation
2508 S3L_Unit angleSin = S3L_sin(angle);
2509 S3L_Unit angleCos = S3L_cos(angle);
2511 S3L_Unit xBackup = *x;
2513 *x =
2514 (angleCos * (*x)) / S3L_F -
2515 (angleSin * (*y)) / S3L_F;
2517 *y =
2518 (angleSin * xBackup) / S3L_F +
2519 (angleCos * (*y)) / S3L_F;
2522 void S3L_makeWorldMatrix(S3L_Transform3D worldTransform, S3L_Mat4 m)
2524 S3L_makeScaleMatrix(
2525 worldTransform.scale.x,
2526 worldTransform.scale.y,
2527 worldTransform.scale.z,
2530 S3L_Mat4 t;
2532 S3L_makeRotationMatrixZXY(
2533 worldTransform.rotation.x,
2534 worldTransform.rotation.y,
2535 worldTransform.rotation.z,
2538 S3L_mat4Xmat4(m,t);
2540 S3L_makeTranslationMat(
2541 worldTransform.translation.x,
2542 worldTransform.translation.y,
2543 worldTransform.translation.z,
2546 S3L_mat4Xmat4(m,t);
2549 void S3L_mat4Transpose(S3L_Mat4 m)
2551 S3L_Unit tmp;
2553 for (uint8_t y = 0; y < 3; ++y)
2554 for (uint8_t x = 1 + y; x < 4; ++x)
2556 tmp = m[x][y];
2557 m[x][y] = m[y][x];
2558 m[y][x] = tmp;
2562 void S3L_makeCameraMatrix(S3L_Transform3D cameraTransform, S3L_Mat4 m)
2564 S3L_makeTranslationMat(
2565 -1 * cameraTransform.translation.x,
2566 -1 * cameraTransform.translation.y,
2567 -1 * cameraTransform.translation.z,
2570 S3L_Mat4 r;
2572 S3L_makeRotationMatrixZXY(
2573 cameraTransform.rotation.x,
2574 cameraTransform.rotation.y,
2575 cameraTransform.rotation.z,
2578 S3L_mat4Transpose(r); // transposing creates an inverse transform
2580 S3L_Mat4 s;
2582 S3L_makeScaleMatrix(
2583 cameraTransform.scale.x,
2584 cameraTransform.scale.y,
2585 cameraTransform.scale.z,s);
2587 S3L_mat4Xmat4(m,r);
2588 S3L_mat4Xmat4(m,s);
2591 int8_t S3L_triangleWinding(
2592 S3L_ScreenCoord x0,
2593 S3L_ScreenCoord y0,
2594 S3L_ScreenCoord x1,
2595 S3L_ScreenCoord y1,
2596 S3L_ScreenCoord x2,
2597 S3L_ScreenCoord y2)
2599 int32_t winding =
2600 (y1 - y0) * (x2 - x1) - (x1 - x0) * (y2 - y1);
2601 // ^ cross product for points with z == 0
2603 return winding > 0 ? 1 : (winding < 0 ? -1 : 0);
2606 /** Checks if given triangle (in Screen Space) is at least partially visible,
2607 i.e. returns false if the triangle is either completely outside the frustum
2608 (left, right, top, bottom, near) or is invisible due to backface culling. */
2609 static inline int8_t S3L_triangleIsVisible(
2610 S3L_Vec4 p0,
2611 S3L_Vec4 p1,
2612 S3L_Vec4 p2,
2613 uint8_t backfaceCulling)
2615 #define clipTest(c,cmp,v)\
2616 (p0.c cmp (v) && p1.c cmp (v) && p2.c cmp (v))
2618 if ( // outside frustum?
2619 #if S3L_NEAR_CROSS_STRATEGY == 0
2620 p0.z <= S3L_NEAR || p1.z <= S3L_NEAR || p2.z <= S3L_NEAR ||
2621 // ^ partially in front of NEAR?
2622 #else
2623 clipTest(z,<=,S3L_NEAR) || // completely in front of NEAR?
2624 #endif
2625 clipTest(x,<,0) ||
2626 clipTest(x,>=,S3L_RESOLUTION_X) ||
2627 clipTest(y,<,0) ||
2628 clipTest(y,>,S3L_RESOLUTION_Y)
2630 return 0;
2632 #undef clipTest
2634 if (backfaceCulling != 0)
2636 int8_t winding =
2637 S3L_triangleWinding(p0.x,p0.y,p1.x,p1.y,p2.x,p2.y);
2639 if ((backfaceCulling == 1 && winding > 0) ||
2640 (backfaceCulling == 2 && winding < 0))
2641 return 0;
2644 return 1;
2647 #if S3L_SORT != 0
2648 typedef struct
2650 uint8_t modelIndex;
2651 S3L_Index triangleIndex;
2652 uint16_t sortValue;
2653 } _S3L_TriangleToSort;
2655 _S3L_TriangleToSort S3L_sortArray[S3L_MAX_TRIANGES_DRAWN];
2656 uint16_t S3L_sortArrayLength;
2657 #endif
2659 void _S3L_projectVertex(const S3L_Model3D *model, S3L_Index triangleIndex,
2660 uint8_t vertex, S3L_Mat4 projectionMatrix, S3L_Vec4 *result)
2662 uint32_t vertexIndex = model->triangles[triangleIndex * 3 + vertex] * 3;
2664 result->x = model->vertices[vertexIndex];
2665 result->y = model->vertices[vertexIndex + 1];
2666 result->z = model->vertices[vertexIndex + 2];
2667 result->w = S3L_F; // needed for translation
2669 S3L_vec3Xmat4(result,projectionMatrix);
2671 result->w = result->z;
2672 /* We'll keep the non-clamped z in w for sorting. */
2675 void _S3L_mapProjectedVertexToScreen(S3L_Vec4 *vertex, S3L_Unit focalLength)
2677 vertex->z = vertex->z >= S3L_NEAR ? vertex->z : S3L_NEAR;
2678 /* ^ This firstly prevents zero division in the follwoing z-divide and
2679 secondly "pushes" vertices that are in front of near a little bit forward,
2680 which makes them behave a bit better. If all three vertices end up exactly
2681 on NEAR, the triangle will be culled. */
2683 S3L_perspectiveDivide(vertex,focalLength);
2685 S3L_ScreenCoord sX, sY;
2687 S3L_mapProjectionPlaneToScreen(*vertex,&sX,&sY);
2689 vertex->x = sX;
2690 vertex->y = sY;
2693 /** Projects a triangle to the screen. If enabled, a triangle can be potentially
2694 subdivided into two if it crosses the near plane, in which case two projected
2695 triangles are returned (the info about splitting or cutting the triangle is
2696 passed in global variables, see above). */
2697 void _S3L_projectTriangle(
2698 const S3L_Model3D *model,
2699 S3L_Index triangleIndex,
2700 S3L_Mat4 matrix,
2701 uint32_t focalLength,
2702 S3L_Vec4 transformed[6])
2704 _S3L_projectVertex(model,triangleIndex,0,matrix,&(transformed[0]));
2705 _S3L_projectVertex(model,triangleIndex,1,matrix,&(transformed[1]));
2706 _S3L_projectVertex(model,triangleIndex,2,matrix,&(transformed[2]));
2707 _S3L_projectedTriangleState = 0;
2709 #if S3L_NEAR_CROSS_STRATEGY == 2 || S3L_NEAR_CROSS_STRATEGY == 3
2710 uint8_t infront = 0;
2711 uint8_t behind = 0;
2712 uint8_t infrontI[3];
2713 uint8_t behindI[3];
2715 for (uint8_t i = 0; i < 3; ++i)
2716 if (transformed[i].z < S3L_NEAR)
2718 infrontI[infront] = i;
2719 infront++;
2721 else
2723 behindI[behind] = i;
2724 behind++;
2727 #if S3L_NEAR_CROSS_STRATEGY == 3
2728 for (int i = 0; i < 3; ++i)
2729 S3L_vec4Init(&(_S3L_triangleRemapBarycentrics[i]));
2731 _S3L_triangleRemapBarycentrics[0].x = S3L_F;
2732 _S3L_triangleRemapBarycentrics[1].y = S3L_F;
2733 _S3L_triangleRemapBarycentrics[2].z = S3L_F;
2734 #endif
2736 #define interpolateVertex \
2737 S3L_Unit ratio =\
2738 ((transformed[be].z - S3L_NEAR) * S3L_F) /\
2739 (transformed[be].z - transformed[in].z);\
2740 transformed[in].x = transformed[be].x - \
2741 ((transformed[be].x - transformed[in].x) * ratio) /\
2742 S3L_F;\
2743 transformed[in].y = transformed[be].y -\
2744 ((transformed[be].y - transformed[in].y) * ratio) /\
2745 S3L_F;\
2746 transformed[in].z = S3L_NEAR;\
2747 if (beI != 0) {\
2748 beI->x = (beI->x * ratio) / S3L_F;\
2749 beI->y = (beI->y * ratio) / S3L_F;\
2750 beI->z = (beI->z * ratio) / S3L_F;\
2751 ratio = S3L_F - ratio;\
2752 beI->x += (beB->x * ratio) / S3L_F;\
2753 beI->y += (beB->y * ratio) / S3L_F;\
2754 beI->z += (beB->z * ratio) / S3L_F; }
2756 if (infront == 2)
2758 // shift the two vertices forward along the edge
2759 for (uint8_t i = 0; i < 2; ++i)
2761 uint8_t be = behindI[0], in = infrontI[i];
2763 #if S3L_NEAR_CROSS_STRATEGY == 3
2764 S3L_Vec4 *beI = &(_S3L_triangleRemapBarycentrics[in]),
2765 *beB = &(_S3L_triangleRemapBarycentrics[be]);
2766 #else
2767 S3L_Vec4 *beI = 0, *beB = 0;
2768 #endif
2770 interpolateVertex
2772 _S3L_projectedTriangleState = 1;
2775 else if (infront == 1)
2777 // create another triangle and do the shifts
2778 transformed[3] = transformed[behindI[1]];
2779 transformed[4] = transformed[infrontI[0]];
2780 transformed[5] = transformed[infrontI[0]];
2782 #if S3L_NEAR_CROSS_STRATEGY == 3
2783 _S3L_triangleRemapBarycentrics[3] =
2784 _S3L_triangleRemapBarycentrics[behindI[1]];
2785 _S3L_triangleRemapBarycentrics[4] =
2786 _S3L_triangleRemapBarycentrics[infrontI[0]];
2787 _S3L_triangleRemapBarycentrics[5] =
2788 _S3L_triangleRemapBarycentrics[infrontI[0]];
2789 #endif
2791 for (uint8_t i = 0; i < 2; ++i)
2793 uint8_t be = behindI[i], in = i + 4;
2795 #if S3L_NEAR_CROSS_STRATEGY == 3
2796 S3L_Vec4 *beI = &(_S3L_triangleRemapBarycentrics[in]),
2797 *beB = &(_S3L_triangleRemapBarycentrics[be]);
2798 #else
2799 S3L_Vec4 *beI = 0, *beB = 0;
2800 #endif
2802 interpolateVertex
2805 #if S3L_NEAR_CROSS_STRATEGY == 3
2806 _S3L_triangleRemapBarycentrics[infrontI[0]] =
2807 _S3L_triangleRemapBarycentrics[4];
2808 #endif
2810 transformed[infrontI[0]] = transformed[4];
2812 _S3L_mapProjectedVertexToScreen(&transformed[3],focalLength);
2813 _S3L_mapProjectedVertexToScreen(&transformed[4],focalLength);
2814 _S3L_mapProjectedVertexToScreen(&transformed[5],focalLength);
2816 _S3L_projectedTriangleState = 2;
2819 #undef interpolateVertex
2820 #endif // S3L_NEAR_CROSS_STRATEGY == 2
2822 _S3L_mapProjectedVertexToScreen(&transformed[0],focalLength);
2823 _S3L_mapProjectedVertexToScreen(&transformed[1],focalLength);
2824 _S3L_mapProjectedVertexToScreen(&transformed[2],focalLength);
2827 void S3L_drawScene(S3L_Scene scene)
2829 S3L_Mat4 matFinal, matCamera;
2830 S3L_Vec4 transformed[6]; // transformed triangle coords, for 2 triangles
2832 const S3L_Model3D *model;
2833 S3L_Index modelIndex, triangleIndex;
2835 S3L_makeCameraMatrix(scene.camera.transform,matCamera);
2837 #if S3L_SORT != 0
2838 uint16_t previousModel = 0;
2839 S3L_sortArrayLength = 0;
2840 #endif
2842 for (modelIndex = 0; modelIndex < scene.modelCount; ++modelIndex)
2844 if (!scene.models[modelIndex].config.visible)
2845 continue;
2847 #if S3L_SORT != 0
2848 if (S3L_sortArrayLength >= S3L_MAX_TRIANGES_DRAWN)
2849 break;
2851 previousModel = modelIndex;
2852 #endif
2854 if (scene.models[modelIndex].customTransformMatrix == 0)
2855 S3L_makeWorldMatrix(scene.models[modelIndex].transform,matFinal);
2856 else
2858 S3L_Mat4 *m = scene.models[modelIndex].customTransformMatrix;
2860 for (int8_t j = 0; j < 4; ++j)
2861 for (int8_t i = 0; i < 4; ++i)
2862 matFinal[i][j] = (*m)[i][j];
2865 S3L_mat4Xmat4(matFinal,matCamera);
2867 S3L_Index triangleCount = scene.models[modelIndex].triangleCount;
2869 triangleIndex = 0;
2871 model = &(scene.models[modelIndex]);
2873 while (triangleIndex < triangleCount)
2875 /* Some kind of cache could be used in theory to not project perviously
2876 already projected vertices, but after some testing this was abandoned,
2877 no gain was seen. */
2879 _S3L_projectTriangle(model,triangleIndex,matFinal,
2880 scene.camera.focalLength,transformed);
2882 if (S3L_triangleIsVisible(transformed[0],transformed[1],transformed[2],
2883 model->config.backfaceCulling))
2885 #if S3L_SORT == 0
2886 // without sorting draw right away
2887 S3L_drawTriangle(transformed[0],transformed[1],transformed[2],modelIndex,
2888 triangleIndex);
2890 if (_S3L_projectedTriangleState == 2) // draw potential subtriangle
2892 #if S3L_NEAR_CROSS_STRATEGY == 3
2893 _S3L_triangleRemapBarycentrics[0] = _S3L_triangleRemapBarycentrics[3];
2894 _S3L_triangleRemapBarycentrics[1] = _S3L_triangleRemapBarycentrics[4];
2895 _S3L_triangleRemapBarycentrics[2] = _S3L_triangleRemapBarycentrics[5];
2896 #endif
2898 S3L_drawTriangle(transformed[3],transformed[4],transformed[5],
2899 modelIndex, triangleIndex);
2901 #else
2903 if (S3L_sortArrayLength >= S3L_MAX_TRIANGES_DRAWN)
2904 break;
2906 // with sorting add to a sort list
2907 S3L_sortArray[S3L_sortArrayLength].modelIndex = modelIndex;
2908 S3L_sortArray[S3L_sortArrayLength].triangleIndex = triangleIndex;
2909 S3L_sortArray[S3L_sortArrayLength].sortValue = S3L_zeroClamp(
2910 transformed[0].w + transformed[1].w + transformed[2].w) >> 2;
2911 /* ^
2912 The w component here stores non-clamped z.
2914 As a simple approximation we sort by the triangle center point,
2915 which is a mean coordinate -- we don't actually have to divide by 3
2916 (or anything), that is unnecessary for sorting! We shift by 2 just
2917 as a fast operation to prevent overflow of the sum over uint_16t. */
2919 S3L_sortArrayLength++;
2920 #endif
2923 triangleIndex++;
2927 #if S3L_SORT != 0
2929 #if S3L_SORT == 1
2930 #define cmp <
2931 #else
2932 #define cmp >
2933 #endif
2935 /* Sort the triangles. We use insertion sort, because it has many advantages,
2936 especially for smaller arrays (better than bubble sort, in-place, stable,
2937 simple, ...). */
2939 for (int16_t i = 1; i < S3L_sortArrayLength; ++i)
2941 _S3L_TriangleToSort tmp = S3L_sortArray[i];
2943 int16_t j = i - 1;
2945 while (j >= 0 && S3L_sortArray[j].sortValue cmp tmp.sortValue)
2947 S3L_sortArray[j + 1] = S3L_sortArray[j];
2948 j--;
2951 S3L_sortArray[j + 1] = tmp;
2954 #undef cmp
2956 for (S3L_Index i = 0; i < S3L_sortArrayLength; ++i) // draw sorted triangles
2958 modelIndex = S3L_sortArray[i].modelIndex;
2959 triangleIndex = S3L_sortArray[i].triangleIndex;
2961 model = &(scene.models[modelIndex]);
2963 if (modelIndex != previousModel)
2965 // only recompute the matrix when the model has changed
2966 S3L_makeWorldMatrix(model->transform,matFinal);
2967 S3L_mat4Xmat4(matFinal,matCamera);
2968 previousModel = modelIndex;
2971 /* Here we project the points again, which is redundant and slow as they've
2972 already been projected above, but saving the projected points would
2973 require a lot of memory, which for small resolutions could be even
2974 worse than z-bufer. So this seems to be the best way memory-wise. */
2976 _S3L_projectTriangle(model,triangleIndex,matFinal,scene.camera.focalLength,
2977 transformed);
2979 S3L_drawTriangle(transformed[0],transformed[1],transformed[2],modelIndex,
2980 triangleIndex);
2982 if (_S3L_projectedTriangleState == 2)
2984 #if S3L_NEAR_CROSS_STRATEGY == 3
2985 _S3L_triangleRemapBarycentrics[0] = _S3L_triangleRemapBarycentrics[3];
2986 _S3L_triangleRemapBarycentrics[1] = _S3L_triangleRemapBarycentrics[4];
2987 _S3L_triangleRemapBarycentrics[2] = _S3L_triangleRemapBarycentrics[5];
2988 #endif
2990 S3L_drawTriangle(transformed[3],transformed[4],transformed[5],
2991 modelIndex, triangleIndex);
2994 #endif
2997 #endif // guard