3 [Software](software.md) (SW) rendering refers to [rendering](rendering.md) [computer graphics](graphics.md) without the help of [graphics card](gpu.md) (GPU), or in other words computing images only with [CPU](cpu.md). Most commonly the term means rendering [3D graphics](3d_rendering.md) but may as well refer to other sorts of graphics such as drawing [fonts](font.md) or [video](video.md). Before the invention of GPU card all rendering was done in software of course -- games such as [Quake](quake.md) or Thief were designed with SW rendering and only added optional GPU acceleration later. SW rendering for traditional 3D graphics is also called software [rasterization](rasterization.md), for rasterization is the basis of current real-time 3D graphics.
5 SW rendering has advantages and disadvantages, though from our point of view its advantages prevail (at least given only capitalist GPUs exist nowadays). Firstly it is **much slower** than GPU graphics -- GPUs are designed to perform graphics-specific operations very quickly and, more importantly, they can process many pixels (and other elements) in [parallel](parallelism.md), while a CPU has to compute pixels sequentially one by one and that in addition to all other computations it is otherwise performing. This causes a much lower [FPS](fps.md) in SW rendering. For this reasons SW rendering is also normally of **lower quality** (lower resolution, [nearest neighbour](nn.md) texture filtering, ...) to allow workable FPS. Nevertheless thanks to the ginormous speeds of today's CPUs simple fullscreen SW rendering can be pretty fast on PCs and achieve even above 60 FPS; on slower CPUs (typically [embedded](embedded.md)) SW rendering is usable normally at around 30 FPS if resolutions are kept small.
7 On the other hand SW rendering is more [portable](portability.md) (as it can be written purely in a portable language such as [C](c.md)), less [bloated](bloat.md) and **eliminates the [dependency](dependency.md) on GPU** so it will be supported almost anywhere as every computer has a CPU, while not all computers (such as [embedded](embedded.md) devices) have a GPU (or, if they have it, it may not be sufficient, supported or have a required [driver](driver.md)). SW rendering may also be implemented in a simpler way and it may be easier to deal with as there is e.g. no need to write [shaders](shader.md) in a special language, manage transfer of data between CPU and GPU or deal with parallel programming. SW rendering is the [KISS](kiss.md) approach.
9 SW rendering may also utilize a much wider variety of rendering techniques than only 3D [rasterization](rasterization.md) traditionally used with [GPUs](gpu.md) and their [APIs](api.md), thanks to not being limited by hard-wired pipelines, i.e. it is more flexible. This may include [splatting](splatting.md), [raytracing](raytracing.md) or [BSP rendering](bsp.md) (and many other ["pseudo 3D"](pseudo3d.md) techniques).
11 A lot of software and rendering frameworks offer both options: accelerated rendering using GPU and SW rendering as a [fallback](fallback.md) (in case the first option is not possible). Sometimes there exists a rendering [API](api.md) that has both an accelerated and software implementation (e.g. [TinyGL](tinygl.md) for [OpenGL](opengl.md)).
13 For simpler and even somewhat more complex graphics **purely software rendering is mostly the best choice**. [LRS](lrs.md) suggests you prefer this kind of rendering for its simplicity and [portability](portability.md), at least as one possible option. On devices with lower resolution not many pixels need to be computed so SW rendering can actually be pretty fast despite low specs, and on "big" computers there is nowadays usually an extremely fast CPU available that can handle comfortable FPS at higher resolutions. There is a LRS software renderer you can use: [small3dlib](s3l.md).
15 SW renderers are also written for the purpose of verifying rendering hardware, i.e. as a [reference implementation](reference_implementation.md).
17 Note that SW rendering doesn't mean our program is never touching [GPU](gpu.md) at all, in fact most personal computers nowadays **require** some kind of GPU to even display anything. SW rendering only means that computation of the image to be displayed doesn't use any hardware specialized for this purpose.
19 Some SW renderers make use of specialized CPU instructions such as [MMX](mmx.md) which can make SW rendering faster thanks to handling multiple data in a single step. This is kind of a mid way: it is not using a GPU per se but only a mild form of hardware acceleration. The speed won't reach that of a GPU but will outperform a "pure" SW renderer. However the disadvantage of a hardware dependency is still present: the CPU has to support the MMX instruction set. Good renderers only use these instructions optionally and fall back to general implementation in case MMX is not supported.
21 ## Programming A Software Rasterizer
23 { In case [small3dlib](small3dlib.md) is somehow not enough for you :) ~drummyfish }
25 Difficulty of this task depends on features you want -- a super simple [flat shaded](flat_shading.md) (no textures, no smooth [shading](shading.md)) renderer is relatively easy to make, especially if you don't need movable camera, can afford to use [floating point](float.md) etc. See the details of [3D rendering](3d_rendering.md), especially how the GPU pipelines work, and try to imitate them in software. The core of these renderers is the **[triangle](triangle.md) [rasterization](rasterization.md)** algorithm which, if you want, can be very simple -- even a naive one will give workable results -- or pretty complex and advanced, using various optimizations and things such as the [top-left rule](top_left_rule.md) to guarantee no holes and overlaps of triangles. Remember this function will likely be the performance [bottleneck](bottleneck.md) of your renderer so you want to put effort into [optimizing](optimization.md) it to achieve good [FPS](fps.md). Once you have triangle rasterization, you can draw 3D models which consist of vertices (points in 3D space) and triangles between these vertices (it's very simple to load simple 3D models e.g. from the [obj](obj.md) format) -- you simply project (using [perspective](perspective.md)) 3D position of each vertex to screen coordinates and draw triangles between these pixels with the rasterization algorithm. Here you need to also solve [visibility](visibility.md), i.e. possible overlap of triangles on the screen and correctly drawing those nearer the view in front of those that are further away -- a very simple solution is a [z buffer](z_buffer.md), but to save memory you can also e.g. [sort](sorting.md) the triangles by distance and draw them back-to-front ([painter's algorithm](painters_algorithm.md)). You may add a [scene](scene.md) data structure that can hold multiple models to be rendered. If you additionally want to have movable camera and models that can be transformed (moved, rotated, scaled, ...), you will additionally need to look into some [linear algebra](linear_algebra.md) and [transform matrices](transform_matrix.md) that allow to efficiently compute positions of vertices of a transformed model against a transformed camera -- you do this the same way as basically all other 3D engines (look up e.g. some [OpenGL](opengl.md) tutorials, see model/view/projection [matrices](matrix.md) etc.). If you also want texturing, the matters get again a bit more complicated, you need to compute [barycentric](barycentric.md) coordinates (special coordinates within a triangle) as you're rasterizing the triangle, and possibly apply [perspective correction](perspective_correction.md) (otherwise you'll be seeing distortions). You then map the barycentrics of each rasterized pixel to [UV](uv.md) (texturing) coordinates which you use to retrieve specific pixels from a texture. On top of all this you may start adding all the advanced features of typical engines such as [acceleration structures](acceleration_structure.md) that for example discard models that are completely out of view, [LOD](lod.md), instancing, [MIP maps](mip_map.md) and so on.
27 Possible tricks, cheats and [optimizations](optimization.md) you may utilize include:
29 - Using painter's algorithm (sorting triangles and drawing back to front) instead of z-buffer if you need to save a lot of RAM. But remember sorting doesn't [work](work.md) perfectly, glitches will inevitably appear, and you will probably gain overdraw penalty.
30 - Ad previous point: you don't have to perform whole triangle sorting each frame if you need to save speed, it may be good enough to perform a constant continuous sorting by performing only a few iterations of some sorting algorithm per frame.
31 - You may lower the quality of far-away objects in many ways, e.g. with [LOD](lod.md), only using affine texturing for them (as opposed to perspective-correct one) or even just using a constant color (average color of the texture), maybe even just drawing 2D sprites instead of 3D models etc. This may help a lot.
32 - Try to reduce [overdraw](overdraw.md) (overwriting already rendered pixels with new closer ones) which wastes computation time. This can be achieved by good [culling](culling.md) of obscured objects or by using z-buffer along with front to back drawing.
33 - Generally use cheap [approximations](approximation.md) such as [Gouraud](gouraud.md) (per-vertex) [shading](shading.md) instead of [Phong](phong.md) (per-pixel), nearest neighbour texture sampling, only approximate perspective correction (every N pixels), simplified handling of near-plane culling (e.g. just pushing the vertices in front of camera instead of actually culling a triangle) etc.
34 - Use general [optimization](optimization.md) techniques: e.g. [precomputation](precomputation.md), using power of two resolution for textures, fixed screen resolution that's known at compile time or inlining of your shader function will probably help performance.
39 These are some notable software renderers:
41 - **Bootleg3D/RAL**: Very tiny, flat-shaded, super suckless (< 1000 LOC) renderers ([RAL link](https://codeberg.org/Ilya3point999k/RAL)).
42 - **[Build engine](build_engine.md)**: So called ["pseudo 3D"](pseudo_3d.md) or primitive 3D, this was a very popular [proprietary](proprietary.md) portal-rendering engine for older games like [Duke Nukem 3D](duke3d.md) or [Blood](blood.md).
43 - **[BRender](brender.md)**: Old commercial renderer used in games such as Carmageddon, Croc or [Harry Potter](harry_potter.md) 1. Later made [FOSS](foss.md).
44 - **[Chasm: The Rift](chasm_the_rift.md) engine**: Mysterious proprietary 1997 renderer made specifically for one game, notable especially by being a hybrid of "2.5D" and "true 3D", it managed to make it look very good.
45 - **[Dark Engine](dark_engine.md)**: Old proprietary game engine which includes a SW renderer, used mainly in the game Thief. The author writes about it at https://nothings.org/gamedev/thief_rendering.html.
46 - **[Descent](descent.md) engine**: The 1995 proprietary game Descent featured one of the first real time "[true 3D](true3d.md)" engines based on [portal rendering](portal_rendering.md), it still stands as a marble of that time's technology.
47 - **[id Tech](id_tech.md)**: Multiple engines by [Id software](id.md) (later made [FOSS](foss.md)) used for games like [Doom](doom.md), [Quake](quake.md) and its successors included a software renderer. Quake's SW renderer was partially described in the *Michael Abrash's Graphics Programming Black Book*, Doom's renderer is described e.g. in the book *Game Engine Black Book DOOM*.
48 - **[Irrlich](irrlicht.md)**: [FOSS](foss.md) game engine including a software renderer as one of its [backends](backend.md).
49 - **[Jedi](jedi_engine.md)**: Old proprietary "pseudo3D" engine.
50 - **[Mesa](mesa.md)**: [FOSS](foss.md) implementation of [OpenGL](opengl.md) that includes a software rasterizer.
51 - **[raycastlib](raycastlib.md)**: [LRS](lrs.md), free [C](c.md) 2D raycasting ("2.5D") engine most notably used in [Anarch](anarch.md).
52 - **[small3dlib](small3dlib.md)**: [LRS](lrs.md), free pure [C](c.md) "true 3D" rasterizer, very simple but flexible and coming with all the high level features (textures, perspective correction etc.).
53 - **[SSRE](ssre.md)**: The guy who wrote [LIL](lil.md) also made this renderer named Shitty Software Rendering Engine, accessible [here](http://runtimeterror.com/tech/ssre/).
54 - **[System Shock](system_shock.md) engine**: Old proprietary game engine.
55 - **[TinyGL](tinygl.md)**: Implements a subset of [OpenGL](opengl.md).
56 - **[Tomb Raider](tomb_raider.md)**: Famous 90s game with custom software 3D renderer.
57 - **[Ultima underworld](ultima_underworld.md)**: Proprietary game featuring a very early (1992) texture mapped software 3D renderer.
58 - **old [Unreal Engine](unreal_engine.md)**: One of the most mainstream popular proprietary engines nowadays featured software rendering fallbacks in early versions.
59 - In general many old [games](game.md) in the 90s implemented their own software renderers. Also games on non-3D consoles such as [Gameboy Advance](gba.md) sometimes attempted simple software rendering 3D. These are the places where you can look for interesting renderers of this kind.
64 - [3D rendering](3d_rendering.md)
65 - ["pseudo/primitive 3D, 2.5D"](pseudo3d.md)