Documentation/contributing/coding_style.md

   1 # Coding Style
   2
   3 This document describes the preferred C coding style for the
   4 coreboot project. It is in many ways exactly the same as the Linux
   5 kernel coding style. In fact, most of this document has been copied from
   6 the [Linux kernel coding style](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/Documentation/process/4.Coding.rst)
   7
   8 The guidelines in this file should be seen as a strong suggestion, and
   9 should overrule personal preference. But they may be ignored in
  10 individual instances when there are good practical reasons to do so, and
  11 reviewers are in agreement.
  12
  13 Any style questions that are not mentioned in here should be decided
  14 between the author and reviewers on a case-by-case basis. When modifying
  15 existing files, authors should try to match the prevalent style in that
  16 file -- otherwise, they should try to match similar existing files in
  17 coreboot.
  18
  19 Bulk style changes to existing code ("cleanup patches") should avoid
  20 changing existing style choices unless they actually violate this style
  21 guide, or there is broad consensus that the new version is an
  22 improvement. By default the style choices of the original author should
  23 be honored. (Note that `checkpatch.pl` is not part of this style guide,
  24 and neither is `clang-format`. These tools can be useful to find
  25 potential issues or simplify formatting in new submissions, but they
  26 were not designed to directly match this guide and may have false
  27 positives. They should not be bulk-applied to change existing code.)
  28
  29 ## Indentation
  30
  31 Tabs are 8 characters, and thus indentations are also 8 characters.
  32 There are heretic movements that try to make indentations 4 (or even 2!)
  33 characters deep, and that is akin to trying to define the value of PI to
  34 be 3.
  35
  36 Rationale: The whole idea behind indentation is to clearly define where
  37 a block of control starts and ends. Especially when you've been looking
  38 at your screen for 20 straight hours, you'll find it a lot easier to
  39 see how the indentation works if you have large indentations.
  40
  41 Now, some people will claim that having 8-character indentations makes
  42 the code move too far to the right, and makes it hard to read on a
  43 80-character terminal screen. The answer to that is that if you need
  44 more than 3 levels of indentation, you're screwed anyway, and should
  45 fix your program.
  46
  47 In short, 8-char indents make things easier to read, and have the added
  48 benefit of warning you when you're nesting your functions too deep.
  49 Heed that warning.
  50
  51 The preferred way to ease multiple indentation levels in a switch
  52 statement is to align the "switch" and its subordinate "case" labels
  53 in the same column instead of "double-indenting" the "case" labels.
  54 E.g.:
  55
  56 ```c
  57 switch (suffix) {
  58 case 'G':
  59 case 'g':
  60         mem <<= 30;
  61         break;
  62 case 'M':
  63 case 'm':
  64         mem <<= 20;
  65         break;
  66 case 'K':
  67 case 'k':
  68         mem <<= 10;
  69         __fallthrough;
  70 default:
  71         break;
  72 }
  73 ```
  74
  75 Don't put multiple statements on a single line unless you have
  76 something to hide:
  77
  78 ```c
  79 if (condition) do_this;
  80   do_something_everytime;
  81 ```
  82
  83 Don't put multiple assignments on a single line either. Kernel coding
  84 style is super simple. Avoid tricky expressions.
  85
  86 Outside of comments, documentation and except in Kconfig, spaces are
  87 never used for indentation, and the above example is deliberately
  88 broken.
  89
  90 Get a decent editor and don't leave whitespace at the end of lines.
  91
  92 ## Breaking long lines and strings
  93
  94 Coding style is all about readability and maintainability using commonly
  95 available tools.
  96
  97 The limit on the length of lines is 96 columns and this is a strongly
  98 preferred limit.
  99
 100 Statements longer than 96 columns will be broken into sensible chunks,
 101 unless exceeding 96 columns significantly increases readability and does
 102 not hide information. Descendants are always substantially shorter than
 103 the parent and are placed substantially to the right. The same applies
 104 to function headers with a long argument list. However, never break
 105 user-visible strings such as printk messages, because that breaks the
 106 ability to grep for them.
 107
 108 ## Placing Braces and Spaces
 109
 110 The other issue that always comes up in C styling is the placement of
 111 braces. Unlike the indent size, there are few technical reasons to
 112 choose one placement strategy over the other, but the preferred way, as
 113 shown to us by the prophets Kernighan and Ritchie, is to put the opening
 114 brace last on the line, and put the closing brace first, thusly:
 115
 116 ```c
 117 if (x is true) {
 118         we do y
 119 }
 120 ```
 121
 122 This applies to all non-function statement blocks (if, switch, for,
 123 while, do). E.g.:
 124
 125 ```c
 126 switch (action) {
 127 case KOBJ_ADD:
 128         return "add";
 129 case KOBJ_REMOVE:
 130         return "remove";
 131 case KOBJ_CHANGE:
 132         return "change";
 133 default:
 134         return NULL;
 135 }
 136 ```
 137
 138 However, there is one special case, namely functions: they have the
 139 opening brace at the beginning of the next line, thus:
 140
 141 ```c
 142 int function(int x)
 143 {
 144         body of function
 145 }
 146 ```
 147
 148 Heretic people all over the world have claimed that this inconsistency
 149 is ... well ... inconsistent, but all right-thinking people know that
 150 (a) K&R are _right_ and (b) K&R are right. Besides, functions are
 151 special anyway (you can't nest them in C).
 152
 153 Note that the closing brace is empty on a line of its own, _except_ in
 154 the cases where it is followed by a continuation of the same statement,
 155 ie a "while" in a do-statement or an "else" in an if-statement, like
 156 this:
 157
 158 ```c
 159 do {
 160         body of do-loop
 161 } while (condition);
 162 ```
 163
 164 and
 165
 166 ```c
 167 if (x == y) {
 168         ..
 169 } else if (x > y) {
 170         ...
 171 } else {
 172         ....
 173 }
 174 ```
 175
 176 Rationale: K&R.
 177
 178 Also, note that this brace-placement also minimizes the number of empty
 179 (or almost empty) lines, without any loss of readability. Thus, as the
 180 supply of new-lines on your screen is not a renewable resource (think
 181 25-line terminal screens here), you have more empty lines to put
 182 comments on.
 183
 184 Do not unnecessarily use braces where a single statement will do.
 185
 186 ```c
 187 if (condition)
 188         action();
 189 ```
 190
 191 and
 192
 193 ```c
 194 if (condition)
 195         do_this();
 196 else
 197         do_that();
 198 ```
 199
 200 This does not apply if only one branch of a conditional statement is a
 201 single statement; in the latter case use braces in both branches:
 202
 203 ```c
 204 if (condition) {
 205         do_this();
 206         do_that();
 207 } else {
 208         otherwise();
 209 }
 210 ```
 211
 212 ### Spaces
 213
 214 Linux kernel style for use of spaces depends (mostly) on
 215 function-versus-keyword usage. Use a space after (most) keywords. The
 216 notable exceptions are sizeof, typeof, alignof, and __attribute__,
 217 which look somewhat like functions (and are usually used with
 218 parentheses in Linux, although they are not required in the language, as
 219 in: "sizeof info" after "struct fileinfo info;" is declared).
 220
 221 So use a space after these keywords:
 222
 223 ```
 224 if, switch, case, for, do, while
 225 ```
 226
 227 but not with sizeof, typeof, alignof, or __attribute__. E.g.,
 228
 229 ```c
 230 s = sizeof(struct file);
 231 ```
 232
 233 Do not add spaces around (inside) parenthesized expressions. This
 234 example is
 235
 236 -   bad*:
 237
 238 ```c
 239 s = sizeof( struct file );
 240 ```
 241
 242 When declaring pointer data or a function that returns a pointer type,
 243 the preferred use of '*' is adjacent to the data name or function
 244 name and not adjacent to the type name. Examples:
 245
 246 ```c
 247 char *linux_banner;
 248 unsigned long long memparse(char *ptr, char **retptr);
 249 char *match_strdup(substring_t *s);
 250 ```
 251
 252 Use one space around (on each side of) most binary and ternary
 253 operators, such as any of these:
 254
 255 ```
 256 =  +  -  <  >  *  /  %  |  &  ^  <=  >=  ==  !=  ?  :
 257 ```
 258
 259 but no space after unary operators:
 260
 261 ```
 262 &  *  +  -  ~  !  sizeof  typeof  alignof  __attribute__  defined
 263 ```
 264
 265 no space before the postfix increment & decrement unary operators:
 266
 267 ```
 268 ++  --
 269 ```
 270
 271 no space after the prefix increment & decrement unary operators:
 272
 273 ```
 274 ++  --
 275 ```
 276
 277 and no space around the '.' and "->" structure member operators.
 278
 279 Do not leave trailing whitespace at the ends of lines. Some editors with
 280 "smart" indentation will insert whitespace at the beginning of new
 281 lines as appropriate, so you can start typing the next line of code
 282 right away. However, some such editors do not remove the whitespace if
 283 you end up not putting a line of code there, such as if you leave a
 284 blank line. As a result, you end up with lines containing trailing
 285 whitespace.
 286
 287 Git will warn you about patches that introduce trailing whitespace, and
 288 can optionally strip the trailing whitespace for you; however, if
 289 applying a series of patches, this may make later patches in the series
 290 fail by changing their context lines.
 291
 292 ### Naming
 293
 294 C is a Spartan language, and so should your naming be. Unlike Modula-2
 295 and Pascal programmers, C programmers do not use cute names like
 296 ThisVariableIsATemporaryCounter. A C programmer would call that variable
 297 "tmp", which is much easier to write, and not the least more difficult
 298 to understand.
 299
 300 HOWEVER, while mixed-case names are frowned upon, descriptive names for
 301 global variables are a must. To call a global function "foo" is a
 302 shooting offense.
 303
 304 GLOBAL variables (to be used only if you _really_ need them) need to
 305 have descriptive names, as do global functions. If you have a function
 306 that counts the number of active users, you should call that
 307 "count_active_users()" or similar, you should _not_ call it
 308 "cntusr()".
 309
 310 Encoding the type of a function into the name (so-called Hungarian
 311 notation) is brain damaged - the compiler knows the types anyway and can
 312 check those, and it only confuses the programmer. No wonder MicroSoft
 313 makes buggy programs.
 314
 315 LOCAL variable names should be short, and to the point. If you have some
 316 random integer loop counter, it should probably be called "i". Calling
 317 it "loop_counter" is non-productive, if there is no chance of it
 318 being mis-understood. Similarly, "tmp" can be just about any type of
 319 variable that is used to hold a temporary value.
 320
 321 If you are afraid to mix up your local variable names, you have another
 322 problem, which is called the function-growth-hormone-imbalance syndrome.
 323 See chapter 6 (Functions).
 324
 325 ## Typedefs
 326
 327 Please don't use things like "vps_t".
 328
 329 It's a _mistake_ to use typedef for structures and pointers. When you
 330 see a
 331
 332 ```c
 333 vps_t a;
 334 ```
 335
 336 in the source, what does it mean?
 337
 338 In contrast, if it says
 339
 340 ```c
 341 struct virtual_container *a;
 342 ```
 343
 344 you can actually tell what "a" is.
 345
 346 Lots of people think that typedefs "help readability". Not so. They
 347 are useful only for:
 348
 349 (a) totally opaque objects (where the typedef is actively used to
 350 _hide_ what the object is).
 351
 352 Example: "pte_t" etc. opaque objects that you can only access using
 353 the proper accessor functions.
 354
 355 NOTE! Opaqueness and "accessor functions" are not good in themselves.
 356 The reason we have them for things like pte_t etc. is that there really
 357 is absolutely _zero_ portably accessible information there.
 358
 359 (b) Clear integer types, where the abstraction _helps_ avoid confusion
 360 whether it is "int" or "long".
 361
 362 u8/u16/u32 are perfectly fine typedefs, although they fit into category
 363 (d) better than here.
 364
 365 NOTE! Again - there needs to be a _reason_ for this. If something is
 366 "unsigned long", then there's no reason to do
 367
 368 ```c
 369 typedef unsigned long myflags_t;
 370 ```
 371
 372 but if there is a clear reason for why it under certain circumstances
 373 might be an "unsigned int" and under other configurations might be
 374 "unsigned long", then by all means go ahead and use a typedef.
 375
 376 (c) when you use sparse to literally create a _new_ type for
 377 type-checking.
 378
 379 (d) New types which are identical to standard C99 types, in certain
 380 exceptional circumstances.
 381
 382 Although it would only take a short amount of time for the eyes and
 383 brain to become accustomed to the standard types like 'uint32_t',
 384 some people object to their use anyway.
 385
 386 Therefore, the Linux-specific 'u8/u16/u32/u64' types and their signed
 387 equivalents which are identical to standard types are permitted --
 388 although they are not mandatory in new code of your own.
 389
 390 When editing existing code which already uses one or the other set of
 391 types, you should conform to the existing choices in that code.
 392
 393 (e) Types safe for use in userspace.
 394
 395 In certain structures which are visible to userspace, we cannot require
 396 C99 types and cannot use the 'u32' form above. Thus, we use __u32
 397 and similar types in all structures which are shared with userspace.
 398
 399 Maybe there are other cases too, but the rule should basically be to
 400 NEVER EVER use a typedef unless you can clearly match one of those
 401 rules.
 402
 403 In general, a pointer, or a struct that has elements that can reasonably
 404 be directly accessed should _never_ be a typedef.
 405
 406 ## Functions
 407
 408 Functions should be short and sweet, and do just one thing. They should
 409 fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24,
 410 as we all know), and do one thing and do that well.
 411
 412 The maximum length of a function is inversely proportional to the
 413 complexity and indentation level of that function. So, if you have a
 414 conceptually simple function that is just one long (but simple)
 415 case-statement, where you have to do lots of small things for a lot of
 416 different cases, it's OK to have a longer function.
 417
 418 However, if you have a complex function, and you suspect that a
 419 less-than-gifted first-year high-school student might not even
 420 understand what the function is all about, you should adhere to the
 421 maximum limits all the more closely. Use helper functions with
 422 descriptive names (you can ask the compiler to in-line them if you think
 423 it's performance-critical, and it will probably do a better job of it
 424 than you would have done).
 425
 426 Another measure of the function is the number of local variables. They
 427 shouldn't exceed 5-10, or you're doing something wrong. Re-think the
 428 function, and split it into smaller pieces. A human brain can generally
 429 easily keep track of about 7 different things, anything more and it gets
 430 confused. You know you're brilliant, but maybe you'd like to
 431 understand what you did 2 weeks from now.
 432
 433 In source files, separate functions with one blank line. If the function
 434 is exported, the EXPORT* macro for it should follow immediately after
 435 the closing function brace line. E.g.:
 436
 437 ```c
 438 int system_is_up(void)
 439 {
 440         return system_state == SYSTEM_RUNNING;
 441 }
 442 EXPORT_SYMBOL(system_is_up);
 443 ```
 444
 445 In function prototypes, include parameter names with their data types.
 446 Although this is not required by the C language, it is preferred in
 447 Linux because it is a simple way to add valuable information for the
 448 reader.
 449
 450 ## Centralized exiting of functions
 451
 452 Albeit deprecated by some people, the equivalent of the goto statement
 453 is used frequently by compilers in form of the unconditional jump
 454 instruction.
 455
 456 The goto statement comes in handy when a function exits from multiple
 457 locations and some common work such as cleanup has to be done. If there
 458 is no cleanup needed then just return directly.
 459
 460 The rationale is:
 461
 462 -   unconditional statements are easier to understand and follow
 463 -   nesting is reduced
 464 -   errors by not updating individual exit points when making
 465     modifications are prevented
 466 -   saves the compiler work to optimize redundant code away ;)
 467
 468 ```c
 469 int fun(int a)
 470 {
 471         int result = 0;
 472         char *buffer = kmalloc(SIZE);
 473
 474         if (buffer == NULL)
 475                 return -ENOMEM;
 476
 477         if (condition1) {
 478                 while (loop1) {
 479                         ...
 480                 }
 481                 result = 1;
 482                 goto out;
 483         }
 484         ...
 485         out:
 486         kfree(buffer);
 487         return result;
 488 }
 489 ```
 490
 491 ## Commenting
 492
 493 Comments are good, but there is also a danger of over-commenting. NEVER
 494 try to explain HOW your code works in a comment: it's much better to
 495 write the code so that the _working_ is obvious, and it's a waste of
 496 time to explain badly written code.
 497
 498 Generally, you want your comments to tell WHAT your code does, not HOW.
 499 Also, try to avoid putting comments inside a function body: if the
 500 function is so complex that you need to separately comment parts of it,
 501 you should probably go back to chapter 6 for a while. You can make small
 502 comments to note or warn about something particularly clever (or ugly),
 503 but try to avoid excess. Instead, put the comments at the head of the
 504 function, telling people what it does, and possibly WHY it does it.
 505
 506 When commenting the kernel API functions, please use the kernel-doc
 507 format. See the files Documentation/kernel-doc-nano-HOWTO.txt and
 508 scripts/kernel-doc for details.
 509
 510 coreboot style for comments is the C89 "/* ... */" style. You may
 511 use C99-style "// ..." comments.
 512
 513 The preferred style for *short* (multi-line) comments is:
 514
 515 ```c
 516 /* This is the preferred style for short multi-line
 517    comments in the Linux kernel source code.
 518    Please use it consistently. */
 519 ```
 520
 521 The preferred style for *long* (multi-line) comments is:
 522
 523 ```c
 524 /*
 525  * This is the preferred style for multi-line
 526  * comments in the Linux kernel source code.
 527  * Please use it consistently.
 528  *
 529  * Description:  A column of asterisks on the left side,
 530  * with beginning and ending almost-blank lines.
 531  */
 532 ```
 533
 534 It's also important to comment data, whether they are basic types or
 535 derived types. To this end, use just one data declaration per line (no
 536 commas for multiple data declarations). This leaves you room for a small
 537 comment on each item, explaining its use.
 538
 539 ## You've made a mess of it
 540 That's OK, we all do. You've probably been told by your long-time Unix user
 541 helper that "GNU emacs" automatically formats the C sources for you, and
 542 you've noticed that yes, it does do that, but the defaults it uses are less
 543 than desirable (in fact, they are worse than random typing - an infinite
 544 number of monkeys typing into GNU emacs would never make a good program).
 545
 546 So, you can either get rid of GNU emacs, or change it to use saner values.
 547 To do the latter, you can stick the following in your .emacs file:
 548
 549 ```lisp
 550 (defun c-lineup-arglist-tabs-only (ignored)
 551   "Line up argument lists by tabs, not spaces"
 552   (let* ((anchor (c-langelem-pos c-syntactic-element))
 553          (column (c-langelem-2nd-pos c-syntactic-element))
 554          (offset (- (1+ column) anchor))
 555          (steps (floor offset c-basic-offset)))
 556     (* (max steps 1)
 557        c-basic-offset)))
 558
 559 (add-hook 'c-mode-common-hook
 560           (lambda ()
 561             ;; Add kernel style
 562             (c-add-style
 563              "linux-tabs-only"
 564              '("linux" (c-offsets-alist
 565                         (arglist-cont-nonempty
 566                          c-lineup-gcc-asm-reg
 567                          c-lineup-arglist-tabs-only))))))
 568
 569 (add-hook 'c-mode-hook
 570           (lambda ()
 571             (let ((filename (buffer-file-name)))
 572               ;; Enable kernel mode for the appropriate files
 573               (when (and filename
 574                          (string-match (expand-file-name "~/src/linux-trees")
 575                                         filename))
 576                 (setq indent-tabs-mode t)
 577                 (c-set-style "linux-tabs-only")))))
 578 ```
 579
 580 This will make emacs go better with the kernel coding style for C files
 581 below ~/src/linux-trees.
 582
 583 But even if you fail in getting emacs to do sane formatting, not
 584 everything is lost: use "indent".
 585
 586 Now, again, GNU indent has the same brain-dead settings that GNU emacs
 587 has, which is why you need to give it a few command line options.
 588 However, that's not too bad, because even the makers of GNU indent
 589 recognize the authority of K&R (the GNU people aren't evil, they are
 590 just severely misguided in this matter), so you just give indent the
 591 options "-kr -i8" (stands for "K&R, 8 character indents"), or use
 592 "scripts/Lindent", which indents in the latest style.
 593
 594 "indent" has a lot of options, and especially when it comes to comment
 595 re-formatting you may want to take a look at the man page. But remember:
 596 "indent" is not a fix for bad programming.
 597
 598 ## Kconfig configuration files
 599
 600 For all of the Kconfig* configuration files throughout the source tree,
 601 the indentation is somewhat different. Lines under a "config"
 602 definition are indented with one tab, while help text is indented an
 603 additional two spaces. Example:
 604
 605 ```kconfig
 606 config AUDIT
 607         bool "Auditing support"
 608         depends on NET
 609         help
 610           Enable auditing infrastructure that can be used with another
 611           kernel subsystem, such as SELinux (which requires this for
 612           logging of avc messages output).  Does not do system-call
 613           auditing without CONFIG_AUDITSYSCALL.
 614 ```
 615
 616 Seriously dangerous features (such as write support for certain
 617 filesystems) should advertise this prominently in their prompt string:
 618
 619 ```kconfig
 620 config ADFS_FS_RW
 621         bool "ADFS write support (DANGEROUS)"
 622         depends on ADFS_FS
 623         ...
 624 ```
 625
 626 For full documentation on the configuration files, see the file
 627 Documentation/kbuild/kconfig-language.txt.
 628
 629 Data structures
 630 ---------------
 631
 632 Data structures that have visibility outside the single-threaded
 633 environment they are created and destroyed in should always have
 634 reference counts. In the kernel, garbage collection doesn't exist (and
 635 outside the kernel garbage collection is slow and inefficient), which
 636 means that you absolutely _have_ to reference count all your uses.
 637
 638 Reference counting means that you can avoid locking, and allows multiple
 639 users to have access to the data structure in parallel - and not having
 640 to worry about the structure suddenly going away from under them just
 641 because they slept or did something else for a while.
 642
 643 Note that locking is _not_ a replacement for reference counting.
 644 Locking is used to keep data structures coherent, while reference
 645 counting is a memory management technique. Usually both are needed, and
 646 they are not to be confused with each other.
 647
 648 Many data structures can indeed have two levels of reference counting,
 649 when there are users of different "classes". The subclass count counts
 650 the number of subclass users, and decrements the global count just once
 651 when the subclass count goes to zero.
 652
 653 Examples of this kind of "multi-level-reference-counting" can be found
 654 in memory management ("struct mm_struct": mm_users and mm_count),
 655 and in filesystem code ("struct super_block": s_count and
 656 s_active).
 657
 658 Remember: if another thread can find your data structure, and you don't
 659 have a reference count on it, you almost certainly have a bug.
 660
 661 Macros, Enums and RTL
 662 ---------------------
 663
 664 Names of macros defining constants and labels in enums are capitalized.
 665
 666 ```c
 667 #define CONSTANT 0x12345
 668 ```
 669
 670 Enums are preferred when defining several related constants.
 671
 672 CAPITALIZED macro names are appreciated but macros resembling functions
 673 may be named in lower case.
 674
 675 Generally, inline functions are preferable to macros resembling
 676 functions.
 677
 678 Macros with multiple statements should be enclosed in a do - while
 679 block:
 680
 681 ```c
 682 #define macrofun(a, b, c)   \
 683     do {                    \
 684         if (a == 5)         \
 685             do_this(b, c);  \
 686     } while (0)
 687 ```
 688
 689 Things to avoid when using macros:
 690
 691 1) macros that affect control flow:
 692
 693 ```c
 694 #define FOO(x)              \
 695     do {                        \
 696             if (blah(x) < 0)        \
 697                     return -EBUGGERED;  \
 698     } while(0)
 699 ```
 700
 701 is a *very* bad idea. It looks like a function call but exits the
 702 "calling" function; don't break the internal parsers of those who
 703 will read the code.
 704
 705 2) macros that depend on having a local variable with a magic name:
 706
 707 ```c
 708 #define FOO(val) bar(index, val)
 709 ```
 710
 711 might look like a good thing, but it's confusing as hell when one reads
 712 the code and it's prone to breakage from seemingly innocent changes.
 713
 714 3) macros with arguments that are used as l-values: FOO(x) = y; will
 715 bite you if somebody e.g. turns FOO into an inline function.
 716
 717 4) forgetting about precedence: macros defining constants using
 718 expressions must enclose the expression in parentheses. Beware of
 719 similar issues with macros using parameters.
 720
 721 ```c
 722 #define CONSTANT 0x4000
 723 #define CONSTEXP (CONSTANT | 3)
 724 ```
 725
 726 The cpp manual deals with macros exhaustively. The gcc internals manual
 727 also covers RTL which is used frequently with assembly language in the
 728 kernel.
 729
 730 Printing kernel messages
 731 ------------------------
 732
 733 Kernel developers like to be seen as literate. Do mind the spelling of
 734 kernel messages to make a good impression. Do not use crippled words
 735 like "dont"; use "do not" or "don't" instead. Make the messages
 736 concise, clear, and unambiguous.
 737
 738 Kernel messages do not have to be terminated with a period.
 739
 740 Printing numbers in parentheses (%d) adds no value and should be
 741 avoided.
 742
 743 There are a number of driver model diagnostic macros in
 744 <linux/device.h> which you should use to make sure messages are
 745 matched to the right device and driver, and are tagged with the right
 746 level: dev_err(), dev_warn(), dev_info(), and so forth. For messages
 747 that aren't associated with a particular device, <linux/printk.h>
 748 defines pr_debug() and pr_info().
 749
 750 Coming up with good debugging messages can be quite a challenge; and
 751 once you have them, they can be a huge help for remote troubleshooting.
 752 Such messages should be compiled out when the DEBUG symbol is not
 753 defined (that is, by default they are not included). When you use
 754 dev_dbg() or pr_debug(), that's automatic. Many subsystems have
 755 Kconfig options to turn on -DDEBUG. A related convention uses
 756 VERBOSE_DEBUG to add dev_vdbg() messages to the ones already enabled
 757 by DEBUG.
 758
 759 Allocating memory
 760 -----------------
 761
 762 coreboot provides a single general purpose memory allocator: malloc()
 763
 764 The preferred form for passing a size of a struct is the following:
 765
 766 ```c
 767 p = malloc(sizeof(*p));
 768 ```
 769
 770 The alternative form where struct name is spelled out hurts readability
 771 and introduces an opportunity for a bug when the pointer variable type
 772 is changed but the corresponding sizeof that is passed to a memory
 773 allocator is not.
 774
 775 Casting the return value which is a void pointer is redundant. The
 776 conversion from void pointer to any other pointer type is guaranteed by
 777 the C programming language.
 778
 779 You should contain your memory usage to stack variables whenever
 780 possible. Only use malloc() as a last resort. In ramstage, you may also
 781 be able to get away with using static variables. Never use malloc()
 782 outside of ramstage.
 783
 784 Since coreboot only runs for a very short time, there is no memory
 785 deallocator, although a corresponding free() is offered. It is a no-op.
 786 Use of free() is not required though it is accepted. It is useful when
 787 sharing code with other codebases that make use of free().
 788
 789 The inline disease
 790 ------------------
 791
 792 There appears to be a common misperception that gcc has a magic "make
 793 me faster" speedup option called "inline". While the use of inlines
 794 can be appropriate (for example as a means of replacing macros, see
 795 Chapter 12), it very often is not. Abundant use of the inline keyword
 796 leads to a much bigger kernel, which in turn slows the system as a whole
 797 down, due to a bigger icache footprint for the CPU and simply because
 798 there is less memory available for the pagecache. Just think about it; a
 799 pagecache miss causes a disk seek, which easily takes 5 milliseconds.
 800 There are a LOT of cpu cycles that can go into these 5 milliseconds.
 801
 802 A reasonable rule of thumb is to not put inline at functions that have
 803 more than 3 lines of code in them. An exception to this rule are the
 804 cases where a parameter is known to be a compile time constant, and as a
 805 result of this constantness you *know* the compiler will be able to
 806 optimize most of your function away at compile time. For a good example
 807 of this later case, see the kmalloc() inline function.
 808
 809 Often people argue that adding inline to functions that are static and
 810 used only once is always a win since there is no space tradeoff. While
 811 this is technically correct, gcc is capable of inlining these
 812 automatically without help, and the maintenance issue of removing the
 813 inline when a second user appears outweighs the potential value of the
 814 hint that tells gcc to do something it would have done anyway.
 815
 816 Function return values and names
 817 --------------------------------
 818
 819 Functions can return values of many different kinds, and one of the most
 820 common is a value indicating whether the function succeeded or failed.
 821 Such a value can be represented as an error-code integer (`CB_ERR_xxx`
 822 (negative number) = failure, `CB_SUCCESS` (0) = success) or a "succeeded"
 823 boolean (0 = failure, non-zero = success).
 824
 825 Mixing up these two sorts of representations is a fertile source of
 826 difficult-to-find bugs. If the C language included a strong distinction
 827 between integers and booleans then the compiler would find these
 828 mistakes for us... but it doesn't. To help prevent such bugs, always
 829 follow this convention:
 830
 831 If the name of a function is an action or an imperative command,
 832 the function should return an error-code integer.  If the name
 833 is a predicate, the function should return a "succeeded" boolean.
 834
 835 For example, "add work" is a command, and the `add_work()` function
 836 returns 0 for success or `CB_ERR` for failure. In the same way, "PCI
 837 device present" is a predicate, and the `pci_dev_present()` function
 838 returns 1 if it succeeds in finding a matching device or 0 if it
 839 doesn't.
 840
 841 Functions whose return value is the actual result of a computation,
 842 rather than an indication of whether the computation succeeded, are not
 843 subject to this rule. Generally they indicate failure by returning some
 844 out-of-range result. Typical examples would be functions that return
 845 pointers; they use NULL to report failure.
 846
 847 Error handling, assertions and die()
 848 -----------------------------
 849
 850 As firmware, coreboot has no means to let the user interactively fix things when
 851 something goes wrong. We either succeed to boot or the device becomes a brick
 852 that must be recovered through complicated external means (e.g. a flash
 853 programmer). Therefore, coreboot code should strive to continue booting
 854 wherever possible.
 855
 856 In most cases, errors should be handled by logging a message of at least
 857 `BIOS_ERR` level, returning out of the function stack for the failed feature,
 858 and then continuing execution. For example, if a function reading the EDID of an
 859 eDP display panel encounters an I2C error, it should print a "cannot read EDID"
 860 message and return an error code. The calling display initialization function
 861 knows that without the EDID there is no way to initialize the display correctly,
 862 so it will also immediately return with an error code without running its
 863 remaining code that would initialize the SoC's display controller. Exeuction
 864 returns further up the function stack to the mainboard initialization code
 865 which continues booting despite the failed display initialization, since
 866 display functionality is non-essential to the system. (Code is encouraged but
 867 not required to use `enum cb_err` error codes to return these errors.)
 868
 869 coreboot also has the `die()` function that completely halts execution. `die()`
 870 should only be used as a last resort, since it results in the worst user
 871 experience (bricked system). It is generally preferrable to continue executing
 872 even after a problem was encountered that might be fatal (e.g. SPI clock
 873 couldn't be configured correctly), because a slight chance of successfully
 874 booting is still better than not booting at all. The only cases where `die()`
 875 should be used are:
 876
 877 1. There is no (simple) way to continue executing. For example, when loading the
 878    next stage from SPI flash fails, we don't have any more code to execute. When
 879    memory initialization fails, we have no space to load the ramstage into.
 880
 881 2. Continuing execution would pose a security risk. All security features in
 882    coreboot are optional, but when they are configured in the user must be able
 883    to rely on them. For example, if CBFS verification is enabled and the file
 884    hash when loading the romstage doesn't match what it should be, it is better
 885    to stop execution than to jump to potentially malicious code.
 886
 887 In addition to normal error logging with `printk()`, coreboot also offers the
 888 `assert()` macro. `assert()` should be used judiciously to confirm that
 889 conditions are true which the programmer _knows_ to be true, in order to catch
 890 programming errors and incorrect assumptions. It is therefore different from a
 891 normal `if ()`-check that is used to actually test for things which may turn
 892 out to be true or false based on external conditions. For example, anything
 893 that involves communicating with hardware, such as whether an attempt to read
 894 from SPI flash succeeded, should _not_ use `assert()` and should instead just
 895 be checked with a normal `if ()` and subsequent manual error handling. Hardware
 896 can always fail for various reasons and the programmer can never 100% assume in
 897 advance that it will work as expected. On the other hand, if a function takes a
 898 pointer parameter `ctx` and the contract for that function (as documented in a
 899 comment above its declaration) specifies that this parameter should point to a
 900 valid context structure, then adding an `assert(ctx)` line to that function may
 901 be a good idea. The programmer knows that this function should never be called
 902 with a NULL pointer (because that's how it is specified), and if it was actually
 903 called with a NULL pointer that would indicate a programming error on account of
 904 the caller.
 905
 906 `assert()` can be configured to either just print an error message and continue
 907 execution (default), or call `die()` (when `CONFIG_FATAL_ASSERTS` is set).
 908 Developers are encouraged to always test their code with this option enabled to
 909 make assertion errors (and therefore bugs) more easy to notice. Since assertions
 910 thus do not always stop execution, they should never be relied upon to be the
 911 sole guard against conditions that really _need_ to stop execution (e.g.
 912 security guarantees should never be enforced only by `assert()`).
 913
 914 Headers and includes
 915 ---------------
 916
 917 Headers should always be included at the top of the file. Includes should
 918 always use the `#include <file.h>` notation, except for rare cases where a file
 919 in the same directory that is not part of a normal include path gets included
 920 (e.g. local headers in mainboard directories), which should use `#include
 921 "file.h"`. Local "file.h" includes should always come separately after all
 922 <file.h> includes.  Headers that can be included from both assembly files and
 923 .c files should keep all C code wrapped in `#ifndef __ASSEMBLER__` blocks,
 924 including includes to other headers that don't follow that provision. Where a
 925 specific include order is required for technical reasons, it should be clearly
 926 documented with comments.
 927
 928 Files should generally include every header they need a definition from
 929 directly (and not include any unnecessary extra headers). Excepted from
 930 this are certain headers that intentionally chain-include other headers
 931 which logically belong to them and are just factored out into a separate
 932 location for implementation or organizatory reasons. This could be
 933 because part of the definitions is generic and part SoC-specific (e.g.
 934 `<gpio.h>` chain-including `<soc/gpio.h>`), architecture-specific (e.g.
 935 `<device/mmio.h>` chain-including `<arch/mmio.h>`), separated out into
 936 commonlib[/bsd] for sharing/license reasons (e.g. `<cbfs.h>`
 937 chain-including `<commonlib/bsd/cbfs_serialized.h>`) or just split out
 938 to make organizing subunits of a larger header easier. This can also
 939 happen when certain definitions need to be in a specific header for
 940 legacy POSIX reasons but we would like to logically group them together
 941 (e.g. `uintptr_t` is in `<stdint.h>` and `size_t` in `<stddef.h>`, but
 942 it's nicer to be able to just include `<types.h>` and get all the common
 943 type and helper function stuff we need everywhere).
 944
 945 The headers `<kconfig.h>`, `<rules.h>` and `<commonlib/bsd/compiler.h>`
 946 are always automatically included in all compilation units by the build
 947 system and should not be included manually.
 948
 949 Don't re-invent common macros
 950 -----------------------------
 951
 952 The header file `src/commonlib/bsd/include/commonlib/bsd/helpers.h`
 953 contains a number of macros that you should use, rather than explicitly
 954 coding some variant of them yourself. For example, if you need to
 955 calculate the length of an array, take advantage of the macro
 956
 957 ```c
 958 #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
 959 ```
 960
 961 Editor modelines and other cruft
 962 --------------------------------
 963
 964 Some editors can interpret configuration information embedded in source
 965 files, indicated with special markers. For example, emacs interprets
 966 lines marked like this:
 967
 968 ```
 969 -*- mode: c -*-
 970 ```
 971
 972 Or like this:
 973
 974 ```
 975 /*
 976 Local Variables:
 977 compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c"
 978 End:
 979 */
 980 ```
 981
 982 Vim interprets markers that look like this:
 983
 984 ```
 985 /* vim:set sw=8 noet */
 986 ```
 987
 988 Do not include any of these in source files. People have their own
 989 personal editor configurations, and your source files should not
 990 override them. This includes markers for indentation and mode
 991 configuration. People may use their own custom mode, or may have some
 992 other magic method for making indentation work correctly.
 993
 994 Inline assembly
 995 ---------------
 996
 997 In architecture-specific code, you may need to use inline assembly to
 998 interface with CPU or platform functionality. Don't hesitate to do so
 999 when necessary. However, don't use inline assembly gratuitously when C
1000 can do the job. You can and should poke hardware from C when possible.
1001
1002 Consider writing simple helper functions that wrap common bits of inline
1003 assembly, rather than repeatedly writing them with slight variations.
1004 Remember that inline assembly can use C parameters.
1005
1006 Large, non-trivial assembly functions should go in .S files, with
1007 corresponding C prototypes defined in C header files. The C prototypes
1008 for assembly functions should use "asmlinkage".
1009
1010 You may need to mark your asm statement as volatile, to prevent GCC from
1011 removing it if GCC doesn't notice any side effects. You don't always
1012 need to do so, though, and doing so unnecessarily can limit
1013 optimization.
1014
1015 When writing a single inline assembly statement containing multiple
1016 instructions, put each instruction on a separate line in a separate
1017 quoted string, and end each string except the last with nt to
1018 properly indent the next instruction in the assembly output:
1019
1020 ```c
1021 asm ("magic %reg1, #42nt"
1022         "more_magic %reg2, %reg3"
1023         : /* outputs */ : /* inputs */ : /* clobbers */);
1024 ```
1025
1026 GCC extensions
1027 --------------
1028
1029 GCC is the only officially-supported compiler for coreboot, and a
1030 variety of its C language extensions are heavily used throughout the
1031 code base. There have been occasional attempts to add clang as a second
1032 compiler option, which is generally compatible to the same language
1033 extensions that have been long-established by GCC.
1034
1035 Some GCC extensions (e.g. inline assembly) are basically required for
1036 proper firmware development. Others enable more safe or flexible
1037 coding patterns than can be expressed with standard C (e.g. statement
1038 expressions and `typeof()` to avoid double evaluation in macros like
1039 `MAX()`). Yet others just add some simple convenience and reduce
1040 boilerplate (e.g. `void *` arithmetic).
1041
1042 Since some GCC extensions are necessary either way, there is no gain
1043 from avoiding other GCC extensions elsewhere. The use of all official
1044 GCC extensions is expressly allowed within coreboot. In cases where an
1045 extension can be replaced by a 100% equivalent C standard feature with
1046 no extra boilerplate or loss of readability, the C standard feature
1047 should be preferred (this usually only happens when GCC retains an
1048 older pre-standardization extension for backwards compatibility, e.g.
1049 the old pre-C99 syntax for designated initializers). But if there is
1050 any advantage offered by the GCC extension (e.g. using GCC zero-length
1051 arrays instead of C99 variable-length arrays because they don't inhibit
1052 `sizeof()`), there is no reason to deprive ourselves of that, and "this
1053 is not C standard compliant" should not be a reason to argue against
1054 its use in reviews.
1055
1056 This rule only applies to explicit GCC extensions listed in the
1057 "Extensions to the C Language Family" section of the GCC manual. Code
1058 should never rely on incidental GCC translation behavior that is not
1059 explicitly documented as a feature and could change at any moment.
1060
1061 References
1062 ----------
1063
1064 The C Programming Language, Second Edition by Brian W. Kernighan and
1065 Dennis M. Ritchie. Prentice Hall, Inc., 1988. ISBN 0-13-110362-8
1066 (paperback), 0-13-110370-9 (hardback). URL:
1067 <https://duckduckgo.com/?q=isbn+0-13-110362-8> or
1068 <https://www.google.com/search?q=isbn+0-13-110362-8>
1069
1070
1071 The Practice of Programming by Brian W. Kernighan and Rob Pike.
1072 Addison-Wesley, Inc., 1999. ISBN 0-201-61586-X. URL:
1073 <https://duckduckgo.com/?q=ISBN+0-201-61586-X> or
1074 <https://www.google.com/search?q=ISBN+0-201-61586-X>
1075
1076 GNU manuals - where in compliance with K&R and this text - for cpp, gcc,
1077 gcc internals and indent, all available from
1078 <http://www.gnu.org/manual/>
1079
1080 WG14 is the international standardization working group for the
1081 programming language C, URL: <http://www.open-std.org/JTC1/SC22/WG14/>
1082
1083 Kernel CodingStyle, by greg@kroah.com at OLS 2002:
1084 <http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/>