4 SPDX-FileCopyrightText: Copyright The SCons Foundation (https://scons.org)
5 SPDX-License-Identifier: MIT
6 SPDX-FileType: DOCUMENTATION
8 This file is processed by the bin/SConsDoc.py module.
12 <!ENTITY % scons SYSTEM "../scons.mod">
15 <!ENTITY % builders-mod SYSTEM "../generated/builders.mod">
17 <!ENTITY % functions-mod SYSTEM "../generated/functions.mod">
19 <!ENTITY % tools-mod SYSTEM "../generated/tools.mod">
21 <!ENTITY % variables-mod SYSTEM "../generated/variables.mod">
25 <chapter id="chap-scanners"
26 xmlns="http://www.scons.org/dbxsd/v1.0"
27 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
28 xsi:schemaLocation="http://www.scons.org/dbxsd/v1.0 http://www.scons.org/dbxsd/v1.0/scons.xsd">
29 <title>Extending &SCons;: Writing Your Own Scanners</title>
33 =head1 Using and writing dependency scanners
35 QuickScan allows simple target-independent scanners to be set up for
36 source files. Only one QuickScan scanner may be associated with any given
37 source file and environment, although the same scanner may (and should)
38 be used for multiple files of a given type.
40 A QuickScan scanner is only ever invoked once for a given source file,
41 and it is only invoked if the file is used by some target in the tree
42 (i.e., there is a dependency on the source file).
44 QuickScan is invoked as follows:
46 QuickScan CONSENV CODEREF, FILENAME [, PATH]
48 The subroutine referenced by CODEREF is expected to return a list of
49 filenames included directly by FILE. These filenames will, in turn, be
50 scanned. The optional PATH argument supplies a lookup path for finding
51 FILENAME and/or files returned by the user-supplied subroutine. The PATH
52 may be a reference to an array of lookup-directory names, or a string of
53 names separated by the system's separator character (':' on UNIX systems,
56 The subroutine is called once for each line in the file, with $_ set to the
57 current line. If the subroutine needs to look at additional lines, or, for
58 that matter, the entire file, then it may read them itself, from the
59 filehandle SCAN. It may also terminate the loop, if it knows that no further
60 include information is available, by closing the filehandle.
62 Whether or not a lookup path is provided, QuickScan first tries to lookup
63 the file relative to the current directory (for the top-level file
64 supplied directly to QuickScan), or from the directory containing the
65 file which referenced the file. This is not very general, but seems good
66 enough, especially if you have the luxury of writing your own utilities
67 and can control the use of the search path in a standard way.
69 Here's a real example, taken from a F<Construct> file here:
72 my($env, @tables) = @_;
73 foreach $t (@tables) {
74 $env->QuickScan(sub { /\b\S*?\.smf\b/g }, "$t.smf",
75 $env->{SMF_INCLUDE_PATH});
76 $env->Command(["$t.smdb.cc","$t.smdb.h","$t.snmp.cc",
77 "$t.ami.cc", "$t.http.cc"], "$t.smf",
78 q(smfgen %( %SMF_INCLUDE_OPT %) %<));
82 The subroutine above finds all names of the form <name>.smf in the
83 file. It will return the names even if they're found within comments,
84 but that's OK (the mechanism is forgiving of extra files; they're just
85 ignored on the assumption that the missing file will be noticed when
86 the program, in this example, smfgen, is actually invoked).
88 [NOTE that the form C<$env-E<gt>QuickScan ...> and C<$env-E<gt>Command
89 ...> should not be necessary, but, for some reason, is required
90 for this particular invocation. This appears to be a bug in Perl or
91 a misunderstanding on my part; this invocation style does not always
92 appear to be necessary.]
94 Here is another way to build the same scanner. This one uses an
95 explicit code reference, and also (unnecessarily, in this case) reads
96 the whole file itself:
101 push(@includes, /\b\S*?\.smf\b/g);
106 Note that the order of the loop is reversed, with the loop test at the
107 end. This is because the first line is already read for you. This scanner
108 can be attached to a source file by:
110 QuickScan $env \&myscan, "$_.smf";
112 This final example, which scans a different type of input file, takes
113 over the file scanning rather than being called for each input line:
116 sub { my(@includes) = ();
119 if /^(#include|import)\s+(\")(.+)(\")/ && $3
124 "$env->{CPPPATH};$BUILD/ActiveContext/ACSCLientInterfaces"
131 &SCons; has built-in &Scanners; that know how to look in
132 C/C++, Fortran, D, IDL, LaTeX, Python and SWIG source files
133 for information about
134 other files that targets built from those files depend on.
136 For example, if you have a file format which uses <literal>#include</literal>
137 to specify files which should be included into the source file
138 when it is processed, you can use an existing scanner already
141 You can use the same mechanisms that &SCons; uses to create
142 its built-in Scanners to write Scanners of your own for file types
143 that &SCons; does not know how to scan "out of the box."
147 <section id="simple-scanner">
148 <title>A Simple Scanner Example</title>
152 Suppose, for example, that we want to create a simple &Scanner;
153 for <filename>.k</filename> files.
154 A <filename>.k</filename> file contains some text that
156 and can include other files on lines that begin
157 with <literal>include</literal>
158 followed by a file name:
168 Scanning a file will be handled by a Python function
169 that you must supply.
170 Here is a function that will use the Python
171 <systemitem>re</systemitem> module
172 to scan for the <literal>include</literal> lines in our example:
179 include_re = re.compile(r'^include\s+(\S+)$', re.M)
181 def kfile_scan(node, env, path, arg=None):
182 contents = node.get_text_contents()
183 return env.File(include_re.findall(contents))
188 It is important to note that you
189 have to return a list of File nodes from the scanner function, simple
190 strings for the file names won't do.
191 As in the examples we are showing here,
192 you can use the &f-link-File;
193 function of your current &consenv; in order to create nodes
194 on the fly from a sequence of file names with relative paths.
200 The scanner function must
201 accept the four specified arguments
202 and return a list of implicit dependencies.
203 Presumably, these would be dependencies found
204 from examining the contents of the file,
205 although the function can perform any
206 manipulation at all to generate the list of
214 <term><parameter>node</parameter></term>
219 An &SCons; node object representing the file being scanned.
220 The path name to the file can be
221 used by converting the node to a string
222 using the <function>str</function> function,
223 or an internal &SCons; <methodname>get_text_contents</methodname>
224 object method can be used to fetch the contents.
231 <term><parameter>env</parameter></term>
236 The &consenv; in effect for this scan.
237 The scanner function may choose to use &consvars;
238 from this environment to affect its behavior.
245 <term><parameter>path</parameter></term>
250 A list of directories that form the search path for included files
252 This is how &SCons; handles the &cv-link-CPPPATH; and &cv-link-LIBPATH;
260 <term><parameter>arg</parameter></term>
265 An optional argument that can be passed
266 to this scanner function when it is called from
267 a scanner instance. The argument is only supplied
268 if it was given when the scanner instance is created
269 (see the manpage section "Scanner Objects").
270 This can be useful, for example, to distinguish which
271 scanner type called us, if the function might be bound
272 to several scanner objects.
273 Since the argument is only supplied in the function
274 call if it was defined for that scanner, the function
275 needs to be prepared to possibly be called in different
276 ways if multiple scanners are expected to use this
277 function - giving the parameter a default value as
278 shown above is a good way to do this.
279 If the function to scanner relationship will be 1:1,
280 just make sure they match.
290 A scanner object is created using the &f-link-Scanner; function,
291 which typically takes an <parameter>skeys</parameter> argument
292 to associate a file suffix with this Scanner.
293 The scanner object must then be associated with the
294 &cv-link-SCANNERS; &consvar; in the current &consenv;,
295 typically by using the &f-link-Append; method:
300 kscan = Scanner(function=kfile_scan, skeys=['.k'])
301 env.Append(SCANNERS=kscan)
306 Let's put this all together.
307 Our new file type, with the <filename>.k</filename> suffix,
308 will be processed by a command named <command>kprocess</command>,
309 which lives in non-standard location
310 <filename>/usr/local/bin</filename>,
311 so we add that path to the execution environment so &SCons;
312 can find it. Here's what it looks like:
316 <scons_example name="scanners_scan">
317 <file name="SConstruct" printme="1">
320 include_re = re.compile(r'^include\s+(\S+)$', re.M)
322 def kfile_scan(node, env, path):
323 contents = node.get_text_contents()
324 includes = include_re.findall(contents)
325 return env.File(includes)
327 kscan = Scanner(function=kfile_scan, skeys=['.k'])
329 env.AppendENVPath('PATH', '__ROOT__/usr/local/bin')
330 env.Append(SCANNERS=kscan)
332 env.Command('foo', 'foo.k', 'kprocess < $SOURCES > $TARGET')
339 <!-- # leave dep file out to show scanner works via dep not found
340 file name="other_file">
344 <directory name="__ROOT__/usr"></directory>
345 <directory name="__ROOT__/usr/local"></directory>
346 <directory name="__ROOT__/usr/local/bin"></directory>
347 <file name="__ROOT_/usr/local/bin/kprocess" chmod="755">
354 Assume a <filename>foo.k</filename> file like this:
358 <scons_example_file example="scanners_scan" name="foo.k">
359 </scons_example_file>
363 Now if we run &scons; we can see that the scanner works -
364 it identified the dependency
365 <filename>other_file</filename> via the detected
366 <literal>include</literal> line,
367 although we get an error message because we
368 forgot to create that file!
372 <scons_output example="scanners_scan" suffix="1">
373 <scons_output_command>scons -Q</scons_output_command>
378 <section id="scanner-search-paths">
379 <title>Adding a search path to a Scanner: &FindPathDirs;</title>
383 If the build tool in question will use a path variable to search
384 for included files or other dependencies, then the &Scanner; will
385 need to take that path variable into account as well -
386 the same way &cv-link-CPPPATH; is used for files processed
387 by the C Preprocessor (used for C, C++, Fortran and others).
388 Path variables may be lists of nodes or semicolon-separated strings
389 (&SCons; uses a semicolon here irrespective of
390 the pathlist separator used by the native operating system),
391 and may contain &consvars; to be expanded.
392 A Scanner can take a <parameter>path_function</parameter>
393 to process such a path variable;
394 the function produces a tuple of paths that is passed to the
395 scanner function as its <parameter>path</parameter> parameter.
402 &SCons; provides the premade &f-link-FindPathDirs;
403 function which returns a callable to expand a given path variable
404 (given as an &SCons; &consvar; name)
405 to a tuple of paths at the time the Scanner is called.
406 Deferring evaluation until that point allows, for instance,
407 the path to contain &cv-link-TARGET; references which differ for
414 Using &FindPathDirs; is easy. Continuing the above example,
415 using <envar>$KPATH</envar> as the &consvar; to hold the paths
416 (analogous to &cv-link-CPPPATH;), we just modify the call to
417 the &f-link-Scanner; factory function to include a
418 <parameter>path_function</parameter> keyword argument:
422 <scons_example name="scanners_findpathdirs">
423 <file name="SConstruct" printme="1">
427 path_function=FindPathDirs('KPATH'),
434 &FindPathDirs; is called when the Scanner is created,
435 and the callable object it returns is stored
436 as an attribute in the scanner.
437 When the scanner is invoked, it calls that object,
438 which processes the <envar>$KPATH</envar> from the
439 current &consenv;, doing necessary expansions and,
440 if necessary, adds related repository and variant directories,
441 producing a (possibly empty) tuple of paths
442 that is passed on to the scanner function.
443 The scanner function is then responsible for using that list
444 of paths to locate the include files identified by the scan.
445 The next section will show an example of that.
452 the returned method stores the path in an efficient way so
453 lookups are fast even when variable substitutions may be needed.
454 This is important since many files get scanned in a typical build.
460 <section id="scanner-with-builder">
461 <title>Using scanners with Builders</title>
464 One approach for introducing a &Scanner; into the build is in
465 conjunction with a &Builder;. There are two relevant optional
466 parameters we can use when creating a Builder:
467 <parameter>source_scanner</parameter> and
468 <parameter>target_scanner</parameter>.
469 <parameter>source_scanner</parameter> is used for scanning
470 source files, and <parameter>target_scanner</parameter>
471 is used for scanning the target once it is generated.
474 <scons_example name="scanners_builders">
475 <file name="SConstruct" printme="1">
478 include_re = re.compile(r"^include\s+(\S+)$", re.M)
480 def kfile_scan(node, env, path, arg=None):
481 includes = include_re.findall(node.get_text_contents())
482 print(f"DEBUG: scan of {str(node)!r} found {includes}")
486 file = str(dir) + os.sep + inc
487 if os.path.exists(file):
490 print(f"DEBUG: scanned dependencies found: {deps}")
491 return env.File(deps)
496 path_function=FindPathDirs("KPATH"),
499 def build_function(target, source, env):
500 # Code to build "target" from "source"
504 action=build_function,
506 source_scanner=kscan,
510 env = Environment(BUILDERS={"KFile": bld}, KPATH="inc")
513 <file name="file.input">
518 <file name="inc/other_file">
524 Running this example would only show that the stub
525 <function>build_function</function> is getting called,
526 so some debug prints were added to the scanner function,
527 just to show the scanner is being invoked.
530 <scons_output example="scanners_builders" suffix="1">
531 <scons_output_command>scons -Q</scons_output_command>
535 The path-search implementation in
536 <function>kfile_scan</function> works,
537 but is quite simple-minded - a production scanner
538 will probably do something more sophisticated.
544 An emitter function can modify the list of sources or targets
545 passed to the action function when the Builder is triggered.
551 A scanner function will not affect the list of sources or targets
552 seen by the Builder during the build action. The scanner function
553 will, however, affect if the Builder should rebuild (if any of
554 the files sourced by the Scanner have changed for example).