8 .. _clang-nvlink-wrapper:
13 This tools works as a wrapper around the NVIDIA ``nvlink`` linker. The purpose
14 of this wrapper is to provide an interface similar to the ``ld.lld`` linker
15 while still relying on NVIDIA's proprietary linker to produce the final output.
17 ``nvlink`` has a number of known quirks that make it difficult to use in a
18 unified offloading setting. For example, it does not accept ``.o`` files as they
19 must be named ``.cubin``. Static archives do not work, so passing a ``.a`` will
20 provide a linker error. ``nvlink`` also does not support link time optimization
21 and ignores many standard linker arguments. This tool works around these issues.
26 This tool can be used with the following options. Any arguments not intended
27 only for the linker wrapper will be forwarded to ``nvlink``.
29 .. code-block:: console
31 OVERVIEW: A utility that wraps around the NVIDIA 'nvlink' linker.
32 This enables static linking and LTO handling for NVPTX targets.
34 USAGE: clang-nvlink-wrapper [options] <options to passed to nvlink>
37 --arch <value> Specify the 'sm_' name of the target architecture.
38 --cuda-path=<dir> Set the system CUDA path
39 --dry-run Print generated commands without running.
40 --feature <value> Specify the '+ptx' freature to use for LTO.
41 -g Specify that this was a debug compile.
42 -help-hidden Display all available options
43 -help Display available options (--help-hidden for more)
44 -L <dir> Add <dir> to the library search path
45 -l <libname> Search for library <libname>
46 -mllvm <arg> Arguments passed to LLVM, including Clang invocations,
47 for which the '-mllvm' prefix is preserved. Use '-mllvm
48 --help' for a list of options.
49 -o <path> Path to file to write output
50 --plugin-opt=jobs=<value>
51 Number of LTO codegen partitions
52 --plugin-opt=lto-partitions=<value>
53 Number of LTO codegen partitions
54 --plugin-opt=O<O0, O1, O2, or O3>
55 Optimization level for LTO
56 --plugin-opt=thinlto<value>
57 Enable the thin-lto backend
58 --plugin-opt=<value> Arguments passed to LLVM, including Clang invocations,
59 for which the '-mllvm' prefix is preserved. Use '-mllvm
60 --help' for a list of options.
61 --save-temps Save intermediate results
62 --version Display the version number and exit
63 -v Print verbose information
68 This tool is intended to be invoked when targeting the NVPTX toolchain directly
69 as a cross-compiling target. This can be used to create standalone GPU
70 executables with normal linking semantics similar to standard compilation.
72 .. code-block:: console
74 clang --target=nvptx64-nvidia-cuda -march=native -flto=full input.c