3 Found thread data sharing on the GPU. Expect degraded performance due to data globalization. [OMP112]
4 =====================================================================================================
6 This missed remark indicates that a globalized value was found on the target
7 device that was not either replaced with stack memory by :ref:`OMP110 <omp110>`
8 or shared memory by :ref:`OMP111 <omp111>`. Globalization that has not been
9 removed will need to be handled by the runtime and will significantly impact
12 The OpenMP standard requires that threads are able to share their data between
13 each-other. However, this is not true by default when offloading to a target
14 device such as a GPU. Threads on a GPU cannot shared their data unless it is
15 first placed in global or shared memory. In order to create standards complaint
16 code, the Clang compiler will globalize any variables that could potentially be
17 shared between the threads. In the majority of cases, globalized variables can
18 either be returns to a thread-local stack, or pushed to shared memory. However,
19 in a few cases it is necessary and will cause a performance penalty.
24 This example shows legitimate data sharing on the device. It is a convoluted
25 example, but is completely complaint with the OpenMP standard. If globalization
26 was not added this would result in different results on different target
34 #pragma omp declare target
36 #pragma omp end declare target
39 int x = omp_get_thread_num();
40 if (omp_get_thread_num() == 1)
45 printf ("Thread %d: %d\n", omp_get_thread_num(), *p);
49 #pragma omp target parallel
53 .. code-block:: console
55 $ clang++ -fopenmp -fopenmp-targets=nvptx64 -O1 -Rpass-missed=openmp-opt omp112.cpp
56 omp112.cpp:9:7: remark: Found thread data sharing on the GPU. Expect degraded performance
57 due to data globalization. [OMP112] [-Rpass-missed=openmp-opt]
58 int x = omp_get_thread_num();
61 A less convoluted example globalization that cannot be removed occurs when
62 calling functions that aren't visible from the current translation unit.
66 extern void use(int *x);
74 #pragma omp target parallel
78 .. code-block:: console
80 $ clang++ -fopenmp -fopenmp-targets=nvptx64 -O1 -Rpass-missed=openmp-opt omp112.cpp
81 omp112.cpp:4:7: remark: Found thread data sharing on the GPU. Expect degraded performance
82 due to data globalization. [OMP112] [-Rpass-missed=openmp-opt]
89 OpenMP target offloading missed remark.