Documentation/x86/intel_rdt_ui.txt

   1 User Interface for Resource Allocation in Intel Resource Director Technology
   2
   3 Copyright (C) 2016 Intel Corporation
   4
   5 Fenghua Yu <fenghua.yu@intel.com>
   6 Tony Luck <tony.luck@intel.com>
   7 Vikas Shivappa <vikas.shivappa@intel.com>
   8
   9 This feature is enabled by the CONFIG_INTEL_RDT Kconfig and the
  10 X86 /proc/cpuinfo flag bits "rdt", "cqm", "cat_l3" and "cdp_l3".
  11
  12 To use the feature mount the file system:
  13
  14  # mount -t resctrl resctrl [-o cdp] /sys/fs/resctrl
  15
  16 mount options are:
  17
  18 "cdp": Enable code/data prioritization in L3 cache allocations.
  19
  20 RDT features are orthogonal. A particular system may support only
  21 monitoring, only control, or both monitoring and control.
  22
  23 The mount succeeds if either of allocation or monitoring is present, but
  24 only those files and directories supported by the system will be created.
  25 For more details on the behavior of the interface during monitoring
  26 and allocation, see the "Resource alloc and monitor groups" section.
  27
  28 Info directory
  29 --------------
  30
  31 The 'info' directory contains information about the enabled
  32 resources. Each resource has its own subdirectory. The subdirectory
  33 names reflect the resource names.
  34
  35 Each subdirectory contains the following files with respect to
  36 allocation:
  37
  38 Cache resource(L3/L2)  subdirectory contains the following files
  39 related to allocation:
  40
  41 "num_closids":          The number of CLOSIDs which are valid for this
  42                         resource. The kernel uses the smallest number of
  43                         CLOSIDs of all enabled resources as limit.
  44
  45 "cbm_mask":             The bitmask which is valid for this resource.
  46                         This mask is equivalent to 100%.
  47
  48 "min_cbm_bits":         The minimum number of consecutive bits which
  49                         must be set when writing a mask.
  50
  51 "shareable_bits":       Bitmask of shareable resource with other executing
  52                         entities (e.g. I/O). User can use this when
  53                         setting up exclusive cache partitions. Note that
  54                         some platforms support devices that have their
  55                         own settings for cache use which can over-ride
  56                         these bits.
  57
  58 Memory bandwitdh(MB) subdirectory contains the following files
  59 with respect to allocation:
  60
  61 "min_bandwidth":        The minimum memory bandwidth percentage which
  62                         user can request.
  63
  64 "bandwidth_gran":       The granularity in which the memory bandwidth
  65                         percentage is allocated. The allocated
  66                         b/w percentage is rounded off to the next
  67                         control step available on the hardware. The
  68                         available bandwidth control steps are:
  69                         min_bandwidth + N * bandwidth_gran.
  70
  71 "delay_linear":         Indicates if the delay scale is linear or
  72                         non-linear. This field is purely informational
  73                         only.
  74
  75 If RDT monitoring is available there will be an "L3_MON" directory
  76 with the following files:
  77
  78 "num_rmids":            The number of RMIDs available. This is the
  79                         upper bound for how many "CTRL_MON" + "MON"
  80                         groups can be created.
  81
  82 "mon_features": Lists the monitoring events if
  83                         monitoring is enabled for the resource.
  84
  85 "max_threshold_occupancy":
  86                         Read/write file provides the largest value (in
  87                         bytes) at which a previously used LLC_occupancy
  88                         counter can be considered for re-use.
  89
  90 Finally, in the top level of the "info" directory there is a file
  91 named "last_cmd_status". This is reset with every "command" issued
  92 via the file system (making new directories or writing to any of the
  93 control files). If the command was successful, it will read as "ok".
  94 If the command failed, it will provide more information that can be
  95 conveyed in the error returns from file operations. E.g.
  96
  97         # echo L3:0=f7 > schemata
  98         bash: echo: write error: Invalid argument
  99         # cat info/last_cmd_status
 100         mask f7 has non-consecutive 1-bits
 101
 102 Resource alloc and monitor groups
 103 ---------------------------------
 104
 105 Resource groups are represented as directories in the resctrl file
 106 system.  The default group is the root directory which, immediately
 107 after mounting, owns all the tasks and cpus in the system and can make
 108 full use of all resources.
 109
 110 On a system with RDT control features additional directories can be
 111 created in the root directory that specify different amounts of each
 112 resource (see "schemata" below). The root and these additional top level
 113 directories are referred to as "CTRL_MON" groups below.
 114
 115 On a system with RDT monitoring the root directory and other top level
 116 directories contain a directory named "mon_groups" in which additional
 117 directories can be created to monitor subsets of tasks in the CTRL_MON
 118 group that is their ancestor. These are called "MON" groups in the rest
 119 of this document.
 120
 121 Removing a directory will move all tasks and cpus owned by the group it
 122 represents to the parent. Removing one of the created CTRL_MON groups
 123 will automatically remove all MON groups below it.
 124
 125 All groups contain the following files:
 126
 127 "tasks":
 128         Reading this file shows the list of all tasks that belong to
 129         this group. Writing a task id to the file will add a task to the
 130         group. If the group is a CTRL_MON group the task is removed from
 131         whichever previous CTRL_MON group owned the task and also from
 132         any MON group that owned the task. If the group is a MON group,
 133         then the task must already belong to the CTRL_MON parent of this
 134         group. The task is removed from any previous MON group.
 135
 136
 137 "cpus":
 138         Reading this file shows a bitmask of the logical CPUs owned by
 139         this group. Writing a mask to this file will add and remove
 140         CPUs to/from this group. As with the tasks file a hierarchy is
 141         maintained where MON groups may only include CPUs owned by the
 142         parent CTRL_MON group.
 143
 144
 145 "cpus_list":
 146         Just like "cpus", only using ranges of CPUs instead of bitmasks.
 147
 148
 149 When control is enabled all CTRL_MON groups will also contain:
 150
 151 "schemata":
 152         A list of all the resources available to this group.
 153         Each resource has its own line and format - see below for details.
 154
 155 When monitoring is enabled all MON groups will also contain:
 156
 157 "mon_data":
 158         This contains a set of files organized by L3 domain and by
 159         RDT event. E.g. on a system with two L3 domains there will
 160         be subdirectories "mon_L3_00" and "mon_L3_01".  Each of these
 161         directories have one file per event (e.g. "llc_occupancy",
 162         "mbm_total_bytes", and "mbm_local_bytes"). In a MON group these
 163         files provide a read out of the current value of the event for
 164         all tasks in the group. In CTRL_MON groups these files provide
 165         the sum for all tasks in the CTRL_MON group and all tasks in
 166         MON groups. Please see example section for more details on usage.
 167
 168 Resource allocation rules
 169 -------------------------
 170 When a task is running the following rules define which resources are
 171 available to it:
 172
 173 1) If the task is a member of a non-default group, then the schemata
 174    for that group is used.
 175
 176 2) Else if the task belongs to the default group, but is running on a
 177    CPU that is assigned to some specific group, then the schemata for the
 178    CPU's group is used.
 179
 180 3) Otherwise the schemata for the default group is used.
 181
 182 Resource monitoring rules
 183 -------------------------
 184 1) If a task is a member of a MON group, or non-default CTRL_MON group
 185    then RDT events for the task will be reported in that group.
 186
 187 2) If a task is a member of the default CTRL_MON group, but is running
 188    on a CPU that is assigned to some specific group, then the RDT events
 189    for the task will be reported in that group.
 190
 191 3) Otherwise RDT events for the task will be reported in the root level
 192    "mon_data" group.
 193
 194
 195 Notes on cache occupancy monitoring and control
 196 -----------------------------------------------
 197 When moving a task from one group to another you should remember that
 198 this only affects *new* cache allocations by the task. E.g. you may have
 199 a task in a monitor group showing 3 MB of cache occupancy. If you move
 200 to a new group and immediately check the occupancy of the old and new
 201 groups you will likely see that the old group is still showing 3 MB and
 202 the new group zero. When the task accesses locations still in cache from
 203 before the move, the h/w does not update any counters. On a busy system
 204 you will likely see the occupancy in the old group go down as cache lines
 205 are evicted and re-used while the occupancy in the new group rises as
 206 the task accesses memory and loads into the cache are counted based on
 207 membership in the new group.
 208
 209 The same applies to cache allocation control. Moving a task to a group
 210 with a smaller cache partition will not evict any cache lines. The
 211 process may continue to use them from the old partition.
 212
 213 Hardware uses CLOSid(Class of service ID) and an RMID(Resource monitoring ID)
 214 to identify a control group and a monitoring group respectively. Each of
 215 the resource groups are mapped to these IDs based on the kind of group. The
 216 number of CLOSid and RMID are limited by the hardware and hence the creation of
 217 a "CTRL_MON" directory may fail if we run out of either CLOSID or RMID
 218 and creation of "MON" group may fail if we run out of RMIDs.
 219
 220 max_threshold_occupancy - generic concepts
 221 ------------------------------------------
 222
 223 Note that an RMID once freed may not be immediately available for use as
 224 the RMID is still tagged the cache lines of the previous user of RMID.
 225 Hence such RMIDs are placed on limbo list and checked back if the cache
 226 occupancy has gone down. If there is a time when system has a lot of
 227 limbo RMIDs but which are not ready to be used, user may see an -EBUSY
 228 during mkdir.
 229
 230 max_threshold_occupancy is a user configurable value to determine the
 231 occupancy at which an RMID can be freed.
 232
 233 Schemata files - general concepts
 234 ---------------------------------
 235 Each line in the file describes one resource. The line starts with
 236 the name of the resource, followed by specific values to be applied
 237 in each of the instances of that resource on the system.
 238
 239 Cache IDs
 240 ---------
 241 On current generation systems there is one L3 cache per socket and L2
 242 caches are generally just shared by the hyperthreads on a core, but this
 243 isn't an architectural requirement. We could have multiple separate L3
 244 caches on a socket, multiple cores could share an L2 cache. So instead
 245 of using "socket" or "core" to define the set of logical cpus sharing
 246 a resource we use a "Cache ID". At a given cache level this will be a
 247 unique number across the whole system (but it isn't guaranteed to be a
 248 contiguous sequence, there may be gaps).  To find the ID for each logical
 249 CPU look in /sys/devices/system/cpu/cpu*/cache/index*/id
 250
 251 Cache Bit Masks (CBM)
 252 ---------------------
 253 For cache resources we describe the portion of the cache that is available
 254 for allocation using a bitmask. The maximum value of the mask is defined
 255 by each cpu model (and may be different for different cache levels). It
 256 is found using CPUID, but is also provided in the "info" directory of
 257 the resctrl file system in "info/{resource}/cbm_mask". X86 hardware
 258 requires that these masks have all the '1' bits in a contiguous block. So
 259 0x3, 0x6 and 0xC are legal 4-bit masks with two bits set, but 0x5, 0x9
 260 and 0xA are not.  On a system with a 20-bit mask each bit represents 5%
 261 of the capacity of the cache. You could partition the cache into four
 262 equal parts with masks: 0x1f, 0x3e0, 0x7c00, 0xf8000.
 263
 264 Memory bandwidth(b/w) percentage
 265 --------------------------------
 266 For Memory b/w resource, user controls the resource by indicating the
 267 percentage of total memory b/w.
 268
 269 The minimum bandwidth percentage value for each cpu model is predefined
 270 and can be looked up through "info/MB/min_bandwidth". The bandwidth
 271 granularity that is allocated is also dependent on the cpu model and can
 272 be looked up at "info/MB/bandwidth_gran". The available bandwidth
 273 control steps are: min_bw + N * bw_gran. Intermediate values are rounded
 274 to the next control step available on the hardware.
 275
 276 The bandwidth throttling is a core specific mechanism on some of Intel
 277 SKUs. Using a high bandwidth and a low bandwidth setting on two threads
 278 sharing a core will result in both threads being throttled to use the
 279 low bandwidth.
 280
 281 L3 schemata file details (code and data prioritization disabled)
 282 ----------------------------------------------------------------
 283 With CDP disabled the L3 schemata format is:
 284
 285         L3:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
 286
 287 L3 schemata file details (CDP enabled via mount option to resctrl)
 288 ------------------------------------------------------------------
 289 When CDP is enabled L3 control is split into two separate resources
 290 so you can specify independent masks for code and data like this:
 291
 292         L3data:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
 293         L3code:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
 294
 295 L2 schemata file details
 296 ------------------------
 297 L2 cache does not support code and data prioritization, so the
 298 schemata format is always:
 299
 300         L2:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...
 301
 302 Memory b/w Allocation details
 303 -----------------------------
 304
 305 Memory b/w domain is L3 cache.
 306
 307         MB:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
 308
 309 Reading/writing the schemata file
 310 ---------------------------------
 311 Reading the schemata file will show the state of all resources
 312 on all domains. When writing you only need to specify those values
 313 which you wish to change.  E.g.
 314
 315 # cat schemata
 316 L3DATA:0=fffff;1=fffff;2=fffff;3=fffff
 317 L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
 318 # echo "L3DATA:2=3c0;" > schemata
 319 # cat schemata
 320 L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
 321 L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
 322
 323 Examples for RDT allocation usage:
 324
 325 Example 1
 326 ---------
 327 On a two socket machine (one L3 cache per socket) with just four bits
 328 for cache bit masks, minimum b/w of 10% with a memory bandwidth
 329 granularity of 10%
 330
 331 # mount -t resctrl resctrl /sys/fs/resctrl
 332 # cd /sys/fs/resctrl
 333 # mkdir p0 p1
 334 # echo "L3:0=3;1=c\nMB:0=50;1=50" > /sys/fs/resctrl/p0/schemata
 335 # echo "L3:0=3;1=3\nMB:0=50;1=50" > /sys/fs/resctrl/p1/schemata
 336
 337 The default resource group is unmodified, so we have access to all parts
 338 of all caches (its schemata file reads "L3:0=f;1=f").
 339
 340 Tasks that are under the control of group "p0" may only allocate from the
 341 "lower" 50% on cache ID 0, and the "upper" 50% of cache ID 1.
 342 Tasks in group "p1" use the "lower" 50% of cache on both sockets.
 343
 344 Similarly, tasks that are under the control of group "p0" may use a
 345 maximum memory b/w of 50% on socket0 and 50% on socket 1.
 346 Tasks in group "p1" may also use 50% memory b/w on both sockets.
 347 Note that unlike cache masks, memory b/w cannot specify whether these
 348 allocations can overlap or not. The allocations specifies the maximum
 349 b/w that the group may be able to use and the system admin can configure
 350 the b/w accordingly.
 351
 352 Example 2
 353 ---------
 354 Again two sockets, but this time with a more realistic 20-bit mask.
 355
 356 Two real time tasks pid=1234 running on processor 0 and pid=5678 running on
 357 processor 1 on socket 0 on a 2-socket and dual core machine. To avoid noisy
 358 neighbors, each of the two real-time tasks exclusively occupies one quarter
 359 of L3 cache on socket 0.
 360
 361 # mount -t resctrl resctrl /sys/fs/resctrl
 362 # cd /sys/fs/resctrl
 363
 364 First we reset the schemata for the default group so that the "upper"
 365 50% of the L3 cache on socket 0 and 50% of memory b/w cannot be used by
 366 ordinary tasks:
 367
 368 # echo "L3:0=3ff;1=fffff\nMB:0=50;1=100" > schemata
 369
 370 Next we make a resource group for our first real time task and give
 371 it access to the "top" 25% of the cache on socket 0.
 372
 373 # mkdir p0
 374 # echo "L3:0=f8000;1=fffff" > p0/schemata
 375
 376 Finally we move our first real time task into this resource group. We
 377 also use taskset(1) to ensure the task always runs on a dedicated CPU
 378 on socket 0. Most uses of resource groups will also constrain which
 379 processors tasks run on.
 380
 381 # echo 1234 > p0/tasks
 382 # taskset -cp 1 1234
 383
 384 Ditto for the second real time task (with the remaining 25% of cache):
 385
 386 # mkdir p1
 387 # echo "L3:0=7c00;1=fffff" > p1/schemata
 388 # echo 5678 > p1/tasks
 389 # taskset -cp 2 5678
 390
 391 For the same 2 socket system with memory b/w resource and CAT L3 the
 392 schemata would look like(Assume min_bandwidth 10 and bandwidth_gran is
 393 10):
 394
 395 For our first real time task this would request 20% memory b/w on socket
 396 0.
 397
 398 # echo -e "L3:0=f8000;1=fffff\nMB:0=20;1=100" > p0/schemata
 399
 400 For our second real time task this would request an other 20% memory b/w
 401 on socket 0.
 402
 403 # echo -e "L3:0=f8000;1=fffff\nMB:0=20;1=100" > p0/schemata
 404
 405 Example 3
 406 ---------
 407
 408 A single socket system which has real-time tasks running on core 4-7 and
 409 non real-time workload assigned to core 0-3. The real-time tasks share text
 410 and data, so a per task association is not required and due to interaction
 411 with the kernel it's desired that the kernel on these cores shares L3 with
 412 the tasks.
 413
 414 # mount -t resctrl resctrl /sys/fs/resctrl
 415 # cd /sys/fs/resctrl
 416
 417 First we reset the schemata for the default group so that the "upper"
 418 50% of the L3 cache on socket 0, and 50% of memory bandwidth on socket 0
 419 cannot be used by ordinary tasks:
 420
 421 # echo "L3:0=3ff\nMB:0=50" > schemata
 422
 423 Next we make a resource group for our real time cores and give it access
 424 to the "top" 50% of the cache on socket 0 and 50% of memory bandwidth on
 425 socket 0.
 426
 427 # mkdir p0
 428 # echo "L3:0=ffc00\nMB:0=50" > p0/schemata
 429
 430 Finally we move core 4-7 over to the new group and make sure that the
 431 kernel and the tasks running there get 50% of the cache. They should
 432 also get 50% of memory bandwidth assuming that the cores 4-7 are SMT
 433 siblings and only the real time threads are scheduled on the cores 4-7.
 434
 435 # echo F0 > p0/cpus
 436
 437 4) Locking between applications
 438
 439 Certain operations on the resctrl filesystem, composed of read/writes
 440 to/from multiple files, must be atomic.
 441
 442 As an example, the allocation of an exclusive reservation of L3 cache
 443 involves:
 444
 445   1. Read the cbmmasks from each directory
 446   2. Find a contiguous set of bits in the global CBM bitmask that is clear
 447      in any of the directory cbmmasks
 448   3. Create a new directory
 449   4. Set the bits found in step 2 to the new directory "schemata" file
 450
 451 If two applications attempt to allocate space concurrently then they can
 452 end up allocating the same bits so the reservations are shared instead of
 453 exclusive.
 454
 455 To coordinate atomic operations on the resctrlfs and to avoid the problem
 456 above, the following locking procedure is recommended:
 457
 458 Locking is based on flock, which is available in libc and also as a shell
 459 script command
 460
 461 Write lock:
 462
 463  A) Take flock(LOCK_EX) on /sys/fs/resctrl
 464  B) Read/write the directory structure.
 465  C) funlock
 466
 467 Read lock:
 468
 469  A) Take flock(LOCK_SH) on /sys/fs/resctrl
 470  B) If success read the directory structure.
 471  C) funlock
 472
 473 Example with bash:
 474
 475 # Atomically read directory structure
 476 $ flock -s /sys/fs/resctrl/ find /sys/fs/resctrl
 477
 478 # Read directory contents and create new subdirectory
 479
 480 $ cat create-dir.sh
 481 find /sys/fs/resctrl/ > output.txt
 482 mask = function-of(output.txt)
 483 mkdir /sys/fs/resctrl/newres/
 484 echo mask > /sys/fs/resctrl/newres/schemata
 485
 486 $ flock /sys/fs/resctrl/ ./create-dir.sh
 487
 488 Example with C:
 489
 490 /*
 491  * Example code do take advisory locks
 492  * before accessing resctrl filesystem
 493  */
 494 #include <sys/file.h>
 495 #include <stdlib.h>
 496
 497 void resctrl_take_shared_lock(int fd)
 498 {
 499         int ret;
 500
 501         /* take shared lock on resctrl filesystem */
 502         ret = flock(fd, LOCK_SH);
 503         if (ret) {
 504                 perror("flock");
 505                 exit(-1);
 506         }
 507 }
 508
 509 void resctrl_take_exclusive_lock(int fd)
 510 {
 511         int ret;
 512
 513         /* release lock on resctrl filesystem */
 514         ret = flock(fd, LOCK_EX);
 515         if (ret) {
 516                 perror("flock");
 517                 exit(-1);
 518         }
 519 }
 520
 521 void resctrl_release_lock(int fd)
 522 {
 523         int ret;
 524
 525         /* take shared lock on resctrl filesystem */
 526         ret = flock(fd, LOCK_UN);
 527         if (ret) {
 528                 perror("flock");
 529                 exit(-1);
 530         }
 531 }
 532
 533 void main(void)
 534 {
 535         int fd, ret;
 536
 537         fd = open("/sys/fs/resctrl", O_DIRECTORY);
 538         if (fd == -1) {
 539                 perror("open");
 540                 exit(-1);
 541         }
 542         resctrl_take_shared_lock(fd);
 543         /* code to read directory contents */
 544         resctrl_release_lock(fd);
 545
 546         resctrl_take_exclusive_lock(fd);
 547         /* code to read and write directory contents */
 548         resctrl_release_lock(fd);
 549 }
 550
 551 Examples for RDT Monitoring along with allocation usage:
 552
 553 Reading monitored data
 554 ----------------------
 555 Reading an event file (for ex: mon_data/mon_L3_00/llc_occupancy) would
 556 show the current snapshot of LLC occupancy of the corresponding MON
 557 group or CTRL_MON group.
 558
 559
 560 Example 1 (Monitor CTRL_MON group and subset of tasks in CTRL_MON group)
 561 ---------
 562 On a two socket machine (one L3 cache per socket) with just four bits
 563 for cache bit masks
 564
 565 # mount -t resctrl resctrl /sys/fs/resctrl
 566 # cd /sys/fs/resctrl
 567 # mkdir p0 p1
 568 # echo "L3:0=3;1=c" > /sys/fs/resctrl/p0/schemata
 569 # echo "L3:0=3;1=3" > /sys/fs/resctrl/p1/schemata
 570 # echo 5678 > p1/tasks
 571 # echo 5679 > p1/tasks
 572
 573 The default resource group is unmodified, so we have access to all parts
 574 of all caches (its schemata file reads "L3:0=f;1=f").
 575
 576 Tasks that are under the control of group "p0" may only allocate from the
 577 "lower" 50% on cache ID 0, and the "upper" 50% of cache ID 1.
 578 Tasks in group "p1" use the "lower" 50% of cache on both sockets.
 579
 580 Create monitor groups and assign a subset of tasks to each monitor group.
 581
 582 # cd /sys/fs/resctrl/p1/mon_groups
 583 # mkdir m11 m12
 584 # echo 5678 > m11/tasks
 585 # echo 5679 > m12/tasks
 586
 587 fetch data (data shown in bytes)
 588
 589 # cat m11/mon_data/mon_L3_00/llc_occupancy
 590 16234000
 591 # cat m11/mon_data/mon_L3_01/llc_occupancy
 592 14789000
 593 # cat m12/mon_data/mon_L3_00/llc_occupancy
 594 16789000
 595
 596 The parent ctrl_mon group shows the aggregated data.
 597
 598 # cat /sys/fs/resctrl/p1/mon_data/mon_l3_00/llc_occupancy
 599 31234000
 600
 601 Example 2 (Monitor a task from its creation)
 602 ---------
 603 On a two socket machine (one L3 cache per socket)
 604
 605 # mount -t resctrl resctrl /sys/fs/resctrl
 606 # cd /sys/fs/resctrl
 607 # mkdir p0 p1
 608
 609 An RMID is allocated to the group once its created and hence the <cmd>
 610 below is monitored from its creation.
 611
 612 # echo $$ > /sys/fs/resctrl/p1/tasks
 613 # <cmd>
 614
 615 Fetch the data
 616
 617 # cat /sys/fs/resctrl/p1/mon_data/mon_l3_00/llc_occupancy
 618 31789000
 619
 620 Example 3 (Monitor without CAT support or before creating CAT groups)
 621 ---------
 622
 623 Assume a system like HSW has only CQM and no CAT support. In this case
 624 the resctrl will still mount but cannot create CTRL_MON directories.
 625 But user can create different MON groups within the root group thereby
 626 able to monitor all tasks including kernel threads.
 627
 628 This can also be used to profile jobs cache size footprint before being
 629 able to allocate them to different allocation groups.
 630
 631 # mount -t resctrl resctrl /sys/fs/resctrl
 632 # cd /sys/fs/resctrl
 633 # mkdir mon_groups/m01
 634 # mkdir mon_groups/m02
 635
 636 # echo 3478 > /sys/fs/resctrl/mon_groups/m01/tasks
 637 # echo 2467 > /sys/fs/resctrl/mon_groups/m02/tasks
 638
 639 Monitor the groups separately and also get per domain data. From the
 640 below its apparent that the tasks are mostly doing work on
 641 domain(socket) 0.
 642
 643 # cat /sys/fs/resctrl/mon_groups/m01/mon_L3_00/llc_occupancy
 644 31234000
 645 # cat /sys/fs/resctrl/mon_groups/m01/mon_L3_01/llc_occupancy
 646 34555
 647 # cat /sys/fs/resctrl/mon_groups/m02/mon_L3_00/llc_occupancy
 648 31234000
 649 # cat /sys/fs/resctrl/mon_groups/m02/mon_L3_01/llc_occupancy
 650 32789
 651
 652
 653 Example 4 (Monitor real time tasks)
 654 -----------------------------------
 655
 656 A single socket system which has real time tasks running on cores 4-7
 657 and non real time tasks on other cpus. We want to monitor the cache
 658 occupancy of the real time threads on these cores.
 659
 660 # mount -t resctrl resctrl /sys/fs/resctrl
 661 # cd /sys/fs/resctrl
 662 # mkdir p1
 663
 664 Move the cpus 4-7 over to p1
 665 # echo f0 > p0/cpus
 666
 667 View the llc occupancy snapshot
 668
 669 # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy
 670 11234000