1 .. SPDX-License-Identifier: GPL-2.0-only
2 .. Copyright (C) 2022 Red Hat, Inc.
3 .. Copyright (C) 2022-2023 Isovalent, Inc.
5 ===============================================
6 BPF_MAP_TYPE_HASH, with PERCPU and LRU Variants
7 ===============================================
10 - ``BPF_MAP_TYPE_HASH`` was introduced in kernel version 3.19
11 - ``BPF_MAP_TYPE_PERCPU_HASH`` was introduced in version 4.6
12 - Both ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
13 were introduced in version 4.10
15 ``BPF_MAP_TYPE_HASH`` and ``BPF_MAP_TYPE_PERCPU_HASH`` provide general
16 purpose hash map storage. Both the key and the value can be structs,
17 allowing for composite keys and values.
19 The kernel is responsible for allocating and freeing key/value pairs, up
20 to the max_entries limit that you specify. Hash maps use pre-allocation
21 of hash table elements by default. The ``BPF_F_NO_PREALLOC`` flag can be
22 used to disable pre-allocation when it is too memory expensive.
24 ``BPF_MAP_TYPE_PERCPU_HASH`` provides a separate value slot per
25 CPU. The per-cpu values are stored internally in an array.
27 The ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
28 variants add LRU semantics to their respective hash tables. An LRU hash
29 will automatically evict the least recently used entries when the hash
30 table reaches capacity. An LRU hash maintains an internal LRU list that
31 is used to select elements for eviction. This internal LRU list is
32 shared across CPUs but it is possible to request a per CPU LRU list with
33 the ``BPF_F_NO_COMMON_LRU`` flag when calling ``bpf_map_create``. The
34 following table outlines the properties of LRU maps depending on the a
35 map type and the flags used to create the map.
37 ======================== ========================= ================================
38 Flag ``BPF_MAP_TYPE_LRU_HASH`` ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
39 ======================== ========================= ================================
40 **BPF_F_NO_COMMON_LRU** Per-CPU LRU, global map Per-CPU LRU, per-cpu map
41 **!BPF_F_NO_COMMON_LRU** Global LRU, global map Global LRU, per-cpu map
42 ======================== ========================= ================================
55 long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags)
57 Hash entries can be added or updated using the ``bpf_map_update_elem()``
58 helper. This helper replaces existing elements atomically. The ``flags``
59 parameter can be used to control the update behaviour:
61 - ``BPF_ANY`` will create a new element or update an existing element
62 - ``BPF_NOEXIST`` will create a new element only if one did not already
64 - ``BPF_EXIST`` will update an existing element
66 ``bpf_map_update_elem()`` returns 0 on success, or negative error in
74 void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
76 Hash entries can be retrieved using the ``bpf_map_lookup_elem()``
77 helper. This helper returns a pointer to the value associated with
78 ``key``, or ``NULL`` if no entry was found.
85 long bpf_map_delete_elem(struct bpf_map *map, const void *key)
87 Hash entries can be deleted using the ``bpf_map_delete_elem()``
88 helper. This helper will return 0 on success, or negative error in case
94 For ``BPF_MAP_TYPE_PERCPU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH``
95 the ``bpf_map_update_elem()`` and ``bpf_map_lookup_elem()`` helpers
96 automatically access the hash slot for the current CPU.
98 bpf_map_lookup_percpu_elem()
99 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
103 void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key, u32 cpu)
105 The ``bpf_map_lookup_percpu_elem()`` helper can be used to lookup the
106 value in the hash slot for a specific CPU. Returns value associated with
107 ``key`` on ``cpu`` , or ``NULL`` if no entry was found or ``cpu`` is
113 Values stored in ``BPF_MAP_TYPE_HASH`` can be accessed concurrently by
114 programs running on different CPUs. Since Kernel version 5.1, the BPF
115 infrastructure provides ``struct bpf_spin_lock`` to synchronise access.
116 See ``tools/testing/selftests/bpf/progs/test_spin_lock.c``.
121 bpf_map_get_next_key()
122 ~~~~~~~~~~~~~~~~~~~~~~
126 int bpf_map_get_next_key(int fd, const void *cur_key, void *next_key)
128 In userspace, it is possible to iterate through the keys of a hash using
129 libbpf's ``bpf_map_get_next_key()`` function. The first key can be fetched by
130 calling ``bpf_map_get_next_key()`` with ``cur_key`` set to
131 ``NULL``. Subsequent calls will fetch the next key that follows the
132 current key. ``bpf_map_get_next_key()`` returns 0 on success, -ENOENT if
133 cur_key is the last key in the hash, or negative error in case of
136 Note that if ``cur_key`` gets deleted then ``bpf_map_get_next_key()``
137 will instead return the *first* key in the hash table which is
138 undesirable. It is recommended to use batched lookup if there is going
139 to be key deletion intermixed with ``bpf_map_get_next_key()``.
144 Please see the ``tools/testing/selftests/bpf`` directory for functional
145 examples. The code snippets below demonstrates API usage.
147 This example shows how to declare an LRU Hash with a struct key and a
152 #include <linux/bpf.h>
153 #include <bpf/bpf_helpers.h>
165 __uint(type, BPF_MAP_TYPE_LRU_HASH);
166 __uint(max_entries, 32);
167 __type(key, struct key);
168 __type(value, struct value);
169 } packet_stats SEC(".maps");
171 This example shows how to create or update hash values using atomic
176 static void update_stats(__u32 srcip, int bytes)
181 struct value *value = bpf_map_lookup_elem(&packet_stats, &key);
184 __sync_fetch_and_add(&value->packets, 1);
185 __sync_fetch_and_add(&value->bytes, bytes);
187 struct value newval = { 1, bytes };
189 bpf_map_update_elem(&packet_stats, &key, &newval, BPF_NOEXIST);
193 Userspace walking the map elements from the map declared above:
197 #include <bpf/libbpf.h>
200 static void walk_hash_elements(int map_fd)
202 struct key *cur_key = NULL;
208 err = bpf_map_get_next_key(map_fd, cur_key, &next_key);
212 bpf_map_lookup_elem(map_fd, &next_key, &value);
214 // Use key and value here
223 This section of the document is targeted at Linux developers and describes
224 aspects of the map implementations that are not considered stable ABI. The
225 following details are subject to change in future versions of the kernel.
227 ``BPF_MAP_TYPE_LRU_HASH`` and variants
228 --------------------------------------
230 Updating elements in LRU maps may trigger eviction behaviour when the capacity
231 of the map is reached. There are various steps that the update algorithm
232 attempts in order to enforce the LRU property which have increasing impacts on
233 other CPUs involved in the following operation attempts:
235 - Attempt to use CPU-local state to batch operations
236 - Attempt to fetch free nodes from global lists
237 - Attempt to pull any node from a global list and remove it from the hashmap
238 - Attempt to pull any node from any CPU's list and remove it from the hashmap
240 This algorithm is described visually in the following diagram. See the
241 description in commit 3a08c2fd7634 ("bpf: LRU List") for a full explanation of
242 the corresponding operations:
244 .. kernel-figure:: map_lru_hash_update.dot
245 :alt: Diagram outlining the LRU eviction steps taken during map update.
247 LRU hash eviction during map update for ``BPF_MAP_TYPE_LRU_HASH`` and
248 variants. See the dot file source for kernel function name code references.
250 Map updates start from the oval in the top right "begin ``bpf_map_update()``"
251 and progress through the graph towards the bottom where the result may be
252 either a successful update or a failure with various error codes. The key in
253 the top right provides indicators for which locks may be involved in specific
254 operations. This is intended as a visual hint for reasoning about how map
255 contention may impact update operations, though the map type and flags may
256 impact the actual contention on those locks, based on the logic described in
257 the table above. For instance, if the map is created with type
258 ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` and flags ``BPF_F_NO_COMMON_LRU`` then all map
259 properties would be per-cpu.