1 Rocker Network Switch Register Programming Guide
2 ************************************************
5 Copyright (c) Scott Feldman <sfeldma@gmail.com>
6 Copyright (c) Neil Horman <nhorman@tuxdriver.com>
7 Version 0.11, 12/29/2014
9 This program is free software; you can redistribute it and/or modify
10 it under the terms of the GNU General Public License as published by
11 the Free Software Foundation; either version 2 of the License, or
12 (at your option) any later version.
14 This program is distributed in the hope that it will be useful,
15 but WITHOUT ANY WARRANTY; without even the implied warranty of
16 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
17 GNU General Public License for more details.
25 This document describes the hardware/software interface for the Rocker switch
26 device. The intended audience is authors of OS drivers and device emulation
29 Notations and Conventions
30 -------------------------
32 * In register descriptions, [n:m] indicates a range from bit n to bit m,
34 * Use of leading 0x indicates a hexadecimal number.
35 * Use of leading 0b indicates a binary number.
36 * The use of RSVD or Reserved indicates that a bit or field is reserved for
38 * Field width is in bytes, unless otherwise noted.
39 * Register are (R) read-only, (R/W) read/write, (W) write-only, or (COR) clear
41 * TLV values in network-byte-order are designated with (N).
44 PCI Configuration Registers
45 ===========================
47 PCI Configuration Space
48 -----------------------
50 Each switch instance registers as a PCI device with PCI configuration space::
52 offset width description value
53 ---------------------------------------------
54 0x0 2 Vendor ID 0x1b36
55 0x2 2 Device ID 0x0006
57 0x8 1 Revision ID 0x01
58 0x9 3 Class code 0x2800
62 0xF 1 Built-in self test
63 0x10 4 Base address low
64 0x14 4 Base address high
66 0x2C 2 Subsystem vendor ID *
70 0x3D 1 Interrupt pin 0x00
72 0x3D 1 Max latency 0x00
77 * Assigned by sub-system implementation
79 Memory-Mapped Register Space
80 ============================
82 There are two memory-mapped BARs. BAR0 maps device register space and is
83 0x2000 in size. BAR1 maps MSI-X vector and PBA tables and is also 0x2000 in
84 size, allowing for 256 MSI-X vectors.
86 All registers are 4 or 8 bytes long. It is assumed host software will access 4
87 byte registers with one 4-byte access, and 8 byte registers with either two
88 4-byte accesses or a single 8-byte access. In the case of two 4-byte accesses,
89 access must be lower and then upper 4-bytes, in that order.
91 BAR0 device register space is organized as follows::
94 ------------------------------------------------------
95 0x0000-0x000f Bogus registers to catch misbehaving
96 drivers. Writes do nothing. Reads
98 0x0010-0x00ff Test registers
99 0x0300-0x03ff General purpose registers
100 0x1000-0x1fff Descriptor control
102 Holes in register space are reserved. Writes to reserved registers do nothing.
103 Reads to reserved registers read back as 0.
105 No fancy stuff like write-combining is enabled on any of the registers.
107 BAR1 MSI-X register space is organized as follows::
110 ------------------------------------------------------
111 0x0000-0x0fff MSI-X vector table (256 vectors total)
112 0x1000-0x1fff MSI-X PBA table
115 Interrupts, DMA, and Endianness
116 ===============================
121 The device supports only MSI-X interrupts. BAR1 memory-mapped region contains
122 the MSI-X vector and PBA tables, with support for up to 256 MSI-X vectors.
124 The vector assignment is::
127 -----------------------------------------------------
128 0 Command descriptor ring completion
129 1 Event descriptor ring completion
130 2 Test operation completion
132 4-255 Tx and Rx descriptor ring completion
136 A MSI-X vector table entry is 16 bytes::
138 field offset width description
139 -------------------------------------------------------------
140 lower_addr 0x0 4 [31:2] message address[31:2]
141 [1:0] Rsvd (4 byte alignment
143 upper_addr 0x4 4 [31:19] Rsvd
144 [14:0] message address[46:32]
145 data 0x8 4 message data[31:0]
146 control 0xc 4 [31:1] Rsvd
147 [0] mask (0 = enable,
150 Software should install the Interrupt Service Routine (ISR) before any ports
151 are enabled or any commands are issued on the command ring.
156 DMA operations are used for packet DMA to/from the CPU, command and event
157 processing. Command processing includes statistical counters and table dumps,
158 table insertion/deletion, and more. Event processing provides an async
159 notification method for device-originating events. Each DMA operation has a
160 set of control registers to manage a descriptor ring. The descriptor rings are
161 allocated from contiguous host DMA-able memory and registers specify the rings
162 base address, size and current head and tail indices. Software always writes
163 the head, and hardware always writes the tail.
165 The higher-order bit of DMA_DESC_COMP_ERR is used to mark hardware completion
166 of a descriptor. Software will clear this bit when posting a descriptor to the
167 ring, and hardware will set this bit when the descriptor is complete.
169 Descriptor ring sizes must be a power of 2 and range from 2 to 64K entries.
170 Descriptor rings' base address must be 8-byte aligned. Descriptors must be
171 packed within ring. Each descriptor in each ring must also be aligned on an 8
172 byte boundary. Each descriptor ring will have these registers::
174 DMA_DESC_xxx_BASE_ADDR, offset 0x1000 + (x * 32), 64-bit, (R/W)
175 DMA_DESC_xxx_SIZE, offset 0x1008 + (x * 32), 32-bit, (R/W)
176 DMA_DESC_xxx_HEAD, offset 0x100c + (x * 32), 32-bit, (R/W)
177 DMA_DESC_xxx_TAIL, offset 0x1010 + (x * 32), 32-bit, (R)
178 DMA_DESC_xxx_CTRL, offset 0x1014 + (x * 32), 32-bit, (W)
179 DMA_DESC_xxx_CREDITS, offset 0x1018 + (x * 32), 32-bit, (R/W)
180 DMA_DESC_xxx_RSVD1, offset 0x101c + (x * 32), 32-bit, (R/W)
182 Where x is descriptor ring index::
200 Writing BASE_ADDR or SIZE will reset HEAD and TAIL to zero. HEAD cannot be
201 written past TAIL. To do so would wrap the ring. An empty ring is when HEAD
202 == TAIL. A full ring is when HEAD is one position behind TAIL. Both HEAD and
203 TAIL increment and modulo wrap at the ring size.
208 ------------------------------------------------------------------------
209 [0] CTRL_RESET Reset the descriptor ring
212 All descriptor types share some common fields::
214 field width description
215 -------------------------------------------------------------------
216 DMA_DESC_BUF_ADDR 8 Phys addr of desc payload, 8-byte
218 DMA_DESC_COOKIE 8 Desc cookie for completion matching,
219 upper-most bit is reserved
220 DMA_DESC_BUF_SIZE 2 Desc payload size in bytes
221 DMA_DESC_TLV_SIZE 2 Desc payload total size in bytes
222 used for TLVs. Must be <=
224 DMA_DESC_COMP_ERR 2 Completion status of associated
225 desc payload. High order bit is
226 clear on new descs, toggled by
227 hw for completed items.
229 To support forward- and backward-compatibility, descriptor and completion
230 payloads are specified in TLV format. Fields are packed with Type=field name,
231 Length=field length, and Value=field value. Software will ignore unknown fields
232 filled in by the switch. Likewise, the switch will ignore unknown fields
233 filled in by software.
235 Descriptor payload buffer is 8-byte aligned and TLVs are 8-byte aligned. The
236 value within a TLV is also 8-byte aligned. The (packed, 8 byte) TLV header is::
238 field width description
239 -----------------------------
241 len 2 TLV value length
244 The alignment requirements for descriptors and TLVs are to avoid unaligned
245 access exceptions in software. Note that the payload for each TLV is also
248 Figure 1 shows an example descriptor buffer with two TLVs::
250 <------- 8 bytes ------->
252 8-byte +––––+ +–––––––––––+–––––+–––––+ +–+
253 align | type | len | pad | TLV#1 hdr |
254 +–––––––––––+–––––+–––––+ (len=22) |
256 | value | TVL#1 value |
257 | | (padded to 8-byte |
258 | +–––––+ alignment) |
260 8-byte +––––+ +–––––––––––+–––––––––––+ |
261 align | type | len | pad | TLV#2 hdr DESC_BUF_SIZE
262 +–––––+–––––+–––––+–––––+ (len=2) |
263 |value|/////////////////| TLV#2 value |
264 +–––––+/////////////////| |
265 |///////////////////////| |
266 |///////////////////////| |
267 |///////////////////////| |
268 |////////unused/////////| |
269 |////////space//////////| |
270 |///////////////////////| |
271 |///////////////////////| |
272 |///////////////////////| |
273 +–––––––––––––––––––––––+ +–+
277 TLVs can be nested within the NEST TLV type.
282 MSI-X vectors used for descriptor ring completions use a credit mechanism for
283 efficient device, PCIe bus, OS and driver operations. Each descriptor ring has
284 a credit count which represents the number of outstanding descriptors to be
285 processed by the driver. As the device marks descriptors complete, the credit
286 count is incremented. As the driver processes those outstanding descriptors,
287 it returns credits back to the device. This way, the device knows the driver's
288 progress and can make decisions about when to fire the next interrupt or not.
289 When the credit count is zero, and the first descriptors are posted for the
290 driver, a single interrupt is fired. Once the interrupt is fired, the
291 interrupt is disabled (auto-masked*). In response to the interrupt, the driver
292 will process descriptors and PIO write a returned credit value for that
293 descriptor ring. If the driver returns all credits (the driver caught up with
294 the device and there is no outstanding work), then the interrupt is unmasked,
295 but not fired. If only partial credits are returned, the interrupt remains
296 masked but the device generates an interrupt, signaling the driver that more
297 outstanding work is available.
299 (* this masking is unrelated to the MSI-X interrupt mask register)
304 Device registers are hard-coded to little-endian (LE). The driver should
305 convert to/from host endianness to LE for device register accesses.
307 Descriptors are LE. Descriptor buffer TLVs will have LE type and length
308 fields, but the value field can either be LE or network-byte-order, depending
309 on context. TLV values containing network packet data will be in network-byte
310 order. A TLV value containing a field or mask used to compare against network
311 packet data is network-byte order. For example, flow match fields (and masks)
312 are network-byte-order since they're matched directly, byte-by-byte, against
313 network packet data. All non-network-packet TLV multi-byte values will be LE.
315 TLV values in network-byte-order are designated with (N).
321 Rocker has several test registers to support troubleshooting register access,
322 interrupt generation, and DMA operations::
324 TEST_REG, offset 0x0010, 32-bit (R/W)
325 TEST_REG64, offset 0x0018, 64-bit (R/W)
326 TEST_IRQ, offset 0x0020, 32-bit (R/W)
327 TEST_DMA_ADDR, offset 0x0028, 64-bit (R/W)
328 TEST_DMA_SIZE, offset 0x0030, 32-bit (R/W)
329 TEST_DMA_CTRL, offset 0x0034, 32-bit (R/W)
331 Reads to TEST_REG and TEST_REG64 will read a value equal to twice the last
332 value written to the register. The 32-bit and 64-bit versions are for testing
333 32-bit and 64-bit host accesses.
335 A vector can be written to TEST_IRQ and the device will generate an interrupt
338 To test basic DMA operations, allocate a DMA-able host buffer and put the
339 buffer address into TEST_DMA_ADDR and size into TEST_DMA_SIZE. Then, write to
340 TEST_DMA_CTRL to manipulate the buffer contents. TEST_DMA_CTRL operations are::
342 operation value description
343 -----------------------------------------------------------
344 TEST_DMA_CTRL_CLEAR 1 clear buffer
345 TEST_DMA_CTRL_FILL 2 fill buffer bytes with 0x96
346 TEST_DMA_CTRL_INVERT 4 invert bytes in buffer
348 Various buffer address and sizes should be tested to verify no address boundary
349 issue exists. In particular, buffers that start on odd-8-byte boundary and/or
350 span multiple PAGE sizes should be tested.
356 Physical and Logical Ports
357 ------------------------------------
359 The switch supports up to 62 physical (front-panel) ports. Register
360 PORT_PHYS_COUNT returns the actual number of physical ports available::
362 PORT_PHYS_COUNT, offset 0x0304, 32-bit, (R)
364 In addition to front-panel ports, the switch supports logical ports for
367 Front-panel ports and logical tunnel ports are mapped into a single 32-bit port
368 space. A special CPU port is assigned port 0. The front-panel ports are
369 mapped to ports 1-62. A special loopback port is assigned port 63. Logical
370 tunnel ports are assigned ports 0x0001000-0x0001ffff.
371 To summarize the port assignments::
374 -------------------------------------------------------
375 0 CPU port (for packets to/from host CPU)
376 1-62 front-panel physical ports
379 0x00010000-0x0001ffff logical tunnel ports
380 0x00020000-0xffffffff RSVD
385 Switch front-panel ports operate in a mode. Currently, the only mode is
386 OF-DPA. OF-DPA[1] mode is based on OpenFlow Data Plane Abstraction (OF-DPA)
387 Abstract Switch Specification, Version 1.0, from Broadcom Corporation. To
388 set/get the mode for front-panel ports, see port settings, below.
393 Link status for all front-panel ports is available via PORT_PHYS_LINK_STATUS::
395 PORT_PHYS_LINK_STATUS, offset 0x0310, 64-bit, (R)
397 Value is port bitmap. Bits 0 and 63 always read 0. Bits 1-62
398 read 1 for link UP and 0 for link DOWN for respective front-panel ports.
400 Other properties for front-panel ports are available via DMA CMD descriptors::
402 Get PORT_SETTINGS descriptor:
404 field width description
405 ----------------------------------------------
406 PORT_SETTINGS 2 CMD_GET
407 PPORT 4 Physical port #
409 Get PORT_SETTINGS completion:
411 field width description
412 ----------------------------------------------
413 PPORT 4 Physical port #
414 SPEED 4 Current port interface speed, in Mbps
415 DUPLEX 1 1 = Full, 0 = Half
416 AUTONEG 1 1 = enabled, 0 = disabled
417 MACADDR 6 Port MAC address
419 LEARNING 1 MAC address learning on port
422 PHYS_NAME <var> Physical port name (string)
424 Set PORT_SETTINGS descriptor:
426 field width description
427 ----------------------------------------------
428 PORT_SETTINGS 2 CMD_SET
429 PPORT 4 Physical port #
430 SPEED 4 Port interface speed, in Mbps
431 DUPLEX 1 1 = Full, 0 = Half
432 AUTONEG 1 1 = enabled, 0 = disabled
433 MACADDR 6 Port MAC address
439 Front-panel ports are initially disabled, which means port ingress and egress
440 packets will be dropped. To enable or disable a port, use PORT_PHYS_ENABLE::
442 PORT_PHYS_ENABLE: offset 0x0318, 64-bit, (R/W)
444 Value is bitmap of first 64 ports. Bits 0 and 63 are ignored
445 and always read as 0. Write 1 to enable port; write 0 to disable it.
452 This section covers switch-wide register settings.
457 This register is used for low level control of the switch::
459 CONTROL: offset 0x0300, 32-bit, (W)
462 ------------------------------------------------------------------------
463 [0] CONTROL_RESET If set, device will perform reset
469 The switch has a SWITCH_ID to be used by software to uniquely identify the
472 SWITCH_ID: offset 0x0320, 64-bit, (R)
474 Value is opaque to switch software and no special encoding is implied.
480 Non-I/O asynchronous events from the device are notified to the host using the
481 event ring. The TLV structure for events is::
483 field width description
484 ---------------------------------------------------
485 TYPE 4 Event type, one of:
488 INFO <nest> Event info (details below)
493 When link status changes on a physical port, this event is generated::
495 field width description
496 ---------------------------------------------------
498 PPORT 4 Physical port
499 LINKUP 1 Link status:
506 When a packet ingresses on a port and the source MAC/VLAN isn't known to the
507 device, the device will generate this event. In response to the event, the
508 driver should install to the device the MAC/VLAN on the port into the bridge
509 table. Once installed, the MAC/VLAN is known on the port and this event will
510 no longer be generated.
514 field width description
515 ---------------------------------------------------
517 PPORT 4 Physical port
522 CPU Packet Processing
523 =====================
525 Ingress packets directed to the host CPU for further processing are delivered
526 in the DMA RX ring. Likewise, host CPU originating packets destined to egress
527 on switch ports are scheduled by software using the DMA TX ring.
532 Software schedules packets for egress on switch ports using the DMA TX ring. A
533 TX descriptor buffer describes the packet location and size in host DMA-able
534 memory, the destination port, and any hardware-offload functions (such as L3
535 payload checksum offload). Software then bumps the descriptor head to signal
536 hardware of new Tx work. In response, hardware will DMA read Tx descriptors up
537 to head, DMA read descriptor buffer and packet data, perform offloading
538 functions, and finally frame packet on wire (network). Once packet processing
539 is complete, hardware will writeback status to descriptor(s) to signal to
540 software that Tx is complete and software resources (e.g. skb) backing packet
543 Figure 2 shows an example 3-fragment packet queued with one Tx descriptor. A
544 TLV is used for each packet fragment::
551 Tx ring +–––+ +–––––+ | | |
552 +–––––––––+ | | TLVs | +–––––––+ |
553 | +–––+ +––––––––+ pkt frag 2 |
554 | desc 0 | | +–––––+ +–––––––+ |
555 +–––––––––+ | TLVs | +–––+ | |
556 head+–+ | +––––––––+ | | |
557 | desc 1 | | +–––––+ +–––––––+ |pkt
558 +–––––––––+ | TLVs | | |
559 | | +––––––––+ | pkt frag 3 |
561 +–––––––––+ +–––+ | |
574 The TLVs for Tx descriptor buffer are::
576 field width description
577 ---------------------------------------------------------------------
578 PPORT 4 Destination physical port #
579 TX_OFFLOAD 1 Hardware offload modes:
581 1: insert IP csum (ipv4 only)
582 2: insert TCP/UDP csum
583 3: L3 csum calc and insert
584 into csum offset (TX_L3_CSUM_OFF)
585 16-bit 1's complement csum value.
586 IPv4 pseudo-header and IP
587 already calculated by OS
589 4: TSO (TCP Segmentation Offload)
590 TX_L3_CSUM_OFF 2 For L3 csum offload mode, the offset,
591 from the beginning of the packet,
592 of the csum field in the L3 header
593 TX_TSO_MSS 2 For TSO offload mode, the
594 Maximum Segment Size in bytes
595 TX_TSO_HDR_LEN 2 For TSO offload mode, the
596 length of ethernet, IP, and
597 TCP/UDP headers, including IP
599 TX_FRAGS <array> Packet fragments
600 TX_FRAG <nest> Packet fragment
601 TX_FRAG_ADDR 8 DMA address of packet fragment
602 TX_FRAG_LEN 2 Packet fragment length
604 Possible status return codes in descriptor on completion are::
607 --------------------------------------------------------------------
609 -ROCKER_ENXIO address or data read err on desc buf or packet
611 -ROCKER_EINVAL bad pport or TSO or csum offloading error
612 -ROCKER_ENOMEM no memory for internal staging tx fragment
617 For packets ingressing on switch ports that are not forwarded by the switch but
618 rather directed to the host CPU for further processing are delivered in the DMA
619 RX ring. Rx descriptor buffers are allocated by software and placed on the
620 ring. Hardware will fill Rx descriptor buffers with packet data, write the
621 completion, and signal to software that a new packet is ready. Since Rx packet
622 size is not known a-priori, the Rx descriptor buffer must be allocated for
623 worst-case packet size. A single Rx descriptor will contain the entire Rx
624 packet data in one RX_FRAG. Other Rx TLVs describe and hardware offloads
625 performed on the packet, such as checksum validation.
627 The TLVs for Rx descriptor buffer are::
629 field width description
630 ---------------------------------------------------
631 PPORT 4 Source physical port #
632 RX_FLAGS 2 Packet parsing flags:
633 (1 << 0): IPv4 packet
634 (1 << 1): IPv6 packet
635 (1 << 2): csum calculated
636 (1 << 3): IPv4 csum good
637 (1 << 4): IP fragment
640 (1 << 7): TCP/UDP csum good
641 (1 << 8): Offload forward
642 RX_CSUM 2 IP calculated checksum:
643 IPv4: IP payload csum
644 IPv6: header and payload csum
645 (Only valid is RX_FLAGS:csum calc is set)
646 RX_FRAG_ADDR 8 DMA address of packet fragment
647 RX_FRAG_MAX_LEN 2 Packet maximum fragment length
648 RX_FRAG_LEN 2 Actual packet fragment length after receive
650 Offload forward RX_FLAG indicates the device has already forwarded the packet
651 so the host CPU should not also forward the packet.
653 Possible status return codes in descriptor on completion are::
656 --------------------------------------------------------------------
658 -ROCKER_ENXIO address or data read err on desc buf
659 -ROCKER_ENOMEM no memory for internal staging desc buf
660 -ROCKER_EMSGSIZE Rx descriptor buffer wasn't big enough to contain
661 packet data TLV and other TLVs.
667 OF-DPA mode allows the switch to offload flow packet processing functions to
668 hardware. An OpenFlow controller would communicate with an OpenFlow agent
669 installed on the switch. The OpenFlow agent would (directly or indirectly)
670 communicate with the Rocker switch driver, which in turn would program switch
671 hardware with flow functionality, as defined in OF-DPA. The block diagram is::
673 +–––––––––––––––----–––+
675 | Remote Controller |
676 +––––––––+––----–––––––+
692 To participate in flow functions, ports must be configure for OF-DPA mode
693 during switch initialization.
695 OF-DPA Flow Table Interface
696 ---------------------------
698 There are commands to add, modify, delete, and get stats of flow table entries.
699 The commands are issued using the DMA CMD descriptor ring. The following
700 commands are defined::
702 CMD_ADD: add an entry to flow table
703 CMD_MOD: modify an entry in flow table
704 CMD_DEL: delete an entry from flow table
705 CMD_GET_STATS: get stats for flow entry
707 TLVs for add and modify commands are::
709 field width description
710 ----------------------------------------------------
711 OF_DPA_CMD 2 CMD_[ADD|MOD]
712 OF_DPA_TBL 2 Flow table ID
717 40: multicast routing
720 OF_DPA_PRIORITY 4 Flow priority
721 OF_DPA_HARDTIME 4 Hard timeout for flow
722 OF_DPA_IDLETIME 4 Idle timeout for flow
723 OF_DPA_COOKIE 8 Cookie
725 Additional TLVs based on flow table ID:
727 Table ID 0: ingress port::
729 field width description
730 ----------------------------------------------------
731 OF_DPA_IN_PPORT 4 ingress physical port number
732 OF_DPA_GOTO_TBL 2 goto table ID; zero to drop
736 field width description
737 ----------------------------------------------------
738 OF_DPA_IN_PPORT 4 ingress physical port number
739 OF_DPA_VLAN_ID 2 (N) vlan ID
740 OF_DPA_VLAN_ID_MASK 2 (N) vlan ID mask
741 OF_DPA_GOTO_TBL 2 goto table ID; zero to drop
742 OF_DPA_NEW_VLAN_ID 2 (N) new vlan ID
744 Table ID 20: termination mac::
746 field width description
747 ----------------------------------------------------
748 OF_DPA_IN_PPORT 4 ingress physical port number
749 OF_DPA_IN_PPORT_MASK 4 ingress physical port number mask
750 OF_DPA_ETHERTYPE 2 (N) must be either 0x0800 or 0x86dd
751 OF_DPA_DST_MAC 6 (N) destination MAC
752 OF_DPA_DST_MAC_MASK 6 (N) destination MAC mask
753 OF_DPA_VLAN_ID 2 (N) vlan ID
754 OF_DPA_VLAN_ID_MASK 2 (N) vlan ID mask
755 OF_DPA_GOTO_TBL 2 only acceptable values are
756 unicast or multicast routing
758 OF_DPA_OUT_PPORT 2 if specified, must be
759 controller, set zero otherwise
761 Table ID 30: unicast routing::
763 field width description
764 ----------------------------------------------------
765 OF_DPA_ETHERTYPE 2 (N) must be either 0x0800 or 0x86dd
766 OF_DPA_DST_IP 4 (N) destination IPv4 address.
767 Must be unicast address
768 OF_DPA_DST_IP_MASK 4 (N) IP mask. Must be prefix mask
769 OF_DPA_DST_IPV6 16 (N) destination IPv6 address.
770 Must be unicast address
771 OF_DPA_DST_IPV6_MASK 16 (N) IPv6 mask. Must be prefix mask
772 OF_DPA_GOTO_TBL 2 goto table ID; zero to drop
773 OF_DPA_GROUP_ID 4 data for GROUP action must
774 be an L3 Unicast group entry
776 Table ID 40: multicast routing::
778 field width description
779 ----------------------------------------------------
780 OF_DPA_ETHERTYPE 2 (N) must be either 0x0800 or 0x86dd
781 OF_DPA_VLAN_ID 2 (N) vlan ID
782 OF_DPA_SRC_IP 4 (N) source IPv4. Optional,
783 can contain IPv4 address,
784 must be completely masked
786 OF_DPA_SRC_IP_MASK 4 (N) IP Mask
787 OF_DPA_DST_IP 4 (N) destination IPv4 address.
788 Must be multicast address
789 OF_DPA_SRC_IPV6 16 (N) source IPv6 Address. Optional.
790 Can contain IPv6 address,
791 must be completely masked
793 OF_DPA_SRC_IPV6_MASK 16 (N) IPv6 mask.
794 OF_DPA_DST_IPV6 16 (N) destination IPv6 Address. Must
796 Must be multicast address
797 OF_DPA_GOTO_TBL 2 goto table ID; zero to drop
798 OF_DPA_GROUP_ID 4 data for GROUP action must
799 be an L3 multicast group entry
801 Table ID 50: bridging::
803 field width description
804 ----------------------------------------------------
805 OF_DPA_VLAN_ID 2 (N) vlan ID
806 OF_DPA_TUNNEL_ID 4 tunnel ID
807 OF_DPA_DST_MAC 6 (N) destination MAC
808 OF_DPA_DST_MAC_MASK 6 (N) destination MAC mask
809 OF_DPA_GOTO_TBL 2 goto table ID; zero to drop
810 OF_DPA_GROUP_ID 4 data for GROUP action must
811 be a L2 Interface, L2
813 or L2 Overlay group entry
815 OF_DPA_TUNNEL_LPORT 4 unicast Tenant Bridging
816 flows specify a tunnel
818 OF_DPA_OUT_PPORT 2 data for OUTPUT action,
819 restricted to CONTROLLER,
822 Table ID 60: acl policy::
824 field width description
825 ----------------------------------------------------
826 OF_DPA_IN_PPORT 4 ingress physical port number
827 OF_DPA_IN_PPORT_MASK 4 ingress physical port number mask
828 OF_DPA_ETHERTYPE 2 (N) ethertype
829 OF_DPA_VLAN_ID 2 (N) vlan ID
830 OF_DPA_VLAN_ID_MASK 2 (N) vlan ID mask
831 OF_DPA_VLAN_PCP 2 (N) vlan Priority Code Point
832 OF_DPA_VLAN_PCP_MASK 2 (N) vlan Priority Code Point mask
833 OF_DPA_SRC_MAC 6 (N) source MAC
834 OF_DPA_SRC_MAC_MASK 6 (N) source MAC mask
835 OF_DPA_DST_MAC 6 (N) destination MAC
836 OF_DPA_DST_MAC_MASK 6 (N) destination MAC mask
837 OF_DPA_TUNNEL_ID 4 tunnel ID
838 OF_DPA_SRC_IP 4 (N) source IPv4. Optional,
839 can contain IPv4 address,
840 must be completely masked
842 OF_DPA_SRC_IP_MASK 4 (N) IP Mask
843 OF_DPA_DST_IP 4 (N) destination IPv4 address.
844 Must be multicast address
845 OF_DPA_DST_IP_MASK 4 (N) IP Mask
846 OF_DPA_SRC_IPV6 16 (N) source IPv6 Address. Optional.
847 Can contain IPv6 address,
848 must be completely masked
850 OF_DPA_SRC_IPV6_MASK 16 (N) IPv6 mask
851 OF_DPA_DST_IPV6 16 (N) destination IPv6 Address. Must
852 be multicast address.
853 OF_DPA_DST_IPV6_MASK 16 (N) IPv6 mask
854 OF_DPA_SRC_ARP_IP 4 (N) source IPv4 address in the ARP
855 payload. Only used if ethertype
857 OF_DPA_SRC_ARP_IP_MASK 4 (N) IP Mask
858 OF_DPA_IP_PROTO 1 IP protocol
859 OF_DPA_IP_PROTO_MASK 1 IP protocol mask
860 OF_DPA_IP_DSCP 1 DSCP
861 OF_DPA_IP_DSCP_MASK 1 DSCP mask
863 OF_DPA_IP_ECN_MASK 1 ECN mask
864 OF_DPA_L4_SRC_PORT 2 (N) L4 source port, only for
866 OF_DPA_L4_SRC_PORT_MASK 2 (N) L4 source port mask
867 OF_DPA_L4_DST_PORT 2 (N) L4 source port, only for
869 OF_DPA_L4_DST_PORT_MASK 2 (N) L4 source port mask
870 OF_DPA_ICMP_TYPE 1 ICMP type, only if IP
872 OF_DPA_ICMP_TYPE_MASK 1 ICMP type mask
873 OF_DPA_ICMP_CODE 1 ICMP code
874 OF_DPA_ICMP_CODE_MASK 1 ICMP code mask
875 OF_DPA_IPV6_LABEL 4 (N) IPv6 flow label
876 OF_DPA_IPV6_LABEL_MASK 4 (N) IPv6 flow label mask
877 OF_DPA_GROUP_ID 4 data for GROUP action
878 OF_DPA_QUEUE_ID_ACTION 1 write the queue ID
879 OF_DPA_NEW_QUEUE_ID 1 queue ID
880 OF_DPA_VLAN_PCP_ACTION 1 write the VLAN priority
881 OF_DPA_NEW_VLAN_PCP 1 VLAN priority
882 OF_DPA_IP_DSCP_ACTION 1 write the DSCP
883 OF_DPA_NEW_IP_DSCP 1 new DSCP
884 OF_DPA_TUNNEL_LPORT 4 restrct to valid tunnel
885 logical port, set to 0
887 OF_DPA_OUT_PPORT 2 data for OUTPUT action,
888 restricted to CONTROLLER,
890 OF_DPA_CLEAR_ACTIONS 4 if 1 packets matching flow are
891 dropped (all other instructions
894 TLVs for flow delete and get stats command are::
896 field width description
897 ---------------------------------------------------
898 OF_DPA_CMD 2 CMD_[DEL|GET_STATS]
899 OF_DPA_COOKIE 8 Cookie
901 On completion of get stats command, the descriptor buffer is written back with
904 field width description
905 ---------------------------------------------------
906 OF_DPA_STAT_DURATION 4 Flow duration
907 OF_DPA_STAT_RX_PKTS 8 Received packets
908 OF_DPA_STAT_TX_PKTS 8 Transmit packets
910 Possible status return codes in descriptor on completion are::
912 DESC_COMP_ERR command reason
913 --------------------------------------------------------------------
915 -ROCKER_EFAULT all head or tail index outside
917 -ROCKER_ENXIO all address or data read err on
919 -ROCKER_EMSGSIZE GET_STATS cmd descriptor buffer wasn't
920 big enough to contain write-back
922 -ROCKER_EINVAL all invalid parameters passed in
923 -ROCKER_EEXIST ADD entry already exists
924 -ROCKER_ENOSPC ADD no space left in flow table
925 -ROCKER_ENOENT MOD|DEL|GET_STATS cookie invalid
927 Group Table Interface
928 ---------------------
930 There are commands to add, modify, delete, and get stats of group table
931 entries. The commands are issued using the DMA CMD descriptor ring. The
932 following commands are defined::
934 CMD_ADD: add an entry to group table
935 CMD_MOD: modify an entry in group table
936 CMD_DEL: delete an entry from group table
937 CMD_GET_STATS: get stats for group entry
939 TLVs for add and modify commands are::
941 field width description
942 -----------------------------------------------------------
943 FLOW_GROUP_CMD 2 CMD_[ADD|MOD]
944 FLOW_GROUP_ID 2 Flow group ID
945 FLOW_GROUP_TYPE 1 Group type:
955 FLOW_VLAN_ID 2 Vlan ID (types 0, 3, 4, 6)
956 FLOW_L2_PORT 2 Port (types 0)
957 FLOW_INDEX 4 Index (all types but 0)
958 FLOW_OVERLAY_TYPE 1 Overlay sub-type (type 8):
959 0: Flood unicast tunnel
960 1: Flood multicast tunnel
961 2: Multicast unicast tunnel
962 3: Multicast multicast tunnel
963 FLOW_GROUP_ACTION nest
964 FLOW_GROUP_ID 2 next group ID in chain (all
966 FLOW_OUT_PORT 4 egress port (types 0, 8)
967 FLOW_POP_VLAN_TAG 1 strip outer VLAN tag (type 1
969 FLOW_VLAN_ID 2 (types 1, 5)
970 FLOW_SRC_MAC 6 (types 1, 2, 5)
971 FLOW_DST_MAC 6 (types 1, 2)
973 TLVs for flow delete and get stats command are::
975 field width description
976 -----------------------------------------------------------
977 FLOW_GROUP_CMD 2 CMD_[DEL|GET_STATS]
978 FLOW_GROUP_ID 2 Flow group ID
980 On completion of get stats command, the descriptor buffer is written back with
983 field width description
984 ---------------------------------------------------
985 FLOW_GROUP_ID 2 Flow group ID
986 FLOW_STAT_DURATION 4 Flow duration
987 FLOW_STAT_REF_COUNT 4 Flow reference count
988 FLOW_STAT_BUCKET_COUNT 4 Flow bucket count
990 Possible status return codes in descriptor on completion are::
992 DESC_COMP_ERR command reason
993 --------------------------------------------------------------------
995 -ROCKER_EFAULT all head or tail index outside
997 -ROCKER_ENXIO all address or data read err on
999 -ROCKER_ENOSPC GET_STATS cmd descriptor buffer wasn't
1000 big enough to contain write-back
1002 -ROCKER_EINVAL ADD|MOD invalid parameters passed in
1003 -ROCKER_EEXIST ADD entry already exists
1004 -ROCKER_ENOSPC ADD no space left in flow table
1005 -ROCKER_ENOENT MOD|DEL|GET_STATS group ID invalid
1006 -ROCKER_EBUSY DEL group reference count non-zero
1007 -ROCKER_ENODEV ADD next group ID doesn't exist
1014 [1] OpenFlow Data Plane Abstraction (OF-DPA) Abstract Switch Specification,
1015 Version 1.0, from Broadcom Corporation, February 21, 2014.