浏览代码

Documentation: Add explanation for XPS using Rx-queue(s) map

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amritha Nambiar 7 年之前
父节点
当前提交
a4fd1f4bab
共有 2 个文件被更改,包括 61 次插入11 次删除
  1. 11 0
      Documentation/ABI/testing/sysfs-class-net-queues
  2. 50 11
      Documentation/networking/scaling.txt

+ 11 - 0
Documentation/ABI/testing/sysfs-class-net-queues

@@ -42,6 +42,17 @@ Description:
 		network device transmit queue. Possible vaules depend on the
 		network device transmit queue. Possible vaules depend on the
 		number of available CPU(s) in the system.
 		number of available CPU(s) in the system.
 
 
+What:		/sys/class/<iface>/queues/tx-<queue>/xps_rxqs
+Date:		June 2018
+KernelVersion:	4.18.0
+Contact:	netdev@vger.kernel.org
+Description:
+		Mask of the receive queue(s) currently enabled to participate
+		into the Transmit Packet Steering packet processing flow for this
+		network device transmit queue. Possible values depend on the
+		number of available receive queue(s) in the network device.
+		Default is disabled.
+
 What:		/sys/class/<iface>/queues/tx-<queue>/byte_queue_limits/hold_time
 What:		/sys/class/<iface>/queues/tx-<queue>/byte_queue_limits/hold_time
 Date:		November 2011
 Date:		November 2011
 KernelVersion:	3.3
 KernelVersion:	3.3

+ 50 - 11
Documentation/networking/scaling.txt

@@ -366,8 +366,13 @@ XPS: Transmit Packet Steering
 
 
 Transmit Packet Steering is a mechanism for intelligently selecting
 Transmit Packet Steering is a mechanism for intelligently selecting
 which transmit queue to use when transmitting a packet on a multi-queue
 which transmit queue to use when transmitting a packet on a multi-queue
-device. To accomplish this, a mapping from CPU to hardware queue(s) is
-recorded. The goal of this mapping is usually to assign queues
+device. This can be accomplished by recording two kinds of maps, either
+a mapping of CPU to hardware queue(s) or a mapping of receive queue(s)
+to hardware transmit queue(s).
+
+1. XPS using CPUs map
+
+The goal of this mapping is usually to assign queues
 exclusively to a subset of CPUs, where the transmit completions for
 exclusively to a subset of CPUs, where the transmit completions for
 these queues are processed on a CPU within this set. This choice
 these queues are processed on a CPU within this set. This choice
 provides two benefits. First, contention on the device queue lock is
 provides two benefits. First, contention on the device queue lock is
@@ -377,15 +382,40 @@ transmit queue). Secondly, cache miss rate on transmit completion is
 reduced, in particular for data cache lines that hold the sk_buff
 reduced, in particular for data cache lines that hold the sk_buff
 structures.
 structures.
 
 
-XPS is configured per transmit queue by setting a bitmap of CPUs that
-may use that queue to transmit. The reverse mapping, from CPUs to
-transmit queues, is computed and maintained for each network device.
-When transmitting the first packet in a flow, the function
-get_xps_queue() is called to select a queue. This function uses the ID
-of the running CPU as a key into the CPU-to-queue lookup table. If the
+2. XPS using receive queues map
+
+This mapping is used to pick transmit queue based on the receive
+queue(s) map configuration set by the administrator. A set of receive
+queues can be mapped to a set of transmit queues (many:many), although
+the common use case is a 1:1 mapping. This will enable sending packets
+on the same queue associations for transmit and receive. This is useful for
+busy polling multi-threaded workloads where there are challenges in
+associating a given CPU to a given application thread. The application
+threads are not pinned to CPUs and each thread handles packets
+received on a single queue. The receive queue number is cached in the
+socket for the connection. In this model, sending the packets on the same
+transmit queue corresponding to the associated receive queue has benefits
+in keeping the CPU overhead low. Transmit completion work is locked into
+the same queue-association that a given application is polling on. This
+avoids the overhead of triggering an interrupt on another CPU. When the
+application cleans up the packets during the busy poll, transmit completion
+may be processed along with it in the same thread context and so result in
+reduced latency.
+
+XPS is configured per transmit queue by setting a bitmap of
+CPUs/receive-queues that may use that queue to transmit. The reverse
+mapping, from CPUs to transmit queues or from receive-queues to transmit
+queues, is computed and maintained for each network device. When
+transmitting the first packet in a flow, the function get_xps_queue() is
+called to select a queue. This function uses the ID of the receive queue
+for the socket connection for a match in the receive queue-to-transmit queue
+lookup table. Alternatively, this function can also use the ID of the
+running CPU as a key into the CPU-to-queue lookup table. If the
 ID matches a single queue, that is used for transmission. If multiple
 ID matches a single queue, that is used for transmission. If multiple
 queues match, one is selected by using the flow hash to compute an index
 queues match, one is selected by using the flow hash to compute an index
-into the set.
+into the set. When selecting the transmit queue based on receive queue(s)
+map, the transmit device is not validated against the receive device as it
+requires expensive lookup operation in the datapath.
 
 
 The queue chosen for transmitting a particular flow is saved in the
 The queue chosen for transmitting a particular flow is saved in the
 corresponding socket structure for the flow (e.g. a TCP connection).
 corresponding socket structure for the flow (e.g. a TCP connection).
@@ -404,11 +434,15 @@ acknowledged.
 
 
 XPS is only available if the kconfig symbol CONFIG_XPS is enabled (on by
 XPS is only available if the kconfig symbol CONFIG_XPS is enabled (on by
 default for SMP). The functionality remains disabled until explicitly
 default for SMP). The functionality remains disabled until explicitly
-configured. To enable XPS, the bitmap of CPUs that may use a transmit
-queue is configured using the sysfs file entry:
+configured. To enable XPS, the bitmap of CPUs/receive-queues that may
+use a transmit queue is configured using the sysfs file entry:
 
 
+For selection based on CPUs map:
 /sys/class/net/<dev>/queues/tx-<n>/xps_cpus
 /sys/class/net/<dev>/queues/tx-<n>/xps_cpus
 
 
+For selection based on receive-queues map:
+/sys/class/net/<dev>/queues/tx-<n>/xps_rxqs
+
 == Suggested Configuration
 == Suggested Configuration
 
 
 For a network device with a single transmission queue, XPS configuration
 For a network device with a single transmission queue, XPS configuration
@@ -421,6 +455,11 @@ best CPUs to share a given queue are probably those that share the cache
 with the CPU that processes transmit completions for that queue
 with the CPU that processes transmit completions for that queue
 (transmit interrupts).
 (transmit interrupts).
 
 
+For transmit queue selection based on receive queue(s), XPS has to be
+explicitly configured mapping receive-queue(s) to transmit queue(s). If the
+user configuration for receive-queue map does not apply, then the transmit
+queue is selected based on the CPUs map.
+
 Per TX Queue rate limitation:
 Per TX Queue rate limitation:
 =============================
 =============================