7 years ago · fd5ff5f6f6
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -249,6 +249,103 @@ Memory Management and Command Submission
 
				 This sections covers all things related to the GEM implementation in the
			
 
				 i915 driver.
			
 
				 
			
 
				+Intel GPU Basics
			
 
				+----------------
			
 
				+
			
 
				+An Intel GPU has multiple engines. There are several engine types.
			
 
				+
			
 
				+- RCS engine is for rendering 3D and performing compute, this is named
			
 
				+  `I915_EXEC_RENDER` in user space.
			
 
				+- BCS is a blitting (copy) engine, this is named `I915_EXEC_BLT` in user
			
 
				+  space.
			
 
				+- VCS is a video encode and decode engine, this is named `I915_EXEC_BSD`
			
 
				+  in user space
			
 
				+- VECS is video enhancement engine, this is named `I915_EXEC_VEBOX` in user
			
 
				+  space.
			
 
				+- The enumeration `I915_EXEC_DEFAULT` does not refer to specific engine;
			
 
				+  instead it is to be used by user space to specify a default rendering
			
 
				+  engine (for 3D) that may or may not be the same as RCS.
			
 
				+
			
 
				+The Intel GPU family is a family of integrated GPU's using Unified
			
 
				+Memory Access. For having the GPU "do work", user space will feed the
			
 
				+GPU batch buffers via one of the ioctls `DRM_IOCTL_I915_GEM_EXECBUFFER2`
			
 
				+or `DRM_IOCTL_I915_GEM_EXECBUFFER2_WR`. Most such batchbuffers will
			
 
				+instruct the GPU to perform work (for example rendering) and that work
			
 
				+needs memory from which to read and memory to which to write. All memory
			
 
				+is encapsulated within GEM buffer objects (usually created with the ioctl
			
 
				+`DRM_IOCTL_I915_GEM_CREATE`). An ioctl providing a batchbuffer for the GPU
			
 
				+to create will also list all GEM buffer objects that the batchbuffer reads
			
 
				+and/or writes. For implementation details of memory management see
			
 
				+`GEM BO Management Implementation Details`_.
			
 
				+
			
 
				+The i915 driver allows user space to create a context via the ioctl
			
 
				+`DRM_IOCTL_I915_GEM_CONTEXT_CREATE` which is identified by a 32-bit
			
 
				+integer. Such a context should be viewed by user-space as -loosely-
			
 
				+analogous to the idea of a CPU process of an operating system. The i915
			
 
				+driver guarantees that commands issued to a fixed context are to be
			
 
				+executed so that writes of a previously issued command are seen by
			
 
				+reads of following commands. Actions issued between different contexts
			
 
				+(even if from the same file descriptor) are NOT given that guarantee
			
 
				+and the only way to synchronize across contexts (even from the same
			
 
				+file descriptor) is through the use of fences. At least as far back as
			
 
				+Gen4, also have that a context carries with it a GPU HW context;
			
 
				+the HW context is essentially (most of atleast) the state of a GPU.
			
 
				+In addition to the ordering guarantees, the kernel will restore GPU
			
 
				+state via HW context when commands are issued to a context, this saves
			
 
				+user space the need to restore (most of atleast) the GPU state at the
			
 
				+start of each batchbuffer. The non-deprecated ioctls to submit batchbuffer
			
 
				+work can pass that ID (in the lower bits of drm_i915_gem_execbuffer2::rsvd1)
			
 
				+to identify what context to use with the command.
			
 
				+
			
 
				+The GPU has its own memory management and address space. The kernel
			
 
				+driver maintains the memory translation table for the GPU. For older
			
 
				+GPUs (i.e. those before Gen8), there is a single global such translation
			
 
				+table, a global Graphics Translation Table (GTT). For newer generation
			
 
				+GPUs each context has its own translation table, called Per-Process
			
 
				+Graphics Translation Table (PPGTT). Of important note, is that although
			
 
				+PPGTT is named per-process it is actually per context. When user space
			
 
				+submits a batchbuffer, the kernel walks the list of GEM buffer objects
			
 
				+used by the batchbuffer and guarantees that not only is the memory of
			
 
				+each such GEM buffer object resident but it is also present in the
			
 
				+(PP)GTT. If the GEM buffer object is not yet placed in the (PP)GTT,
			
 
				+then it is given an address. Two consequences of this are: the kernel
			
 
				+needs to edit the batchbuffer submitted to write the correct value of
			
 
				+the GPU address when a GEM BO is assigned a GPU address and the kernel
			
 
				+might evict a different GEM BO from the (PP)GTT to make address room
			
 
				+for another GEM BO. Consequently, the ioctls submitting a batchbuffer
			
 
				+for execution also include a list of all locations within buffers that
			
 
				+refer to GPU-addresses so that the kernel can edit the buffer correctly.
			
 
				+This process is dubbed relocation.
			
 
				+
			
 
				+GEM BO Management Implementation Details
			
 
				+----------------------------------------
			
 
				+
			
 
				+.. kernel-doc:: drivers/gpu/drm/i915/i915_vma.h
			
 
				+   :doc: Virtual Memory Address
			
 
				+
			
 
				+Buffer Object Eviction
			
 
				+----------------------
			
 
				+
			
 
				+This section documents the interface functions for evicting buffer
			
 
				+objects to make space available in the virtual gpu address spaces. Note
			
 
				+that this is mostly orthogonal to shrinking buffer objects caches, which
			
 
				+has the goal to make main memory (shared with the gpu through the
			
 
				+unified memory architecture) available.
			
 
				+
			
 
				+.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_evict.c
			
 
				+   :internal:
			
 
				+
			
 
				+Buffer Object Memory Shrinking
			
 
				+------------------------------
			
 
				+
			
 
				+This section documents the interface function for shrinking memory usage
			
 
				+of buffer object caches. Shrinking is used to make main memory
			
 
				+available. Note that this is mostly orthogonal to evicting buffer
			
 
				+objects, which has the goal to make space in gpu virtual address spaces.
			
 
				+
			
 
				+.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_shrinker.c
			
 
				+   :internal:
			
 
				+
			
 
				 Batchbuffer Parsing
			
 
				 -------------------
			
 
				 
			
@@ -312,29 +409,6 @@ Object Tiling IOCTLs
 
				 .. kernel-doc:: drivers/gpu/drm/i915/i915_gem_tiling.c
			
 
				    :doc: buffer object tiling
			
 
				 
			
 
				-Buffer Object Eviction
			
 
				-----------------------
			
 
				-
			
 
				-This section documents the interface functions for evicting buffer
			
 
				-objects to make space available in the virtual gpu address spaces. Note
			
 
				-that this is mostly orthogonal to shrinking buffer objects caches, which
			
 
				-has the goal to make main memory (shared with the gpu through the
			
 
				-unified memory architecture) available.
			
 
				-
			
 
				-.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_evict.c
			
 
				-   :internal:
			
 
				-
			
 
				-Buffer Object Memory Shrinking
			
 
				-------------------------------
			
 
				-
			
 
				-This section documents the interface function for shrinking memory usage
			
 
				-of buffer object caches. Shrinking is used to make main memory
			
 
				-available. Note that this is mostly orthogonal to evicting buffer
			
 
				-objects, which has the goal to make space in gpu virtual address spaces.
			
 
				-
			
 
				-.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_shrinker.c
			
 
				-   :internal:
			
 
				-
			
 
				 WOPCM
			
 
				 =====