|
@@ -249,6 +249,103 @@ Memory Management and Command Submission
|
|
|
This sections covers all things related to the GEM implementation in the
|
|
|
i915 driver.
|
|
|
|
|
|
+Intel GPU Basics
|
|
|
+----------------
|
|
|
+
|
|
|
+An Intel GPU has multiple engines. There are several engine types.
|
|
|
+
|
|
|
+- RCS engine is for rendering 3D and performing compute, this is named
|
|
|
+ `I915_EXEC_RENDER` in user space.
|
|
|
+- BCS is a blitting (copy) engine, this is named `I915_EXEC_BLT` in user
|
|
|
+ space.
|
|
|
+- VCS is a video encode and decode engine, this is named `I915_EXEC_BSD`
|
|
|
+ in user space
|
|
|
+- VECS is video enhancement engine, this is named `I915_EXEC_VEBOX` in user
|
|
|
+ space.
|
|
|
+- The enumeration `I915_EXEC_DEFAULT` does not refer to specific engine;
|
|
|
+ instead it is to be used by user space to specify a default rendering
|
|
|
+ engine (for 3D) that may or may not be the same as RCS.
|
|
|
+
|
|
|
+The Intel GPU family is a family of integrated GPU's using Unified
|
|
|
+Memory Access. For having the GPU "do work", user space will feed the
|
|
|
+GPU batch buffers via one of the ioctls `DRM_IOCTL_I915_GEM_EXECBUFFER2`
|
|
|
+or `DRM_IOCTL_I915_GEM_EXECBUFFER2_WR`. Most such batchbuffers will
|
|
|
+instruct the GPU to perform work (for example rendering) and that work
|
|
|
+needs memory from which to read and memory to which to write. All memory
|
|
|
+is encapsulated within GEM buffer objects (usually created with the ioctl
|
|
|
+`DRM_IOCTL_I915_GEM_CREATE`). An ioctl providing a batchbuffer for the GPU
|
|
|
+to create will also list all GEM buffer objects that the batchbuffer reads
|
|
|
+and/or writes. For implementation details of memory management see
|
|
|
+`GEM BO Management Implementation Details`_.
|
|
|
+
|
|
|
+The i915 driver allows user space to create a context via the ioctl
|
|
|
+`DRM_IOCTL_I915_GEM_CONTEXT_CREATE` which is identified by a 32-bit
|
|
|
+integer. Such a context should be viewed by user-space as -loosely-
|
|
|
+analogous to the idea of a CPU process of an operating system. The i915
|
|
|
+driver guarantees that commands issued to a fixed context are to be
|
|
|
+executed so that writes of a previously issued command are seen by
|
|
|
+reads of following commands. Actions issued between different contexts
|
|
|
+(even if from the same file descriptor) are NOT given that guarantee
|
|
|
+and the only way to synchronize across contexts (even from the same
|
|
|
+file descriptor) is through the use of fences. At least as far back as
|
|
|
+Gen4, also have that a context carries with it a GPU HW context;
|
|
|
+the HW context is essentially (most of atleast) the state of a GPU.
|
|
|
+In addition to the ordering guarantees, the kernel will restore GPU
|
|
|
+state via HW context when commands are issued to a context, this saves
|
|
|
+user space the need to restore (most of atleast) the GPU state at the
|
|
|
+start of each batchbuffer. The non-deprecated ioctls to submit batchbuffer
|
|
|
+work can pass that ID (in the lower bits of drm_i915_gem_execbuffer2::rsvd1)
|
|
|
+to identify what context to use with the command.
|
|
|
+
|
|
|
+The GPU has its own memory management and address space. The kernel
|
|
|
+driver maintains the memory translation table for the GPU. For older
|
|
|
+GPUs (i.e. those before Gen8), there is a single global such translation
|
|
|
+table, a global Graphics Translation Table (GTT). For newer generation
|
|
|
+GPUs each context has its own translation table, called Per-Process
|
|
|
+Graphics Translation Table (PPGTT). Of important note, is that although
|
|
|
+PPGTT is named per-process it is actually per context. When user space
|
|
|
+submits a batchbuffer, the kernel walks the list of GEM buffer objects
|
|
|
+used by the batchbuffer and guarantees that not only is the memory of
|
|
|
+each such GEM buffer object resident but it is also present in the
|
|
|
+(PP)GTT. If the GEM buffer object is not yet placed in the (PP)GTT,
|
|
|
+then it is given an address. Two consequences of this are: the kernel
|
|
|
+needs to edit the batchbuffer submitted to write the correct value of
|
|
|
+the GPU address when a GEM BO is assigned a GPU address and the kernel
|
|
|
+might evict a different GEM BO from the (PP)GTT to make address room
|
|
|
+for another GEM BO. Consequently, the ioctls submitting a batchbuffer
|
|
|
+for execution also include a list of all locations within buffers that
|
|
|
+refer to GPU-addresses so that the kernel can edit the buffer correctly.
|
|
|
+This process is dubbed relocation.
|
|
|
+
|
|
|
+GEM BO Management Implementation Details
|
|
|
+----------------------------------------
|
|
|
+
|
|
|
+.. kernel-doc:: drivers/gpu/drm/i915/i915_vma.h
|
|
|
+ :doc: Virtual Memory Address
|
|
|
+
|
|
|
+Buffer Object Eviction
|
|
|
+----------------------
|
|
|
+
|
|
|
+This section documents the interface functions for evicting buffer
|
|
|
+objects to make space available in the virtual gpu address spaces. Note
|
|
|
+that this is mostly orthogonal to shrinking buffer objects caches, which
|
|
|
+has the goal to make main memory (shared with the gpu through the
|
|
|
+unified memory architecture) available.
|
|
|
+
|
|
|
+.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_evict.c
|
|
|
+ :internal:
|
|
|
+
|
|
|
+Buffer Object Memory Shrinking
|
|
|
+------------------------------
|
|
|
+
|
|
|
+This section documents the interface function for shrinking memory usage
|
|
|
+of buffer object caches. Shrinking is used to make main memory
|
|
|
+available. Note that this is mostly orthogonal to evicting buffer
|
|
|
+objects, which has the goal to make space in gpu virtual address spaces.
|
|
|
+
|
|
|
+.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_shrinker.c
|
|
|
+ :internal:
|
|
|
+
|
|
|
Batchbuffer Parsing
|
|
|
-------------------
|
|
|
|
|
@@ -312,29 +409,6 @@ Object Tiling IOCTLs
|
|
|
.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_tiling.c
|
|
|
:doc: buffer object tiling
|
|
|
|
|
|
-Buffer Object Eviction
|
|
|
-----------------------
|
|
|
-
|
|
|
-This section documents the interface functions for evicting buffer
|
|
|
-objects to make space available in the virtual gpu address spaces. Note
|
|
|
-that this is mostly orthogonal to shrinking buffer objects caches, which
|
|
|
-has the goal to make main memory (shared with the gpu through the
|
|
|
-unified memory architecture) available.
|
|
|
-
|
|
|
-.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_evict.c
|
|
|
- :internal:
|
|
|
-
|
|
|
-Buffer Object Memory Shrinking
|
|
|
-------------------------------
|
|
|
-
|
|
|
-This section documents the interface function for shrinking memory usage
|
|
|
-of buffer object caches. Shrinking is used to make main memory
|
|
|
-available. Note that this is mostly orthogonal to evicting buffer
|
|
|
-objects, which has the goal to make space in gpu virtual address spaces.
|
|
|
-
|
|
|
-.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_shrinker.c
|
|
|
- :internal:
|
|
|
-
|
|
|
WOPCM
|
|
|
=====
|
|
|
|