|
|
@@ -17,15 +17,18 @@ CONTENTS
|
|
|
3. Structural Constraints
|
|
|
3-1. Top-down
|
|
|
3-2. No internal tasks
|
|
|
-4. Other Changes
|
|
|
- 4-1. [Un]populated Notification
|
|
|
- 4-2. Other Core Changes
|
|
|
- 4-3. Per-Controller Changes
|
|
|
- 4-3-1. blkio
|
|
|
- 4-3-2. cpuset
|
|
|
- 4-3-3. memory
|
|
|
-5. Planned Changes
|
|
|
- 5-1. CAP for resource control
|
|
|
+4. Delegation
|
|
|
+ 4-1. Model of delegation
|
|
|
+ 4-2. Common ancestor rule
|
|
|
+5. Other Changes
|
|
|
+ 5-1. [Un]populated Notification
|
|
|
+ 5-2. Other Core Changes
|
|
|
+ 5-3. Per-Controller Changes
|
|
|
+ 5-3-1. blkio
|
|
|
+ 5-3-2. cpuset
|
|
|
+ 5-3-3. memory
|
|
|
+6. Planned Changes
|
|
|
+ 6-1. CAP for resource control
|
|
|
|
|
|
|
|
|
1. Background
|
|
|
@@ -245,9 +248,72 @@ cgroup must create children and transfer all its tasks to the children
|
|
|
before enabling controllers in its "cgroup.subtree_control" file.
|
|
|
|
|
|
|
|
|
-4. Other Changes
|
|
|
+4. Delegation
|
|
|
|
|
|
-4-1. [Un]populated Notification
|
|
|
+4-1. Model of delegation
|
|
|
+
|
|
|
+A cgroup can be delegated to a less privileged user by granting write
|
|
|
+access of the directory and its "cgroup.procs" file to the user. Note
|
|
|
+that the resource control knobs in a given directory concern the
|
|
|
+resources of the parent and thus must not be delegated along with the
|
|
|
+directory.
|
|
|
+
|
|
|
+Once delegated, the user can build sub-hierarchy under the directory,
|
|
|
+organize processes as it sees fit and further distribute the resources
|
|
|
+it got from the parent. The limits and other settings of all resource
|
|
|
+controllers are hierarchical and regardless of what happens in the
|
|
|
+delegated sub-hierarchy, nothing can escape the resource restrictions
|
|
|
+imposed by the parent.
|
|
|
+
|
|
|
+Currently, cgroup doesn't impose any restrictions on the number of
|
|
|
+cgroups in or nesting depth of a delegated sub-hierarchy; however,
|
|
|
+this may in the future be limited explicitly.
|
|
|
+
|
|
|
+
|
|
|
+4-2. Common ancestor rule
|
|
|
+
|
|
|
+On the unified hierarchy, to write to a "cgroup.procs" file, in
|
|
|
+addition to the usual write permission to the file and uid match, the
|
|
|
+writer must also have write access to the "cgroup.procs" file of the
|
|
|
+common ancestor of the source and destination cgroups. This prevents
|
|
|
+delegatees from smuggling processes across disjoint sub-hierarchies.
|
|
|
+
|
|
|
+Let's say cgroups C0 and C1 have been delegated to user U0 who created
|
|
|
+C00, C01 under C0 and C10 under C1 as follows.
|
|
|
+
|
|
|
+ ~~~~~~~~~~~~~ - C0 - C00
|
|
|
+ ~ cgroup ~ \ C01
|
|
|
+ ~ hierarchy ~
|
|
|
+ ~~~~~~~~~~~~~ - C1 - C10
|
|
|
+
|
|
|
+C0 and C1 are separate entities in terms of resource distribution
|
|
|
+regardless of their relative positions in the hierarchy. The
|
|
|
+resources the processes under C0 are entitled to are controlled by
|
|
|
+C0's ancestors and may be completely different from C1. It's clear
|
|
|
+that the intention of delegating C0 to U0 is allowing U0 to organize
|
|
|
+the processes under C0 and further control the distribution of C0's
|
|
|
+resources.
|
|
|
+
|
|
|
+On traditional hierarchies, if a task has write access to "tasks" or
|
|
|
+"cgroup.procs" file of a cgroup and its uid agrees with the target, it
|
|
|
+can move the target to the cgroup. In the above example, U0 will not
|
|
|
+only be able to move processes in each sub-hierarchy but also across
|
|
|
+the two sub-hierarchies, effectively allowing it to violate the
|
|
|
+organizational and resource restrictions implied by the hierarchical
|
|
|
+structure above C0 and C1.
|
|
|
+
|
|
|
+On the unified hierarchy, let's say U0 wants to write the pid of a
|
|
|
+process which has a matching uid and is currently in C10 into
|
|
|
+"C00/cgroup.procs". U0 obviously has write access to the file and
|
|
|
+migration permission on the process; however, the common ancestor of
|
|
|
+the source cgroup C10 and the destination cgroup C00 is above the
|
|
|
+points of delegation and U0 would not have write access to its
|
|
|
+"cgroup.procs" and thus be denied with -EACCES.
|
|
|
+
|
|
|
+
|
|
|
+5. Other Changes
|
|
|
+
|
|
|
+5-1. [Un]populated Notification
|
|
|
|
|
|
cgroup users often need a way to determine when a cgroup's
|
|
|
subhierarchy becomes empty so that it can be cleaned up. cgroup
|
|
|
@@ -289,7 +355,7 @@ supported and the interface files "release_agent" and
|
|
|
"notify_on_release" do not exist.
|
|
|
|
|
|
|
|
|
-4-2. Other Core Changes
|
|
|
+5-2. Other Core Changes
|
|
|
|
|
|
- None of the mount options is allowed.
|
|
|
|
|
|
@@ -306,14 +372,14 @@ supported and the interface files "release_agent" and
|
|
|
- The "cgroup.clone_children" file is removed.
|
|
|
|
|
|
|
|
|
-4-3. Per-Controller Changes
|
|
|
+5-3. Per-Controller Changes
|
|
|
|
|
|
-4-3-1. blkio
|
|
|
+5-3-1. blkio
|
|
|
|
|
|
- blk-throttle becomes properly hierarchical.
|
|
|
|
|
|
|
|
|
-4-3-2. cpuset
|
|
|
+5-3-2. cpuset
|
|
|
|
|
|
- Tasks are kept in empty cpusets after hotplug and take on the masks
|
|
|
of the nearest non-empty ancestor, instead of being moved to it.
|
|
|
@@ -322,7 +388,7 @@ supported and the interface files "release_agent" and
|
|
|
masks of the nearest non-empty ancestor.
|
|
|
|
|
|
|
|
|
-4-3-3. memory
|
|
|
+5-3-3. memory
|
|
|
|
|
|
- use_hierarchy is on by default and the cgroup file for the flag is
|
|
|
not created.
|
|
|
@@ -407,9 +473,9 @@ supported and the interface files "release_agent" and
|
|
|
memory.low, memory.high, and memory.max will use the string "max" to
|
|
|
indicate and set the highest possible value.
|
|
|
|
|
|
-5. Planned Changes
|
|
|
+6. Planned Changes
|
|
|
|
|
|
-5-1. CAP for resource control
|
|
|
+6-1. CAP for resource control
|
|
|
|
|
|
Unified hierarchy will require one of the capabilities(7), which is
|
|
|
yet to be decided, for all resource control related knobs. Process
|