8 ani în urmă · 706eeb3e9c
--- a/Documentation/atomic_bitops.txt
+++ b/Documentation/atomic_bitops.txt
@@ -0,0 +1,66 @@
 
				+
			
 
				+On atomic bitops.
			
 
				+
			
 
				+
			
 
				+While our bitmap_{}() functions are non-atomic, we have a number of operations
			
 
				+operating on single bits in a bitmap that are atomic.
			
 
				+
			
 
				+
			
 
				+API
			
 
				+---
			
 
				+
			
 
				+The single bit operations are:
			
 
				+
			
 
				+Non-RMW ops:
			
 
				+
			
 
				+  test_bit()
			
 
				+
			
 
				+RMW atomic operations without return value:
			
 
				+
			
 
				+  {set,clear,change}_bit()
			
 
				+  clear_bit_unlock()
			
 
				+
			
 
				+RMW atomic operations with return value:
			
 
				+
			
 
				+  test_and_{set,clear,change}_bit()
			
 
				+  test_and_set_bit_lock()
			
 
				+
			
 
				+Barriers:
			
 
				+
			
 
				+  smp_mb__{before,after}_atomic()
			
 
				+
			
 
				+
			
 
				+All RMW atomic operations have a '__' prefixed variant which is non-atomic.
			
 
				+
			
 
				+
			
 
				+SEMANTICS
			
 
				+---------
			
 
				+
			
 
				+Non-atomic ops:
			
 
				+
			
 
				+In particular __clear_bit_unlock() suffers the same issue as atomic_set(),
			
 
				+which is why the generic version maps to clear_bit_unlock(), see atomic_t.txt.
			
 
				+
			
 
				+
			
 
				+RMW ops:
			
 
				+
			
 
				+The test_and_{}_bit() operations return the original value of the bit.
			
 
				+
			
 
				+
			
 
				+ORDERING
			
 
				+--------
			
 
				+
			
 
				+Like with atomic_t, the rule of thumb is:
			
 
				+
			
 
				+ - non-RMW operations are unordered;
			
 
				+
			
 
				+ - RMW operations that have no return value are unordered;
			
 
				+
			
 
				+ - RMW operations that have a return value are fully ordered.
			
 
				+
			
 
				+Except for test_and_set_bit_lock() which has ACQUIRE semantics and
			
 
				+clear_bit_unlock() which has RELEASE semantics.
			
 
				+
			
 
				+Since a platform only has a single means of achieving atomic operations
			
 
				+the same barriers as for atomic_t are used, see atomic_t.txt.
			
 
				+
			
--- a/Documentation/atomic_t.txt
+++ b/Documentation/atomic_t.txt
@@ -0,0 +1,200 @@
 
				+
			
 
				+On atomic types (atomic_t atomic64_t and atomic_long_t).
			
 
				+
			
 
				+The atomic type provides an interface to the architecture's means of atomic
			
 
				+RMW operations between CPUs (atomic operations on MMIO are not supported and
			
 
				+can lead to fatal traps on some platforms).
			
 
				+
			
 
				+API
			
 
				+---
			
 
				+
			
 
				+The 'full' API consists of (atomic64_ and atomic_long_ prefixes omitted for
			
 
				+brevity):
			
 
				+
			
 
				+Non-RMW ops:
			
 
				+
			
 
				+  atomic_read(), atomic_set()
			
 
				+  atomic_read_acquire(), atomic_set_release()
			
 
				+
			
 
				+
			
 
				+RMW atomic operations:
			
 
				+
			
 
				+Arithmetic:
			
 
				+
			
 
				+  atomic_{add,sub,inc,dec}()
			
 
				+  atomic_{add,sub,inc,dec}_return{,_relaxed,_acquire,_release}()
			
 
				+  atomic_fetch_{add,sub,inc,dec}{,_relaxed,_acquire,_release}()
			
 
				+
			
 
				+
			
 
				+Bitwise:
			
 
				+
			
 
				+  atomic_{and,or,xor,andnot}()
			
 
				+  atomic_fetch_{and,or,xor,andnot}{,_relaxed,_acquire,_release}()
			
 
				+
			
 
				+
			
 
				+Swap:
			
 
				+
			
 
				+  atomic_xchg{,_relaxed,_acquire,_release}()
			
 
				+  atomic_cmpxchg{,_relaxed,_acquire,_release}()
			
 
				+  atomic_try_cmpxchg{,_relaxed,_acquire,_release}()
			
 
				+
			
 
				+
			
 
				+Reference count (but please see refcount_t):
			
 
				+
			
 
				+  atomic_add_unless(), atomic_inc_not_zero()
			
 
				+  atomic_sub_and_test(), atomic_dec_and_test()
			
 
				+
			
 
				+
			
 
				+Misc:
			
 
				+
			
 
				+  atomic_inc_and_test(), atomic_add_negative()
			
 
				+  atomic_dec_unless_positive(), atomic_inc_unless_negative()
			
 
				+
			
 
				+
			
 
				+Barriers:
			
 
				+
			
 
				+  smp_mb__{before,after}_atomic()
			
 
				+
			
 
				+
			
 
				+
			
 
				+SEMANTICS
			
 
				+---------
			
 
				+
			
 
				+Non-RMW ops:
			
 
				+
			
 
				+The non-RMW ops are (typically) regular LOADs and STOREs and are canonically
			
 
				+implemented using READ_ONCE(), WRITE_ONCE(), smp_load_acquire() and
			
 
				+smp_store_release() respectively.
			
 
				+
			
 
				+The one detail to this is that atomic_set{}() should be observable to the RMW
			
 
				+ops. That is:
			
 
				+
			
 
				+  C atomic-set
			
 
				+
			
 
				+  {
			
 
				+    atomic_set(v, 1);
			
 
				+  }
			
 
				+
			
 
				+  P1(atomic_t *v)
			
 
				+  {
			
 
				+    atomic_add_unless(v, 1, 0);
			
 
				+  }
			
 
				+
			
 
				+  P2(atomic_t *v)
			
 
				+  {
			
 
				+    atomic_set(v, 0);
			
 
				+  }
			
 
				+
			
 
				+  exists
			
 
				+  (v=2)
			
 
				+
			
 
				+In this case we would expect the atomic_set() from CPU1 to either happen
			
 
				+before the atomic_add_unless(), in which case that latter one would no-op, or
			
 
				+_after_ in which case we'd overwrite its result. In no case is "2" a valid
			
 
				+outcome.
			
 
				+
			
 
				+This is typically true on 'normal' platforms, where a regular competing STORE
			
 
				+will invalidate a LL/SC or fail a CMPXCHG.
			
 
				+
			
 
				+The obvious case where this is not so is when we need to implement atomic ops
			
 
				+with a lock:
			
 
				+
			
 
				+  CPU0						CPU1
			
 
				+
			
 
				+  atomic_add_unless(v, 1, 0);
			
 
				+    lock();
			
 
				+    ret = READ_ONCE(v->counter); // == 1
			
 
				+						atomic_set(v, 0);
			
 
				+    if (ret != u)				  WRITE_ONCE(v->counter, 0);
			
 
				+      WRITE_ONCE(v->counter, ret + 1);
			
 
				+    unlock();
			
 
				+
			
 
				+the typical solution is to then implement atomic_set{}() with atomic_xchg().
			
 
				+
			
 
				+
			
 
				+RMW ops:
			
 
				+
			
 
				+These come in various forms:
			
 
				+
			
 
				+ - plain operations without return value: atomic_{}()
			
 
				+
			
 
				+ - operations which return the modified value: atomic_{}_return()
			
 
				+
			
 
				+   these are limited to the arithmetic operations because those are
			
 
				+   reversible. Bitops are irreversible and therefore the modified value
			
 
				+   is of dubious utility.
			
 
				+
			
 
				+ - operations which return the original value: atomic_fetch_{}()
			
 
				+
			
 
				+ - swap operations: xchg(), cmpxchg() and try_cmpxchg()
			
 
				+
			
 
				+ - misc; the special purpose operations that are commonly used and would,
			
 
				+   given the interface, normally be implemented using (try_)cmpxchg loops but
			
 
				+   are time critical and can, (typically) on LL/SC architectures, be more
			
 
				+   efficiently implemented.
			
 
				+
			
 
				+All these operations are SMP atomic; that is, the operations (for a single
			
 
				+atomic variable) can be fully ordered and no intermediate state is lost or
			
 
				+visible.
			
 
				+
			
 
				+
			
 
				+ORDERING  (go read memory-barriers.txt first)
			
 
				+--------
			
 
				+
			
 
				+The rule of thumb:
			
 
				+
			
 
				+ - non-RMW operations are unordered;
			
 
				+
			
 
				+ - RMW operations that have no return value are unordered;
			
 
				+
			
 
				+ - RMW operations that have a return value are fully ordered;
			
 
				+
			
 
				+ - RMW operations that are conditional are unordered on FAILURE,
			
 
				+   otherwise the above rules apply.
			
 
				+
			
 
				+Except of course when an operation has an explicit ordering like:
			
 
				+
			
 
				+ {}_relaxed: unordered
			
 
				+ {}_acquire: the R of the RMW (or atomic_read) is an ACQUIRE
			
 
				+ {}_release: the W of the RMW (or atomic_set)  is a  RELEASE
			
 
				+
			
 
				+Where 'unordered' is against other memory locations. Address dependencies are
			
 
				+not defeated.
			
 
				+
			
 
				+Fully ordered primitives are ordered against everything prior and everything
			
 
				+subsequent. Therefore a fully ordered primitive is like having an smp_mb()
			
 
				+before and an smp_mb() after the primitive.
			
 
				+
			
 
				+
			
 
				+The barriers:
			
 
				+
			
 
				+  smp_mb__{before,after}_atomic()
			
 
				+
			
 
				+only apply to the RMW ops and can be used to augment/upgrade the ordering
			
 
				+inherent to the used atomic op. These barriers provide a full smp_mb().
			
 
				+
			
 
				+These helper barriers exist because architectures have varying implicit
			
 
				+ordering on their SMP atomic primitives. For example our TSO architectures
			
 
				+provide full ordered atomics and these barriers are no-ops.
			
 
				+
			
 
				+Thus:
			
 
				+
			
 
				+  atomic_fetch_add();
			
 
				+
			
 
				+is equivalent to:
			
 
				+
			
 
				+  smp_mb__before_atomic();
			
 
				+  atomic_fetch_add_relaxed();
			
 
				+  smp_mb__after_atomic();
			
 
				+
			
 
				+However the atomic_fetch_add() might be implemented more efficiently.
			
 
				+
			
 
				+Further, while something like:
			
 
				+
			
 
				+  smp_mb__before_atomic();
			
 
				+  atomic_dec(&X);
			
 
				+
			
 
				+is a 'typical' RELEASE pattern, the barrier is strictly stronger than
			
 
				+a RELEASE. Similarly for something like:
			
 
				+
			
 
				+
			
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -498,11 +498,11 @@ And a couple of implicit varieties:
 
				      This means that ACQUIRE acts as a minimal "acquire" operation and
			
 
				      RELEASE acts as a minimal "release" operation.
			
 
				 
			
 
				-A subset of the atomic operations described in core-api/atomic_ops.rst have
			
 
				-ACQUIRE and RELEASE variants in addition to fully-ordered and relaxed (no
			
 
				-barrier semantics) definitions.  For compound atomics performing both a load
			
 
				-and a store, ACQUIRE semantics apply only to the load and RELEASE semantics
			
 
				-apply only to the store portion of the operation.
			
 
				+A subset of the atomic operations described in atomic_t.txt have ACQUIRE and
			
 
				+RELEASE variants in addition to fully-ordered and relaxed (no barrier
			
 
				+semantics) definitions.  For compound atomics performing both a load and a
			
 
				+store, ACQUIRE semantics apply only to the load and RELEASE semantics apply
			
 
				+only to the store portion of the operation.
			
 
				 
			
 
				 Memory barriers are only required where there's a possibility of interaction
			
 
				 between two CPUs or between a CPU and a device.  If it can be guaranteed that
			
@@ -1876,8 +1876,7 @@ There are some more advanced barrier functions:
 
				      This makes sure that the death mark on the object is perceived to be set
			
 
				      *before* the reference counter is decremented.
			
 
				 
			
 
				-     See Documentation/core-api/atomic_ops.rst for more information.  See the
			
 
				-     "Atomic operations" subsection for information on where to use these.
			
 
				+     See Documentation/atomic_{t,bitops}.txt for more information.
			
 
				 
			
 
				 
			
 
				  (*) lockless_dereference();
			
@@ -2503,88 +2502,7 @@ operations are noted specially as some of them imply full memory barriers and
 
				 some don't, but they're very heavily relied on as a group throughout the
			
 
				 kernel.
			
 
				 
			
 
				-Any atomic operation that modifies some state in memory and returns information
			
 
				-about the state (old or new) implies an SMP-conditional general memory barrier
			
 
				-(smp_mb()) on each side of the actual operation (with the exception of
			
 
				-explicit lock operations, described later).  These include:
			
 
				-
			
 
				-	xchg();
			
 
				-	atomic_xchg();			atomic_long_xchg();
			
 
				-	atomic_inc_return();		atomic_long_inc_return();
			
 
				-	atomic_dec_return();		atomic_long_dec_return();
			
 
				-	atomic_add_return();		atomic_long_add_return();
			
 
				-	atomic_sub_return();		atomic_long_sub_return();
			
 
				-	atomic_inc_and_test();		atomic_long_inc_and_test();
			
 
				-	atomic_dec_and_test();		atomic_long_dec_and_test();
			
 
				-	atomic_sub_and_test();		atomic_long_sub_and_test();
			
 
				-	atomic_add_negative();		atomic_long_add_negative();
			
 
				-	test_and_set_bit();
			
 
				-	test_and_clear_bit();
			
 
				-	test_and_change_bit();
			
 
				-
			
 
				-	/* when succeeds */
			
 
				-	cmpxchg();
			
 
				-	atomic_cmpxchg();		atomic_long_cmpxchg();
			
 
				-	atomic_add_unless();		atomic_long_add_unless();
			
 
				-
			
 
				-These are used for such things as implementing ACQUIRE-class and RELEASE-class
			
 
				-operations and adjusting reference counters towards object destruction, and as
			
 
				-such the implicit memory barrier effects are necessary.
			
 
				-
			
 
				-
			
 
				-The following operations are potential problems as they do _not_ imply memory
			
 
				-barriers, but might be used for implementing such things as RELEASE-class
			
 
				-operations:
			
 
				-
			
 
				-	atomic_set();
			
 
				-	set_bit();
			
 
				-	clear_bit();
			
 
				-	change_bit();
			
 
				-
			
 
				-With these the appropriate explicit memory barrier should be used if necessary
			
 
				-(smp_mb__before_atomic() for instance).
			
 
				-
			
 
				-
			
 
				-The following also do _not_ imply memory barriers, and so may require explicit
			
 
				-memory barriers under some circumstances (smp_mb__before_atomic() for
			
 
				-instance):
			
 
				-
			
 
				-	atomic_add();
			
 
				-	atomic_sub();
			
 
				-	atomic_inc();
			
 
				-	atomic_dec();
			
 
				-
			
 
				-If they're used for statistics generation, then they probably don't need memory
			
 
				-barriers, unless there's a coupling between statistical data.
			
 
				-
			
 
				-If they're used for reference counting on an object to control its lifetime,
			
 
				-they probably don't need memory barriers because either the reference count
			
 
				-will be adjusted inside a locked section, or the caller will already hold
			
 
				-sufficient references to make the lock, and thus a memory barrier unnecessary.
			
 
				-
			
 
				-If they're used for constructing a lock of some description, then they probably
			
 
				-do need memory barriers as a lock primitive generally has to do things in a
			
 
				-specific order.
			
 
				-
			
 
				-Basically, each usage case has to be carefully considered as to whether memory
			
 
				-barriers are needed or not.
			
 
				-
			
 
				-The following operations are special locking primitives:
			
 
				-
			
 
				-	test_and_set_bit_lock();
			
 
				-	clear_bit_unlock();
			
 
				-	__clear_bit_unlock();
			
 
				-
			
 
				-These implement ACQUIRE-class and RELEASE-class operations.  These should be
			
 
				-used in preference to other operations when implementing locking primitives,
			
 
				-because their implementations can be optimised on many architectures.
			
 
				-
			
 
				-[!] Note that special memory barrier primitives are available for these
			
 
				-situations because on some CPUs the atomic instructions used imply full memory
			
 
				-barriers, and so barrier instructions are superfluous in conjunction with them,
			
 
				-and in such cases the special barrier primitives will be no-ops.
			
 
				-
			
 
				-See Documentation/core-api/atomic_ops.rst for more information.
			
 
				+See Documentation/atomic_t.txt for more information.
			
 
				 
			
 
				 
			
 
				 ACCESSING DEVICES