Эх сурвалжийг харах

Merge branch 'for-next/cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into cpuidle/3.18

These are the specific changes for ARM64 to make it possible to integrate the
DT based generic cpuidle driver in this tree.

It contains:
  * The documentation for the DT definitions for ARM
  * The refactoring of the cpu_suspend function for ARM64
  * Introduce the cpu_idle_init function for ARM64
  * Add the PSCI CPU SUSPEND based on the previous changes on cpu_suspend

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Daniel Lezcano 11 жил өмнө
parent
commit
f4ea5332c8

+ 8 - 0
Documentation/devicetree/bindings/arm/cpus.txt

@@ -219,6 +219,12 @@ nodes to be present and contain the properties described below.
 		Value type: <phandle>
 		Value type: <phandle>
 		Definition: Specifies the ACC[2] node associated with this CPU.
 		Definition: Specifies the ACC[2] node associated with this CPU.
 
 
+	- cpu-idle-states
+		Usage: Optional
+		Value type: <prop-encoded-array>
+		Definition:
+			# List of phandles to idle state nodes supported
+			  by this cpu [3].
 
 
 Example 1 (dual-cluster big.LITTLE system 32-bit):
 Example 1 (dual-cluster big.LITTLE system 32-bit):
 
 
@@ -415,3 +421,5 @@ cpus {
 --
 --
 [1] arm/msm/qcom,saw2.txt
 [1] arm/msm/qcom,saw2.txt
 [2] arm/msm/qcom,kpss-acc.txt
 [2] arm/msm/qcom,kpss-acc.txt
+[3] ARM Linux kernel documentation - idle states bindings
+    Documentation/devicetree/bindings/arm/idle-states.txt

+ 679 - 0
Documentation/devicetree/bindings/arm/idle-states.txt

@@ -0,0 +1,679 @@
+==========================================
+ARM idle states binding description
+==========================================
+
+==========================================
+1 - Introduction
+==========================================
+
+ARM systems contain HW capable of managing power consumption dynamically,
+where cores can be put in different low-power states (ranging from simple
+wfi to power gating) according to OS PM policies. The CPU states representing
+the range of dynamic idle states that a processor can enter at run-time, can be
+specified through device tree bindings representing the parameters required
+to enter/exit specific idle states on a given processor.
+
+According to the Server Base System Architecture document (SBSA, [3]), the
+power states an ARM CPU can be put into are identified by the following list:
+
+- Running
+- Idle_standby
+- Idle_retention
+- Sleep
+- Off
+
+The power states described in the SBSA document define the basic CPU states on
+top of which ARM platforms implement power management schemes that allow an OS
+PM implementation to put the processor in different idle states (which include
+states listed above; "off" state is not an idle state since it does not have
+wake-up capabilities, hence it is not considered in this document).
+
+Idle state parameters (eg entry latency) are platform specific and need to be
+characterized with bindings that provide the required information to OS PM
+code so that it can build the required tables and use them at runtime.
+
+The device tree binding definition for ARM idle states is the subject of this
+document.
+
+===========================================
+2 - idle-states definitions
+===========================================
+
+Idle states are characterized for a specific system through a set of
+timing and energy related properties, that underline the HW behaviour
+triggered upon idle states entry and exit.
+
+The following diagram depicts the CPU execution phases and related timing
+properties required to enter and exit an idle state:
+
+..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__..
+	    |          |           |          |          |
+
+	    |<------ entry ------->|
+	    |       latency        |
+					      |<- exit ->|
+					      |  latency |
+	    |<-------- min-residency -------->|
+		       |<-------  wakeup-latency ------->|
+
+		Diagram 1: CPU idle state execution phases
+
+EXEC:	Normal CPU execution.
+
+PREP:	Preparation phase before committing the hardware to idle mode
+	like cache flushing. This is abortable on pending wake-up
+	event conditions. The abort latency is assumed to be negligible
+	(i.e. less than the ENTRY + EXIT duration). If aborted, CPU
+	goes back to EXEC. This phase is optional. If not abortable,
+	this should be included in the ENTRY phase instead.
+
+ENTRY:	The hardware is committed to idle mode. This period must run
+	to completion up to IDLE before anything else can happen.
+
+IDLE:	This is the actual energy-saving idle period. This may last
+	between 0 and infinite time, until a wake-up event occurs.
+
+EXIT:	Period during which the CPU is brought back to operational
+	mode (EXEC).
+
+entry-latency: Worst case latency required to enter the idle state. The
+exit-latency may be guaranteed only after entry-latency has passed.
+
+min-residency: Minimum period, including preparation and entry, for a given
+idle state to be worthwhile energywise.
+
+wakeup-latency: Maximum delay between the signaling of a wake-up event and the
+CPU being able to execute normal code again. If not specified, this is assumed
+to be entry-latency + exit-latency.
+
+These timing parameters can be used by an OS in different circumstances.
+
+An idle CPU requires the expected min-residency time to select the most
+appropriate idle state based on the expected expiry time of the next IRQ
+(ie wake-up) that causes the CPU to return to the EXEC phase.
+
+An operating system scheduler may need to compute the shortest wake-up delay
+for CPUs in the system by detecting how long will it take to get a CPU out
+of an idle state, eg:
+
+wakeup-delay = exit-latency + max(entry-latency - (now - entry-timestamp), 0)
+
+In other words, the scheduler can make its scheduling decision by selecting
+(eg waking-up) the CPU with the shortest wake-up latency.
+The wake-up latency must take into account the entry latency if that period
+has not expired. The abortable nature of the PREP period can be ignored
+if it cannot be relied upon (e.g. the PREP deadline may occur much sooner than
+the worst case since it depends on the CPU operating conditions, ie caches
+state).
+
+An OS has to reliably probe the wakeup-latency since some devices can enforce
+latency constraints guarantees to work properly, so the OS has to detect the
+worst case wake-up latency it can incur if a CPU is allowed to enter an
+idle state, and possibly to prevent that to guarantee reliable device
+functioning.
+
+The min-residency time parameter deserves further explanation since it is
+expressed in time units but must factor in energy consumption coefficients.
+
+The energy consumption of a cpu when it enters a power state can be roughly
+characterised by the following graph:
+
+               |
+               |
+               |
+           e   |
+           n   |                                      /---
+           e   |                               /------
+           r   |                        /------
+           g   |                  /-----
+           y   |           /------
+               |       ----
+               |      /|
+               |     / |
+               |    /  |
+               |   /   |
+               |  /    |
+               | /     |
+               |/      |
+          -----|-------+----------------------------------
+              0|       1                              time(ms)
+
+		Graph 1: Energy vs time example
+
+The graph is split in two parts delimited by time 1ms on the X-axis.
+The graph curve with X-axis values = { x | 0 < x < 1ms } has a steep slope
+and denotes the energy costs incurred whilst entering and leaving the idle
+state.
+The graph curve in the area delimited by X-axis values = {x | x > 1ms } has
+shallower slope and essentially represents the energy consumption of the idle
+state.
+
+min-residency is defined for a given idle state as the minimum expected
+residency time for a state (inclusive of preparation and entry) after
+which choosing that state become the most energy efficient option. A good
+way to visualise this, is by taking the same graph above and comparing some
+states energy consumptions plots.
+
+For sake of simplicity, let's consider a system with two idle states IDLE1,
+and IDLE2:
+
+          |
+          |
+          |
+          |                                                  /-- IDLE1
+       e  |                                              /---
+       n  |                                         /----
+       e  |                                     /---
+       r  |                                /-----/--------- IDLE2
+       g  |                    /-------/---------
+       y  |        ------------    /---|
+          |       /           /----    |
+          |      /        /---         |
+          |     /    /----             |
+          |    / /---                  |
+          |   ---                      |
+          |  /                         |
+          | /                          |
+          |/                           |                  time
+       ---/----------------------------+------------------------
+          |IDLE1-energy < IDLE2-energy | IDLE2-energy < IDLE1-energy
+                                       |
+                                IDLE2-min-residency
+
+		Graph 2: idle states min-residency example
+
+In graph 2 above, that takes into account idle states entry/exit energy
+costs, it is clear that if the idle state residency time (ie time till next
+wake-up IRQ) is less than IDLE2-min-residency, IDLE1 is the better idle state
+choice energywise.
+
+This is mainly down to the fact that IDLE1 entry/exit energy costs are lower
+than IDLE2.
+
+However, the lower power consumption (ie shallower energy curve slope) of idle
+state IDLE2 implies that after a suitable time, IDLE2 becomes more energy
+efficient.
+
+The time at which IDLE2 becomes more energy efficient than IDLE1 (and other
+shallower states in a system with multiple idle states) is defined
+IDLE2-min-residency and corresponds to the time when energy consumption of
+IDLE1 and IDLE2 states breaks even.
+
+The definitions provided in this section underpin the idle states
+properties specification that is the subject of the following sections.
+
+===========================================
+3 - idle-states node
+===========================================
+
+ARM processor idle states are defined within the idle-states node, which is
+a direct child of the cpus node [1] and provides a container where the
+processor idle states, defined as device tree nodes, are listed.
+
+- idle-states node
+
+	Usage: Optional - On ARM systems, it is a container of processor idle
+			  states nodes. If the system does not provide CPU
+			  power management capabilities or the processor just
+			  supports idle_standby an idle-states node is not
+			  required.
+
+	Description: idle-states node is a container node, where its
+		     subnodes describe the CPU idle states.
+
+	Node name must be "idle-states".
+
+	The idle-states node's parent node must be the cpus node.
+
+	The idle-states node's child nodes can be:
+
+	- one or more state nodes
+
+	Any other configuration is considered invalid.
+
+	An idle-states node defines the following properties:
+
+	- entry-method
+		Value type: <stringlist>
+		Usage and definition depend on ARM architecture version.
+			# On ARM v8 64-bit this property is required and must
+			  be one of:
+			   - "psci" (see bindings in [2])
+			# On ARM 32-bit systems this property is optional
+
+The nodes describing the idle states (state) can only be defined within the
+idle-states node, any other configuration is considered invalid and therefore
+must be ignored.
+
+===========================================
+4 - state node
+===========================================
+
+A state node represents an idle state description and must be defined as
+follows:
+
+- state node
+
+	Description: must be child of the idle-states node
+
+	The state node name shall follow standard device tree naming
+	rules ([5], 2.2.1 "Node names"), in particular state nodes which
+	are siblings within a single common parent must be given a unique name.
+
+	The idle state entered by executing the wfi instruction (idle_standby
+	SBSA,[3][4]) is considered standard on all ARM platforms and therefore
+	must not be listed.
+
+	With the definitions provided above, the following list represents
+	the valid properties for a state node:
+
+	- compatible
+		Usage: Required
+		Value type: <stringlist>
+		Definition: Must be "arm,idle-state".
+
+	- local-timer-stop
+		Usage: See definition
+		Value type: <none>
+		Definition: if present the CPU local timer control logic is
+			    lost on state entry, otherwise it is retained.
+
+	- entry-latency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency in
+			    microseconds required to enter the idle state.
+			    The exit-latency-us duration may be guaranteed
+			    only after entry-latency-us has passed.
+
+	- exit-latency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency
+			    in microseconds required to exit the idle state.
+
+	- min-residency-us
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing minimum residency duration
+			    in microseconds, inclusive of preparation and
+			    entry, for this idle state to be considered
+			    worthwhile energy wise (refer to section 2 of
+			    this document for a complete description).
+
+	- wakeup-latency-us:
+		Usage: Optional
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing maximum delay between the
+			    signaling of a wake-up event and the CPU being
+			    able to execute normal code again. If omitted,
+			    this is assumed to be equal to:
+
+				entry-latency-us + exit-latency-us
+
+			    It is important to supply this value on systems
+			    where the duration of PREP phase (see diagram 1,
+			    section 2) is non-neglibigle.
+			    In such systems entry-latency-us + exit-latency-us
+			    will exceed wakeup-latency-us by this duration.
+
+	In addition to the properties listed above, a state node may require
+	additional properties specifics to the entry-method defined in the
+	idle-states node, please refer to the entry-method bindings
+	documentation for properties definitions.
+
+===========================================
+4 - Examples
+===========================================
+
+Example 1 (ARM 64-bit, 16-cpu system, PSCI enable-method):
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <2>;
+
+	CPU0: cpu@0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU1: cpu@1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU2: cpu@100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU3: cpu@101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU4: cpu@10000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10000>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU5: cpu@10001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10001>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU6: cpu@10100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU7: cpu@10101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
+				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU8: cpu@100000000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x0>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU9: cpu@100000001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x1>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU10: cpu@100000100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU11: cpu@100000101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU12: cpu@100010000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10000>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU13: cpu@100010001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10001>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU14: cpu@100010100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10100>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU15: cpu@100010101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10101>;
+		enable-method = "psci";
+		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
+				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
+	};
+
+	idle-states {
+		entry-method = "arm,psci";
+
+		CPU_RETENTION_0_0: cpu-retention-0-0 {
+			compatible = "arm,idle-state";
+			arm,psci-suspend-param = <0x0010000>;
+			entry-latency-us = <20>;
+			exit-latency-us = <40>;
+			min-residency-us = <80>;
+		};
+
+		CLUSTER_RETENTION_0: cluster-retention-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			arm,psci-suspend-param = <0x1010000>;
+			entry-latency-us = <50>;
+			exit-latency-us = <100>;
+			min-residency-us = <250>;
+			wakeup-latency-us = <130>;
+		};
+
+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			arm,psci-suspend-param = <0x0010000>;
+			entry-latency-us = <250>;
+			exit-latency-us = <500>;
+			min-residency-us = <950>;
+		};
+
+		CLUSTER_SLEEP_0: cluster-sleep-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			arm,psci-suspend-param = <0x1010000>;
+			entry-latency-us = <600>;
+			exit-latency-us = <1100>;
+			min-residency-us = <2700>;
+			wakeup-latency-us = <1500>;
+		};
+
+		CPU_RETENTION_1_0: cpu-retention-1-0 {
+			compatible = "arm,idle-state";
+			arm,psci-suspend-param = <0x0010000>;
+			entry-latency-us = <20>;
+			exit-latency-us = <40>;
+			min-residency-us = <90>;
+		};
+
+		CLUSTER_RETENTION_1: cluster-retention-1 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			arm,psci-suspend-param = <0x1010000>;
+			entry-latency-us = <50>;
+			exit-latency-us = <100>;
+			min-residency-us = <270>;
+			wakeup-latency-us = <100>;
+		};
+
+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			arm,psci-suspend-param = <0x0010000>;
+			entry-latency-us = <70>;
+			exit-latency-us = <100>;
+			min-residency-us = <300>;
+			wakeup-latency-us = <150>;
+		};
+
+		CLUSTER_SLEEP_1: cluster-sleep-1 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			arm,psci-suspend-param = <0x1010000>;
+			entry-latency-us = <500>;
+			exit-latency-us = <1200>;
+			min-residency-us = <3500>;
+			wakeup-latency-us = <1300>;
+		};
+	};
+
+};
+
+Example 2 (ARM 32-bit, 8-cpu system, two clusters):
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <1>;
+
+	CPU0: cpu@0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x0>;
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU1: cpu@1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x1>;
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU2: cpu@2 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x2>;
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU3: cpu@3 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x3>;
+		cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>;
+	};
+
+	CPU4: cpu@100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x100>;
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU5: cpu@101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x101>;
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU6: cpu@102 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x102>;
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	CPU7: cpu@103 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x103>;
+		cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>;
+	};
+
+	idle-states {
+		CPU_SLEEP_0_0: cpu-sleep-0-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			entry-latency-us = <200>;
+			exit-latency-us = <100>;
+			min-residency-us = <400>;
+			wakeup-latency-us = <250>;
+		};
+
+		CLUSTER_SLEEP_0: cluster-sleep-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			entry-latency-us = <500>;
+			exit-latency-us = <1500>;
+			min-residency-us = <2500>;
+			wakeup-latency-us = <1700>;
+		};
+
+		CPU_SLEEP_1_0: cpu-sleep-1-0 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			entry-latency-us = <300>;
+			exit-latency-us = <500>;
+			min-residency-us = <900>;
+			wakeup-latency-us = <600>;
+		};
+
+		CLUSTER_SLEEP_1: cluster-sleep-1 {
+			compatible = "arm,idle-state";
+			local-timer-stop;
+			entry-latency-us = <800>;
+			exit-latency-us = <2000>;
+			min-residency-us = <6500>;
+			wakeup-latency-us = <2300>;
+		};
+	};
+
+};
+
+===========================================
+5 - References
+===========================================
+
+[1] ARM Linux Kernel documentation - CPUs bindings
+    Documentation/devicetree/bindings/arm/cpus.txt
+
+[2] ARM Linux Kernel documentation - PSCI bindings
+    Documentation/devicetree/bindings/arm/psci.txt
+
+[3] ARM Server Base System Architecture (SBSA)
+    http://infocenter.arm.com/help/index.jsp
+
+[4] ARM Architecture Reference Manuals
+    http://infocenter.arm.com/help/index.jsp
+
+[5] ePAPR standard
+    https://www.power.org/documentation/epapr-version-1-1/

+ 13 - 1
Documentation/devicetree/bindings/arm/psci.txt

@@ -50,6 +50,16 @@ Main node optional properties:
 
 
  - migrate       : Function ID for MIGRATE operation
  - migrate       : Function ID for MIGRATE operation
 
 
+Device tree nodes that require usage of PSCI CPU_SUSPEND function (ie idle
+state nodes, as per bindings in [1]) must specify the following properties:
+
+- arm,psci-suspend-param
+		Usage: Required for state nodes[1] if the corresponding
+                       idle-states node entry-method property is set
+                       to "psci".
+		Value type: <u32>
+		Definition: power_state parameter to pass to the PSCI
+			    suspend call.
 
 
 Example:
 Example:
 
 
@@ -64,7 +74,6 @@ Case 1: PSCI v0.1 only.
 		migrate		= <0x95c10003>;
 		migrate		= <0x95c10003>;
 	};
 	};
 
 
-
 Case 2: PSCI v0.2 only
 Case 2: PSCI v0.2 only
 
 
 	psci {
 	psci {
@@ -88,3 +97,6 @@ Case 3: PSCI v0.2 and PSCI v0.1.
 
 
 		...
 		...
 	};
 	};
+
+[1] Kernel documentation - ARM idle states bindings
+    Documentation/devicetree/bindings/arm/idle-states.txt

+ 3 - 0
arch/arm64/include/asm/cpu_ops.h

@@ -28,6 +28,8 @@ struct device_node;
  *		enable-method property.
  *		enable-method property.
  * @cpu_init:	Reads any data necessary for a specific enable-method from the
  * @cpu_init:	Reads any data necessary for a specific enable-method from the
  *		devicetree, for a given cpu node and proposed logical id.
  *		devicetree, for a given cpu node and proposed logical id.
+ * @cpu_init_idle: Reads any data necessary to initialize CPU idle states from
+ *		devicetree, for a given cpu node and proposed logical id.
  * @cpu_prepare: Early one-time preparation step for a cpu. If there is a
  * @cpu_prepare: Early one-time preparation step for a cpu. If there is a
  *		mechanism for doing so, tests whether it is possible to boot
  *		mechanism for doing so, tests whether it is possible to boot
  *		the given CPU.
  *		the given CPU.
@@ -47,6 +49,7 @@ struct device_node;
 struct cpu_operations {
 struct cpu_operations {
 	const char	*name;
 	const char	*name;
 	int		(*cpu_init)(struct device_node *, unsigned int);
 	int		(*cpu_init)(struct device_node *, unsigned int);
+	int		(*cpu_init_idle)(struct device_node *, unsigned int);
 	int		(*cpu_prepare)(unsigned int);
 	int		(*cpu_prepare)(unsigned int);
 	int		(*cpu_boot)(unsigned int);
 	int		(*cpu_boot)(unsigned int);
 	void		(*cpu_postboot)(void);
 	void		(*cpu_postboot)(void);

+ 13 - 0
arch/arm64/include/asm/cpuidle.h

@@ -0,0 +1,13 @@
+#ifndef __ASM_CPUIDLE_H
+#define __ASM_CPUIDLE_H
+
+#ifdef CONFIG_CPU_IDLE
+extern int cpu_init_idle(unsigned int cpu);
+#else
+static inline int cpu_init_idle(unsigned int cpu)
+{
+	return -EOPNOTSUPP;
+}
+#endif
+
+#endif

+ 1 - 0
arch/arm64/include/asm/suspend.h

@@ -21,6 +21,7 @@ struct sleep_save_sp {
 	phys_addr_t save_ptr_stash_phys;
 	phys_addr_t save_ptr_stash_phys;
 };
 };
 
 
+extern int __cpu_suspend(unsigned long arg, int (*fn)(unsigned long));
 extern void cpu_resume(void);
 extern void cpu_resume(void);
 extern int cpu_suspend(unsigned long);
 extern int cpu_suspend(unsigned long);
 
 

+ 1 - 0
arch/arm64/kernel/Makefile

@@ -26,6 +26,7 @@ arm64-obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o
 arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
 arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
 arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
 arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
 arm64-obj-$(CONFIG_ARM64_CPU_SUSPEND)	+= sleep.o suspend.o
 arm64-obj-$(CONFIG_ARM64_CPU_SUSPEND)	+= sleep.o suspend.o
+arm64-obj-$(CONFIG_CPU_IDLE)		+= cpuidle.o
 arm64-obj-$(CONFIG_JUMP_LABEL)		+= jump_label.o
 arm64-obj-$(CONFIG_JUMP_LABEL)		+= jump_label.o
 arm64-obj-$(CONFIG_KGDB)		+= kgdb.o
 arm64-obj-$(CONFIG_KGDB)		+= kgdb.o
 arm64-obj-$(CONFIG_EFI)			+= efi.o efi-stub.o efi-entry.o
 arm64-obj-$(CONFIG_EFI)			+= efi.o efi-stub.o efi-entry.o

+ 31 - 0
arch/arm64/kernel/cpuidle.c

@@ -0,0 +1,31 @@
+/*
+ * ARM64 CPU idle arch support
+ *
+ * Copyright (C) 2014 ARM Ltd.
+ * Author: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/of.h>
+#include <linux/of_device.h>
+
+#include <asm/cpuidle.h>
+#include <asm/cpu_ops.h>
+
+int cpu_init_idle(unsigned int cpu)
+{
+	int ret = -EOPNOTSUPP;
+	struct device_node *cpu_node = of_cpu_device_node_get(cpu);
+
+	if (!cpu_node)
+		return -ENODEV;
+
+	if (cpu_ops[cpu] && cpu_ops[cpu]->cpu_init_idle)
+		ret = cpu_ops[cpu]->cpu_init_idle(cpu_node, cpu);
+
+	of_node_put(cpu_node);
+	return ret;
+}

+ 104 - 0
arch/arm64/kernel/psci.c

@@ -21,6 +21,7 @@
 #include <linux/reboot.h>
 #include <linux/reboot.h>
 #include <linux/pm.h>
 #include <linux/pm.h>
 #include <linux/delay.h>
 #include <linux/delay.h>
+#include <linux/slab.h>
 #include <uapi/linux/psci.h>
 #include <uapi/linux/psci.h>
 
 
 #include <asm/compiler.h>
 #include <asm/compiler.h>
@@ -28,6 +29,7 @@
 #include <asm/errno.h>
 #include <asm/errno.h>
 #include <asm/psci.h>
 #include <asm/psci.h>
 #include <asm/smp_plat.h>
 #include <asm/smp_plat.h>
+#include <asm/suspend.h>
 #include <asm/system_misc.h>
 #include <asm/system_misc.h>
 
 
 #define PSCI_POWER_STATE_TYPE_STANDBY		0
 #define PSCI_POWER_STATE_TYPE_STANDBY		0
@@ -65,6 +67,8 @@ enum psci_function {
 	PSCI_FN_MAX,
 	PSCI_FN_MAX,
 };
 };
 
 
+static DEFINE_PER_CPU_READ_MOSTLY(struct psci_power_state *, psci_power_state);
+
 static u32 psci_function_id[PSCI_FN_MAX];
 static u32 psci_function_id[PSCI_FN_MAX];
 
 
 static int psci_to_linux_errno(int errno)
 static int psci_to_linux_errno(int errno)
@@ -93,6 +97,18 @@ static u32 psci_power_state_pack(struct psci_power_state state)
 		 & PSCI_0_2_POWER_STATE_AFFL_MASK);
 		 & PSCI_0_2_POWER_STATE_AFFL_MASK);
 }
 }
 
 
+static void psci_power_state_unpack(u32 power_state,
+				    struct psci_power_state *state)
+{
+	state->id = (power_state & PSCI_0_2_POWER_STATE_ID_MASK) >>
+			PSCI_0_2_POWER_STATE_ID_SHIFT;
+	state->type = (power_state & PSCI_0_2_POWER_STATE_TYPE_MASK) >>
+			PSCI_0_2_POWER_STATE_TYPE_SHIFT;
+	state->affinity_level =
+			(power_state & PSCI_0_2_POWER_STATE_AFFL_MASK) >>
+			PSCI_0_2_POWER_STATE_AFFL_SHIFT;
+}
+
 /*
 /*
  * The following two functions are invoked via the invoke_psci_fn pointer
  * The following two functions are invoked via the invoke_psci_fn pointer
  * and will not be inlined, allowing us to piggyback on the AAPCS.
  * and will not be inlined, allowing us to piggyback on the AAPCS.
@@ -199,6 +215,63 @@ static int psci_migrate_info_type(void)
 	return err;
 	return err;
 }
 }
 
 
+static int __maybe_unused cpu_psci_cpu_init_idle(struct device_node *cpu_node,
+						 unsigned int cpu)
+{
+	int i, ret, count = 0;
+	struct psci_power_state *psci_states;
+	struct device_node *state_node;
+
+	/*
+	 * If the PSCI cpu_suspend function hook has not been initialized
+	 * idle states must not be enabled, so bail out
+	 */
+	if (!psci_ops.cpu_suspend)
+		return -EOPNOTSUPP;
+
+	/* Count idle states */
+	while ((state_node = of_parse_phandle(cpu_node, "cpu-idle-states",
+					      count))) {
+		count++;
+		of_node_put(state_node);
+	}
+
+	if (!count)
+		return -ENODEV;
+
+	psci_states = kcalloc(count, sizeof(*psci_states), GFP_KERNEL);
+	if (!psci_states)
+		return -ENOMEM;
+
+	for (i = 0; i < count; i++) {
+		u32 psci_power_state;
+
+		state_node = of_parse_phandle(cpu_node, "cpu-idle-states", i);
+
+		ret = of_property_read_u32(state_node,
+					   "arm,psci-suspend-param",
+					   &psci_power_state);
+		if (ret) {
+			pr_warn(" * %s missing arm,psci-suspend-param property\n",
+				state_node->full_name);
+			of_node_put(state_node);
+			goto free_mem;
+		}
+
+		of_node_put(state_node);
+		pr_debug("psci-power-state %#x index %d\n", psci_power_state,
+							    i);
+		psci_power_state_unpack(psci_power_state, &psci_states[i]);
+	}
+	/* Idle states parsed correctly, initialize per-cpu pointer */
+	per_cpu(psci_power_state, cpu) = psci_states;
+	return 0;
+
+free_mem:
+	kfree(psci_states);
+	return ret;
+}
+
 static int get_set_conduit_method(struct device_node *np)
 static int get_set_conduit_method(struct device_node *np)
 {
 {
 	const char *method;
 	const char *method;
@@ -436,8 +509,39 @@ static int cpu_psci_cpu_kill(unsigned int cpu)
 #endif
 #endif
 #endif
 #endif
 
 
+static int psci_suspend_finisher(unsigned long index)
+{
+	struct psci_power_state *state = __get_cpu_var(psci_power_state);
+
+	return psci_ops.cpu_suspend(state[index - 1],
+				    virt_to_phys(cpu_resume));
+}
+
+static int __maybe_unused cpu_psci_cpu_suspend(unsigned long index)
+{
+	int ret;
+	struct psci_power_state *state = __get_cpu_var(psci_power_state);
+	/*
+	 * idle state index 0 corresponds to wfi, should never be called
+	 * from the cpu_suspend operations
+	 */
+	if (WARN_ON_ONCE(!index))
+		return -EINVAL;
+
+	if (state->type == PSCI_POWER_STATE_TYPE_STANDBY)
+		ret = psci_ops.cpu_suspend(state[index - 1], 0);
+	else
+		ret = __cpu_suspend(index, psci_suspend_finisher);
+
+	return ret;
+}
+
 const struct cpu_operations cpu_psci_ops = {
 const struct cpu_operations cpu_psci_ops = {
 	.name		= "psci",
 	.name		= "psci",
+#ifdef CONFIG_CPU_IDLE
+	.cpu_init_idle	= cpu_psci_cpu_init_idle,
+	.cpu_suspend	= cpu_psci_cpu_suspend,
+#endif
 #ifdef CONFIG_SMP
 #ifdef CONFIG_SMP
 	.cpu_init	= cpu_psci_cpu_init,
 	.cpu_init	= cpu_psci_cpu_init,
 	.cpu_prepare	= cpu_psci_cpu_prepare,
 	.cpu_prepare	= cpu_psci_cpu_prepare,

+ 34 - 13
arch/arm64/kernel/sleep.S

@@ -49,28 +49,39 @@
 	orr	\dst, \dst, \mask		// dst|=(aff3>>rs3)
 	orr	\dst, \dst, \mask		// dst|=(aff3>>rs3)
 	.endm
 	.endm
 /*
 /*
- * Save CPU state for a suspend.  This saves callee registers, and allocates
- * space on the kernel stack to save the CPU specific registers + some
- * other data for resume.
+ * Save CPU state for a suspend and execute the suspend finisher.
+ * On success it will return 0 through cpu_resume - ie through a CPU
+ * soft/hard reboot from the reset vector.
+ * On failure it returns the suspend finisher return value or force
+ * -EOPNOTSUPP if the finisher erroneously returns 0 (the suspend finisher
+ * is not allowed to return, if it does this must be considered failure).
+ * It saves callee registers, and allocates space on the kernel stack
+ * to save the CPU specific registers + some other data for resume.
  *
  *
  *  x0 = suspend finisher argument
  *  x0 = suspend finisher argument
+ *  x1 = suspend finisher function pointer
  */
  */
-ENTRY(__cpu_suspend)
+ENTRY(__cpu_suspend_enter)
 	stp	x29, lr, [sp, #-96]!
 	stp	x29, lr, [sp, #-96]!
 	stp	x19, x20, [sp,#16]
 	stp	x19, x20, [sp,#16]
 	stp	x21, x22, [sp,#32]
 	stp	x21, x22, [sp,#32]
 	stp	x23, x24, [sp,#48]
 	stp	x23, x24, [sp,#48]
 	stp	x25, x26, [sp,#64]
 	stp	x25, x26, [sp,#64]
 	stp	x27, x28, [sp,#80]
 	stp	x27, x28, [sp,#80]
+	/*
+	 * Stash suspend finisher and its argument in x20 and x19
+	 */
+	mov	x19, x0
+	mov	x20, x1
 	mov	x2, sp
 	mov	x2, sp
 	sub	sp, sp, #CPU_SUSPEND_SZ	// allocate cpu_suspend_ctx
 	sub	sp, sp, #CPU_SUSPEND_SZ	// allocate cpu_suspend_ctx
-	mov	x1, sp
+	mov	x0, sp
 	/*
 	/*
-	 * x1 now points to struct cpu_suspend_ctx allocated on the stack
+	 * x0 now points to struct cpu_suspend_ctx allocated on the stack
 	 */
 	 */
-	str	x2, [x1, #CPU_CTX_SP]
-	ldr	x2, =sleep_save_sp
-	ldr	x2, [x2, #SLEEP_SAVE_SP_VIRT]
+	str	x2, [x0, #CPU_CTX_SP]
+	ldr	x1, =sleep_save_sp
+	ldr	x1, [x1, #SLEEP_SAVE_SP_VIRT]
 #ifdef CONFIG_SMP
 #ifdef CONFIG_SMP
 	mrs	x7, mpidr_el1
 	mrs	x7, mpidr_el1
 	ldr	x9, =mpidr_hash
 	ldr	x9, =mpidr_hash
@@ -82,11 +93,21 @@ ENTRY(__cpu_suspend)
 	ldp	w3, w4, [x9, #MPIDR_HASH_SHIFTS]
 	ldp	w3, w4, [x9, #MPIDR_HASH_SHIFTS]
 	ldp	w5, w6, [x9, #(MPIDR_HASH_SHIFTS + 8)]
 	ldp	w5, w6, [x9, #(MPIDR_HASH_SHIFTS + 8)]
 	compute_mpidr_hash x8, x3, x4, x5, x6, x7, x10
 	compute_mpidr_hash x8, x3, x4, x5, x6, x7, x10
-	add	x2, x2, x8, lsl #3
+	add	x1, x1, x8, lsl #3
 #endif
 #endif
-	bl	__cpu_suspend_finisher
+	bl	__cpu_suspend_save
+	/*
+	 * Grab suspend finisher in x20 and its argument in x19
+	 */
+	mov	x0, x19
+	mov	x1, x20
+	/*
+	 * We are ready for power down, fire off the suspend finisher
+	 * in x1, with argument in x0
+	 */
+	blr	x1
         /*
         /*
-	 * Never gets here, unless suspend fails.
+	 * Never gets here, unless suspend finisher fails.
 	 * Successful cpu_suspend should return from cpu_resume, returning
 	 * Successful cpu_suspend should return from cpu_resume, returning
 	 * through this code path is considered an error
 	 * through this code path is considered an error
 	 * If the return value is set to 0 force x0 = -EOPNOTSUPP
 	 * If the return value is set to 0 force x0 = -EOPNOTSUPP
@@ -103,7 +124,7 @@ ENTRY(__cpu_suspend)
 	ldp	x27, x28, [sp, #80]
 	ldp	x27, x28, [sp, #80]
 	ldp	x29, lr, [sp], #96
 	ldp	x29, lr, [sp], #96
 	ret
 	ret
-ENDPROC(__cpu_suspend)
+ENDPROC(__cpu_suspend_enter)
 	.ltorg
 	.ltorg
 
 
 /*
 /*

+ 29 - 19
arch/arm64/kernel/suspend.c

@@ -9,22 +9,19 @@
 #include <asm/suspend.h>
 #include <asm/suspend.h>
 #include <asm/tlbflush.h>
 #include <asm/tlbflush.h>
 
 
-extern int __cpu_suspend(unsigned long);
+extern int __cpu_suspend_enter(unsigned long arg, int (*fn)(unsigned long));
 /*
 /*
- * This is called by __cpu_suspend() to save the state, and do whatever
+ * This is called by __cpu_suspend_enter() to save the state, and do whatever
  * flushing is required to ensure that when the CPU goes to sleep we have
  * flushing is required to ensure that when the CPU goes to sleep we have
  * the necessary data available when the caches are not searched.
  * the necessary data available when the caches are not searched.
  *
  *
- * @arg: Argument to pass to suspend operations
- * @ptr: CPU context virtual address
- * @save_ptr: address of the location where the context physical address
- *            must be saved
+ * ptr: CPU context virtual address
+ * save_ptr: address of the location where the context physical address
+ *           must be saved
  */
  */
-int __cpu_suspend_finisher(unsigned long arg, struct cpu_suspend_ctx *ptr,
-			   phys_addr_t *save_ptr)
+void notrace __cpu_suspend_save(struct cpu_suspend_ctx *ptr,
+				phys_addr_t *save_ptr)
 {
 {
-	int cpu = smp_processor_id();
-
 	*save_ptr = virt_to_phys(ptr);
 	*save_ptr = virt_to_phys(ptr);
 
 
 	cpu_do_suspend(ptr);
 	cpu_do_suspend(ptr);
@@ -35,8 +32,6 @@ int __cpu_suspend_finisher(unsigned long arg, struct cpu_suspend_ctx *ptr,
 	 */
 	 */
 	__flush_dcache_area(ptr, sizeof(*ptr));
 	__flush_dcache_area(ptr, sizeof(*ptr));
 	__flush_dcache_area(save_ptr, sizeof(*save_ptr));
 	__flush_dcache_area(save_ptr, sizeof(*save_ptr));
-
-	return cpu_ops[cpu]->cpu_suspend(arg);
 }
 }
 
 
 /*
 /*
@@ -56,15 +51,15 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
 }
 }
 
 
 /**
 /**
- * cpu_suspend
+ * cpu_suspend() - function to enter a low-power state
+ * @arg: argument to pass to CPU suspend operations
  *
  *
- * @arg: argument to pass to the finisher function
+ * Return: 0 on success, -EOPNOTSUPP if CPU suspend hook not initialized, CPU
+ * operations back-end error code otherwise.
  */
  */
 int cpu_suspend(unsigned long arg)
 int cpu_suspend(unsigned long arg)
 {
 {
-	struct mm_struct *mm = current->active_mm;
-	int ret, cpu = smp_processor_id();
-	unsigned long flags;
+	int cpu = smp_processor_id();
 
 
 	/*
 	/*
 	 * If cpu_ops have not been registered or suspend
 	 * If cpu_ops have not been registered or suspend
@@ -72,6 +67,21 @@ int cpu_suspend(unsigned long arg)
 	 */
 	 */
 	if (!cpu_ops[cpu] || !cpu_ops[cpu]->cpu_suspend)
 	if (!cpu_ops[cpu] || !cpu_ops[cpu]->cpu_suspend)
 		return -EOPNOTSUPP;
 		return -EOPNOTSUPP;
+	return cpu_ops[cpu]->cpu_suspend(arg);
+}
+
+/*
+ * __cpu_suspend
+ *
+ * arg: argument to pass to the finisher function
+ * fn: finisher function pointer
+ *
+ */
+int __cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
+{
+	struct mm_struct *mm = current->active_mm;
+	int ret;
+	unsigned long flags;
 
 
 	/*
 	/*
 	 * From this point debug exceptions are disabled to prevent
 	 * From this point debug exceptions are disabled to prevent
@@ -86,7 +96,7 @@ int cpu_suspend(unsigned long arg)
 	 * page tables, so that the thread address space is properly
 	 * page tables, so that the thread address space is properly
 	 * set-up on function return.
 	 * set-up on function return.
 	 */
 	 */
-	ret = __cpu_suspend(arg);
+	ret = __cpu_suspend_enter(arg, fn);
 	if (ret == 0) {
 	if (ret == 0) {
 		cpu_switch_mm(mm->pgd, mm);
 		cpu_switch_mm(mm->pgd, mm);
 		flush_tlb_all();
 		flush_tlb_all();
@@ -95,7 +105,7 @@ int cpu_suspend(unsigned long arg)
 		 * Restore per-cpu offset before any kernel
 		 * Restore per-cpu offset before any kernel
 		 * subsystem relying on it has a chance to run.
 		 * subsystem relying on it has a chance to run.
 		 */
 		 */
-		set_my_cpu_offset(per_cpu_offset(cpu));
+		set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
 
 
 		/*
 		/*
 		 * Restore HW breakpoint registers to sane values
 		 * Restore HW breakpoint registers to sane values