Power management updates for 5.2-rc1

- Fix the handling of Performance and Energy Bias Hint (EPB) on
    Intel processors and expose it to user space via sysfs to avoid
    having to access it through the generic MSR I/F (Rafael Wysocki).
 
  - Improve the handling of global turbo changes made by the platform
    firmware in the intel_pstate driver (Rafael Wysocki).
 
  - Convert some slow-path static_cpu_has() callers to boot_cpu_has()
    in cpufreq (Borislav Petkov).
 
  - Fix the frequency calculation loop in the armada-37xx cpufreq
    driver (Gregory CLEMENT).
 
  - Fix possible object reference leaks in multuple cpufreq drivers
    (Wen Yang).
 
  - Fix kerneldoc comment in the centrino cpufreq driver (dongjian).
 
  - Clean up the ACPI and maple cpufreq drivers (Viresh Kumar, Mohan
    Kumar).
 
  - Add support for lx2160a and ls1028a to the qoriq cpufreq driver
    (Vabhav Sharma, Yuantian Tang).
 
  - Fix kobject memory leak in the cpufreq core (Viresh Kumar).
 
  - Simplify the IOwait boosting in the schedutil cpufreq governor
    and rework the TSC cpufreq notifier on x86 (Rafael Wysocki).
 
  - Clean up the cpufreq core and statistics code (Yue Hu, Kyle Lin).
 
  - Improve the cpufreq documentation, add SPDX license tags to
    some PM documentation files and unify copyright notices in
    them (Rafael Wysocki).
 
  - Add support for "CPU" domains to the generic power domains (genpd)
    framework and provide low-level PSCI firmware support for that
    feature (Ulf Hansson).
 
  - Rearrange the PSCI firmware support code and add support for
    SYSTEM_RESET2 to it (Ulf Hansson, Sudeep Holla).
 
  - Improve genpd support for devices in multiple power domains (Ulf
    Hansson).
 
  - Unify target residency for the AFTR and coupled AFTR states in the
    exynos cpuidle driver (Marek Szyprowski).
 
  - Introduce new helper routine in the operating performance points
    (OPP) framework (Andrew-sh.Cheng).
 
  - Add support for passing on-die termination (ODT) and auto power
    down parameters from the kernel to Trusted Firmware-A (TF-A) to
    the rk3399_dmc devfreq driver (Enric Balletbo i Serra).
 
  - Add tracing to devfreq (Lukasz Luba).
 
  - Make the exynos-bus devfreq driver suspend all devices on system
    shutdown (Marek Szyprowski).
 
  - Fix a few minor issues in the devfreq subsystem and clean it up
    somewhat (Enric Balletbo i Serra, MyungJoo Ham, Rob Herring,
    Saravana Kannan, Yangtao Li).
 
  - Improve system wakeup diagnostics (Stephen Boyd).
 
  - Rework filesystem sync messages emitted during system suspend and
    hibernation (Harry Pan).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAlzQEwUSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxxXwP/jrxikIXdCOV3CJVioV0NetyebwlOqYp
 UsIA7lQBfZ/DY6dHw/oKuAT9LP01vcFg6XGe83Alkta9qczR5KZ/MYHFNSZXjXjL
 kEvIMBCS/oykaBuW+Xn9am8Ke3Yq/rBSTKWVom3vzSQY0qvZ9GBwPDrzw+k63Zhz
 P3afB4ThyY0e9ftgw4HvSSNm13Kn0ItUIQOdaLatXMMcPqP5aAdnUma5Ibinbtpp
 rpTHuHKYx7MSjaCg6wl3kKTJeWbQP4wYO2ISZqH9zEwQgdvSHeFAvfPKTegUkmw9
 uUsQnPD1JvdglOKovr2muehD1Ur+zsjKDf2OKERkWsWXHPyWzA/AqaVv1mkkU++b
 KaWaJ9pE86kGlJ3EXwRbGfV0dM5rrl+dUUQW6nPI1XJnIOFlK61RzwAbqI26F0Mz
 AlKxY4jyPLcM3SpQz9iILqyzHQqB67rm29XvId/9scoGGgoqEI4S+v6LYZqI3Vx6
 aeSRu+Yof7p5w4Kg5fODX+HzrtMnMrPmLUTXhbExfsYZMi7hXURcN6s+tMpH0ckM
 4yiIpnNGCKUSV4vxHBm8XJdAuUnR4Vcz++yFslszgDVVvw5tkvF7SYeHZ6HqcQVm
 af9HdWzx3qajs/oyBwdRBedZYDnP1joC5donBI2ofLeF33NA7TEiPX8Zebw8XLkv
 fNikssA7PGdv
 =nY9p
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These fix the (Intel-specific) Performance and Energy Bias Hint (EPB)
  handling and expose it to user space via sysfs, fix and clean up
  several cpufreq drivers, add support for two new chips to the qoriq
  cpufreq driver, fix, simplify and clean up the cpufreq core and the
  schedutil governor, add support for "CPU" domains to the generic power
  domains (genpd) framework and provide low-level PSCI firmware support
  for that feature, fix the exynos cpuidle driver and fix a couple of
  issues in the devfreq subsystem and clean it up.

  Specifics:

   - Fix the handling of Performance and Energy Bias Hint (EPB) on Intel
     processors and expose it to user space via sysfs to avoid having to
     access it through the generic MSR I/F (Rafael Wysocki).

   - Improve the handling of global turbo changes made by the platform
     firmware in the intel_pstate driver (Rafael Wysocki).

   - Convert some slow-path static_cpu_has() callers to boot_cpu_has()
     in cpufreq (Borislav Petkov).

   - Fix the frequency calculation loop in the armada-37xx cpufreq
     driver (Gregory CLEMENT).

   - Fix possible object reference leaks in multuple cpufreq drivers
     (Wen Yang).

   - Fix kerneldoc comment in the centrino cpufreq driver (dongjian).

   - Clean up the ACPI and maple cpufreq drivers (Viresh Kumar, Mohan
     Kumar).

   - Add support for lx2160a and ls1028a to the qoriq cpufreq driver
     (Vabhav Sharma, Yuantian Tang).

   - Fix kobject memory leak in the cpufreq core (Viresh Kumar).

   - Simplify the IOwait boosting in the schedutil cpufreq governor and
     rework the TSC cpufreq notifier on x86 (Rafael Wysocki).

   - Clean up the cpufreq core and statistics code (Yue Hu, Kyle Lin).

   - Improve the cpufreq documentation, add SPDX license tags to some PM
     documentation files and unify copyright notices in them (Rafael
     Wysocki).

   - Add support for "CPU" domains to the generic power domains (genpd)
     framework and provide low-level PSCI firmware support for that
     feature (Ulf Hansson).

   - Rearrange the PSCI firmware support code and add support for
     SYSTEM_RESET2 to it (Ulf Hansson, Sudeep Holla).

   - Improve genpd support for devices in multiple power domains (Ulf
     Hansson).

   - Unify target residency for the AFTR and coupled AFTR states in the
     exynos cpuidle driver (Marek Szyprowski).

   - Introduce new helper routine in the operating performance points
     (OPP) framework (Andrew-sh.Cheng).

   - Add support for passing on-die termination (ODT) and auto power
     down parameters from the kernel to Trusted Firmware-A (TF-A) to the
     rk3399_dmc devfreq driver (Enric Balletbo i Serra).

   - Add tracing to devfreq (Lukasz Luba).

   - Make the exynos-bus devfreq driver suspend all devices on system
     shutdown (Marek Szyprowski).

   - Fix a few minor issues in the devfreq subsystem and clean it up
     somewhat (Enric Balletbo i Serra, MyungJoo Ham, Rob Herring,
     Saravana Kannan, Yangtao Li).

   - Improve system wakeup diagnostics (Stephen Boyd).

   - Rework filesystem sync messages emitted during system suspend and
     hibernation (Harry Pan)"

* tag 'pm-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (72 commits)
  cpufreq: Fix kobject memleak
  cpufreq: armada-37xx: fix frequency calculation for opp
  cpufreq: centrino: Fix centrino_setpolicy() kerneldoc comment
  cpufreq: qoriq: add support for lx2160a
  x86: tsc: Rework time_cpufreq_notifier()
  PM / Domains: Allow to attach a CPU via genpd_dev_pm_attach_by_id|name()
  PM / Domains: Search for the CPU device outside the genpd lock
  PM / Domains: Drop unused in-parameter to some genpd functions
  PM / Domains: Use the base device for driver_deferred_probe_check_state()
  cpufreq: qoriq: Add ls1028a chip support
  PM / Domains: Enable genpd_dev_pm_attach_by_id|name() for single PM domain
  PM / Domains: Allow OF lookup for multi PM domain case from ->attach_dev()
  PM / Domains: Don't kfree() the virtual device in the error path
  cpufreq: Move ->get callback check outside of __cpufreq_get()
  PM / Domains: remove unnecessary unlikely()
  cpufreq: Remove needless bios_limit check in show_bios_limit()
  drivers/cpufreq/acpi-cpufreq.c: This fixes the following checkpatch warning
  firmware/psci: add support for SYSTEM_RESET2
  PM / devfreq: add tracing for scheduling work
  trace: events: add devfreq trace event file
  ...
This commit is contained in:
Linus Torvalds 2019-05-06 19:40:31 -07:00
commit 8f5e823f91
79 changed files with 1241 additions and 391 deletions

View file

@ -1,3 +1,6 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
.. |struct cpufreq_policy| replace:: :c:type:`struct cpufreq_policy <cpufreq_policy>`
.. |intel_pstate| replace:: :doc:`intel_pstate <intel_pstate>`
@ -5,9 +8,10 @@
CPU Performance Scaling
=======================
::
:Copyright: |copy| 2017 Intel Corporation
:Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Copyright (c) 2017 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The Concept of CPU Performance Scaling
======================================
@ -396,8 +400,8 @@ RT or deadline scheduling classes, the governor will increase the frequency to
the allowed maximum (that is, the ``scaling_max_freq`` policy limit). In turn,
if it is invoked by the CFS scheduling class, the governor will use the
Per-Entity Load Tracking (PELT) metric for the root control group of the
given CPU as the CPU utilization estimate (see the `Per-entity load tracking`_
LWN.net article for a description of the PELT mechanism). Then, the new
given CPU as the CPU utilization estimate (see the *Per-entity load tracking*
LWN.net article [1]_ for a description of the PELT mechanism). Then, the new
CPU frequency to apply is computed in accordance with the formula
f = 1.25 * ``f_0`` * ``util`` / ``max``
@ -698,4 +702,8 @@ hardware feature (e.g. all Intel ones), even if the
:c:macro:`CONFIG_X86_ACPI_CPUFREQ_CPB` configuration option is set.
.. _Per-entity load tracking: https://lwn.net/Articles/531853/
References
==========
.. [1] Jonathan Corbet, *Per-entity load tracking*,
https://lwn.net/Articles/531853/

View file

@ -1,3 +1,6 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
.. |struct cpuidle_state| replace:: :c:type:`struct cpuidle_state <cpuidle_state>`
.. |cpufreq| replace:: :doc:`CPU Performance Scaling <cpufreq>`
@ -5,9 +8,10 @@
CPU Idle Time Management
========================
::
:Copyright: |copy| 2018 Intel Corporation
:Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Copyright (c) 2018 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Concepts
========

View file

@ -1,3 +1,5 @@
.. SPDX-License-Identifier: GPL-2.0
================
Power Management
================

View file

@ -0,0 +1,41 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
======================================
Intel Performance and Energy Bias Hint
======================================
:Copyright: |copy| 2019 Intel Corporation
:Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
.. kernel-doc:: arch/x86/kernel/cpu/intel_epb.c
:doc: overview
Intel Performance and Energy Bias Attribute in ``sysfs``
========================================================
The Intel Performance and Energy Bias Hint (EPB) value for a given (logical) CPU
can be checked or updated through a ``sysfs`` attribute (file) under
:file:`/sys/devices/system/cpu/cpu<N>/power/`, where the CPU number ``<N>``
is allocated at the system initialization time:
``energy_perf_bias``
Shows the current EPB value for the CPU in a sliding scale 0 - 15, where
a value of 0 corresponds to a hint preference for highest performance
and a value of 15 corresponds to the maximum energy savings.
In order to update the EPB value for the CPU, this attribute can be
written to, either with a number in the 0 - 15 sliding scale above, or
with one of the strings: "performance", "balance-performance", "normal",
"balance-power", "power" that represent values reflected by their
meaning.
This attribute is present for all online CPUs supporting the EPB
feature.
Note that while the EPB interface to the processor is defined at the logical CPU
level, the physical register backing it may be shared by multiple CPUs (for
example, SMT siblings or cores in one package). For this reason, updating the
EPB value for one CPU may cause the EPB values for other CPUs to change.

View file

@ -1,10 +1,13 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
===============================================
``intel_pstate`` CPU Performance Scaling Driver
===============================================
::
:Copyright: |copy| 2017 Intel Corporation
Copyright (c) 2017 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
:Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
General Information
@ -20,11 +23,10 @@ you have not done that yet.]
For the processors supported by ``intel_pstate``, the P-state concept is broader
than just an operating frequency or an operating performance point (see the
`LinuxCon Europe 2015 presentation by Kristen Accardi <LCEU2015_>`_ for more
LinuxCon Europe 2015 presentation by Kristen Accardi [1]_ for more
information about that). For this reason, the representation of P-states used
by ``intel_pstate`` internally follows the hardware specification (for details
refer to `Intel® 64 and IA-32 Architectures Software Developers Manual
Volume 3: System Programming Guide <SDM_>`_). However, the ``CPUFreq`` core
refer to Intel Software Developers Manual [2]_). However, the ``CPUFreq`` core
uses frequencies for identifying operating performance points of CPUs and
frequencies are involved in the user space interface exposed by it, so
``intel_pstate`` maps its internal representation of P-states to frequencies too
@ -561,9 +563,9 @@ or to pin every task potentially sensitive to them to a specific CPU.]
On the majority of systems supported by ``intel_pstate``, the ACPI tables
provided by the platform firmware contain ``_PSS`` objects returning information
that can be used for CPU performance scaling (refer to the `ACPI specification`_
for details on the ``_PSS`` objects and the format of the information returned
by them).
that can be used for CPU performance scaling (refer to the ACPI specification
[3]_ for details on the ``_PSS`` objects and the format of the information
returned by them).
The information returned by the ACPI ``_PSS`` objects is used by the
``acpi-cpufreq`` scaling driver. On systems supported by ``intel_pstate``
@ -728,6 +730,14 @@ P-state is called, the ``ftrace`` filter can be set to to
<idle>-0 [000] ..s. 2537.654843: intel_pstate_set_pstate <-intel_pstate_timer_func
.. _LCEU2015: http://events.linuxfoundation.org/sites/events/files/slides/LinuxConEurope_2015.pdf
.. _SDM: http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html
.. _ACPI specification: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
References
==========
.. [1] Kristen Accardi, *Balancing Power and Performance in the Linux Kernel*,
http://events.linuxfoundation.org/sites/events/files/slides/LinuxConEurope_2015.pdf
.. [2] *Intel® 64 and IA-32 Architectures Software Developers Manual Volume 3: System Programming Guide*,
http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html
.. [3] *Advanced Configuration and Power Interface Specification*,
https://uefi.org/sites/default/files/resources/ACPI_6_3_final_Jan30.pdf

View file

@ -1,10 +1,14 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
===================
System Sleep States
===================
::
:Copyright: |copy| 2017 Intel Corporation
:Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Copyright (c) 2017 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sleep states are global low-power states of the entire system in which user
space code cannot be executed and the overall system activity is significantly

View file

@ -1,10 +1,14 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
===========================
Power Management Strategies
===========================
::
:Copyright: |copy| 2017 Intel Corporation
:Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Copyright (c) 2017 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The Linux kernel supports two major high-level power management strategies.

View file

@ -1,3 +1,5 @@
.. SPDX-License-Identifier: GPL-2.0
============================
System-Wide Power Management
============================

View file

@ -1,3 +1,5 @@
.. SPDX-License-Identifier: GPL-2.0
==============================
Working-State Power Management
==============================
@ -8,3 +10,4 @@ Working-State Power Management
cpuidle
cpufreq
intel_pstate
intel_epb