Merge drm/drm-next into drm-misc-next

Backmerging to get v6.5-rc2.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
This commit is contained in:
Thomas Zimmermann 2023-07-24 15:44:47 +02:00
commit 61b7369483
13117 changed files with 448659 additions and 149115 deletions

View file

@ -254,6 +254,7 @@ ForEachMacros:
- 'for_each_free_mem_range'
- 'for_each_free_mem_range_reverse'
- 'for_each_func_rsrc'
- 'for_each_group_device'
- 'for_each_group_evsel'
- 'for_each_group_member'
- 'for_each_hstate'

1
.gitattributes vendored
View file

@ -2,3 +2,4 @@
*.[ch] diff=cpp
*.dts diff=dts
*.dts[io] diff=dts
*.rs diff=rust

4
.gitignore vendored
View file

@ -16,7 +16,6 @@
*.bin
*.bz2
*.c.[012]*.*
*.cover
*.dt.yaml
*.dtb
*.dtbo
@ -34,7 +33,6 @@
*.lz4
*.lzma
*.lzo
*.mbx
*.mod
*.mod.c
*.o
@ -51,7 +49,6 @@
*.symversions
*.tab.[ch]
*.tar
*.usyms
*.xz
*.zst
Module.symvers
@ -112,7 +109,6 @@ modules.order
#
/include/config/
/include/generated/
/include/ksym/
/arch/*/include/generated/
# stgit generated dirs

View file

@ -5,7 +5,8 @@
# same person appearing not to be so or badly displayed. Also allows for
# old email addresses to map to new email addresses.
#
# For format details, see "MAPPING AUTHORS" in "man git-shortlog".
# For format details, see "man gitmailmap" or "MAPPING AUTHORS" in
# "man git-shortlog" on older systems.
#
# Please keep this list dictionary sorted.
#
@ -70,6 +71,8 @@ Baolin Wang <baolin.wang@linux.alibaba.com> <baolin.wang@unisoc.com>
Baolin Wang <baolin.wang@linux.alibaba.com> <baolin.wang7@gmail.com>
Bart Van Assche <bvanassche@acm.org> <bart.vanassche@sandisk.com>
Bart Van Assche <bvanassche@acm.org> <bart.vanassche@wdc.com>
Ben Dooks <ben-linux@fluff.org> <ben.dooks@simtec.co.uk>
Ben Dooks <ben-linux@fluff.org> <ben.dooks@sifive.com>
Ben Gardner <bgardner@wabtec.com>
Ben M Cahill <ben.m.cahill@intel.com>
Ben Widawsky <bwidawsk@kernel.org> <ben@bwidawsk.net>
@ -175,12 +178,17 @@ Gustavo Padovan <padovan@profusion.mobi>
Hanjun Guo <guohanjun@huawei.com> <hanjun.guo@linaro.org>
Heiko Carstens <hca@linux.ibm.com> <h.carstens@de.ibm.com>
Heiko Carstens <hca@linux.ibm.com> <heiko.carstens@de.ibm.com>
Heiko Stuebner <heiko@sntech.de> <heiko.stuebner@bqreaders.com>
Heiko Stuebner <heiko@sntech.de> <heiko.stuebner@theobroma-systems.com>
Heiko Stuebner <heiko@sntech.de> <heiko.stuebner@vrull.eu>
Henk Vergonet <Henk.Vergonet@gmail.com>
Henrik Kretzschmar <henne@nachtwindheim.de>
Henrik Rydberg <rydberg@bitmath.org>
Herbert Xu <herbert@gondor.apana.org.au>
Huacai Chen <chenhuacai@kernel.org> <chenhc@lemote.com>
Huacai Chen <chenhuacai@kernel.org> <chenhuacai@loongson.cn>
J. Bruce Fields <bfields@fieldses.org> <bfields@redhat.com>
J. Bruce Fields <bfields@fieldses.org> <bfields@citi.umich.edu>
Jacob Shin <Jacob.Shin@amd.com>
Jaegeuk Kim <jaegeuk@kernel.org> <jaegeuk@google.com>
Jaegeuk Kim <jaegeuk@kernel.org> <jaegeuk.kim@samsung.com>
@ -238,6 +246,7 @@ John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
John Stultz <johnstul@us.ibm.com>
<jon.toppins+linux@gmail.com> <jtoppins@cumulusnetworks.com>
<jon.toppins+linux@gmail.com> <jtoppins@redhat.com>
Jonas Gorski <jonas.gorski@gmail.com> <jogo@openwrt.org>
Jordan Crouse <jordan@cosmicpenguin.net> <jcrouse@codeaurora.org>
<josh@joshtriplett.org> <josh@freedesktop.org>
<josh@joshtriplett.org> <josh@kernel.org>
@ -271,6 +280,10 @@ Krzysztof Kozlowski <krzk@kernel.org> <k.kozlowski@samsung.com>
Krzysztof Kozlowski <krzk@kernel.org> <krzysztof.kozlowski@canonical.com>
Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Kuogee Hsieh <quic_khsieh@quicinc.com> <khsieh@codeaurora.org>
Lee Jones <lee@kernel.org> <joneslee@google.com>
Lee Jones <lee@kernel.org> <lee.jones@canonical.com>
Lee Jones <lee@kernel.org> <lee.jones@linaro.org>
Lee Jones <lee@kernel.org> <lee@ubuntu.com>
Leonard Crestez <leonard.crestez@nxp.com> Leonard Crestez <cdleonard@gmail.com>
Leonardo Bras <leobras.c@gmail.com> <leonardo@linux.ibm.com>
Leonard Göhrs <l.goehrs@pengutronix.de>
@ -297,6 +310,7 @@ Marek Behún <kabel@kernel.org> <marek.behun@nic.cz>
Marek Behún <kabel@kernel.org> Marek Behun <marek.behun@nic.cz>
Mark Brown <broonie@sirena.org.uk>
Mark Starovoytov <mstarovo@pm.me> <mstarovoitov@marvell.com>
Markus Schneider-Pargmann <msp@baylibre.com> <mpa@pengutronix.de>
Mark Yao <markyao0591@gmail.com> <mark.yao@rock-chips.com>
Martin Kepplinger <martink@posteo.de> <martin.kepplinger@ginzinger.com>
Martin Kepplinger <martink@posteo.de> <martin.kepplinger@puri.sm>

View file

@ -383,6 +383,12 @@ E: tomas@nocrew.org
W: http://tomas.nocrew.org/
D: dsp56k device driver
N: Srivatsa S. Bhat
E: srivatsa@csail.mit.edu
D: Maintainer of Generic Paravirt-Ops subsystem
D: Maintainer of VMware hypervisor interface
D: Maintainer of VMware virtual PTP clock driver (ptp_vmw)
N: Ross Biro
E: ross.biro@gmail.com
D: Original author of the Linux networking code

View file

@ -1,11 +1,11 @@
What: /sys/o2cb
Date: Dec 2005
KernelVersion: 2.6.16
Contact: ocfs2-devel@oss.oracle.com
Contact: ocfs2-devel@lists.linux.dev
Description: Ocfs2-tools looks at 'interface-revision' for versioning
information. Each logmask/ file controls a set of debug prints
and can be written into with the strings "allow", "deny", or
"off". Reading the file returns the current state.
Was renamed to /sys/fs/u2cb/
Users: ocfs2-tools. It's sufficient to mail proposed changes to
ocfs2-devel@oss.oracle.com.
ocfs2-devel@lists.linux.dev.

View file

@ -1,10 +1,10 @@
What: /sys/o2cb symlink
Date: May 2011
KernelVersion: 3.0
Contact: ocfs2-devel@oss.oracle.com
Contact: ocfs2-devel@lists.linux.dev
Description: This is a symlink: /sys/o2cb to /sys/fs/o2cb. The symlink is
removed when new versions of ocfs2-tools which know to look
in /sys/fs/o2cb are sufficiently prevalent. Don't code new
software to look here, it should try /sys/fs/o2cb instead.
Users: ocfs2-tools. It's sufficient to mail proposed changes to
ocfs2-devel@oss.oracle.com.
ocfs2-devel@lists.linux.dev.

View file

@ -1,10 +1,10 @@
What: /sys/fs/o2cb/
Date: Dec 2005
KernelVersion: 2.6.16
Contact: ocfs2-devel@oss.oracle.com
Contact: ocfs2-devel@lists.linux.dev
Description: Ocfs2-tools looks at 'interface-revision' for versioning
information. Each logmask/ file controls a set of debug prints
and can be written into with the strings "allow", "deny", or
"off". Reading the file returns the current state.
Users: ocfs2-tools. It's sufficient to mail proposed changes to
ocfs2-devel@oss.oracle.com.
ocfs2-devel@lists.linux.dev.

View file

@ -0,0 +1,7 @@
What: /sys/bus/wmi/devices/05901221-D566-11D1-B2F0-00A0C9062910[-X]/bmof
Date: Jun 2017
KernelVersion: 4.13
Description:
Binary MOF metadata used to decribe the details of available ACPI WMI interfaces.
See Documentation/wmi/devices/wmi-bmof.rst for details.

View file

@ -3,19 +3,32 @@ Date: September 2022
KernelVersion: 6.1
Contact: Armin Wolf <W_Armin@gmx.de>
Description:
This file contains the contents of the fan sensor information buffer,
which contains fan sensor entries and a terminating character (0xFF).
This file contains the contents of the fan sensor information
buffer, which contains fan sensor entries and a terminating
character (0xFF).
Each fan sensor entry consists of three bytes with an unknown meaning,
interested people may use this file for reverse-engineering.
Each fan sensor entry contains:
- fan type (single byte)
- fan speed in RPM (two bytes, little endian)
See Documentation/wmi/devices/dell-wmi-ddv.rst for details.
What: /sys/kernel/debug/dell-wmi-ddv-<wmi_device_name>/thermal_sensor_information
Date: September 2022
KernelVersion: 6.1
Contact: Armin Wolf <W_Armin@gmx.de>
Description:
This file contains the contents of the thermal sensor information buffer,
which contains thermal sensor entries and a terminating character (0xFF).
This file contains the contents of the thermal sensor information
buffer, which contains thermal sensor entries and a terminating
character (0xFF).
Each thermal sensor entry consists of five bytes with an unknown meaning,
interested people may use this file for reverse-engineering.
Each thermal sensor entry contains:
- thermal type (single byte)
- current temperature (single byte)
- min. temperature (single byte)
- max. temperature (single byte)
- unknown field (single byte)
See Documentation/wmi/devices/dell-wmi-ddv.rst for details.

View file

@ -95,3 +95,25 @@ Description:
This file does not exist if the HBA driver does not implement
support for the SATA NCQ priority feature, regardless of the
device support for this feature.
What: /sys/block/*/device/cdl_supported
Date: May, 2023
KernelVersion: v6.5
Contact: linux-scsi@vger.kernel.org
Description:
(RO) Indicates if the device supports the command duration
limits feature found in some ATA and SCSI devices.
What: /sys/block/*/device/cdl_enable
Date: May, 2023
KernelVersion: v6.5
Contact: linux-scsi@vger.kernel.org
Description:
(RW) For a device supporting the command duration limits
feature, write to the file to turn on or off the feature.
By default this feature is turned off.
Writing "1" to this file enables the use of command duration
limits for read and write commands in the kernel and turns on
the feature on the device. Writing "0" disables the feature.

View file

@ -90,6 +90,60 @@ Description:
counter does not freeze at the boundary points, but
counts continuously throughout.
interrupt on terminal count:
The output signal is initially low, and will remain low
until the counter reaches zero. The output signal then
goes high and remains high until a new preset value is
set.
hardware retriggerable one-shot:
The output signal is initially high. The output signal
will go low by a trigger input signal, and will remain
low until the counter reaches zero. The output will then
go high and remain high until the next trigger. A
trigger results in loading the counter to the preset
value and setting the output signal low, thus starting
the one-shot pulse.
rate generator:
The output signal is initially high. When the counter
has decremented to 1, the output signal goes low for one
clock pulse. The output signal then goes high again, the
counter is reloaded to the preset value, and the process
repeats in a periodic manner as such.
square wave mode:
The output signal is initially high.
If the initial count is even, the counter is decremented
by two on succeeding clock pulses. When the count
expires, the output signal changes value and the
counter is reloaded to the preset value. The process
repeats in periodic manner as such.
If the initial count is odd, the initial count minus one
(an even number) is loaded and then is decremented by
two on succeeding clock pulses. One clock pulse after
the count expires, the output signal goes low and the
counter is reloaded to the preset value minus one.
Succeeding clock pulses decrement the count by two. When
the count expires, the output goes high again and the
counter is reloaded to the preset value minus one. The
process repeats in a periodic manner as such.
software triggered strobe:
The output signal is initially high. When the count
expires, the output will go low for one clock pulse and
then go high again. The counting sequence is "triggered"
by setting the preset value.
hardware triggered strobe:
The output signal is initially high. Counting is started
by a trigger input signal. When the count expires, the
output signal will go low for one clock pulse and then
go high again. A trigger results in loading the counter
to the preset value.
What: /sys/bus/counter/devices/counterX/countY/count_mode_available
What: /sys/bus/counter/devices/counterX/countY/error_noise_available
What: /sys/bus/counter/devices/counterX/countY/function_available

View file

@ -58,6 +58,54 @@ Description:
affinity for this device.
What: /sys/bus/cxl/devices/memX/security/state
Date: June, 2023
KernelVersion: v6.5
Contact: linux-cxl@vger.kernel.org
Description:
(RO) Reading this file will display the CXL security state for
that device. Such states can be: 'disabled', 'sanitize', when
a sanitization is currently underway; or those available only
for persistent memory: 'locked', 'unlocked' or 'frozen'. This
sysfs entry is select/poll capable from userspace to notify
upon completion of a sanitize operation.
What: /sys/bus/cxl/devices/memX/security/sanitize
Date: June, 2023
KernelVersion: v6.5
Contact: linux-cxl@vger.kernel.org
Description:
(WO) Write a boolean 'true' string value to this attribute to
sanitize the device to securely re-purpose or decommission it.
This is done by ensuring that all user data and meta-data,
whether it resides in persistent capacity, volatile capacity,
or the LSA, is made permanently unavailable by whatever means
is appropriate for the media type. This functionality requires
the device to be not be actively decoding any HPA ranges.
What /sys/bus/cxl/devices/memX/security/erase
Date: June, 2023
KernelVersion: v6.5
Contact: linux-cxl@vger.kernel.org
Description:
(WO) Write a boolean 'true' string value to this attribute to
secure erase user data by changing the media encryption keys for
all user data areas of the device.
What: /sys/bus/cxl/devices/memX/firmware/
Date: April, 2023
KernelVersion: v6.5
Contact: linux-cxl@vger.kernel.org
Description:
(RW) Firmware uploader mechanism. The different files under
this directory can be used to upload and activate new
firmware for CXL devices. The interfaces under this are
documented in sysfs-class-firmware.
What: /sys/bus/cxl/devices/*/devtype
Date: June, 2021
KernelVersion: v5.14

View file

@ -292,6 +292,16 @@ Description:
which is marked with early_stop has failed to initialize, it will ignore
all future connections until this attribute is clear.
What: /sys/bus/usb/devices/.../<hub_interface>/port<X>/state
Date: June 2023
Contact: Roy Luo <royluo@google.com>
Description:
Indicates current state of the USB device attached to the port.
Valid states are: 'not-attached', 'attached', 'powered',
'reconnecting', 'unauthenticated', 'default', 'addressed',
'configured', and 'suspended'. This file supports poll() to
monitor the state change from user space.
What: /sys/bus/usb/devices/.../power/usb2_lpm_l1_timeout
Date: May 2013
Contact: Mathias Nyman <mathias.nyman@linux.intel.com>

View file

@ -243,8 +243,8 @@ Description:
index:
Used with HDD and NVME authentication to set the drive index
that is being referenced (e.g hdd0, hdd1 etc)
This attribute defaults to device 0.
that is being referenced (e.g hdd1, hdd2 etc)
This attribute defaults to device 1.
certificate, signature, save_signature:
These attributes are used for certificate based authentication. This is

View file

@ -0,0 +1,5 @@
What: /sys/class/leds/<led>/dim
Date: May 2023
Description: 64-level DIM current. If you write a negative value or
"auto", the dim will be calculated according to the
brightness.

View file

@ -13,6 +13,11 @@ Description:
Specifies the duration of the LED blink in milliseconds.
Defaults to 50 ms.
With hw_control ON, the interval value MUST be set to the
default value and cannot be changed.
Trying to set any value in this specific mode will return
an EINVAL error.
What: /sys/class/leds/<led>/link
Date: Dec 2017
KernelVersion: 4.16
@ -39,6 +44,9 @@ Description:
If set to 1, the LED will blink for the milliseconds specified
in interval to signal transmission.
With hw_control ON, the blink interval is controlled by hardware
and won't reflect the value set in interval.
What: /sys/class/leds/<led>/rx
Date: Dec 2017
KernelVersion: 4.16
@ -50,3 +58,84 @@ Description:
If set to 1, the LED will blink for the milliseconds specified
in interval to signal reception.
With hw_control ON, the blink interval is controlled by hardware
and won't reflect the value set in interval.
What: /sys/class/leds/<led>/hw_control
Date: Jun 2023
KernelVersion: 6.5
Contact: linux-leds@vger.kernel.org
Description:
Communicate whether the LED trigger modes are driven by hardware
or software fallback is used.
If 0, the LED is using software fallback to blink.
If 1, the LED is using hardware control to blink and signal the
requested modes.
What: /sys/class/leds/<led>/link_10
Date: Jun 2023
KernelVersion: 6.5
Contact: linux-leds@vger.kernel.org
Description:
Signal the link speed state of 10Mbps of the named network device.
If set to 0 (default), the LED's normal state is off.
If set to 1, the LED's normal state reflects the link state
speed of 10MBps of the named network device.
Setting this value also immediately changes the LED state.
What: /sys/class/leds/<led>/link_100
Date: Jun 2023
KernelVersion: 6.5
Contact: linux-leds@vger.kernel.org
Description:
Signal the link speed state of 100Mbps of the named network device.
If set to 0 (default), the LED's normal state is off.
If set to 1, the LED's normal state reflects the link state
speed of 100Mbps of the named network device.
Setting this value also immediately changes the LED state.
What: /sys/class/leds/<led>/link_1000
Date: Jun 2023
KernelVersion: 6.5
Contact: linux-leds@vger.kernel.org
Description:
Signal the link speed state of 1000Mbps of the named network device.
If set to 0 (default), the LED's normal state is off.
If set to 1, the LED's normal state reflects the link state
speed of 1000Mbps of the named network device.
Setting this value also immediately changes the LED state.
What: /sys/class/leds/<led>/half_duplex
Date: Jun 2023
KernelVersion: 6.5
Contact: linux-leds@vger.kernel.org
Description:
Signal the link half duplex state of the named network device.
If set to 0 (default), the LED's normal state is off.
If set to 1, the LED's normal state reflects the link half
duplex state of the named network device.
Setting this value also immediately changes the LED state.
What: /sys/class/leds/<led>/full_duplex
Date: Jun 2023
KernelVersion: 6.5
Contact: linux-leds@vger.kernel.org
Description:
Signal the link full duplex state of the named network device.
If set to 0 (default), the LED's normal state is off.
If set to 1, the LED's normal state reflects the link full
duplex state of the named network device.
Setting this value also immediately changes the LED state.

View file

@ -62,7 +62,7 @@ Description:
What: /sys/class/net/<iface>/qmi/pass_through
Date: January 2021
KernelVersion: 5.12
Contact: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Contact: Subash Abhinov Kasiviswanathan <quic_subashab@quicinc.com>
Description:
Boolean. Default: 'N'

View file

@ -59,3 +59,55 @@ Description: (RW) Control the allocated buffer watermark of outbound packets.
The available tune data is [0, 1, 2]. Writing a negative value
will return an error, and out of range values will be converted
to 2. The value indicates a probable level of the event.
What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/root_port_filters
Date: May 2023
KernelVersion: 6.5
Contact: Yicong Yang <yangyicong@hisilicon.com>
Description: This directory contains the files providing the PCIe Root Port filters
information used for PTT trace. Each file is named after the supported
Root Port device name <domain>:<bus>:<device>.<function>.
See the description of the "filter" in Documentation/trace/hisi-ptt.rst
for more information.
What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/root_port_filters/multiselect
Date: May 2023
KernelVersion: 6.5
Contact: Yicong Yang <yangyicong@hisilicon.com>
Description: (Read) Indicates if this kind of filter can be selected at the same
time as others filters, or must be used on it's own. 1 indicates
the former case and 0 indicates the latter.
What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/root_port_filters/<bdf>
Date: May 2023
KernelVersion: 6.5
Contact: Yicong Yang <yangyicong@hisilicon.com>
Description: (Read) Indicates the filter value of this Root Port filter, which
can be used to control the TLP headers to trace by the PTT trace.
What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/requester_filters
Date: May 2023
KernelVersion: 6.5
Contact: Yicong Yang <yangyicong@hisilicon.com>
Description: This directory contains the files providing the PCIe Requester filters
information used for PTT trace. Each file is named after the supported
Endpoint device name <domain>:<bus>:<device>.<function>.
See the description of the "filter" in Documentation/trace/hisi-ptt.rst
for more information.
What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/requester_filters/multiselect
Date: May 2023
KernelVersion: 6.5
Contact: Yicong Yang <yangyicong@hisilicon.com>
Description: (Read) Indicates if this kind of filter can be selected at the same
time as others filters, or must be used on it's own. 1 indicates
the former case and 0 indicates the latter.
What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/requester_filters/<bdf>
Date: May 2023
KernelVersion: 6.5
Contact: Yicong Yang <yangyicong@hisilicon.com>
Description: (Read) Indicates the filter value of this Requester filter, which
can be used to control the TLP headers to trace by the PTT trace.

View file

@ -670,7 +670,7 @@ Description: Preferred MTE tag checking mode
"async" Prefer asynchronous mode
================ ==============================================
See also: Documentation/arm64/memory-tagging-extension.rst
See also: Documentation/arch/arm64/memory-tagging-extension.rst
What: /sys/devices/system/cpu/nohz_full
Date: Apr 2015

View file

@ -1,4 +1,4 @@
What: /sys/bus/platform/drivers/eud/.../enable
What: /sys/bus/platform/drivers/qcom_eud/.../enable
Date: February 2022
Contact: Souradeep Chowdhury <quic_schowdhu@quicinc.com>
Description:

View file

@ -27,7 +27,18 @@ Description: (RW) Reports the current configuration of the QAT device.
* sym;asym: the device is configured for running crypto
services
* asym;sym: identical to sym;asym
* dc: the device is configured for running compression services
* sym: the device is configured for running symmetric crypto
services
* asym: the device is configured for running asymmetric crypto
services
* asym;dc: the device is configured for running asymmetric
crypto services and compression services
* dc;asym: identical to asym;dc
* sym;dc: the device is configured for running symmetric crypto
services and compression services
* dc;sym: identical to sym;dc
It is possible to set the configuration only if the device
is in the `down` state (see /sys/bus/pci/devices/<BDF>/qat/state)
@ -47,3 +58,38 @@ Description: (RW) Reports the current configuration of the QAT device.
dc
This attribute is only available for qat_4xxx devices.
What: /sys/bus/pci/devices/<BDF>/qat/pm_idle_enabled
Date: June 2023
KernelVersion: 6.5
Contact: qat-linux@intel.com
Description: (RW) This configuration option provides a way to force the device into remaining in
the MAX power state.
If idle support is enabled the device will transition to the `MIN` power state when
idle, otherwise will stay in the MAX power state.
Write to the file to enable or disable idle support.
The values are:
* 0: idle support is disabled
* 1: idle support is enabled
Default value is 1.
It is possible to set the pm_idle_enabled value only if the device
is in the `down` state (see /sys/bus/pci/devices/<BDF>/qat/state)
The following example shows how to change the pm_idle_enabled of
a device::
# cat /sys/bus/pci/devices/<BDF>/qat/state
up
# cat /sys/bus/pci/devices/<BDF>/qat/pm_idle_enabled
1
# echo down > /sys/bus/pci/devices/<BDF>/qat/state
# echo 0 > /sys/bus/pci/devices/<BDF>/qat/pm_idle_enabled
# echo up > /sys/bus/pci/devices/<BDF>/qat/state
# cat /sys/bus/pci/devices/<BDF>/qat/pm_idle_enabled
0
This attribute is only available for qat_4xxx devices.

View file

@ -994,7 +994,7 @@ Description: This file shows the amount of physical memory needed
What: /sys/bus/platform/drivers/ufshcd/*/rpm_lvl
What: /sys/bus/platform/devices/*.ufs/rpm_lvl
Date: September 2014
Contact: Subhash Jadavani <subhashj@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This entry could be used to set or show the UFS device
runtime power management level. The current driver
implementation supports 7 levels with next target states:
@ -1021,7 +1021,7 @@ Description: This entry could be used to set or show the UFS device
What: /sys/bus/platform/drivers/ufshcd/*/rpm_target_dev_state
What: /sys/bus/platform/devices/*.ufs/rpm_target_dev_state
Date: February 2018
Contact: Subhash Jadavani <subhashj@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This entry shows the target power mode of an UFS device
for the chosen runtime power management level.
@ -1030,7 +1030,7 @@ Description: This entry shows the target power mode of an UFS device
What: /sys/bus/platform/drivers/ufshcd/*/rpm_target_link_state
What: /sys/bus/platform/devices/*.ufs/rpm_target_link_state
Date: February 2018
Contact: Subhash Jadavani <subhashj@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This entry shows the target state of an UFS UIC link
for the chosen runtime power management level.
@ -1039,7 +1039,7 @@ Description: This entry shows the target state of an UFS UIC link
What: /sys/bus/platform/drivers/ufshcd/*/spm_lvl
What: /sys/bus/platform/devices/*.ufs/spm_lvl
Date: September 2014
Contact: Subhash Jadavani <subhashj@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This entry could be used to set or show the UFS device
system power management level. The current driver
implementation supports 7 levels with next target states:
@ -1066,7 +1066,7 @@ Description: This entry could be used to set or show the UFS device
What: /sys/bus/platform/drivers/ufshcd/*/spm_target_dev_state
What: /sys/bus/platform/devices/*.ufs/spm_target_dev_state
Date: February 2018
Contact: Subhash Jadavani <subhashj@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This entry shows the target power mode of an UFS device
for the chosen system power management level.
@ -1075,7 +1075,7 @@ Description: This entry shows the target power mode of an UFS device
What: /sys/bus/platform/drivers/ufshcd/*/spm_target_link_state
What: /sys/bus/platform/devices/*.ufs/spm_target_link_state
Date: February 2018
Contact: Subhash Jadavani <subhashj@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This entry shows the target state of an UFS UIC link
for the chosen system power management level.
@ -1084,7 +1084,7 @@ Description: This entry shows the target state of an UFS UIC link
What: /sys/bus/platform/drivers/ufshcd/*/monitor/monitor_enable
What: /sys/bus/platform/devices/*.ufs/monitor/monitor_enable
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows the status of performance monitor enablement
and it can be used to start/stop the monitor. When the monitor
is stopped, the performance data collected is also cleared.
@ -1092,7 +1092,7 @@ Description: This file shows the status of performance monitor enablement
What: /sys/bus/platform/drivers/ufshcd/*/monitor/monitor_chunk_size
What: /sys/bus/platform/devices/*.ufs/monitor/monitor_chunk_size
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file tells the monitor to focus on requests transferring
data of specific chunk size (in Bytes). 0 means any chunk size.
It can only be changed when monitor is disabled.
@ -1100,7 +1100,7 @@ Description: This file tells the monitor to focus on requests transferring
What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_total_sectors
What: /sys/bus/platform/devices/*.ufs/monitor/read_total_sectors
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows how many sectors (in 512 Bytes) have been
sent from device to host after monitor gets started.
@ -1109,7 +1109,7 @@ Description: This file shows how many sectors (in 512 Bytes) have been
What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_total_busy
What: /sys/bus/platform/devices/*.ufs/monitor/read_total_busy
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows how long (in micro seconds) has been spent
sending data from device to host after monitor gets started.
@ -1118,7 +1118,7 @@ Description: This file shows how long (in micro seconds) has been spent
What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_nr_requests
What: /sys/bus/platform/devices/*.ufs/monitor/read_nr_requests
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows how many read requests have been sent after
monitor gets started.
@ -1127,7 +1127,7 @@ Description: This file shows how many read requests have been sent after
What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_max
What: /sys/bus/platform/devices/*.ufs/monitor/read_req_latency_max
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows the maximum latency (in micro seconds) of
read requests after monitor gets started.
@ -1136,7 +1136,7 @@ Description: This file shows the maximum latency (in micro seconds) of
What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_min
What: /sys/bus/platform/devices/*.ufs/monitor/read_req_latency_min
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows the minimum latency (in micro seconds) of
read requests after monitor gets started.
@ -1145,7 +1145,7 @@ Description: This file shows the minimum latency (in micro seconds) of
What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_avg
What: /sys/bus/platform/devices/*.ufs/monitor/read_req_latency_avg
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows the average latency (in micro seconds) of
read requests after monitor gets started.
@ -1154,7 +1154,7 @@ Description: This file shows the average latency (in micro seconds) of
What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_sum
What: /sys/bus/platform/devices/*.ufs/monitor/read_req_latency_sum
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows the total latency (in micro seconds) of
read requests sent after monitor gets started.
@ -1163,7 +1163,7 @@ Description: This file shows the total latency (in micro seconds) of
What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_total_sectors
What: /sys/bus/platform/devices/*.ufs/monitor/write_total_sectors
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows how many sectors (in 512 Bytes) have been sent
from host to device after monitor gets started.
@ -1172,7 +1172,7 @@ Description: This file shows how many sectors (in 512 Bytes) have been sent
What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_total_busy
What: /sys/bus/platform/devices/*.ufs/monitor/write_total_busy
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows how long (in micro seconds) has been spent
sending data from host to device after monitor gets started.
@ -1181,7 +1181,7 @@ Description: This file shows how long (in micro seconds) has been spent
What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_nr_requests
What: /sys/bus/platform/devices/*.ufs/monitor/write_nr_requests
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows how many write requests have been sent after
monitor gets started.
@ -1190,7 +1190,7 @@ Description: This file shows how many write requests have been sent after
What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_req_latency_max
What: /sys/bus/platform/devices/*.ufs/monitor/write_req_latency_max
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows the maximum latency (in micro seconds) of write
requests after monitor gets started.
@ -1199,7 +1199,7 @@ Description: This file shows the maximum latency (in micro seconds) of write
What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_req_latency_min
What: /sys/bus/platform/devices/*.ufs/monitor/write_req_latency_min
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows the minimum latency (in micro seconds) of write
requests after monitor gets started.
@ -1208,7 +1208,7 @@ Description: This file shows the minimum latency (in micro seconds) of write
What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_req_latency_avg
What: /sys/bus/platform/devices/*.ufs/monitor/write_req_latency_avg
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows the average latency (in micro seconds) of write
requests after monitor gets started.
@ -1217,7 +1217,7 @@ Description: This file shows the average latency (in micro seconds) of write
What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_req_latency_sum
What: /sys/bus/platform/devices/*.ufs/monitor/write_req_latency_sum
Date: January 2021
Contact: Can Guo <cang@codeaurora.org>
Contact: Can Guo <quic_cang@quicinc.com>
Description: This file shows the total latency (in micro seconds) of write
requests after monitor gets started.
@ -1226,7 +1226,7 @@ Description: This file shows the total latency (in micro seconds) of write
What: /sys/bus/platform/drivers/ufshcd/*/device_descriptor/wb_presv_us_en
What: /sys/bus/platform/devices/*.ufs/device_descriptor/wb_presv_us_en
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: This entry shows if preserve user-space was configured
The file is read only.
@ -1234,7 +1234,7 @@ Description: This entry shows if preserve user-space was configured
What: /sys/bus/platform/drivers/ufshcd/*/device_descriptor/wb_shared_alloc_units
What: /sys/bus/platform/devices/*.ufs/device_descriptor/wb_shared_alloc_units
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: This entry shows the shared allocated units of WB buffer
The file is read only.
@ -1242,7 +1242,7 @@ Description: This entry shows the shared allocated units of WB buffer
What: /sys/bus/platform/drivers/ufshcd/*/device_descriptor/wb_type
What: /sys/bus/platform/devices/*.ufs/device_descriptor/wb_type
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: This entry shows the configured WB type.
0x1 for shared buffer mode. 0x0 for dedicated buffer mode.
@ -1251,7 +1251,7 @@ Description: This entry shows the configured WB type.
What: /sys/bus/platform/drivers/ufshcd/*/geometry_descriptor/wb_buff_cap_adj
What: /sys/bus/platform/devices/*.ufs/geometry_descriptor/wb_buff_cap_adj
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: This entry shows the total user-space decrease in shared
buffer mode.
The value of this parameter is 3 for TLC NAND when SLC mode
@ -1262,7 +1262,7 @@ Description: This entry shows the total user-space decrease in shared
What: /sys/bus/platform/drivers/ufshcd/*/geometry_descriptor/wb_max_alloc_units
What: /sys/bus/platform/devices/*.ufs/geometry_descriptor/wb_max_alloc_units
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: This entry shows the Maximum total WriteBooster Buffer size
which is supported by the entire device.
@ -1271,7 +1271,7 @@ Description: This entry shows the Maximum total WriteBooster Buffer size
What: /sys/bus/platform/drivers/ufshcd/*/geometry_descriptor/wb_max_wb_luns
What: /sys/bus/platform/devices/*.ufs/geometry_descriptor/wb_max_wb_luns
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: This entry shows the maximum number of luns that can support
WriteBooster.
@ -1280,7 +1280,7 @@ Description: This entry shows the maximum number of luns that can support
What: /sys/bus/platform/drivers/ufshcd/*/geometry_descriptor/wb_sup_red_type
What: /sys/bus/platform/devices/*.ufs/geometry_descriptor/wb_sup_red_type
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: The supportability of user space reduction mode
and preserve user space mode.
00h: WriteBooster Buffer can be configured only in
@ -1295,7 +1295,7 @@ Description: The supportability of user space reduction mode
What: /sys/bus/platform/drivers/ufshcd/*/geometry_descriptor/wb_sup_wb_type
What: /sys/bus/platform/devices/*.ufs/geometry_descriptor/wb_sup_wb_type
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: The supportability of WriteBooster Buffer type.
=== ==========================================================
@ -1310,7 +1310,7 @@ Description: The supportability of WriteBooster Buffer type.
What: /sys/bus/platform/drivers/ufshcd/*/flags/wb_enable
What: /sys/bus/platform/devices/*.ufs/flags/wb_enable
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: This entry shows the status of WriteBooster.
== ============================
@ -1323,7 +1323,7 @@ Description: This entry shows the status of WriteBooster.
What: /sys/bus/platform/drivers/ufshcd/*/flags/wb_flush_en
What: /sys/bus/platform/devices/*.ufs/flags/wb_flush_en
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: This entry shows if flush is enabled.
== =================================
@ -1336,7 +1336,7 @@ Description: This entry shows if flush is enabled.
What: /sys/bus/platform/drivers/ufshcd/*/flags/wb_flush_during_h8
What: /sys/bus/platform/devices/*.ufs/flags/wb_flush_during_h8
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: Flush WriteBooster Buffer during hibernate state.
== =================================================
@ -1351,7 +1351,7 @@ Description: Flush WriteBooster Buffer during hibernate state.
What: /sys/bus/platform/drivers/ufshcd/*/attributes/wb_avail_buf
What: /sys/bus/platform/devices/*.ufs/attributes/wb_avail_buf
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: This entry shows the amount of unused WriteBooster buffer
available.
@ -1360,7 +1360,7 @@ Description: This entry shows the amount of unused WriteBooster buffer
What: /sys/bus/platform/drivers/ufshcd/*/attributes/wb_cur_buf
What: /sys/bus/platform/devices/*.ufs/attributes/wb_cur_buf
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: This entry shows the amount of unused current buffer.
The file is read only.
@ -1368,7 +1368,7 @@ Description: This entry shows the amount of unused current buffer.
What: /sys/bus/platform/drivers/ufshcd/*/attributes/wb_flush_status
What: /sys/bus/platform/devices/*.ufs/attributes/wb_flush_status
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: This entry shows the flush operation status.
@ -1385,7 +1385,7 @@ Description: This entry shows the flush operation status.
What: /sys/bus/platform/drivers/ufshcd/*/attributes/wb_life_time_est
What: /sys/bus/platform/devices/*.ufs/attributes/wb_life_time_est
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: This entry shows an indication of the WriteBooster Buffer
lifetime based on the amount of performed program/erase cycles
@ -1399,7 +1399,7 @@ Description: This entry shows an indication of the WriteBooster Buffer
What: /sys/class/scsi_device/*/device/unit_descriptor/wb_buf_alloc_units
Date: June 2020
Contact: Asutosh Das <asutoshd@codeaurora.org>
Contact: Asutosh Das <quic_asutoshd@quicinc.com>
Description: This entry shows the configured size of WriteBooster buffer.
0400h corresponds to 4GB.
@ -1426,6 +1426,17 @@ Description: This entry shows the status of WriteBooster buffer flushing
If flushing is enabled, the device executes the flush
operation when the command queue is empty.
What: /sys/bus/platform/drivers/ufshcd/*/wb_flush_threshold
What: /sys/bus/platform/devices/*.ufs/wb_flush_threshold
Date: June 2023
Contact: Lu Hongfei <luhongfei@vivo.com>
Description:
wb_flush_threshold represents the threshold for flushing WriteBooster buffer,
whose value expressed in unit of 10% granularity, such as '1' representing 10%,
'2' representing 20%, and so on.
If avail_wb_buff < wb_flush_threshold, it indicates that WriteBooster buffer needs to
be flushed, otherwise it is not necessary.
What: /sys/bus/platform/drivers/ufshcd/*/device_descriptor/hpb_version
What: /sys/bus/platform/devices/*.ufs/device_descriptor/hpb_version
Date: June 2021

View file

@ -1,13 +1,13 @@
What: /sys/fs/ocfs2/
Date: April 2008
Contact: ocfs2-devel@oss.oracle.com
Contact: ocfs2-devel@lists.linux.dev
Description:
The /sys/fs/ocfs2 directory contains knobs used by the
ocfs2-tools to interact with the filesystem.
What: /sys/fs/ocfs2/max_locking_protocol
Date: April 2008
Contact: ocfs2-devel@oss.oracle.com
Contact: ocfs2-devel@lists.linux.dev
Description:
The /sys/fs/ocfs2/max_locking_protocol file displays version
of ocfs2 locking supported by the filesystem. This version
@ -28,7 +28,7 @@ Description:
What: /sys/fs/ocfs2/loaded_cluster_plugins
Date: April 2008
Contact: ocfs2-devel@oss.oracle.com
Contact: ocfs2-devel@lists.linux.dev
Description:
The /sys/fs/ocfs2/loaded_cluster_plugins file describes
the available plugins to support ocfs2 cluster operation.
@ -48,7 +48,7 @@ Description:
What: /sys/fs/ocfs2/active_cluster_plugin
Date: April 2008
Contact: ocfs2-devel@oss.oracle.com
Contact: ocfs2-devel@lists.linux.dev
Description:
The /sys/fs/ocfs2/active_cluster_plugin displays which
cluster plugin is currently in use by the filesystem.
@ -65,7 +65,7 @@ Description:
What: /sys/fs/ocfs2/cluster_stack
Date: April 2008
Contact: ocfs2-devel@oss.oracle.com
Contact: ocfs2-devel@lists.linux.dev
Description:
The /sys/fs/ocfs2/cluster_stack file contains the name
of current ocfs2 cluster stack. This value is set by
@ -86,4 +86,4 @@ Description:
stack return an error.
Users:
ocfs2-tools <ocfs2-tools-devel@oss.oracle.com>
ocfs2-tools <ocfs2-tools-devel@lists.linux.dev>

View file

@ -3,5 +3,7 @@ Date: September 2022
KernelVersion: 6.1
Contact: Armin Wolf <W_Armin@gmx.de>
Description:
Reports the Dell ePPID (electronic Dell Piece Part Identification)
Reports the Dell ePPID (electronic Piece Part Identification)
of the ACPI battery.
See Documentation/wmi/devices/dell-wmi-ddv.rst for details.

View file

@ -75,3 +75,12 @@ KernelVersion: 6.4
Contact: "Liming Sun <limings@nvidia.com>"
Description:
The file used to access the BlueField boot fifo.
What: /sys/bus/platform/devices/MLNXBF04:00/rsh_log
Date: May 2023
KernelVersion: 6.4
Contact: "Liming Sun <limings@nvidia.com>"
Description:
The file used to write BlueField boot log with the format
"[INFO|WARN|ERR|ASSERT ]<msg>". Log level 'INFO' is used by
default if not specified.

View file

@ -88,13 +88,10 @@ commands can be used::
# echo 0x104c > functions/pci_epf_ntb/func1/vendorid
# echo 0xb00d > functions/pci_epf_ntb/func1/deviceid
In order to configure NTB specific attributes, a new sub-directory to func1
should be created::
# mkdir functions/pci_epf_ntb/func1/pci_epf_ntb.0/
The NTB function driver will populate this directory with various attributes
that can be configured by the user::
The PCI endpoint framework also automatically creates a sub-directory in the
function attribute directory. This sub-directory has the same name as the name
of the function device and is populated with the following NTB specific
attributes that can be configured by the user::
# ls functions/pci_epf_ntb/func1/pci_epf_ntb.0/
db_count mw1 mw2 mw3 mw4 num_mws

View file

@ -84,13 +84,10 @@ commands can be used::
# echo 0x1957 > functions/pci_epf_vntb/func1/vendorid
# echo 0x0809 > functions/pci_epf_vntb/func1/deviceid
In order to configure NTB specific attributes, a new sub-directory to func1
should be created::
# mkdir functions/pci_epf_vntb/func1/pci_epf_vntb.0/
The NTB function driver will populate this directory with various attributes
that can be configured by the user::
The PCI endpoint framework also automatically creates a sub-directory in the
function attribute directory. This sub-directory has the same name as the name
of the function device and is populated with the following NTB specific
attributes that can be configured by the user::
# ls functions/pci_epf_vntb/func1/pci_epf_vntb.0/
db_count mw1 mw2 mw3 mw4 num_mws
@ -103,7 +100,7 @@ A sample configuration for NTB function is given below::
# echo 1 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/num_mws
# echo 0x100000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1
A sample configuration for virtual NTB driver for virutal PCI bus::
A sample configuration for virtual NTB driver for virtual PCI bus::
# echo 0x1957 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_vid
# echo 0x080A > functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_pid

View file

@ -290,7 +290,7 @@ PCI_IRQ_MSI or PCI_IRQ_MSIX flags.
List of device drivers MSI(-X) APIs
===================================
The PCI/MSI subystem has a dedicated C file for its exported device driver
The PCI/MSI subsystem has a dedicated C file for its exported device driver
APIs — `drivers/pci/msi/api.c`. The following functions are exported:
.. kernel-doc:: drivers/pci/msi/api.c

View file

@ -364,7 +364,7 @@ Note, however, not all failures are truly "permanent". Some are
caused by over-heating, some by a poorly seated card. Many
PCI error events are caused by software bugs, e.g. DMA's to
wild addresses or bogus split transactions due to programming
errors. See the discussion in powerpc/eeh-pci-error-recovery.txt
errors. See the discussion in Documentation/powerpc/eeh-pci-error-recovery.rst
for additional detail on real-life experience of the causes of
software errors.

View file

@ -16,62 +16,61 @@ Overview
About this guide
----------------
This guide describes the basics of the PCI Express Advanced Error
This guide describes the basics of the PCI Express (PCIe) Advanced Error
Reporting (AER) driver and provides information on how to use it, as
well as how to enable the drivers of endpoint devices to conform with
PCI Express AER driver.
well as how to enable the drivers of Endpoint devices to conform with
the PCIe AER driver.
What is the PCI Express AER Driver?
-----------------------------------
What is the PCIe AER Driver?
----------------------------
PCI Express error signaling can occur on the PCI Express link itself
or on behalf of transactions initiated on the link. PCI Express
PCIe error signaling can occur on the PCIe link itself
or on behalf of transactions initiated on the link. PCIe
defines two error reporting paradigms: the baseline capability and
the Advanced Error Reporting capability. The baseline capability is
required of all PCI Express components providing a minimum defined
required of all PCIe components providing a minimum defined
set of error reporting requirements. Advanced Error Reporting
capability is implemented with a PCI Express advanced error reporting
capability is implemented with a PCIe Advanced Error Reporting
extended capability structure providing more robust error reporting.
The PCI Express AER driver provides the infrastructure to support PCI
Express Advanced Error Reporting capability. The PCI Express AER
driver provides three basic functions:
The PCIe AER driver provides the infrastructure to support PCIe Advanced
Error Reporting capability. The PCIe AER driver provides three basic
functions:
- Gathers the comprehensive error information if errors occurred.
- Reports error to the users.
- Performs error recovery actions.
AER driver only attaches root ports which support PCI-Express AER
capability.
The AER driver only attaches to Root Ports and RCECs that support the PCIe
AER capability.
User Guide
==========
Include the PCI Express AER Root Driver into the Linux Kernel
-------------------------------------------------------------
Include the PCIe AER Root Driver into the Linux Kernel
------------------------------------------------------
The PCI Express AER Root driver is a Root Port service driver attached
to the PCI Express Port Bus driver. If a user wants to use it, the driver
has to be compiled. Option CONFIG_PCIEAER supports this capability. It
depends on CONFIG_PCIEPORTBUS, so pls. set CONFIG_PCIEPORTBUS=y and
CONFIG_PCIEAER = y.
The PCIe AER driver is a Root Port service driver attached
via the PCIe Port Bus driver. If a user wants to use it, the driver
must be compiled. It is enabled with CONFIG_PCIEAER, which
depends on CONFIG_PCIEPORTBUS.
Load PCI Express AER Root Driver
--------------------------------
Load PCIe AER Root Driver
-------------------------
Some systems have AER support in firmware. Enabling Linux AER support at
the same time the firmware handles AER may result in unpredictable
the same time the firmware handles AER would result in unpredictable
behavior. Therefore, Linux does not handle AER events unless the firmware
grants AER control to the OS via the ACPI _OSC method. See the PCI FW 3.0
grants AER control to the OS via the ACPI _OSC method. See the PCI Firmware
Specification for details regarding _OSC usage.
AER error output
----------------
When a PCIe AER error is captured, an error message will be output to
console. If it's a correctable error, it is output as a warning.
console. If it's a correctable error, it is output as an info message.
Otherwise, it is printed as an error. So users could choose different
log level to filter out correctable error messages.
@ -82,9 +81,9 @@ Below shows an example::
0000:50:00.0: [20] Unsupported Request (First)
0000:50:00.0: TLP Header: 04000001 00200a03 05010000 00050100
In the example, 'Requester ID' means the ID of the device who sends
the error message to root port. Pls. refer to pci express specs for
other fields.
In the example, 'Requester ID' means the ID of the device that sent
the error message to the Root Port. Please refer to PCIe specs for other
fields.
AER Statistics / Counters
-------------------------
@ -96,65 +95,56 @@ Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
Developer Guide
===============
To enable AER aware support requires a software driver to configure
the AER capability structure within its device and to provide callbacks.
To enable error recovery, a software driver must provide callbacks.
To support AER better, developers need understand how AER does work
firstly.
To support AER better, developers need to understand how AER works.
PCI Express errors are classified into two types: correctable errors
and uncorrectable errors. This classification is based on the impacts
PCIe errors are classified into two types: correctable errors
and uncorrectable errors. This classification is based on the impact
of those errors, which may result in degraded performance or function
failure.
Correctable errors pose no impacts on the functionality of the
interface. The PCI Express protocol can recover without any software
interface. The PCIe protocol can recover without any software
intervention or any loss of data. These errors are detected and
corrected by hardware. Unlike correctable errors, uncorrectable
corrected by hardware.
Unlike correctable errors, uncorrectable
errors impact functionality of the interface. Uncorrectable errors
can cause a particular transaction or a particular PCI Express link
can cause a particular transaction or a particular PCIe link
to be unreliable. Depending on those error conditions, uncorrectable
errors are further classified into non-fatal errors and fatal errors.
Non-fatal errors cause the particular transaction to be unreliable,
but the PCI Express link itself is fully functional. Fatal errors, on
but the PCIe link itself is fully functional. Fatal errors, on
the other hand, cause the link to be unreliable.
When AER is enabled, a PCI Express device will automatically send an
error message to the PCIe root port above it when the device captures
When PCIe error reporting is enabled, a device will automatically send an
error message to the Root Port above it when it captures
an error. The Root Port, upon receiving an error reporting message,
internally processes and logs the error message in its PCI Express
capability structure. Error information being logged includes storing
internally processes and logs the error message in its AER
Capability structure. Error information being logged includes storing
the error reporting agent's requestor ID into the Error Source
Identification Registers and setting the error bits of the Root Error
Status Register accordingly. If AER error reporting is enabled in Root
Error Command Register, the Root Port generates an interrupt if an
Status Register accordingly. If AER error reporting is enabled in the Root
Error Command Register, the Root Port generates an interrupt when an
error is detected.
Note that the errors as described above are related to the PCI Express
Note that the errors as described above are related to the PCIe
hierarchy and links. These errors do not include any device specific
errors because device specific errors will still get sent directly to
the device driver.
Configure the AER capability structure
--------------------------------------
AER aware drivers of PCI Express component need change the device
control registers to enable AER. They also could change AER registers,
including mask and severity registers. Helper function
pci_enable_pcie_error_reporting could be used to enable AER. See
section 3.3.
Provide callbacks
-----------------
callback reset_link to reset pci express link
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
callback reset_link to reset PCIe link
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This callback is used to reset the pci express physical link when a
fatal error happens. The root port aer service driver provides a
default reset_link function, but different upstream ports might
have different specifications to reset pci express link, so all
upstream ports should provide their own reset_link functions.
This callback is used to reset the PCIe physical link when a
fatal error happens. The Root Port AER service driver provides a
default reset_link function, but different Upstream Ports might
have different specifications to reset the PCIe link, so
Upstream Port drivers may provide their own reset_link functions.
Section 3.2.2.2 provides more detailed info on when to call
reset_link.
@ -162,24 +152,24 @@ reset_link.
PCI error-recovery callbacks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The PCI Express AER Root driver uses error callbacks to coordinate
The PCIe AER Root driver uses error callbacks to coordinate
with downstream device drivers associated with a hierarchy in question
when performing error recovery actions.
Data struct pci_driver has a pointer, err_handler, to point to
pci_error_handlers who consists of a couple of callback function
pointers. AER driver follows the rules defined in
pci-error-recovery.txt except pci express specific parts (e.g.
reset_link). Pls. refer to pci-error-recovery.txt for detailed
pointers. The AER driver follows the rules defined in
pci-error-recovery.rst except PCIe-specific parts (e.g.
reset_link). Please refer to pci-error-recovery.rst for detailed
definitions of the callbacks.
Below sections specify when to call the error callback functions.
The sections below specify when to call the error callback functions.
Correctable errors
~~~~~~~~~~~~~~~~~~
Correctable errors pose no impacts on the functionality of
the interface. The PCI Express protocol can recover without any
the interface. The PCIe protocol can recover without any
software intervention or any loss of data. These errors do not
require any recovery actions. The AER driver clears the device's
correctable error status register accordingly and logs these errors.
@ -190,12 +180,12 @@ Non-correctable (non-fatal and fatal) errors
If an error message indicates a non-fatal error, performing link reset
at upstream is not required. The AER driver calls error_detected(dev,
pci_channel_io_normal) to all drivers associated within a hierarchy in
question. for example::
question. For example::
EndPoint<==>DownstreamPort B<==>UpstreamPort A<==>RootPort
Endpoint <==> Downstream Port B <==> Upstream Port A <==> Root Port
If Upstream port A captures an AER error, the hierarchy consists of
Downstream port B and EndPoint.
If Upstream Port A captures an AER error, the hierarchy consists of
Downstream Port B and Endpoint.
A driver may return PCI_ERS_RESULT_CAN_RECOVER,
PCI_ERS_RESULT_DISCONNECT, or PCI_ERS_RESULT_NEED_RESET, depending on
@ -212,36 +202,11 @@ to reset the link. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER
and reset_link returns PCI_ERS_RESULT_RECOVERED, the error handling goes
to mmio_enabled.
helper functions
----------------
::
int pci_enable_pcie_error_reporting(struct pci_dev *dev);
pci_enable_pcie_error_reporting enables the device to send error
messages to root port when an error is detected. Note that devices
don't enable the error reporting by default, so device drivers need
call this function to enable it.
::
int pci_disable_pcie_error_reporting(struct pci_dev *dev);
pci_disable_pcie_error_reporting disables the device to send error
messages to root port when an error is detected.
::
int pci_aer_clear_nonfatal_status(struct pci_dev *dev);`
pci_aer_clear_nonfatal_status clears non-fatal errors in the uncorrectable
error status register.
Frequent Asked Questions
------------------------
Q:
What happens if a PCI Express device driver does not provide an
What happens if a PCIe device driver does not provide an
error recovery handler (pci_driver->err_handler is equal to NULL)?
A:
@ -257,24 +222,6 @@ A:
Fatal error recovery will fail if the errors are reported by the
upstream ports who are attached by the service driver.
Q:
How does this infrastructure deal with driver that is not PCI
Express aware?
A:
This infrastructure calls the error callback functions of the
driver when an error happens. But if the driver is not aware of
PCI Express, the device might not report its own errors to root
port.
Q:
What modifications will that driver need to make it compatible
with the PCI Express AER Root driver?
A:
It could call the helper functions to enable AER in devices and
cleanup uncorrectable status register. Pls. refer to section 3.3.
Software error injection
========================
@ -296,5 +243,5 @@ from:
https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/
More information about aer-inject can be found in the document comes
with its source code.
More information about aer-inject can be found in the document in
its source code.

View file

@ -2071,41 +2071,7 @@ call.
Because RCU avoids interrupting idle CPUs, it is illegal to execute an
RCU read-side critical section on an idle CPU. (Kernels built with
``CONFIG_PROVE_RCU=y`` will splat if you try it.) The RCU_NONIDLE()
macro and ``_rcuidle`` event tracing is provided to work around this
restriction. In addition, rcu_is_watching() may be used to test
whether or not it is currently legal to run RCU read-side critical
sections on this CPU. I learned of the need for diagnostics on the one
hand and RCU_NONIDLE() on the other while inspecting idle-loop code.
Steven Rostedt supplied ``_rcuidle`` event tracing, which is used quite
heavily in the idle loop. However, there are some restrictions on the
code placed within RCU_NONIDLE():
#. Blocking is prohibited. In practice, this is not a serious
restriction given that idle tasks are prohibited from blocking to
begin with.
#. Although nesting RCU_NONIDLE() is permitted, they cannot nest
indefinitely deeply. However, given that they can be nested on the
order of a million deep, even on 32-bit systems, this should not be a
serious restriction. This nesting limit would probably be reached
long after the compiler OOMed or the stack overflowed.
#. Any code path that enters RCU_NONIDLE() must sequence out of that
same RCU_NONIDLE(). For example, the following is grossly
illegal:
::
1 RCU_NONIDLE({
2 do_something();
3 goto bad_idea; /* BUG!!! */
4 do_something_else();});
5 bad_idea:
It is just as illegal to transfer control into the middle of
RCU_NONIDLE()'s argument. Yes, in theory, you could transfer in
as long as you also transferred out, but in practice you could also
expect to get sharply worded review comments.
``CONFIG_PROVE_RCU=y`` will splat if you try it.)
It is similarly socially unacceptable to interrupt an ``nohz_full`` CPU
running in userspace. RCU must therefore track ``nohz_full`` userspace

View file

@ -1117,7 +1117,6 @@ All: lockdep-checked RCU utility APIs::
RCU_LOCKDEP_WARN
rcu_sleep_check
RCU_NONIDLE
All: Unchecked RCU-protected pointer access::

View file

@ -103,7 +103,7 @@ allows a persistent, OS independent way of storing the user defined SSDTs. There
is also work underway to implement EFI support for loading user defined SSDTs
and using this method will make it easier to convert to the EFI loading
mechanism when that will arrive. To enable it, the
CONFIG_EFI_CUSTOM_SSDT_OVERLAYS shoyld be chosen to y.
CONFIG_EFI_CUSTOM_SSDT_OVERLAYS should be chosen to y.
In order to load SSDTs from an EFI variable the ``"efivar_ssdt=..."`` kernel
command line parameter can be used (the name has a limitation of 16 characters).

View file

@ -508,9 +508,6 @@ cache_miss_collisions
cache miss, but raced with a write and data was already present (usually 0
since the synchronization for cache misses was rewritten)
cache_readaheads
Count of times readahead occurred.
Sysfs - cache set
~~~~~~~~~~~~~~~~~

View file

@ -297,7 +297,7 @@ Lock order is as follows::
Page lock (PG_locked bit of page->flags)
mm->page_table_lock or split pte_lock
lock_page_memcg (memcg->move_lock)
folio_memcg_lock (memcg->move_lock)
mapping->i_pages lock
lruvec->lru_lock.

View file

@ -1580,6 +1580,13 @@ PAGE_SIZE multiple when read back.
Healthy workloads are not expected to reach this limit.
memory.swap.peak
A read-only single value file which exists on non-root
cgroups.
The max swap usage recorded for the cgroup and its
descendants since the creation of the cgroup.
memory.swap.max
A read-write single value file which exists on non-root
cgroups. The default is "max".
@ -2022,31 +2029,33 @@ that attribute:
no-change
Do not modify the I/O priority class.
none-to-rt
For requests that do not have an I/O priority class (NONE),
change the I/O priority class into RT. Do not modify
the I/O priority class of other requests.
promote-to-rt
For requests that have a non-RT I/O priority class, change it into RT.
Also change the priority level of these requests to 4. Do not modify
the I/O priority of requests that have priority class RT.
restrict-to-be
For requests that do not have an I/O priority class or that have I/O
priority class RT, change it into BE. Do not modify the I/O priority
class of requests that have priority class IDLE.
priority class RT, change it into BE. Also change the priority level
of these requests to 0. Do not modify the I/O priority class of
requests that have priority class IDLE.
idle
Change the I/O priority class of all requests into IDLE, the lowest
I/O priority class.
none-to-rt
Deprecated. Just an alias for promote-to-rt.
The following numerical values are associated with the I/O priority policies:
+-------------+---+
| no-change | 0 |
+-------------+---+
| none-to-rt | 1 |
+-------------+---+
| rt-to-be | 2 |
+-------------+---+
| all-to-idle | 3 |
+-------------+---+
+----------------+---+
| no-change | 0 |
+----------------+---+
| rt-to-be | 2 |
+----------------+---+
| all-to-idle | 3 |
+----------------+---+
The numerical value that corresponds to each I/O priority class is as follows:
@ -2062,9 +2071,13 @@ The numerical value that corresponds to each I/O priority class is as follows:
The algorithm to set the I/O priority class for a request is as follows:
- Translate the I/O priority class policy into a number.
- Change the request I/O priority class into the maximum of the I/O priority
class policy number and the numerical I/O priority class.
- If I/O priority class policy is promote-to-rt, change the request I/O
priority class to IOPRIO_CLASS_RT and change the request I/O priority
level to 4.
- If I/O priorityt class is not promote-to-rt, translate the I/O priority
class policy into a number, then change the request I/O priority class
into the maximum of the I/O priority class policy number and the numerical
I/O priority class.
PID
---
@ -2437,7 +2450,7 @@ Miscellaneous controller provides 3 interface files. If two misc resources (res_
res_b 10
misc.current
A read-only flat-keyed file shown in the non-root cgroups. It shows
A read-only flat-keyed file shown in the all cgroups. It shows
the current usage of the resources in the cgroup and its children.::
$ cat misc.current

View file

@ -67,6 +67,16 @@ Optional feature parameters:
Perform the replacement only if bio->bi_opf has all the
selected flags set.
random_read_corrupt <probability>
During <down interval>, replace random byte in a read bio
with a random value. probability is an integer between
0 and 1000000000 meaning 0% to 100% probability of corruption.
random_write_corrupt <probability>
During <down interval>, replace random byte in a write bio
with a random value. probability is an integer between
0 and 1000000000 meaning 0% to 100% probability of corruption.
Examples:
Replaces the 32nd byte of READ bios with the value 1::

View file

@ -25,7 +25,7 @@ mode it calculates and verifies the integrity tag internally. In this
mode, the dm-integrity target can be used to detect silent data
corruption on the disk or in the I/O path.
There's an alternate mode of operation where dm-integrity uses bitmap
There's an alternate mode of operation where dm-integrity uses a bitmap
instead of a journal. If a bit in the bitmap is 1, the corresponding
region's data and integrity tags are not synchronized - if the machine
crashes, the unsynchronized regions will be recalculated. The bitmap mode
@ -38,6 +38,15 @@ the device. But it will only format the device if the superblock contains
zeroes. If the superblock is neither valid nor zeroed, the dm-integrity
target can't be loaded.
Accesses to the on-disk metadata area containing checksums (aka tags) are
buffered using dm-bufio. When an access to any given metadata area
occurs, each unique metadata area gets its own buffer(s). The buffer size
is capped at the size of the metadata area, but may be smaller, thereby
requiring multiple buffers to represent the full metadata area. A smaller
buffer size will produce a smaller resulting read/write operation to the
metadata area for small reads/writes. The metadata is still read even in
a full write to the data covered by a single buffer.
To use the target for the first time:
1. overwrite the superblock with zeroes
@ -93,7 +102,7 @@ journal_sectors:number
device. If the device is already formatted, the value from the
superblock is used.
interleave_sectors:number
interleave_sectors:number (default 32768)
The number of interleaved sectors. This values is rounded down to
a power of two. If the device is already formatted, the value from
the superblock is used.
@ -102,20 +111,16 @@ meta_device:device
Don't interleave the data and metadata on the device. Use a
separate device for metadata.
buffer_sectors:number
The number of sectors in one buffer. The value is rounded down to
a power of two.
buffer_sectors:number (default 128)
The number of sectors in one metadata buffer. The value is rounded
down to a power of two.
The tag area is accessed using buffers, the buffer size is
configurable. The large buffer size means that the I/O size will
be larger, but there could be less I/Os issued.
journal_watermark:number
journal_watermark:number (default 50)
The journal watermark in percents. When the size of the journal
exceeds this watermark, the thread that flushes the journal will
be started.
commit_time:number
commit_time:number (default 10000)
Commit time in milliseconds. When this time passes, the journal is
written. The journal is also written immediately if the FLUSH
request is received.
@ -163,11 +168,10 @@ journal_mac:algorithm(:key) (the key is optional)
the journal. Thus, modified sector number would be detected at
this stage.
block_size:number
The size of a data block in bytes. The larger the block size the
block_size:number (default 512)
The size of a data block in bytes. The larger the block size the
less overhead there is for per-block integrity metadata.
Supported values are 512, 1024, 2048 and 4096 bytes. If not
specified the default block size is 512 bytes.
Supported values are 512, 1024, 2048 and 4096 bytes.
sectors_per_bit:number
In the bitmap mode, this parameter specifies the number of
@ -209,6 +213,12 @@ table and swap the tables with suspend and resume). The other arguments
should not be changed when reloading the target because the layout of disk
data depend on them and the reloaded target would be non-functional.
For example, on a device using the default interleave_sectors of 32768, a
block_size of 512, and an internal_hash of crc32c with a tag size of 4
bytes, it will take 128 KiB of tags to track a full data area, requiring
256 sectors of metadata per data area. With the default buffer_sectors of
128, that means there will be 2 buffers per metadata area, or 2 buffers
per 16 MiB of data.
Status line:
@ -286,7 +296,8 @@ The layout of the formatted block device:
Each run contains:
* tag area - it contains integrity tags. There is one tag for each
sector in the data area
sector in the data area. The size of this area is always 4KiB or
greater.
* data area - it contains data sectors. The number of data sectors
in one run must be a power of two. log2 of this value is stored
in the superblock.

View file

@ -1,17 +1,17 @@
acpi= [HW,ACPI,X86,ARM64]
acpi= [HW,ACPI,X86,ARM64,RISCV64]
Advanced Configuration and Power Interface
Format: { force | on | off | strict | noirq | rsdt |
copy_dsdt }
force -- enable ACPI if default was off
on -- enable ACPI but allow fallback to DT [arm64]
on -- enable ACPI but allow fallback to DT [arm64,riscv64]
off -- disable ACPI if default was on
noirq -- do not use ACPI for IRQ routing
strict -- Be less tolerant of platforms that are not
strictly ACPI specification compliant.
rsdt -- prefer RSDT over (default) XSDT
copy_dsdt -- copy DSDT to memory
For ARM64, ONLY "acpi=off", "acpi=on" or "acpi=force"
are available
For ARM64 and RISCV64, ONLY "acpi=off", "acpi=on" or
"acpi=force" are available
See also Documentation/power/runtime_pm.rst, pci=noacpi
@ -304,7 +304,7 @@
EL0 is indicated by /sys/devices/system/cpu/aarch32_el0
and hot-unplug operations may be restricted.
See Documentation/arm64/asymmetric-32bit.rst for more
See Documentation/arch/arm64/asymmetric-32bit.rst for more
information.
amd_iommu= [HW,X86-64]
@ -323,6 +323,7 @@
option with care.
pgtbl_v1 - Use v1 page table for DMA-API (Default).
pgtbl_v2 - Use v2 page table for DMA-API.
irtcachedis - Disable Interrupt Remapping Table (IRT) caching.
amd_iommu_dump= [HW,X86-64]
Enable AMD IOMMU driver option to dump the ACPI table
@ -429,6 +430,9 @@
arm64.nosme [ARM64] Unconditionally disable Scalable Matrix
Extension support
arm64.nomops [ARM64] Unconditionally disable Memory Copy and Memory
Set instructions support
ataflop= [HW,M68k]
atarimouse= [HW,MOUSE] Atari Mouse
@ -818,20 +822,6 @@
Format:
<first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>]
cpu0_hotplug [X86] Turn on CPU0 hotplug feature when
CONFIG_BOOTPARAM_HOTPLUG_CPU0 is off.
Some features depend on CPU0. Known dependencies are:
1. Resume from suspend/hibernate depends on CPU0.
Suspend/hibernate will fail if CPU0 is offline and you
need to online CPU0 before suspend/hibernate.
2. PIC interrupts also depend on CPU0. CPU0 can't be
removed if a PIC interrupt is detected.
It's said poweroff/reboot may depend on CPU0 on some
machines although I haven't seen such issues so far
after CPU0 is offline on a few tested machines.
If the dependencies are under your control, you can
turn on cpu0_hotplug.
cpuidle.off=1 [CPU_IDLE]
disable the cpuidle sub-system
@ -852,6 +842,12 @@
on every CPU online, such as boot, and resume from suspend.
Default: 10000
cpuhp.parallel=
[SMP] Enable/disable parallel bringup of secondary CPUs
Format: <bool>
Default is enabled if CONFIG_HOTPLUG_PARALLEL=y. Otherwise
the parameter has no effect.
crash_kexec_post_notifiers
Run kdump after running panic-notifiers and dumping
kmsg. This only for the users who doubt kdump always
@ -2117,6 +2113,16 @@
disable
Do not enable intel_pstate as the default
scaling driver for the supported processors
active
Use intel_pstate driver to bypass the scaling
governors layer of cpufreq and provides it own
algorithms for p-state selection. There are two
P-state selection algorithms provided by
intel_pstate in the active mode: powersave and
performance. The way they both operate depends
on whether or not the hardware managed P-states
(HWP) feature has been enabled in the processor
and possibly on the processor model.
passive
Use intel_pstate as a scaling driver, but configure it
to work with generic cpufreq governors (instead of
@ -2551,12 +2557,13 @@
If the value is 0 (the default), KVM will pick a period based
on the ratio, such that a page is zapped after 1 hour on average.
kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.
Default is 1 (enabled)
kvm-amd.nested= [KVM,AMD] Control nested virtualization feature in
KVM/SVM. Default is 1 (enabled).
kvm-amd.npt= [KVM,AMD] Disable nested paging (virtualized MMU)
for all guests.
Default is 1 (enabled) if in 64-bit or 32-bit PAE mode.
kvm-amd.npt= [KVM,AMD] Control KVM's use of Nested Page Tables,
a.k.a. Two-Dimensional Page Tables. Default is 1
(enabled). Disable by KVM if hardware lacks support
for NPT.
kvm-arm.mode=
[KVM,ARM] Select one of KVM/arm64's modes of operation.
@ -2602,30 +2609,33 @@
Format: <integer>
Default: 5
kvm-intel.ept= [KVM,Intel] Disable extended page tables
(virtualized MMU) support on capable Intel chips.
Default is 1 (enabled)
kvm-intel.ept= [KVM,Intel] Control KVM's use of Extended Page Tables,
a.k.a. Two-Dimensional Page Tables. Default is 1
(enabled). Disable by KVM if hardware lacks support
for EPT.
kvm-intel.emulate_invalid_guest_state=
[KVM,Intel] Disable emulation of invalid guest state.
Ignored if kvm-intel.enable_unrestricted_guest=1, as
guest state is never invalid for unrestricted guests.
This param doesn't apply to nested guests (L2), as KVM
never emulates invalid L2 guest state.
Default is 1 (enabled)
[KVM,Intel] Control whether to emulate invalid guest
state. Ignored if kvm-intel.enable_unrestricted_guest=1,
as guest state is never invalid for unrestricted
guests. This param doesn't apply to nested guests (L2),
as KVM never emulates invalid L2 guest state.
Default is 1 (enabled).
kvm-intel.flexpriority=
[KVM,Intel] Disable FlexPriority feature (TPR shadow).
Default is 1 (enabled)
[KVM,Intel] Control KVM's use of FlexPriority feature
(TPR shadow). Default is 1 (enabled). Disalbe by KVM if
hardware lacks support for it.
kvm-intel.nested=
[KVM,Intel] Enable VMX nesting (nVMX).
Default is 0 (disabled)
[KVM,Intel] Control nested virtualization feature in
KVM/VMX. Default is 1 (enabled).
kvm-intel.unrestricted_guest=
[KVM,Intel] Disable unrestricted guest feature
(virtualized real and unpaged mode) on capable
Intel chips. Default is 1 (enabled)
[KVM,Intel] Control KVM's use of unrestricted guest
feature (virtualized real and unpaged mode). Default
is 1 (enabled). Disable by KVM if EPT is disabled or
hardware lacks support for it.
kvm-intel.vmentry_l1d_flush=[KVM,Intel] Mitigation for L1 Terminal Fault
CVE-2018-3620.
@ -2639,9 +2649,10 @@
Default is cond (do L1 cache flush in specific instances)
kvm-intel.vpid= [KVM,Intel] Disable Virtual Processor Identification
feature (tagged TLBs) on capable Intel chips.
Default is 1 (enabled)
kvm-intel.vpid= [KVM,Intel] Control KVM's use of Virtual Processor
Identification feature (tagged TLBs). Default is 1
(enabled). Disable by KVM if hardware lacks support
for it.
l1d_flush= [X86,INTEL]
Control mitigation for L1D based snooping vulnerability.
@ -3423,6 +3434,10 @@
[HW] Make the MicroTouch USB driver use raw coordinates
('y', default) or cooked coordinates ('n')
mtrr=debug [X86]
Enable printing debug information related to MTRR
registers at boot time.
mtrr_chunk_size=nn[KMG] [X86]
used for mtrr cleanup. It is largest continuous chunk
that could hold holes aka. UC entries.
@ -3702,8 +3717,8 @@
nohibernate [HIBERNATION] Disable hibernation and resume.
nohlt [ARM,ARM64,MICROBLAZE,SH] Forces the kernel to busy wait
in do_idle() and not use the arch_cpu_idle()
nohlt [ARM,ARM64,MICROBLAZE,MIPS,SH] Forces the kernel to
busy wait in do_idle() and not use the arch_cpu_idle()
implementation; requires CONFIG_GENERIC_IDLE_POLL_SETUP
to be effective. This is useful on platforms where the
sleep(SH) or wfi(ARM,ARM64) instructions do not work
@ -3838,7 +3853,7 @@
nosmp [SMP] Tells an SMP kernel to act as a UP kernel,
and disable the IO APIC. legacy for "maxcpus=0".
nosmt [KNL,S390] Disable symmetric multithreading (SMT).
nosmt [KNL,MIPS,S390] Disable symmetric multithreading (SMT).
Equivalent to smt=1.
[KNL,X86] Disable symmetric multithreading (SMT).
@ -4049,7 +4064,7 @@
extra details on the taint flags that users can pick
to compose the bitmask to assign to panic_on_taint.
panic_on_warn panic() instead of WARN(). Useful to cause kdump
panic_on_warn=1 panic() instead of WARN(). Useful to cause kdump
on a WARN().
parkbd.port= [HW] Parallel port number the keyboard adapter is
@ -4736,43 +4751,6 @@
the propagation of recent CPU-hotplug changes up
the rcu_node combining tree.
rcutree.use_softirq= [KNL]
If set to zero, move all RCU_SOFTIRQ processing to
per-CPU rcuc kthreads. Defaults to a non-zero
value, meaning that RCU_SOFTIRQ is used by default.
Specify rcutree.use_softirq=0 to use rcuc kthreads.
But note that CONFIG_PREEMPT_RT=y kernels disable
this kernel boot parameter, forcibly setting it
to zero.
rcutree.rcu_fanout_exact= [KNL]
Disable autobalancing of the rcu_node combining
tree. This is used by rcutorture, and might
possibly be useful for architectures having high
cache-to-cache transfer latencies.
rcutree.rcu_fanout_leaf= [KNL]
Change the number of CPUs assigned to each
leaf rcu_node structure. Useful for very
large systems, which will choose the value 64,
and for NUMA systems with large remote-access
latencies, which will choose a value aligned
with the appropriate hardware boundaries.
rcutree.rcu_min_cached_objs= [KNL]
Minimum number of objects which are cached and
maintained per one CPU. Object size is equal
to PAGE_SIZE. The cache allows to reduce the
pressure to page allocator, also it makes the
whole algorithm to behave better in low memory
condition.
rcutree.rcu_delay_page_cache_fill_msec= [KNL]
Set the page-cache refill delay (in milliseconds)
in response to low-memory conditions. The range
of permitted values is in the range 0:100000.
rcutree.jiffies_till_first_fqs= [KNL]
Set delay from grace-period initialization to
first attempt to force quiescent states.
@ -4811,21 +4789,6 @@
When RCU_NOCB_CPU is set, also adjust the
priority of NOCB callback kthreads.
rcutree.rcu_divisor= [KNL]
Set the shift-right count to use to compute
the callback-invocation batch limit bl from
the number of callbacks queued on this CPU.
The result will be bounded below by the value of
the rcutree.blimit kernel parameter. Every bl
callbacks, the softirq handler will exit in
order to allow the CPU to do other work.
Please note that this callback-invocation batch
limit applies only to non-offloaded callback
invocation. Offloaded callbacks are instead
invoked in the context of an rcuoc kthread, which
scheduler will preempt as it does any other task.
rcutree.nocb_nobypass_lim_per_jiffy= [KNL]
On callback-offloaded (rcu_nocbs) CPUs,
RCU reduces the lock contention that would
@ -4839,14 +4802,6 @@
the ->nocb_bypass queue. The definition of "too
many" is supplied by this kernel boot parameter.
rcutree.rcu_nocb_gp_stride= [KNL]
Set the number of NOCB callback kthreads in
each group, which defaults to the square root
of the number of CPUs. Larger numbers reduce
the wakeup overhead on the global grace-period
kthread, but increases that same overhead on
each group's NOCB grace-period kthread.
rcutree.qhimark= [KNL]
Set threshold of queued RCU callbacks beyond which
batch limiting is disabled.
@ -4864,6 +4819,56 @@
on rcutree.qhimark at boot time and to zero to
disable more aggressive help enlistment.
rcutree.rcu_delay_page_cache_fill_msec= [KNL]
Set the page-cache refill delay (in milliseconds)
in response to low-memory conditions. The range
of permitted values is in the range 0:100000.
rcutree.rcu_divisor= [KNL]
Set the shift-right count to use to compute
the callback-invocation batch limit bl from
the number of callbacks queued on this CPU.
The result will be bounded below by the value of
the rcutree.blimit kernel parameter. Every bl
callbacks, the softirq handler will exit in
order to allow the CPU to do other work.
Please note that this callback-invocation batch
limit applies only to non-offloaded callback
invocation. Offloaded callbacks are instead
invoked in the context of an rcuoc kthread, which
scheduler will preempt as it does any other task.
rcutree.rcu_fanout_exact= [KNL]
Disable autobalancing of the rcu_node combining
tree. This is used by rcutorture, and might
possibly be useful for architectures having high
cache-to-cache transfer latencies.
rcutree.rcu_fanout_leaf= [KNL]
Change the number of CPUs assigned to each
leaf rcu_node structure. Useful for very
large systems, which will choose the value 64,
and for NUMA systems with large remote-access
latencies, which will choose a value aligned
with the appropriate hardware boundaries.
rcutree.rcu_min_cached_objs= [KNL]
Minimum number of objects which are cached and
maintained per one CPU. Object size is equal
to PAGE_SIZE. The cache allows to reduce the
pressure to page allocator, also it makes the
whole algorithm to behave better in low memory
condition.
rcutree.rcu_nocb_gp_stride= [KNL]
Set the number of NOCB callback kthreads in
each group, which defaults to the square root
of the number of CPUs. Larger numbers reduce
the wakeup overhead on the global grace-period
kthread, but increases that same overhead on
each group's NOCB grace-period kthread.
rcutree.rcu_kick_kthreads= [KNL]
Cause the grace-period kthread to get an extra
wake_up() if it sleeps three times longer than
@ -4871,6 +4876,13 @@
This wake_up() will be accompanied by a
WARN_ONCE() splat and an ftrace_dump().
rcutree.rcu_resched_ns= [KNL]
Limit the time spend invoking a batch of RCU
callbacks to the specified number of nanoseconds.
By default, this limit is checked only once
every 32 callbacks in order to limit the pain
inflicted by local_clock() overhead.
rcutree.rcu_unlock_delay= [KNL]
In CONFIG_RCU_STRICT_GRACE_PERIOD=y kernels,
this specifies an rcu_read_unlock()-time delay
@ -4885,6 +4897,16 @@
rcu_node tree with an eye towards determining
why a new grace period has not yet started.
rcutree.use_softirq= [KNL]
If set to zero, move all RCU_SOFTIRQ processing to
per-CPU rcuc kthreads. Defaults to a non-zero
value, meaning that RCU_SOFTIRQ is used by default.
Specify rcutree.use_softirq=0 to use rcuc kthreads.
But note that CONFIG_PREEMPT_RT=y kernels disable
this kernel boot parameter, forcibly setting it
to zero.
rcuscale.gp_async= [KNL]
Measure performance of asynchronous
grace-period primitives such as call_rcu().
@ -5087,8 +5109,17 @@
rcutorture.stall_cpu_block= [KNL]
Sleep while stalling if set. This will result
in warnings from preemptible RCU in addition
to any other stall-related activity.
in warnings from preemptible RCU in addition to
any other stall-related activity. Note that
in kernels built with CONFIG_PREEMPTION=n and
CONFIG_PREEMPT_COUNT=y, this parameter will
cause the CPU to pass through a quiescent state.
Given CONFIG_PREEMPTION=n, this will suppress
RCU CPU stall warnings, but will instead result
in scheduling-while-atomic splats.
Use of this module parameter results in splats.
rcutorture.stall_cpu_holdoff= [KNL]
Time to wait (s) after boot before inducing stall.
@ -5452,7 +5483,12 @@
port and the regular usb controller gets disabled.
root= [KNL] Root filesystem
See name_to_dev_t comment in init/do_mounts.c.
Usually this a a block device specifier of some kind,
see the early_lookup_bdev comment in
block/early-lookup.c for details.
Alternatively this can be "ram" for the legacy initial
ramdisk, "nfs" and "cifs" for root on a network file
system, or "mtd" and "ubi" for mounting from raw flash.
rootdelay= [KNL] Delay (in seconds) to pause before attempting to
mount the root filesystem
@ -5735,7 +5771,7 @@
1: Fast pin select (default)
2: ATC IRMode
smt= [KNL,S390] Set the maximum number of threads (logical
smt= [KNL,MIPS,S390] Set the maximum number of threads (logical
CPUs) to use per physical CPU on systems capable of
symmetric multithreading (SMT). Will be capped to the
actual hardware limit.
@ -6563,6 +6599,12 @@
unknown_nmi_panic
[X86] Cause panic on unknown NMI.
unwind_debug [X86-64]
Enable unwinder debug output. This can be
useful for debugging certain unwinder error
conditions, including corrupt stacks and
bad/missing unwinder metadata.
usbcore.authorized_default=
[USB] Default USB device authorization:
(default -1 = authorized except for wireless USB,
@ -6931,6 +6973,18 @@
it can be updated at runtime by writing to the
corresponding sysfs file.
workqueue.cpu_intensive_thresh_us=
Per-cpu work items which run for longer than this
threshold are automatically considered CPU intensive
and excluded from concurrency management to prevent
them from noticeably delaying other per-cpu work
items. Default is 10000 (10ms).
If CONFIG_WQ_CPU_INTENSIVE_REPORT is set, the kernel
will report the work functions which violate this
threshold repeatedly. They are likely good
candidates for using WQ_UNBOUND workqueues instead.
workqueue.disable_numa
By default, all work items queued to unbound
workqueues are affine to the NUMA nodes they're

View file

@ -10,8 +10,8 @@ Introduction
============
This file documents the driver for the Rockchip ISP1 that is part of RK3288
and RK3399 SoCs. The driver is located under drivers/staging/media/rkisp1
and uses the Media-Controller API.
and RK3399 SoCs. The driver is located under drivers/media/platform/rockchip/
rkisp1 and uses the Media-Controller API.
Revisions
=========

View file

@ -119,9 +119,9 @@ set size has chronologically changed.::
Data Access Pattern Aware Memory Management
===========================================
Below three commands make every memory region of size >=4K that doesn't
accessed for >=60 seconds in your workload to be swapped out. ::
Below command makes every memory region of size >=4K that has not accessed for
>=60 seconds in your workload to be swapped out. ::
$ echo "#min-size max-size min-acc max-acc min-age max-age action" > test_scheme
$ echo "4K max 0 0 60s max pageout" >> test_scheme
$ damo schemes -c test_scheme <pid of your workload>
$ sudo damo schemes --damos_access_rate 0 0 --damos_sz_region 4K max \
--damos_age 60s max --damos_action pageout \
<pid of your workload>

View file

@ -10,9 +10,8 @@ DAMON provides below interfaces for different users.
`This <https://github.com/awslabs/damo>`_ is for privileged people such as
system administrators who want a just-working human-friendly interface.
Using this, users can use the DAMONs major features in a human-friendly way.
It may not be highly tuned for special cases, though. It supports both
virtual and physical address spaces monitoring. For more detail, please
refer to its `usage document
It may not be highly tuned for special cases, though. For more detail,
please refer to its `usage document
<https://github.com/awslabs/damo/blob/next/USAGE.md>`_.
- *sysfs interface.*
:ref:`This <sysfs_interface>` is for privileged user space programmers who
@ -20,11 +19,7 @@ DAMON provides below interfaces for different users.
features by reading from and writing to special sysfs files. Therefore,
you can write and use your personalized DAMON sysfs wrapper programs that
reads/writes the sysfs files instead of you. The `DAMON user space tool
<https://github.com/awslabs/damo>`_ is one example of such programs. It
supports both virtual and physical address spaces monitoring. Note that this
interface provides only simple :ref:`statistics <damos_stats>` for the
monitoring results. For detailed monitoring results, DAMON provides a
:ref:`tracepoint <tracepoint>`.
<https://github.com/awslabs/damo>`_ is one example of such programs.
- *debugfs interface. (DEPRECATED!)*
:ref:`This <debugfs_interface>` is almost identical to :ref:`sysfs interface
<sysfs_interface>`. This is deprecated, so users should move to the
@ -139,7 +134,7 @@ scheme of the kdamond. Writing ``clear_schemes_tried_regions`` to ``state``
file clears the DAMON-based operating scheme action tried regions directory for
each DAMON-based operation scheme of the kdamond. For details of the
DAMON-based operation scheme action tried regions directory, please refer to
:ref:tried_regions section <sysfs_schemes_tried_regions>`.
:ref:`tried_regions section <sysfs_schemes_tried_regions>`.
If the state is ``on``, reading ``pid`` shows the pid of the kdamond thread.
@ -259,12 +254,9 @@ be equal or smaller than ``start`` of directory ``N+1``.
contexts/<N>/schemes/
---------------------
For usual DAMON-based data access aware memory management optimizations, users
would normally want the system to apply a memory management action to a memory
region of a specific access pattern. DAMON receives such formalized operation
schemes from the user and applies those to the target memory regions. Users
can get and set the schemes by reading from and writing to files under this
directory.
The directory for DAMON-based Operation Schemes (:ref:`DAMOS
<damon_design_damos>`). Users can get and set the schemes by reading from and
writing to files under this directory.
In the beginning, this directory has only one file, ``nr_schemes``. Writing a
number (``N``) to the file creates the number of child directories named ``0``
@ -277,12 +269,12 @@ In each scheme directory, five directories (``access_pattern``, ``quotas``,
``watermarks``, ``filters``, ``stats``, and ``tried_regions``) and one file
(``action``) exist.
The ``action`` file is for setting and getting what action you want to apply to
memory regions having specific access pattern of the interest. The keywords
that can be written to and read from the file and their meaning are as below.
The ``action`` file is for setting and getting the scheme's :ref:`action
<damon_design_damos_action>`. The keywords that can be written to and read
from the file and their meaning are as below.
Note that support of each action depends on the running DAMON operations set
`implementation <sysfs_contexts>`.
:ref:`implementation <sysfs_contexts>`.
- ``willneed``: Call ``madvise()`` for the region with ``MADV_WILLNEED``.
Supported by ``vaddr`` and ``fvaddr`` operations set.
@ -304,32 +296,21 @@ Note that support of each action depends on the running DAMON operations set
schemes/<N>/access_pattern/
---------------------------
The target access pattern of each DAMON-based operation scheme is constructed
with three ranges including the size of the region in bytes, number of
monitored accesses per aggregate interval, and number of aggregated intervals
for the age of the region.
The directory for the target access :ref:`pattern
<damon_design_damos_access_pattern>` of the given DAMON-based operation scheme.
Under the ``access_pattern`` directory, three directories (``sz``,
``nr_accesses``, and ``age``) each having two files (``min`` and ``max``)
exist. You can set and get the access pattern for the given scheme by writing
to and reading from the ``min`` and ``max`` files under ``sz``,
``nr_accesses``, and ``age`` directories, respectively.
``nr_accesses``, and ``age`` directories, respectively. Note that the ``min``
and the ``max`` form a closed interval.
schemes/<N>/quotas/
-------------------
Optimal ``target access pattern`` for each ``action`` is workload dependent, so
not easy to find. Worse yet, setting a scheme of some action too aggressive
can cause severe overhead. To avoid such overhead, users can limit time and
size quota for each scheme. In detail, users can ask DAMON to try to use only
up to specific time (``time quota``) for applying the action, and to apply the
action to only up to specific amount (``size quota``) of memory regions having
the target access pattern within a given time interval (``reset interval``).
When the quota limit is expected to be exceeded, DAMON prioritizes found memory
regions of the ``target access pattern`` based on their size, access frequency,
and age. For personalized prioritization, users can set the weights for the
three properties.
The directory for the :ref:`quotas <damon_design_damos_quotas>` of the given
DAMON-based operation scheme.
Under ``quotas`` directory, three files (``ms``, ``bytes``,
``reset_interval_ms``) and one directory (``weights``) having three files
@ -337,23 +318,26 @@ Under ``quotas`` directory, three files (``ms``, ``bytes``,
You can set the ``time quota`` in milliseconds, ``size quota`` in bytes, and
``reset interval`` in milliseconds by writing the values to the three files,
respectively. You can also set the prioritization weights for size, access
frequency, and age in per-thousand unit by writing the values to the three
files under the ``weights`` directory.
respectively. Then, DAMON tries to use only up to ``time quota`` milliseconds
for applying the ``action`` to memory regions of the ``access_pattern``, and to
apply the action to only up to ``bytes`` bytes of memory regions within the
``reset_interval_ms``. Setting both ``ms`` and ``bytes`` zero disables the
quota limits.
You can also set the :ref:`prioritization weights
<damon_design_damos_quotas_prioritization>` for size, access frequency, and age
in per-thousand unit by writing the values to the three files under the
``weights`` directory.
schemes/<N>/watermarks/
-----------------------
To allow easy activation and deactivation of each scheme based on system
status, DAMON provides a feature called watermarks. The feature receives five
values called ``metric``, ``interval``, ``high``, ``mid``, and ``low``. The
``metric`` is the system metric such as free memory ratio that can be measured.
If the metric value of the system is higher than the value in ``high`` or lower
than ``low`` at the memoent, the scheme is deactivated. If the value is lower
than ``mid``, the scheme is activated.
The directory for the :ref:`watermarks <damon_design_damos_watermarks>` of the
given DAMON-based operation scheme.
Under the watermarks directory, five files (``metric``, ``interval_us``,
``high``, ``mid``, and ``low``) for setting each value exist. You can set and
``high``, ``mid``, and ``low``) for setting the metric, the time interval
between check of the metric, and the three watermarks exist. You can set and
get the five values by writing to the files, respectively.
Keywords and meanings of those that can be written to the ``metric`` file are
@ -367,12 +351,8 @@ The ``interval`` should written in microseconds unit.
schemes/<N>/filters/
--------------------
Users could know something more than the kernel for specific types of memory.
In the case, users could do their own management for the memory and hence
doesn't want DAMOS bothers that. Users could limit DAMOS by setting the access
pattern of the scheme and/or the monitoring regions for the purpose, but that
can be inefficient in some cases. In such cases, users could set non-access
pattern driven filters using files in this directory.
The directory for the :ref:`filters <damon_design_damos_filters>` of the given
DAMON-based operation scheme.
In the beginning, this directory has only one file, ``nr_filters``. Writing a
number (``N``) to the file creates the number of child directories named ``0``
@ -432,13 +412,17 @@ starting from ``0`` under this directory. Each directory contains files
exposing detailed information about each of the memory region that the
corresponding scheme's ``action`` has tried to be applied under this directory,
during next :ref:`aggregation interval <sysfs_monitoring_attrs>`. The
information includes address range, ``nr_accesses``, , and ``age`` of the
region.
information includes address range, ``nr_accesses``, and ``age`` of the region.
The directories will be removed when another special keyword,
``clear_schemes_tried_regions``, is written to the relevant
``kdamonds/<N>/state`` file.
The expected usage of this directory is investigations of schemes' behaviors,
and query-like efficient data access monitoring results retrievals. For the
latter use case, in particular, users can set the ``action`` as ``stat`` and
set the ``access pattern`` as their interested pattern that they want to query.
tried_regions/<N>/
------------------
@ -600,15 +584,10 @@ update.
Schemes
-------
For usual DAMON-based data access aware memory management optimizations, users
would simply want the system to apply a memory management action to a memory
region of a specific access pattern. DAMON receives such formalized operation
schemes from the user and applies those to the target processes.
Users can get and set the schemes by reading from and writing to ``schemes``
debugfs file. Reading the file also shows the statistics of each scheme. To
the file, each of the schemes should be represented in each line in below
form::
Users can get and set the DAMON-based operation :ref:`schemes
<damon_design_damos>` by reading from and writing to ``schemes`` debugfs file.
Reading the file also shows the statistics of each scheme. To the file, each
of the schemes should be represented in each line in below form::
<target access pattern> <action> <quota> <watermarks>
@ -617,8 +596,9 @@ You can disable schemes by simply writing an empty string to the file.
Target Access Pattern
~~~~~~~~~~~~~~~~~~~~~
The ``<target access pattern>`` is constructed with three ranges in below
form::
The target access :ref:`pattern <damon_design_damos_access_pattern>` of the
scheme. The ``<target access pattern>`` is constructed with three ranges in
below form::
min-size max-size min-acc max-acc min-age max-age
@ -631,9 +611,9 @@ closed interval.
Action
~~~~~~
The ``<action>`` is a predefined integer for memory management actions, which
DAMON will apply to the regions having the target access pattern. The
supported numbers and their meanings are as below.
The ``<action>`` is a predefined integer for memory management :ref:`actions
<damon_design_damos_action>`. The supported numbers and their meanings are as
below.
- 0: Call ``madvise()`` for the region with ``MADV_WILLNEED``. Ignored if
``target`` is ``paddr``.
@ -649,10 +629,8 @@ supported numbers and their meanings are as below.
Quota
~~~~~
Optimal ``target access pattern`` for each ``action`` is workload dependent, so
not easy to find. Worse yet, setting a scheme of some action too aggressive
can cause severe overhead. To avoid such overhead, users can limit time and
size quota for the scheme via the ``<quota>`` in below form::
Users can set the :ref:`quotas <damon_design_damos_quotas>` of the given scheme
via the ``<quota>`` in below form::
<ms> <sz> <reset interval> <priority weights>
@ -662,19 +640,17 @@ the action to memory regions of the ``target access pattern`` within the
``<sz>`` bytes of memory regions within the ``<reset interval>``. Setting both
``<ms>`` and ``<sz>`` zero disables the quota limits.
When the quota limit is expected to be exceeded, DAMON prioritizes found memory
regions of the ``target access pattern`` based on their size, access frequency,
and age. For personalized prioritization, users can set the weights for the
three properties in ``<priority weights>`` in below form::
For the :ref:`prioritization <damon_design_damos_quotas_prioritization>`, users
can set the weights for the three properties in ``<priority weights>`` in below
form::
<size weight> <access frequency weight> <age weight>
Watermarks
~~~~~~~~~~
Some schemes would need to run based on current value of the system's specific
metrics like free memory ratio. For such cases, users can specify watermarks
for the condition.::
Users can specify :ref:`watermarks <damon_design_damos_watermarks>` of the
given scheme via ``<watermarks>`` in below form::
<metric> <check interval> <high mark> <middle mark> <low mark>
@ -797,10 +773,12 @@ root directory only.
Tracepoint for Monitoring Results
=================================
DAMON provides the monitoring results via a tracepoint,
``damon:damon_aggregated``. While the monitoring is turned on, you could
record the tracepoint events and show results using tracepoint supporting tools
like ``perf``. For example::
Users can get the monitoring results via the :ref:`tried_regions
<sysfs_schemes_tried_regions>` or a tracepoint, ``damon:damon_aggregated``.
While the tried regions directory is useful for getting a snapshot, the
tracepoint is useful for getting a full record of the results. While the
monitoring is turned on, you could record the tracepoint events and show
results using tracepoint supporting tools like ``perf``. For example::
# echo on > monitor_on
# perf record -e damon:damon_aggregated &

View file

@ -0,0 +1,68 @@
.. SPDX-License-Identifier: GPL-2.0
======================================
CXL Performance Monitoring Unit (CPMU)
======================================
The CXL rev 3.0 specification provides a definition of CXL Performance
Monitoring Unit in section 13.2: Performance Monitoring.
CXL components (e.g. Root Port, Switch Upstream Port, End Point) may have
any number of CPMU instances. CPMU capabilities are fully discoverable from
the devices. The specification provides event definitions for all CXL protocol
message types and a set of additional events for things commonly counted on
CXL devices (e.g. DRAM events).
CPMU driver
===========
The CPMU driver registers a perf PMU with the name pmu_mem<X>.<Y> on the CXL bus
representing the Yth CPMU for memX.
/sys/bus/cxl/device/pmu_mem<X>.<Y>
The associated PMU is registered as
/sys/bus/event_sources/devices/cxl_pmu_mem<X>.<Y>
In common with other CXL bus devices, the id has no specific meaning and the
relationship to specific CXL device should be established via the device parent
of the device on the CXL bus.
PMU driver provides description of available events and filter options in sysfs.
The "format" directory describes all formats of the config (event vendor id,
group id and mask) config1 (threshold, filter enables) and config2 (filter
parameters) fields of the perf_event_attr structure. The "events" directory
describes all documented events show in perf list.
The events shown in perf list are the most fine grained events with a single
bit of the event mask set. More general events may be enable by setting
multiple mask bits in config. For example, all Device to Host Read Requests
may be captured on a single counter by setting the bits for all of
* d2h_req_rdcurr
* d2h_req_rdown
* d2h_req_rdshared
* d2h_req_rdany
* d2h_req_rdownnodata
Example of usage::
$#perf list
cxl_pmu_mem0.0/clock_ticks/ [Kernel PMU event]
cxl_pmu_mem0.0/d2h_req_rdshared/ [Kernel PMU event]
cxl_pmu_mem0.0/h2d_req_snpcur/ [Kernel PMU event]
cxl_pmu_mem0.0/h2d_req_snpdata/ [Kernel PMU event]
cxl_pmu_mem0.0/h2d_req_snpinv/ [Kernel PMU event]
-----------------------------------------------------------
$# perf stat -a -e cxl_pmu_mem0.0/clock_ticks/ -e cxl_pmu_mem0.0/d2h_req_rdshared/
Vendor specific events may also be available and if so can be used via
$# perf stat -a -e cxl_pmu_mem0.0/vid=VID,gid=GID,mask=MASK/
The driver does not support sampling so "perf record" is unsupported.
It only supports system-wide counting so attaching to a task is
unsupported.

View file

@ -56,14 +56,14 @@ Example usage of perf::
For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same
as PMU v1, but some new functions are added to the hardware.
(a) L3C PMU supports filtering by core/thread within the cluster which can be
1. L3C PMU supports filtering by core/thread within the cluster which can be
specified as a bitmap::
$# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_core=0x3/ sleep 5
This will only count the operations from core/thread 0 and 1 in this cluster.
(b) Tracetag allow the user to chose to count only read, write or atomic
2. Tracetag allow the user to chose to count only read, write or atomic
operations via the tt_req parameeter in perf. The default value counts all
operations. tt_req is 3bits, 3'b100 represents read operations, 3'b101
represents write operations, 3'b110 represents atomic store operations and
@ -73,14 +73,16 @@ represents write operations, 3'b110 represents atomic store operations and
This will only count the read operations in this cluster.
(c) Datasrc allows the user to check where the data comes from. It is 5 bits.
3. Datasrc allows the user to check where the data comes from. It is 5 bits.
Some important codes are as follows:
5'b00001: comes from L3C in this die;
5'b01000: comes from L3C in the cross-die;
5'b01001: comes from L3C which is in another socket;
5'b01110: comes from the local DDR;
5'b01111: comes from the cross-die DDR;
5'b10000: comes from cross-socket DDR;
- 5'b00001: comes from L3C in this die;
- 5'b01000: comes from L3C in the cross-die;
- 5'b01001: comes from L3C which is in another socket;
- 5'b01110: comes from the local DDR;
- 5'b01111: comes from the cross-die DDR;
- 5'b10000: comes from cross-socket DDR;
etc, it is mainly helpful to find that the data source is nearest from the CPU
cores. If datasrc_cfg is used in the multi-chips, the datasrc_skt shall be
configured in perf command::
@ -88,15 +90,25 @@ configured in perf command::
$# perf stat -a -e hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xE/,
hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xF/ sleep 5
(d)Some HiSilicon SoCs encapsulate multiple CPU and IO dies. Each CPU die
4. Some HiSilicon SoCs encapsulate multiple CPU and IO dies. Each CPU die
contains several Compute Clusters (CCLs). The I/O dies are called Super I/O
clusters (SICL) containing multiple I/O clusters (ICLs). Each CCL/ICL in the
SoC has a unique ID. Each ID is 11bits, include a 6-bit SCCL-ID and 5-bit
CCL/ICL-ID. For I/O die, the ICL-ID is followed by:
5'b00000: I/O_MGMT_ICL;
5'b00001: Network_ICL;
5'b00011: HAC_ICL;
5'b10000: PCIe_ICL;
- 5'b00000: I/O_MGMT_ICL;
- 5'b00001: Network_ICL;
- 5'b00011: HAC_ICL;
- 5'b10000: PCIe_ICL;
5. uring_channel: UC PMU events 0x47~0x59 supports filtering by tx request
uring channel. It is 2 bits. Some important codes are as follows:
- 2'b11: count the events which sent to the uring_ext (MATA) channel;
- 2'b01: is the same as 2'b11;
- 2'b10: count the events which sent to the uring (non-MATA) channel;
- 2'b00: default value, count the events which sent to the both uring and
uring_ext channel;
Users could configure IDs to count data come from specific CCL/ICL, by setting
srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting

View file

@ -21,3 +21,4 @@ Performance monitor support
alibaba_pmu
nvidia-pmu
meson-ddr-pmu
cxl

View file

@ -5,7 +5,7 @@
Intel Uncore Frequency Scaling
==============================
:Copyright: |copy| 2022 Intel Corporation
:Copyright: |copy| 2022-2023 Intel Corporation
:Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
@ -58,3 +58,58 @@ Each package_*_die_* contains the following attributes:
``current_freq_khz``
This attribute is used to get the current uncore frequency.
SoCs with TPMI (Topology Aware Register and PM Capsule Interface)
-----------------------------------------------------------------
An SoC can contain multiple power domains with individual or collection
of mesh partitions. This partition is called fabric cluster.
Certain type of meshes will need to run at the same frequency, they will
be placed in the same fabric cluster. Benefit of fabric cluster is that it
offers a scalable mechanism to deal with partitioned fabrics in a SoC.
The current sysfs interface supports controls at package and die level.
This interface is not enough to support more granular control at
fabric cluster level.
SoCs with the support of TPMI (Topology Aware Register and PM Capsule
Interface), can have multiple power domains. Each power domain can
contain one or more fabric clusters.
To represent controls at fabric cluster level in addition to the
controls at package and die level (like systems without TPMI
support), sysfs is enhanced. This granular interface is presented in the
sysfs with directories names prefixed with "uncore". For example:
uncore00, uncore01 etc.
The scope of control is specified by attributes "package_id", "domain_id"
and "fabric_cluster_id" in the directory.
Attributes in each directory:
``domain_id``
This attribute is used to get the power domain id of this instance.
``fabric_cluster_id``
This attribute is used to get the fabric cluster id of this instance.
``package_id``
This attribute is used to get the package id of this instance.
The other attributes are same as presented at package_*_die_* level.
In most of current use cases, the "max_freq_khz" and "min_freq_khz"
is updated at "package_*_die_*" level. This model will be still supported
with the following approach:
When user uses controls at "package_*_die_*" level, then every fabric
cluster is affected in that package and die. For example: user changes
"max_freq_khz" in the package_00_die_00, then "max_freq_khz" for uncore*
directory with the same package id will be updated. In this case user can
still update "max_freq_khz" at each uncore* level, which is more restrictive.
Similarly, user can update "min_freq_khz" at "package_*_die_*" level
to apply at each uncore* level.
Support for "current_freq_khz" is available only at each fabric cluster
level (i.e., in uncore* directory).

View file

@ -949,7 +949,7 @@ user space can read performance monitor counter registers directly.
The default value is 0 (access disabled).
See Documentation/arm64/perf.rst for more information.
See Documentation/arch/arm64/perf.rst for more information.
pid_max

View file

@ -386,8 +386,8 @@ Default : 0 (for compatibility reasons)
txrehash
--------
Controls default hash rethink behaviour on listening socket when SO_TXREHASH
option is set to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt).
Controls default hash rethink behaviour on socket when SO_TXREHASH option is set
to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt).
If set to 1 (default), hash rethink is performed on listening socket.
If set to 0, hash rethink is not performed.

Some files were not shown because too many files have changed in this diff Show more