u-boot/drivers
Maciej W. Rozycki a398a51ccc pci: Work around PCIe link training failures
Attempt to handle cases with a downstream port of a PCIe switch where
link training never completes and the link continues switching between
speeds indefinitely with the data link layer never reaching the active
state.

It has been observed with a downstream port of the ASMedia ASM2824 Gen 3
switch wired to the upstream port of the Pericom PI7C9X2G304 Gen 2
switch, using a Delock Riser Card PCI Express x1 > 2 x PCIe x1 device,
P/N 41433, wired to a SiFive HiFive Unmatched board.  In this setup the
switches are supposed to negotiate the link speed of preferably 5.0GT/s,
falling back to 2.5GT/s.

However the link continues oscillating between the two speeds, at the
rate of 34-35 times per second, with link training reported repeatedly
active ~84% of the time, e.g.:

02:03.0 PCI bridge [0604]: ASMedia Technology Inc. ASM2824 PCIe Gen3 Packet Switch [1b21:2824] (rev 01) (prog-if 00 [Normal decode])
[...]
	Bus: primary=02, secondary=05, subordinate=05, sec-latency=0
[...]
	Capabilities: [80] Express (v2) Downstream Port (Slot+), MSI 00
[...]
		LnkSta:	Speed 5GT/s (downgraded), Width x1 (ok)
			TrErr- Train+ SlotClk+ DLActive- BWMgmt+ ABWMgmt-
[...]
		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis+, Selectable De-emphasis: -3.5dB
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
[...]

Forcibly limiting the target link speed to 2.5GT/s with the upstream
ASM2824 device makes the two switches communicate correctly however:

02:03.0 PCI bridge [0604]: ASMedia Technology Inc. ASM2824 PCIe Gen3 Packet Switch [1b21:2824] (rev 01) (prog-if 00 [Normal decode])
[...]
	Bus: primary=02, secondary=05, subordinate=09, sec-latency=0
[...]
	Capabilities: [80] Express (v2) Downstream Port (Slot+), MSI 00
[...]
		LnkSta:	Speed 2.5GT/s (downgraded), Width x1 (ok)
			TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
[...]
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis+, Selectable De-emphasis: -3.5dB
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
[...]

and then:

05:00.0 PCI bridge [0604]: Pericom Semiconductor PI7C9X2G304 EL/SL PCIe2 3-Port/4-Lane Packet Switch [12d8:2304] (rev 05) (prog-if 00 [Normal decode])
[...]
	Bus: primary=05, secondary=06, subordinate=09, sec-latency=0
[...]
	Capabilities: [c0] Express (v2) Upstream Port, MSI 00
[...]
		LnkSta:	Speed 2.5GT/s (downgraded), Width x1 (downgraded)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
[...]
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
[...]

Make use of this observation then and attempt to detect the inability to
negotiate the link speed automatically, and then handle it by hand.  Use
the Data Link Layer Link Active status flag as the primary indicator of
successful link speed negotiation, but given that the flag is optional
by hardware to implement (the ASM2824 does have it though), resort to
checking for the mandatory Link Bandwidth Management Status flag showing
that the link speed or width has been changed in an attempt to correct
unreliable link operation (the ASM2824 does set it too).

If these checks indicate that link may not operate correctly, then poll
the Data Link Layer Link Active status flag along with the Link Training
flag for the duration of 200ms to see if the link has stabilised, that
is either that the Data Link Layer Link Active status flag has been set
or that Link Training has been inactive during at least the second half
of the interval.

If that has indicated failure, restrict the target speed to 2.5GT/s,
request a link retrain and check again if the link has stabilised.  If
that does not work either, then restore the original speed setting and
claim defeat, otherwise we are done.

NB interestingly enough with the ASM2824 vs PI7C9X2G304 configuration
referred above asking the ASM2824 to retrain with a higher target link
speed once the 2.5GT/s speed has been negotiated makes the two devices
successfully negotiate 5.0GT/s.  Lifting the 2.5GT/s speed restriction
would however prevent our workaround from working with an OS that issues
a reset and that is unaware of the problem.  This is because the devices
would then try to negotiate a higher link speed from scratch and fail,
while the sticky property of the Target Link Speed setting will keep the
2.5GT/s speed restriction across a reset.

Keep the 2.5GT/s speed restriction then, conservatively, if functional
once applied.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Reviewed-by: Stefan Roese <sr@denx.de>
2022-01-14 12:26:42 -05:00
..
adc dm: define LOG_CATEGORY for all uclass 2021-07-06 10:38:03 -06:00
ata scsi: ceva: rename the resource name to match the linux kernel one 2021-11-09 17:18:23 +05:30
axi WS cleanup: remove trailing empty lines 2021-09-30 08:08:56 -04:00
bios_emulator pci: Drop DM_PCI check from bios_emul 2021-08-05 19:46:35 -04:00
block Prepare v2022.01-rc4 2021-12-20 17:12:04 -05:00
bootcount bootcount: add a new driver with syscon as backend 2021-08-22 11:04:52 +02:00
bus bus: ti-sysc: change in a normal driver 2021-03-22 19:23:27 +13:00
button dm: define LOG_CATEGORY for all uclass 2021-07-06 10:38:03 -06:00
cache cache: sifive: Fix -Wint-to-pointer-cast warning 2021-10-20 10:59:09 +08:00
clk treewide: invaild -> invalid 2022-01-13 07:57:49 -05:00
core dm: core: Switch order of pinctrl and power domain calls 2022-01-13 09:13:41 -07:00
cpu sandbox: correct cpu nodes 2021-09-25 09:46:15 -06:00
crypto crypto: aspeed: Add AST2600 ACRY support 2021-11-17 17:05:00 -05:00
ddr ddr: marvell: a38x: fix SPLIT_OUT_MIX state decision 2022-01-14 11:39:15 +01:00
demo demo: migrate uclass to livetree 2021-10-05 08:50:15 -04:00
dfu dfu: newline after updating 2021-11-07 18:36:56 +01:00
dma treewide: invaild -> invalid 2022-01-13 07:57:49 -05:00
fastboot fastboot: fix partition name truncation in environment lookup 2021-10-12 16:48:38 -04:00
firmware Prepare v2022.01-rc3 2021-11-29 12:00:57 -05:00
fpga arm: socfpga: arria10: Enable double peripheral RBF configuration 2021-12-17 12:58:01 +08:00
gpio Convert CONFIG_KIRKWOOD_GPIO to Kconfig 2021-12-27 16:20:19 -05:00
hwspinlock treewide: invaild -> invalid 2022-01-13 07:57:49 -05:00
i2c Convert CONFIG_SYS_IMMR to Kconfig 2021-12-27 08:41:38 -05:00
input Convert CONFIG_KEYBOARD to Kconfig 2021-12-05 09:26:26 -07:00
iommu iommu: Add Apple DART driver 2021-10-31 08:46:44 -04:00
led dm: define LOG_CATEGORY for all uclass 2021-07-06 10:38:03 -06:00
mailbox treewide: invaild -> invalid 2022-01-13 07:57:49 -05:00
memory keystone2: Move CONFIG_AEMIF_CNTRL_BASE out of CONFIG namespace 2021-09-27 21:38:34 -04:00
misc treewide: invaild -> invalid 2022-01-13 07:57:49 -05:00
mmc mmc: dwmmc: return a proper error code when busy 2022-01-12 09:56:40 +09:00
mtd mtd: nand: pxa3xx: use marvell, prefix for custom DT properties 2022-01-14 07:47:57 +01:00
mux treewide: invaild -> invalid 2022-01-13 07:57:49 -05:00
net drivers/net/fec_mxc.c: Fix spelling of "resetting". 2022-01-13 07:57:49 -05:00
nvme Revert "nvme: Enable FUA" 2021-11-18 20:18:34 -05:00
pch treewide: Simply conditions with the new OF_REAL 2021-09-25 09:46:15 -06:00
pci pci: Work around PCIe link training failures 2022-01-14 12:26:42 -05:00
pci_endpoint dm: define LOG_CATEGORY for all uclass 2021-07-06 10:38:03 -06:00
phy treewide: invaild -> invalid 2022-01-13 07:57:49 -05:00
pinctrl pinctrl: stmfx: define LOG_CATEGORY 2021-11-30 11:20:34 +01:00
power - disable CONFIG_NET_RANDOM_ETHADDR when unnecessary on amlogic based configs 2022-01-09 07:56:31 -05:00
pwm exynos: pwm: Deal with a PWM at 100% 2021-11-09 11:57:22 +09:00
qe configs: fsl: migrate FMAN/QE specific defines to Kconfig 2021-11-09 17:18:23 +05:30
ram ram: stm32mp1: remove __maybe_unused on stm32mp1_ddr_setup 2021-11-30 16:43:28 +01:00
reboot-mode reboot-mode: migrate uclass to livetree 2021-10-05 08:50:15 -04:00
remoteproc remoteproc: migrate uclass to livetree 2021-10-05 08:50:15 -04:00
reset treewide: invaild -> invalid 2022-01-13 07:57:49 -05:00
rng Kconfig: Remove all default n/no options 2021-08-31 17:47:49 -04:00
rtc rtc: ds1337: fix compatible string typo 2021-11-11 19:02:44 -05:00
scsi sata: Rename SATA_SUPPORT to SATA 2021-09-04 12:26:02 -04:00
serial fdt: Drop SPL_BUILD macro 2022-01-13 09:13:41 -07:00
smem dm: define LOG_CATEGORY for all uclass 2021-07-06 10:38:03 -06:00
soc soc: xilinx: versal: Add soc_xilinx_versal driver 2021-08-26 08:08:11 +02:00
sound dm: define LOG_CATEGORY for all uclass 2021-07-06 10:38:03 -06:00
spi - disable CONFIG_NET_RANDOM_ETHADDR when unnecessary on amlogic based configs 2022-01-09 07:56:31 -05:00
spmi spmi: msm: add arbiter version 5 support 2021-10-31 08:46:44 -04:00
sysinfo sysinfo: rcar3: Add Renesas R-Car Gen3 sysinfo driver 2021-07-20 23:33:54 +02:00
sysreset sysreset: watchdog: watchdog cannot power off 2021-12-26 06:49:14 +01:00
tee tee: optee: remove unused duplicated login Id macros 2021-11-23 13:53:03 -05:00
thermal WS cleanup: remove SPACE(s) followed by TAB 2021-09-30 09:08:16 -04:00
timer Finish conversion of CONFIG_SYS_CLK_FREQ to Kconfig 2021-12-27 16:20:18 -05:00
tpm tis: fix tpm_tis_remove() 2021-11-30 14:11:05 +02:00
ufs dm: define LOG_CATEGORY for all uclass 2021-07-06 10:38:03 -06:00
usb drivers/usb/gadget/dwc2_udc_otg.c: Fix spelling of "resetting". 2022-01-13 07:57:50 -05:00
video - disable CONFIG_NET_RANDOM_ETHADDR when unnecessary on amlogic based configs 2022-01-09 07:56:31 -05:00
virtio pci: Drop DM_PCI 2021-09-13 18:23:13 -04:00
w1 arm: Remove zmx25 board and ARCH_MX25 2021-10-01 21:08:18 -04:00
w1-eeprom dm: define LOG_CATEGORY for all uclass 2021-07-06 10:38:03 -06:00
watchdog watchdog: Add a driver for the Apple watchdog 2022-01-13 06:55:46 +01:00
xen WS cleanup: remove trailing empty lines 2021-09-30 08:08:56 -04:00
Kconfig iommu: Add IOMMU uclass 2021-10-31 08:46:44 -04:00
Makefile iommu: Add IOMMU uclass 2021-10-31 08:46:44 -04:00