Star64_linux/drivers/gpu/drm/amd/amdgpu
Emily Deng 8ee3a52e3f drm/gpu-sched: fix force APP kill hang(v4)
issue:
there are VMC page fault occurred if force APP kill during
3dmark test, the cause is in entity_fini we manually signal
all those jobs in entity's queue which confuse the sync/dep
mechanism:

1)page fault occurred in sdma's clear job which operate on
shadow buffer, and shadow buffer's Gart table is cleaned by
ttm_bo_release since the fence in its reservation was fake signaled
by entity_fini() under the case of SIGKILL received.

2)page fault occurred in gfx' job because during the lifetime
of gfx job we manually fake signal all jobs from its entity
in entity_fini(), thus the unmapping/clear PTE job depend on those
result fence is satisfied and sdma start clearing the PTE and lead
to GFX page fault.

fix:
1)should at least wait all jobs already scheduled complete in entity_fini()
if SIGKILL is the case.

2)if a fence signaled and try to clear some entity's dependency, should
set this entity guilty to prevent its job really run since the dependency
is fake signaled.

v2:
splitting drm_sched_entity_fini() into two functions:
1)The first one is does the waiting, removes the entity from the
runqueue and returns an error when the process was killed.
2)The second one then goes over the entity, install it as
completion signal for the remaining jobs and signals all jobs
with an error code.

v3:
1)Replace the fini1 and fini2 with better name
2)Call the first part before the VM teardown in
amdgpu_driver_postclose_kms() and the second part
after the VM teardown
3)Keep the original function drm_sched_entity_fini to
refine the code.

v4:
1)Rename entity->finished to entity->last_scheduled;
2)Rename drm_sched_entity_fini_job_cb() to
drm_sched_entity_kill_jobs_cb();
3)Pass NULL to drm_sched_entity_fini_job_cb() if -ENOENT;
4)Replace the type of entity->fini_status with "int";
5)Remove the check about entity->finished.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:17 -05:00
..
amdgpu.h drm/gpu-sched: fix force APP kill hang(v4) 2018-05-15 13:43:17 -05:00
amdgpu_acp.c drm/amdgpu: Get pci resource directly through adev 2018-04-11 13:07:50 -05:00
amdgpu_acp.h
amdgpu_acpi.c
amdgpu_afmt.c
amdgpu_amdkfd.c
amdgpu_amdkfd.h
amdgpu_amdkfd_fence.c
amdgpu_amdkfd_gfx_v7.c
amdgpu_amdkfd_gfx_v8.c
amdgpu_amdkfd_gpuvm.c
amdgpu_atombios.c
amdgpu_atombios.h
amdgpu_atomfirmware.c
amdgpu_atomfirmware.h
amdgpu_atpx_handler.c
amdgpu_benchmark.c
amdgpu_bios.c
amdgpu_bo_list.c
amdgpu_cgs.c drm/amdgpu: remove duplicate cg/pg wrapper functions 2018-04-11 13:07:53 -05:00
amdgpu_connectors.c
amdgpu_connectors.h
amdgpu_cs.c drm/amdgpu: fix and cleanup cpu visible VRAM handling 2018-05-15 13:43:05 -05:00
amdgpu_ctx.c drm/gpu-sched: fix force APP kill hang(v4) 2018-05-15 13:43:17 -05:00
amdgpu_debugfs.c drm/amdgpu: Use dpm_enabled as dpm state flag 2018-04-11 13:07:49 -05:00
amdgpu_debugfs.h
amdgpu_device.c drm/amd/display: Remove PRE_VEGA flag 2018-05-15 13:43:06 -05:00
amdgpu_display.c drm/amdgpu: Move GEM BO to drm_framebuffer 2018-04-11 13:07:56 -05:00
amdgpu_display.h
amdgpu_dpm.c drm/amdgpu: Set pm_display_cfg in non-dc mode 2018-04-11 13:07:51 -05:00
amdgpu_dpm.h drm/amdgpu: Set pm_display_cfg in non-dc mode 2018-04-11 13:07:51 -05:00
amdgpu_drv.c drm/amdgpu: Fix memory leaks at amdgpu_init() error path 2018-04-03 13:08:46 -05:00
amdgpu_drv.h
amdgpu_encoders.c
amdgpu_fb.c drm/amdgpu: Move GEM BO to drm_framebuffer 2018-04-11 13:07:56 -05:00
amdgpu_fence.c drm/amdgpu: drop compute ring timeout setting for non-sriov only (v2) 2018-04-03 12:52:56 -05:00
amdgpu_gart.c
amdgpu_gart.h
amdgpu_gds.h
amdgpu_gem.c drm/amdgpu: Don't change preferred domian when fallback GTT v6 2018-04-11 13:08:00 -05:00
amdgpu_gfx.c
amdgpu_gfx.h
amdgpu_gmc.h
amdgpu_gtt_mgr.c
amdgpu_i2c.c
amdgpu_i2c.h
amdgpu_ib.c
amdgpu_ids.c
amdgpu_ids.h
amdgpu_ih.c
amdgpu_ih.h
amdgpu_ioc32.c
amdgpu_irq.c
amdgpu_irq.h
amdgpu_job.c
amdgpu_kms.c drm/gpu-sched: fix force APP kill hang(v4) 2018-05-15 13:43:17 -05:00
amdgpu_mn.c
amdgpu_mn.h
amdgpu_mode.h drm/amdgpu: Move GEM BO to drm_framebuffer 2018-04-11 13:07:56 -05:00
amdgpu_object.c drm/amdgpu: Free VGA stolen memory as soon as possible. 2018-05-15 13:43:16 -05:00
amdgpu_object.h drm/amdgpu: Free VGA stolen memory as soon as possible. 2018-05-15 13:43:16 -05:00
amdgpu_pll.c
amdgpu_pll.h
amdgpu_pm.c drm/amdgpu: add documentation on hwmon interfaces exposed (v3) 2018-04-11 13:07:57 -05:00
amdgpu_pm.h
amdgpu_prime.c
amdgpu_psp.c drm/amdgpu: fix null pointer panic with direct fw loading on gpu reset 2018-05-15 13:43:03 -05:00
amdgpu_psp.h
amdgpu_queue_mgr.c
amdgpu_ring.c drm/amdgpu: add emit_reg_write_reg_wait ring callback 2018-05-15 13:43:13 -05:00
amdgpu_ring.h drm/amdgpu: add emit_reg_write_reg_wait ring callback 2018-05-15 13:43:13 -05:00
amdgpu_sa.c
amdgpu_sched.c
amdgpu_sched.h
amdgpu_sync.c
amdgpu_sync.h
amdgpu_test.c
amdgpu_trace.h
amdgpu_trace_points.c
amdgpu_ttm.c drm/amdgpu: Free VGA stolen memory as soon as possible. 2018-05-15 13:43:16 -05:00
amdgpu_ttm.h drm/amdgpu: Free VGA stolen memory as soon as possible. 2018-05-15 13:43:16 -05:00
amdgpu_ucode.c
amdgpu_ucode.h
amdgpu_uvd.c
amdgpu_uvd.h
amdgpu_vce.c drm/amdgpu: Added support for MV packet 2018-04-11 13:08:02 -05:00
amdgpu_vce.h
amdgpu_vcn.c
amdgpu_vcn.h
amdgpu_vf_error.c
amdgpu_vf_error.h
amdgpu_virt.c
amdgpu_virt.h
amdgpu_vm.c
amdgpu_vm.h
amdgpu_vram_mgr.c
atom.c
atom.h
atombios_crtc.c
atombios_crtc.h
atombios_dp.c
atombios_dp.h
atombios_encoders.c
atombios_encoders.h
atombios_i2c.c
atombios_i2c.h
ci_dpm.c drm/amdgpu: Use dpm_enabled as dpm state flag 2018-04-11 13:07:49 -05:00
ci_dpm.h
ci_smc.c
cik.c drm/amdgpu/cik: implement asic need_full_reset callback 2018-04-11 13:07:58 -05:00
cik.h
cik_dpm.h
cik_ih.c
cik_ih.h
cik_sdma.c drm/amdgpu/sdma: fix mask in emit_pipeline_sync 2018-04-03 12:52:58 -05:00
cik_sdma.h
cikd.h
clearstate_ci.h
clearstate_defs.h
clearstate_gfx9.h
clearstate_si.h
clearstate_vi.h
cz_ih.c
cz_ih.h
dce_v6_0.c drm/amdgpu: Move GEM BO to drm_framebuffer 2018-04-11 13:07:56 -05:00
dce_v6_0.h
dce_v8_0.c drm/amdgpu: Move GEM BO to drm_framebuffer 2018-04-11 13:07:56 -05:00
dce_v8_0.h
dce_v10_0.c drm/amdgpu: Move GEM BO to drm_framebuffer 2018-04-11 13:07:56 -05:00
dce_v10_0.h
dce_v11_0.c drm/amdgpu: Move GEM BO to drm_framebuffer 2018-04-11 13:07:56 -05:00
dce_v11_0.h
dce_virtual.c drm/amdgpu: Move GEM BO to drm_framebuffer 2018-04-11 13:07:56 -05:00
dce_virtual.h
df_v1_7.c drm/amdgpu/df: implement df v1_7 callback functions 2018-04-11 13:07:54 -05:00
df_v1_7.h drm/amdgpu/df: implement df v1_7 callback functions 2018-04-11 13:07:54 -05:00
emu_soc.c
gfx_v6_0.c drm/amdgpu: Add support for SRBM selection v3 2018-04-03 13:08:44 -05:00
gfx_v6_0.h
gfx_v7_0.c drm/amdgpu: Add support for SRBM selection v3 2018-04-03 13:08:44 -05:00
gfx_v7_0.h
gfx_v8_0.c drm/amdgpu: Add support for SRBM selection v3 2018-04-03 13:08:44 -05:00
gfx_v8_0.h
gfx_v9_0.c drm/amdgpu/gfx9: add emit_reg_write_reg_wait ring callback (v2) 2018-05-15 13:43:13 -05:00
gfx_v9_0.h
gfxhub_v1_0.c
gfxhub_v1_0.h
gmc_v6_0.c drm/amdgpu: Free VGA stolen memory as soon as possible. 2018-05-15 13:43:16 -05:00
gmc_v6_0.h
gmc_v7_0.c drm/amdgpu: Free VGA stolen memory as soon as possible. 2018-05-15 13:43:16 -05:00
gmc_v7_0.h
gmc_v8_0.c drm/amdgpu: Free VGA stolen memory as soon as possible. 2018-05-15 13:43:16 -05:00
gmc_v8_0.h
gmc_v9_0.c drm/amdgpu: Free VGA stolen memory as soon as possible. 2018-05-15 13:43:16 -05:00
gmc_v9_0.h
iceland_ih.c
iceland_ih.h
iceland_sdma_pkt_open.h
Kconfig
kv_dpm.c drm/amdgpu: Use dpm_enabled as dpm state flag 2018-04-11 13:07:49 -05:00
kv_dpm.h
kv_smc.c
Makefile drm/amdgpu/df: implement df v1_7 callback functions 2018-04-11 13:07:54 -05:00
mmhub_v1_0.c
mmhub_v1_0.h
mmsch_v1_0.h
mxgpu_ai.c
mxgpu_ai.h
mxgpu_vi.c
mxgpu_vi.h
nbio_v6_1.c
nbio_v6_1.h
nbio_v7_0.c
nbio_v7_0.h
ObjectID.h
ppsmc.h
psp_gfx_if.h
psp_v3_1.c
psp_v3_1.h
psp_v10_0.c
psp_v10_0.h
r600_dpm.h
sdma_v2_4.c drm/amdgpu/sdma: fix mask in emit_pipeline_sync 2018-04-03 12:52:58 -05:00
sdma_v2_4.h
sdma_v3_0.c drm/amdgpu/sdma: fix mask in emit_pipeline_sync 2018-04-03 12:52:58 -05:00
sdma_v3_0.h
sdma_v4_0.c drm/amdgpu/sdma4: add emit_reg_write_reg_wait ring callback (v2) 2018-05-15 13:43:14 -05:00
sdma_v4_0.h
si.c drm/amdgpu/si: implement asic need_full_reset callback 2018-04-11 13:07:57 -05:00
si.h
si_dma.c
si_dma.h
si_dpm.c drm/amdgpu: Use dpm_enabled as dpm state flag 2018-04-11 13:07:49 -05:00
si_dpm.h
si_enums.h
si_ih.c
si_ih.h
si_smc.c
sid.h
sislands_smc.h
soc15.c drm/amdgpu/gfx9: cache DB_DEBUG2 and make it available to userspace 2018-05-15 13:43:11 -05:00
soc15.h
soc15_common.h
soc15d.h
tonga_ih.c
tonga_ih.h
tonga_sdma_pkt_open.h
uvd_v4_2.c drm/amdgpu: Use dpm_enabled as dpm state flag 2018-04-11 13:07:49 -05:00
uvd_v4_2.h
uvd_v5_0.c
uvd_v5_0.h
uvd_v6_0.c
uvd_v6_0.h
uvd_v7_0.c drm/amdgpu/uvd7: add emit_reg_write_reg_wait ring callback 2018-05-15 13:43:14 -05:00
uvd_v7_0.h
vce_v2_0.c
vce_v2_0.h
vce_v3_0.c drm/amdgpu: Add APU support in vi_set_vce_clocks 2018-05-15 13:43:08 -05:00
vce_v3_0.h
vce_v4_0.c drm/amdgpu/vce4: add emit_reg_write_reg_wait ring callback 2018-05-15 13:43:15 -05:00
vce_v4_0.h
vcn_v1_0.c drm/amdgpu/vcn1: add emit_reg_write_reg_wait ring callback 2018-05-15 13:43:15 -05:00
vcn_v1_0.h
vega10_ih.c
vega10_ih.h
vega10_reg_init.c drm/amdgpu: add MP1 and THM hw ip base reg offset 2018-05-15 13:43:04 -05:00
vega10_sdma_pkt_open.h
vi.c drm/amdgpu: Add APU support in vi_set_vce_clocks 2018-05-15 13:43:08 -05:00
vi.h
vi_dpm.h
vid.h