Star64_linux/fs/btrfs
Josef Bacik 351cbf6e44 btrfs: use nofs allocations for running delayed items
Zygo reported the following lockdep splat while testing the balance
patches

======================================================
WARNING: possible circular locking dependency detected
5.6.0-c6f0579d496a+ #53 Not tainted
------------------------------------------------------
kswapd0/1133 is trying to acquire lock:
ffff888092f622c0 (&delayed_node->mutex){+.+.}, at: __btrfs_release_delayed_node+0x7c/0x5b0

but task is already holding lock:
ffffffff8fc5f860 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (fs_reclaim){+.+.}:
       fs_reclaim_acquire.part.91+0x29/0x30
       fs_reclaim_acquire+0x19/0x20
       kmem_cache_alloc_trace+0x32/0x740
       add_block_entry+0x45/0x260
       btrfs_ref_tree_mod+0x6e2/0x8b0
       btrfs_alloc_tree_block+0x789/0x880
       alloc_tree_block_no_bg_flush+0xc6/0xf0
       __btrfs_cow_block+0x270/0x940
       btrfs_cow_block+0x1ba/0x3a0
       btrfs_search_slot+0x999/0x1030
       btrfs_insert_empty_items+0x81/0xe0
       btrfs_insert_delayed_items+0x128/0x7d0
       __btrfs_run_delayed_items+0xf4/0x2a0
       btrfs_run_delayed_items+0x13/0x20
       btrfs_commit_transaction+0x5cc/0x1390
       insert_balance_item.isra.39+0x6b2/0x6e0
       btrfs_balance+0x72d/0x18d0
       btrfs_ioctl_balance+0x3de/0x4c0
       btrfs_ioctl+0x30ab/0x44a0
       ksys_ioctl+0xa1/0xe0
       __x64_sys_ioctl+0x43/0x50
       do_syscall_64+0x77/0x2c0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (&delayed_node->mutex){+.+.}:
       __lock_acquire+0x197e/0x2550
       lock_acquire+0x103/0x220
       __mutex_lock+0x13d/0xce0
       mutex_lock_nested+0x1b/0x20
       __btrfs_release_delayed_node+0x7c/0x5b0
       btrfs_remove_delayed_node+0x49/0x50
       btrfs_evict_inode+0x6fc/0x900
       evict+0x19a/0x2c0
       dispose_list+0xa0/0xe0
       prune_icache_sb+0xbd/0xf0
       super_cache_scan+0x1b5/0x250
       do_shrink_slab+0x1f6/0x530
       shrink_slab+0x32e/0x410
       shrink_node+0x2a5/0xba0
       balance_pgdat+0x4bd/0x8a0
       kswapd+0x35a/0x800
       kthread+0x1e9/0x210
       ret_from_fork+0x3a/0x50

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               lock(&delayed_node->mutex);
                               lock(fs_reclaim);
  lock(&delayed_node->mutex);

 *** DEADLOCK ***

3 locks held by kswapd0/1133:
 #0: ffffffff8fc5f860 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30
 #1: ffffffff8fc380d8 (shrinker_rwsem){++++}, at: shrink_slab+0x1e8/0x410
 #2: ffff8881e0e6c0e8 (&type->s_umount_key#42){++++}, at: trylock_super+0x1b/0x70

stack backtrace:
CPU: 2 PID: 1133 Comm: kswapd0 Not tainted 5.6.0-c6f0579d496a+ #53
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Call Trace:
 dump_stack+0xc1/0x11a
 print_circular_bug.isra.38.cold.57+0x145/0x14a
 check_noncircular+0x2a9/0x2f0
 ? print_circular_bug.isra.38+0x130/0x130
 ? stack_trace_consume_entry+0x90/0x90
 ? save_trace+0x3cc/0x420
 __lock_acquire+0x197e/0x2550
 ? btrfs_inode_clear_file_extent_range+0x9b/0xb0
 ? register_lock_class+0x960/0x960
 lock_acquire+0x103/0x220
 ? __btrfs_release_delayed_node+0x7c/0x5b0
 __mutex_lock+0x13d/0xce0
 ? __btrfs_release_delayed_node+0x7c/0x5b0
 ? __asan_loadN+0xf/0x20
 ? pvclock_clocksource_read+0xeb/0x190
 ? __btrfs_release_delayed_node+0x7c/0x5b0
 ? mutex_lock_io_nested+0xc20/0xc20
 ? __kasan_check_read+0x11/0x20
 ? check_chain_key+0x1e6/0x2e0
 mutex_lock_nested+0x1b/0x20
 ? mutex_lock_nested+0x1b/0x20
 __btrfs_release_delayed_node+0x7c/0x5b0
 btrfs_remove_delayed_node+0x49/0x50
 btrfs_evict_inode+0x6fc/0x900
 ? btrfs_setattr+0x840/0x840
 ? do_raw_spin_unlock+0xa8/0x140
 evict+0x19a/0x2c0
 dispose_list+0xa0/0xe0
 prune_icache_sb+0xbd/0xf0
 ? invalidate_inodes+0x310/0x310
 super_cache_scan+0x1b5/0x250
 do_shrink_slab+0x1f6/0x530
 shrink_slab+0x32e/0x410
 ? do_shrink_slab+0x530/0x530
 ? do_shrink_slab+0x530/0x530
 ? __kasan_check_read+0x11/0x20
 ? mem_cgroup_protected+0x13d/0x260
 shrink_node+0x2a5/0xba0
 balance_pgdat+0x4bd/0x8a0
 ? mem_cgroup_shrink_node+0x490/0x490
 ? _raw_spin_unlock_irq+0x27/0x40
 ? finish_task_switch+0xce/0x390
 ? rcu_read_lock_bh_held+0xb0/0xb0
 kswapd+0x35a/0x800
 ? _raw_spin_unlock_irqrestore+0x4c/0x60
 ? balance_pgdat+0x8a0/0x8a0
 ? finish_wait+0x110/0x110
 ? __kasan_check_read+0x11/0x20
 ? __kthread_parkme+0xc6/0xe0
 ? balance_pgdat+0x8a0/0x8a0
 kthread+0x1e9/0x210
 ? kthread_create_worker_on_cpu+0xc0/0xc0
 ret_from_fork+0x3a/0x50

This is because we hold that delayed node's mutex while doing tree
operations.  Fix this by just wrapping the searches in nofs.

CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-25 16:26:00 +01:00
..
tests btrfs: kill the subvol_srcu 2020-03-23 17:02:00 +01:00
acl.c
async-thread.c Btrfs: fix crash during unmount due to race with delayed inode workers 2020-03-23 17:01:51 +01:00
async-thread.h Btrfs: fix crash during unmount due to race with delayed inode workers 2020-03-23 17:01:51 +01:00
backref.c btrfs: do not resolve backrefs for roots that are being deleted 2020-03-23 17:03:51 +01:00
backref.h btrfs: relocation: Use btrfs_find_all_leafs to locate data extent parent tree leaves 2020-03-23 17:01:57 +01:00
block-group.c btrfs: add RCU locks around block group initialization 2020-03-23 17:01:53 +01:00
block-group.h
block-rsv.c btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
block-rsv.h btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
btrfs_inode.h btrfs: introduce per-inode file extent tree 2020-03-23 17:01:24 +01:00
check-integrity.c btrfs: remove buffer_heads form super block mirror integrity checking 2020-03-23 17:01:40 +01:00
check-integrity.h btrfs: remove btrfsic_submit_bh() 2020-03-23 17:01:39 +01:00
compression.c
compression.h
ctree.c btrfs: inline checksum name and driver definitions 2020-03-23 17:01:52 +01:00
ctree.h btrfs: kill the subvol_srcu 2020-03-23 17:02:00 +01:00
delalloc-space.c btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
delalloc-space.h
delayed-inode.c btrfs: use nofs allocations for running delayed items 2020-03-25 16:26:00 +01:00
delayed-inode.h btrfs: delayed-inode: Replace zero-length array with flexible-array member 2020-03-23 17:01:53 +01:00
delayed-ref.c btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
delayed-ref.h
dev-replace.c btrfs: sysfs, rename device_link add/remove functions 2020-03-23 17:01:35 +01:00
dev-replace.h
dir-item.c
discard.c
discard.h
disk-io.c btrfs: kill the subvol_srcu 2020-03-23 17:02:00 +01:00
disk-io.h btrfs: move the root freeing stuff into btrfs_put_root 2020-03-23 17:01:59 +01:00
export.c btrfs: kill the subvol_srcu 2020-03-23 17:02:00 +01:00
export.h btrfs: export helpers for subvolume name/id resolution 2020-03-23 17:01:42 +01:00
extent-io-tree.h btrfs: switch to per-transaction pinned extents 2020-03-23 17:01:38 +01:00
extent-tree.c btrfs: do not use readahead for running delayed refs 2020-03-23 17:03:50 +01:00
extent_io.c btrfs: move the root freeing stuff into btrfs_put_root 2020-03-23 17:01:59 +01:00
extent_io.h btrfs: make the extent buffer leak check per fs info 2020-03-23 17:01:58 +01:00
extent_map.c Btrfs: fix race between using extent maps and merging them 2020-02-12 17:16:46 +01:00
extent_map.h
file-item.c btrfs: add helper to get the end offset of a file extent item 2020-03-23 17:01:56 +01:00
file.c btrfs: kill the subvol_srcu 2020-03-23 17:02:00 +01:00
free-space-cache.c btrfs: simplify error handling in __btrfs_write_out_cache() 2020-03-23 17:01:43 +01:00
free-space-cache.h
free-space-tree.c btrfs: move the root freeing stuff into btrfs_put_root 2020-03-23 17:01:59 +01:00
free-space-tree.h
inode-item.c
inode-map.c btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
inode-map.h
inode.c btrfs: kill the subvol_srcu 2020-03-23 17:02:00 +01:00
ioctl.c btrfs: Remove async_transid from btrfs_mksubvol/create_subvol/create_snapshot 2020-03-23 17:02:00 +01:00
Kconfig
locking.c btrfs: Implement DREW lock 2020-03-23 17:01:43 +01:00
locking.h btrfs: Implement DREW lock 2020-03-23 17:01:43 +01:00
lzo.c
Makefile Btrfs: move all reflink implementation code into its own file 2020-03-23 17:01:54 +01:00
misc.h
ordered-data.c btrfs: drop argument tree from btrfs_lock_and_flush_ordered_range 2020-03-23 17:01:34 +01:00
ordered-data.h btrfs: drop argument tree from btrfs_lock_and_flush_ordered_range 2020-03-23 17:01:34 +01:00
orphan.c
print-tree.c
print-tree.h
props.c btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
props.h
qgroup.c btrfs: move the root freeing stuff into btrfs_put_root 2020-03-23 17:01:59 +01:00
qgroup.h btrfs: destroy qgroup extent records on transaction abort 2020-02-19 00:35:54 +01:00
raid56.c btrfs: use struct_size to calculate size of raid hash table 2020-03-23 17:01:44 +01:00
raid56.h
rcu-string.h btrfs: rcu-string: Replace zero-length array with flexible-array member 2020-03-23 17:01:53 +01:00
reada.c
ref-verify.c btrfs: fix ref-verify to catch operations on 0 ref extents 2020-03-23 17:01:56 +01:00
ref-verify.h
reflink.c Btrfs: implement full reflink support for inline extents 2020-03-23 17:01:54 +01:00
reflink.h Btrfs: move all reflink implementation code into its own file 2020-03-23 17:01:54 +01:00
relocation.c btrfs: track reloc roots based on their commit root bytenr 2020-03-23 17:03:51 +01:00
root-tree.c btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
scrub.c btrfs: scrub: Replace zero-length array with flexible-array member 2020-03-23 17:01:54 +01:00
send.c btrfs: kill the subvol_srcu 2020-03-23 17:02:00 +01:00
send.h
space-info.c btrfs: account ticket size at add/delete time 2020-03-23 17:01:55 +01:00
space-info.h btrfs: account ticket size at add/delete time 2020-03-23 17:01:55 +01:00
struct-funcs.c
super.c btrfs: adjust message level for unrecognized mount option 2020-03-23 17:01:45 +01:00
sysfs.c btrfs: sysfs: Use scnprintf() instead of snprintf() 2020-03-23 18:14:47 +01:00
sysfs.h btrfs: sysfs, rename device_link add/remove functions 2020-03-23 17:01:35 +01:00
transaction.c btrfs: hold a ref on the root on the dead roots list 2020-03-23 17:01:59 +01:00
transaction.h btrfs: switch to per-transaction pinned extents 2020-03-23 17:01:38 +01:00
tree-checker.c
tree-checker.h
tree-defrag.c
tree-log.c btrfs: move the root freeing stuff into btrfs_put_root 2020-03-23 17:01:59 +01:00
tree-log.h
ulist.c
ulist.h
uuid-tree.c btrfs: bail out of uuid tree scanning if we're closing 2020-03-23 17:01:41 +01:00
volumes.c btrfs: relocation: add error injection points for cancelling balance 2020-03-23 17:01:54 +01:00
volumes.h btrfs: introduce chunk allocation policy 2020-03-23 17:01:48 +01:00
xattr.c
xattr.h
zlib.c
zstd.c