summaryrefslogtreecommitdiff
path: root/fs/btrfs/backref.c
diff options
context:
space:
mode:
authorQu Wenruo <wqu@suse.com>2025-10-19 11:15:27 +1030
committerDavid Sterba <dsterba@suse.com>2025-11-24 22:24:52 +0100
commitc7b478504b2e5a8e428eac4c16925d52c8deb6bd (patch)
tree2a2d8ca80af75ecdd9fed2ffd312b20e4629200e /fs/btrfs/backref.c
parent02a7e90797be89ff4f6bdf1d1fbab26964b0c13a (diff)
btrfs: scrub: cancel the run if the process or fs is being frozen
It's a known bug that btrfs scrub/dev-replace can prevent the system from suspending. There are at least two factors involved: - Holding super_block::s_writers for the whole scrub/dev-replace duration We hold that percpu rw semaphore through mnt_want_write_file() for the whole scrub/dev-replace duration. That will prevent the fs being frozen, which can be initiated by either the user (e.g. fsfreeze) or power management suspend/hibernate. - Stuck in the kernel space for a long time During suspend all user processes (and some kernel threads) will be frozen. But if a user space progress has fallen into kernel (scrub ioctl) and do not return for a long time, it will make process freezing time out. Unfortunately scrub/dev-replace is a long running ioctl, and it will prevent the btrfs process from returning to the user space, thus make PM suspend/hibernate time out. Address them in one go: - Introduce a new helper should_cancel_scrub() Which includes the existing cancel request and new fs/process freezing checks. Here we have to check both fs and process freezing for PM suspend/hibernate. PM can be configured to freeze filesystems before processes. (The current default is not to freeze filesystems, but planned to freeze the filesystems as the new default.) Checking only fs freezing will fail PM without fs freezing, as the process freezing will time out. Checking only process freezing will fail PM with fs freezing since the fs freezing happens before process freezing. And the return value will indicate the reason, -ECANCLED for the explicitly canceled runs, and -EINTR for fs freeze or PM reasons. - Cancel the run if should_cancel_scrub() is true Unfortunately canceling is the only feasible solution here, pausing is not possible as we will still stay in the kernel space thus will still prevent the process from being frozen. This will cause a user impacting behavior change: Dev-replace can be interrupted by PM, and there is no way to resume but start from the beginning again. This means dev-replace may fail on newer kernels, and end users will need extra steps like using systemd-inhibit to prevent suspend/hibernate, to get back the old uninterrupted behavior. This behavior change will need extra documentation updates and communication with projects involving scrub/dev-replace including btrfs-progs. Reviewed-by: Filipe Manana <fdmanana@suse.com> Link: https://lore.kernel.org/linux-btrfs/d93b2a2d-6ad9-4c49-809f-11d769a6f30a@app.fastmail.com/ Reported-by: Chris Murphy <lists@colorremedies.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Diffstat (limited to 'fs/btrfs/backref.c')
0 files changed, 0 insertions, 0 deletions