2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
linux/rust/kernel
Danilo Krummrich f744201c61 rust: devres: fix race in Devres::drop()
In Devres::drop() we first remove the devres action and then drop the
wrapped device resource.

The design goal is to give the owner of a Devres object control over when
the device resource is dropped, but limit the overall scope to the
corresponding device being bound to a driver.

However, there's a race that was introduced with commit 8ff656643d
("rust: devres: remove action in `Devres::drop`"), but also has been
(partially) present from the initial version on.

In Devres::drop(), the devres action is removed successfully and
subsequently the destructor of the wrapped device resource runs.
However, there is no guarantee that the destructor of the wrapped device
resource completes before the driver core is done unbinding the
corresponding device.

If in Devres::drop(), the devres action can't be removed, it means that
the devres callback has been executed already, or is still running
concurrently. In case of the latter, either Devres::drop() wins revoking
the Revocable or the devres callback wins revoking the Revocable. If
Devres::drop() wins, we (again) have no guarantee that the destructor of
the wrapped device resource completes before the driver core is done
unbinding the corresponding device.

CPU0					CPU1
------------------------------------------------------------------------
Devres::drop() {			Devres::devres_callback() {
   self.data.revoke() {			   this.data.revoke() {
      is_available.swap() == true
					      is_available.swap == false
					   }
					}

					// [...]
					// device fully unbound
      drop_in_place() {
         // release device resource
      }
   }
}

Depending on the specific device resource, this can potentially lead to
user-after-free bugs.

In order to fix this, implement the following logic.

In the devres callback, we're always good when we get to revoke the
device resource ourselves, i.e. Revocable::revoke() returns true.

If Revocable::revoke() returns false, it means that Devres::drop(),
concurrently, already drops the device resource and we have to wait for
Devres::drop() to signal that it finished dropping the device resource.

Note that if we hit the case where we need to wait for the completion of
Devres::drop() in the devres callback, it means that we're actually
racing with a concurrent Devres::drop() call, which already started
revoking the device resource for us. This is rather unlikely and means
that the concurrent Devres::drop() already started doing our work and we
just need to wait for it to complete it for us. Hence, there should not
be any additional overhead from that.

(Actually, for now it's even better if Devres::drop() does the work for
us, since it can bypass the synchronize_rcu() call implied by
Revocable::revoke(), but this goes away anyways once I get to implement
the split devres callback approach, which allows us to first flip the
atomics of all registered Devres objects of a certain device, execute a
single synchronize_rcu() and then drop all revocable objects.)

In Devres::drop() we try to revoke the device resource. If that is *not*
successful, it means that the devres callback already did and we're good.

Otherwise, we try to remove the devres action, which, if successful,
means that we're good, since the device resource has just been revoked
by us *before* we removed the devres action successfully.

If the devres action could not be removed, it means that the devres
callback must be running concurrently, hence we signal that the device
resource has been revoked by us, using the completion.

This makes it safe to drop a Devres object from any task and at any point
of time, which is one of the design goals.

Fixes: 76c01ded72 ("rust: add devres abstraction")
Reported-by: Alice Ryhl <aliceryhl@google.com>
Closes: https://lore.kernel.org/lkml/aD64YNuqbPPZHAa5@google.com/
Reviewed-by: Benno Lossin <lossin@kernel.org>
Link: https://lore.kernel.org/r/20250612121817.1621-4-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-06-13 23:47:53 +02:00
..
alloc rust: alloc: add missing Markdown code span 2025-05-25 22:58:35 +02:00
block rust: convert raw URLs to Markdown autolinks in comments 2025-05-12 00:20:25 +02:00
drm Rust changes for v6.16 2025-06-04 21:18:37 -07:00
fs rust: file: improve safety comments 2025-05-30 07:12:05 +02:00
list rust: list: Fix typo much in arc.rs 2025-05-29 23:35:44 +02:00
mm mm: rust: make CONFIG_MMU ifdefs more narrow 2025-05-31 22:46:12 -07:00
net net: phy: pass PHY driver to .match_phy_device OP 2025-05-21 15:56:09 -07:00
sync rust: completion: implement initial abstraction 2025-06-13 23:46:56 +02:00
time rust: workaround bindgen issue with forward references to enum types 2025-05-22 15:39:16 +02:00
.gitignore rust: jump_label: skip formatting generated file 2024-11-20 13:32:42 -05:00
alloc.rs rust: alloc: add missing Markdown code spans 2025-05-25 22:58:35 +02:00
auxiliary.rs Rust changes for v6.16 2025-06-04 21:18:37 -07:00
block.rs rust: block: introduce kernel::block::mq module 2024-06-14 07:45:04 -06:00
build_assert.rs rust: add build_error! to the prelude 2025-01-10 00:19:09 +01:00
clk.rs rust: clk: Add initial abstractions 2025-05-19 12:55:40 +05:30
configfs.rs Rust changes for v6.16 2025-06-04 21:18:37 -07:00
cpu.rs rust: cpu: Add from_cpu() 2025-05-20 10:04:06 +05:30
cpufreq.rs Rust changes for v6.16 2025-06-04 21:18:37 -07:00
cpumask.rs rust: cpumask: Add initial abstractions 2025-05-19 12:55:40 +05:30
cred.rs cred,rust: mark Credential methods inline 2025-03-04 17:07:49 -05:00
device_id.rs rust: use absolute paths in macros referencing core and kernel 2025-05-23 00:12:14 +02:00
device.rs Rust changes for v6.16 2025-06-04 21:18:37 -07:00
devres.rs rust: devres: fix race in Devres::drop() 2025-06-13 23:47:53 +02:00
dma.rs Rust changes for v6.16 2025-06-04 21:18:37 -07:00
driver.rs rust: make pin-init its own crate 2025-03-16 21:59:19 +01:00
error.rs Rust changes for v6.15 2025-03-30 17:03:26 -07:00
faux.rs rust/kernel/faux: mark Registration methods inline 2025-03-11 10:42:23 +01:00
firmware.rs rust: firmware: Use ffi::c_char type in FwFunc 2025-04-14 14:13:23 +02:00
fs.rs rust: file: add Rust abstraction for struct file 2024-09-30 13:02:28 +02:00
generated_arch_static_branch_asm.rs.S rust: jump_label: skip formatting generated file 2024-11-20 13:32:42 -05:00
init.rs Rust changes for v6.15 2025-03-30 17:03:26 -07:00
io.rs rust: io: rename io::Io accessors 2025-02-22 15:44:19 +01:00
ioctl.rs rust: start using the #[expect(...)] attribute 2024-10-07 21:39:57 +02:00
jump_label.rs rust: jump_label: skip formatting generated file 2024-11-20 13:32:42 -05:00
kunit.rs rust: add kunit_tests to the prelude 2025-05-27 20:09:59 +02:00
lib.rs Rust changes for v6.16 2025-06-04 21:18:37 -07:00
list.rs rust: list: Add examples for linked list 2025-05-22 12:00:52 +02:00
miscdevice.rs Char/Misc/IIO pull request for 6.16-rc1 2025-06-06 11:50:47 -07:00
mm.rs mm: rust: make CONFIG_MMU ifdefs more narrow 2025-05-31 22:46:12 -07:00
net.rs rust: core abstractions for network PHY drivers 2023-12-15 09:35:50 +00:00
of.rs rust: of: add of::DeviceId abstraction 2024-12-20 17:21:04 +01:00
opp.rs rust: opp: Move cfg(CONFIG_OF) attribute to the top of doc test 2025-05-27 15:29:17 +02:00
page.rs rust: page: optimize rust symbol generation for Page 2025-05-12 00:20:25 +02:00
pci.rs Rust changes for v6.16 2025-06-04 21:18:37 -07:00
pid_namespace.rs rust: add PidNamespace 2024-10-08 15:44:36 +02:00
platform.rs Rust changes for v6.16 2025-06-04 21:18:37 -07:00
prelude.rs rust: add kunit_tests to the prelude 2025-05-27 20:09:59 +02:00
print.rs rust: replace rustdoc references to alloc::format 2025-05-12 00:20:25 +02:00
rbtree.rs rust: retain pointer mut-ness in container_of! 2025-05-28 18:54:09 +02:00
revocable.rs rust: revocable: indicate whether data has been revoked already 2025-06-13 23:46:59 +02:00
security.rs lsm,rust: reword "destroy" -> "release" in SecurityCtx 2025-03-04 15:44:46 -05:00
seq_file.rs Rust changes for v6.15 2025-03-30 17:03:26 -07:00
sizes.rs rust: sizes: add commonly used constants 2024-08-30 10:27:34 +01:00
static_assert.rs rust: use absolute paths in macros referencing core and kernel 2025-05-23 00:12:14 +02:00
std_vendor.rs rust: convert raw URLs to Markdown autolinks in comments 2025-05-12 00:20:25 +02:00
str.rs rust: str: take advantage of the -> Result support in KUnit #[test]'s 2025-05-27 20:09:59 +02:00
sync.rs rust: completion: implement initial abstraction 2025-06-13 23:46:56 +02:00
task.rs rust: task: add missing Markdown code spans and intra-doc links 2025-05-25 22:58:35 +02:00
time.rs rust: time: Introduce Instant type 2025-04-29 15:31:07 +02:00
tracepoint.rs rust: add tracepoint support 2024-11-04 16:21:44 -05:00
transmute.rs rust: kernel: move FromBytes and AsBytes traits to a new transmute module 2024-10-10 00:33:42 +02:00
types.rs Rust changes for v6.16 2025-06-04 21:18:37 -07:00
uaccess.rs Alloc changes for v6.16 2025-05-18 20:56:03 +02:00
workqueue.rs rust: workqueue: remove HasWork::OFFSET 2025-05-29 01:34:52 +02:00
xarray.rs rust: xarray: Add an abstraction for XArray 2025-05-01 11:37:59 +02:00