On Thu, Apr 27, 2023 at 03:19:12PM +0100, Richard W.M. Jones wrote:
We've long known that nbdkit-memory-plugin with the default
sparse
array allocator suffers because a global lock must be taken whenever
any read or write operation is performed. This commit aims to safely
improve performance by converting the lock into a read-write lock.
I searched for other approaches, and found this to be an interesting read:
https://stackoverflow.com/questions/2407558/pthreads-reader-writer-locks-...
One of the suggestions there is using fcntl() locking (which does
support upgrade attempts) instead of pthread_rwlock*. Might be
interesting to code up and compare an approach along those lines.
+++ b/common/allocators/sparse.c
@@ -129,11 +129,20 @@ DEFINE_VECTOR_TYPE (l1_dir, struct l1_entry);
struct sparse_array {
struct allocator a; /* Must come first. */
- /* This lock is highly contended. When hammering the memory plugin
- * with 8 fio threads, about 30% of the total system time was taken
- * just waiting for this lock. Fixing this is quite hard.
+ /* The shared (read) lock must be held if you just want to access
+ * the data without modifying any of the L1/L2 metadata or
+ * allocating or freeing any page.
+ *
+ * To modify the L1/L2 metadata including allocating or freeing any
+ * page you must hold the exclusive (write) lock.
+ *
+ * Because POSIX rwlocks are not upgradable this presents a problem.
+ * We solve it below by speculatively performing the request while
+ * holding the shared lock, but if we run into an operation that
+ * needs to update the metadata, we restart the entire request
+ * holding the exclusive lock.
*/
- pthread_mutex_t lock;
+ pthread_rwlock_t lock;
But as written, this does look like a nice division of labor that
reduces serialization in the common case. And yes, it does look like
posix_rwlock is a bit heavier-weight than bare mutex, which explains
the slowdown on the read-heavy workload even while improving the
write-heavy workload.
Reviewed-by: Eric Blake <eblake(a)redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization:
qemu.org |
libvirt.org