[cross-project PATCH v2] NBD 64-bit extensions
by Eric Blake
This is a cover letter for a set of multi-project patch series all
designed to implement 64-bit operations in NBD, and demonstrate
interoperability of the new extension between projects.
v1 of the project was attempted nearly a year ago:
https://lists.nongnu.org/archive/html/qemu-devel/2021-12/msg00453.html
Since then, I've addressed a lot of preliminary cleanups in libnbd to
make it easier to test things, and incorporated review comments issued
on v1, including:
- no orthogonality between simple/structured replies and 64-bit mode:
once extended headers are negotiated, all transmission traffic (both
from client to server and server to client) uses just one header
size
- add support for the client to pass a payload on commands to the
server, and demonstrate it by further implementing a way to pass a
flag with NBD_CMD_BLOCK_STATUS that says the client is passing a
payload to request status of just a subset of the negotiated
contexts, rather than all possible contexts that were earlier
negotiated during NBD_OPT_SET_META_CONTEXT
- tweaks to the header layouts: tweak block status to provide 64-bit
flags values (although "base:allocation" will remain usable with
just 32-bit flags); reply with offset rather than padding and with
fields rearranged for maximal sharing between client and server
layouts
- word-smithing of the NBD proposed protocol; split things into
several smaller patches, where we can choose how much to take
- more unit tests added in qemu and libnbd
- the series end with RFC patches on whether to support 64-bit hole
responses to NBD_CMD_READ, even though our current limitations say
servers don't have to accept more than a 32M read request and
therefore will never have that big of a hole to read)
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
1 year, 8 months
[V2V PATCH 0/5] Bring support for virtio-scsi back to Windows
by Andrey Drobyshev
Since commits b28cd1dc ("Remove requested_guestcaps / rcaps"), f0afc439
("Remove guestcaps_block_type Virtio_SCSI") support for installing
virtio-scsi driver is missing in virt-v2v. AFAIU plans and demands for
bringing this feature back have been out there for a while. E.g. I've
found a corresponding issue which is still open [1].
The code in b28cd1dc, f0afc439 was removed due to removing the old in-place
support. However, having the new in-place support present and bringing
this same code (partially) back with several additions and improvements,
I'm able to successfully convert and boot a Win guest with a virtio-scsi
disk controller. So please consider the following implementation of
this feature.
[1] https://github.com/libguestfs/virt-v2v/issues/12
v2v:
Andrey Drobyshev (2):
Revert "Remove guestcaps_block_type Virtio_SCSI"
convert_windows: add Inject_virtio_win.Virtio_SCSI as a possible block
type
convert/convert.ml | 2 +-
convert/convert_linux.ml | 9 +++++++--
convert/convert_windows.ml | 1 +
convert/target_bus_assignment.ml | 1 +
lib/create_ovf.ml | 1 +
lib/types.ml | 3 ++-
lib/types.mli | 2 +-
output/openstack_image_properties.ml | 7 +++++++
9 files changed, 22 insertions(+), 6 deletions(-)
common:
Andrey Drobyshev (2):
inject_virtio_win: add Virtio_SCSI to block_type
inject_virtio_win: make virtio-scsi the default block driver
Roman Kagan (1):
inject_virtio_win: match only vendor/device
mlcustomize/inject_virtio_win.ml | 25 ++++++++++++++++---------
mlcustomize/inject_virtio_win.mli | 2 +-
2 files changed, 17 insertions(+), 10 deletions(-)
--
2.31.1
1 year, 8 months
Checksums and other verification
by Richard W.M. Jones
https://github.com/kubevirt/containerized-data-importer/issues/1520
Hi Eric,
We had a question from the Kubevirt team related to the above issue.
The question is roughly if it's possible to calculate the checksum of
an image as an nbdkit filter and/or in the qemu block layer.
Supplemental #1: could qemu-img convert calculate a checksum as it goes
along?
Supplemental #2: could we detect various sorts of common errors, such
a webserver that is incorrectly configured and serves up an error page
containing "<html>"; or something which is supposed to be a disk image
but does not "look like" (in some ill-defined sense) a disk image,
eg. it has no partition table.
I'm not sure if qemu has any existing features covering the above (and
I know for sure that nbdkit doesn't).
One issue is that calculating a checksum involves a linear scan of the
image, although we can at least skip holes.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
nbdkit - Flexible, fast NBD server with plugins
https://gitlab.com/nbdkit/nbdkit
1 year, 8 months
[libnbd PATCH v5 0/5] vector: introduce DEFINE_POINTER_VECTOR_TYPE()
by Laszlo Ersek
V4:
http://mid.mail-archive.com/20230224043937.392235-1-lersek@redhat.com
Please see the Notes section in each patch for the updates.
Laszlo
Laszlo Ersek (5):
common/include: add TYPE_IS_POINTER() macro
common/include: extract STATIC_ASSERT() macro
force semicolon after DEFINE_VECTOR_TYPE() macro invocations
vector: introduce DEFINE_POINTER_VECTOR_TYPE()
convert string_vector_(iter(free) + reset()) to string_vector_empty()
common/include/Makefile.am | 1 +
common/include/checked-overflow.h | 8 ++--
common/include/compiler-macros.h | 31 ++++++++++++-
common/include/static-assert.h | 48 ++++++++++++++++++++
common/include/test-array-size.c | 45 ++++++------------
common/utils/string-vector.h | 2 +-
common/utils/test-vector.c | 3 +-
common/utils/vector.h | 42 ++++++++++++++++-
copy/file-ops.c | 2 +-
copy/nbd-ops.c | 2 +-
dump/dump.c | 2 +-
fuse/nbdfuse.h | 2 +-
generator/states-connect-socket-activation.c | 9 ++--
info/list.c | 2 +-
info/map.c | 2 +-
info/show.c | 3 +-
lib/handle.c | 12 ++---
lib/uri.c | 2 +-
lib/utils.c | 6 +--
ublk/nbdublk.h | 2 +-
ublk/tgt.c | 2 +-
21 files changed, 157 insertions(+), 71 deletions(-)
create mode 100644 common/include/static-assert.h
base-commit: 594f7a738e0aba23eda37965c96b3df6b8f76960
1 year, 8 months
[libnbd PATCH v3 00/29] pass LISTEN_FDNAMES with systemd socket activation
by Laszlo Ersek
Rich posted v2 at:
https://listman.redhat.com/archives/libguestfs/2023-January/030555.html
This is my stab at the feature, with a number of changes (that I felt
were prerequisites) prepended. The series is structured as follows. Note
that almost every segment of the series strongly depends on prior
segment(s); in other words, the patches are in reasonably strict
dependency order.
- Patches #1 through #4 are generic, mostly mechanical, tree-wide
cleanups, dealing with whitespace and reserved identifiers.
- Patch #5 extracts a vector usage pattern we use 11 times in the tree
to a new vector API called "name##_empty".
- Patches #6 through #8 introduce an assert() variant that is
async-signal-safe. This means we can make assertions in a child
process before reaching one of the exec functions, even if the parent
process was multi-threaded at the time of fork().
- Patches #9 and #10 introduce an execvpe() variant that is
async-signal-safe.
- Patches #11 through #25 implement the same series of cleanups *twice*,
once for CONNECT_SA.START ("socket activation"), and another time for
CONNECT_COMMAND.START. The table below describes this sub-structure:
change summary CONNECT_SA CONNECT_COMMAND
---------------------------------------------- ---------- ---------------
minor thinkos and warts #11 - #13 #18 - #21
check syscalls for errors in the child process #14 #22
centralize resource release #15 #23
plug resource leak on error #16 #24
replace execvp() call with fork-safe variant #17 #25
- Patches #26 through #29 implement the LISTEN_FDNAMES passing. This
sub-series includes two patches from Rich's v2 series (with proper
attribution).
The series buils, and passes "make check" and "make check-valgrind", at
every stage.
In the Notes sections of some of the patches, there are lines of the
form "context:-U<number>" or "context:-W". I've always wanted to control
the number of context lines at the *individual* patch level -- for
reviewing some patches, the default 3 is just fine, but for reviewing
many other patches, significantly larger contexts are beneficial. I've
finally bitten the bullet and written a (rather awkward) script that
parses this "context:" note out of each patch (where "-W" is the short
form of "--function--context") and applies it to formatting. That's why
the context size varies over the series.
The larger context size may make it harder to apply the series from the
list if the master branch advances meanwhile. I don't expect this to
happen, but in any case, I've passed "--base=master" to
git-format-patch, so that the base commit (the current HEAD of the
master branch) be noted at the end of the cover letter. That way the
series can be applied precisely, and then rebased with a separate local
step (because git has more information that way).
Laszlo
Laszlo Ersek (27):
use space consistently in function and function-like macro invocations
generator/C.ml: use space consistently in func. and func.-like macro
calls
socket activation: rename sa_(tmpdir|sockpath) to
sact_(tmpdir|sockpath)
ocaml: rename "sa_u" to "saddr_u"
vector: (mostly) factor out DEFINE_VECTOR_EMPTY
lib/utils: introduce xwrite() as a more robust write()
lib/utils: add async-signal-safe assert()
lib/utils: add unit test for async-signal-safe assert()
lib/utils: introduce async-signal-safe execvpe()
lib/utils: add unit tests for async-signal-safe execvpe()
socket activation: fix error message upon asprintf() failure
socket activation: clean up responsibilities of prep.sock.act.env.()
socket activation: avoid manipulating the sign bit
socket activation: check syscalls for errors in the child process
socket activation: centralize resource release
socket activation: plug AF_UNIX socket address (filesystem) leak on
error
socket activation: replace execvp() call with fork-safe variant
CONNECT_COMMAND.START: fix small comment thinko about socket pair
usage
CONNECT_COMMAND.START: set the NBD error when fcntl() fails
CONNECT_COMMAND.START: use symbolic constants for fd#0 and fd#1
CONNECT_COMMAND.START: sanitize close() calls in the child process
CONNECT_COMMAND.START: check syscalls for errors in the child process
CONNECT_COMMAND.START: centralize resource release
CONNECT_COMMAND.START: plug child process leak on error
CONNECT_COMMAND.START: replace execvp() call with fork-safe variant
socket activation: generalize environment construction
socket activation: set LISTEN_FDNAMES
Richard W.M. Jones (2):
common/include: Copy ascii-ctype functions from nbdkit
generator: Add APIs to get/set the socket activation socket name
.gitignore | 4 +
common/include/Makefile.am | 6 +
common/include/array-size.h | 2 +-
common/include/ascii-ctype.h | 75 ++++
common/include/byte-swapping.h | 24 +-
common/include/checked-overflow.h | 42 +-
common/include/compiler-macros.h | 2 +-
common/include/iszero.h | 2 +-
common/include/minmax.h | 4 +-
common/include/test-array-size.c | 26 +-
common/include/test-ascii-ctype.c | 88 ++++
common/utils/const-string-vector.h | 2 +-
common/utils/nbdkit-string.h | 2 +-
common/utils/string-vector.h | 3 +-
common/utils/test-human-size.c | 10 +-
common/utils/test-vector.c | 7 +-
common/utils/vector.h | 45 +-
configure.ac | 5 +
copy/file-ops.c | 2 +-
copy/main.c | 2 +-
copy/nbdcopy.h | 2 +-
dump/dump.c | 2 +-
examples/copy-libev.c | 30 +-
examples/list-exports.c | 2 +-
examples/strict-structured-reads.c | 2 +-
examples/threaded-reads-and-writes.c | 2 +-
fuse/nbdfuse.c | 2 +-
generator/API.ml | 49 +++
generator/C.ml | 20 +-
generator/states-connect-socket-activation.c | 294 +++++++++----
generator/states-connect.c | 123 ++++--
info/main.c | 2 +-
info/show.c | 3 +-
interop/interop.c | 2 +-
lib/Makefile.am | 48 ++-
lib/errors.c | 6 +-
lib/handle.c | 80 +++-
lib/internal.h | 112 +++--
lib/nbd-protocol.h | 12 +-
lib/opt.c | 4 +-
lib/test-fork-safe-assert.c | 63 +++
lib/test-fork-safe-assert.sh | 31 ++
lib/test-fork-safe-execvpe.c | 117 ++++++
lib/test-fork-safe-execvpe.sh | 270 ++++++++++++
lib/uri.c | 2 +-
lib/utils.c | 430 +++++++++++++++++++-
ocaml/helpers.c | 8 +-
ocaml/nbd-c.h | 6 +-
tests/eflags.c | 6 +-
tests/get-size.c | 4 +-
tests/newstyle-limited.c | 6 +-
tests/oldstyle.c | 4 +-
ublk/nbdublk.c | 4 +-
ublk/tgt.c | 20 +-
54 files changed, 1771 insertions(+), 350 deletions(-)
create mode 100644 common/include/ascii-ctype.h
create mode 100644 common/include/test-ascii-ctype.c
create mode 100644 lib/test-fork-safe-assert.c
create mode 100755 lib/test-fork-safe-assert.sh
create mode 100644 lib/test-fork-safe-execvpe.c
create mode 100755 lib/test-fork-safe-execvpe.sh
base-commit: 5a02c7d2cc6a201f9e5531c0c20c2f3c22b805a2
1 year, 8 months
[nbdkit PATCH v2] server: Don't assert on send if client hangs up early
by Eric Blake
libnbd's copy/copy-nbd-error.sh was triggering an assertion failure in
nbdkit (this is the command that triggered the failure, at least with
nbdcopy 1.14.2; although future nbdcopy may be fixed independently to
more gracefully exit):
$ nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M error-pread-rate=0.5 ] null:
...
nbdkit: pattern.2: debug: error-inject: pread count=262144 offset=4718592
nbdkit: pattern.2: debug: pattern: pread count=262144 offset=4718592
nbdkit: pattern.1: debug: error-inject: pread count=262144 offset=4456448
nbdkit: pattern.1: error: injecting EIO error into pread
nbdkit: pattern.1: debug: sending error reply: Input/output error
nbdkit: pattern.3: debug: error-inject: pread count=262144 offset=4980736
nbdkit: pattern.3: debug: pattern: pread count=262144 offset=4980736
nbdkit: pattern.4: error: write data: NBD_CMD_READ: Broken pipe
nbdkit: pattern.4: debug: exiting worker thread pattern.4
nbdkit: connections.c:402: raw_send_socket: Assertion `sock >= 0' failed.
When there are multiple queued requests, and the client hangs up
abruptly as soon as the first error response is seen (rather than
waiting until all other queued responses are flushed then sending
NBD_CMD_DISC), another thread that has a queued response ready to send
sees EPIPE (the client is no longer reading) and closes the socket,
then the third thread triggers the assertion failure because it was
not expecting to attempt a send to a closed socket. Thus, my claim
that sock would always be non-negative after collecting data from the
plugin was wrong. The bug can only happen for a server with
NBDKIT_THREAD_MODEL_PARALLEL, as that is the only time a single client
can have more than one worker thread; and even with parallel service,
it is possible to avoid the bug by using 'nbdkit --threads=1' (at the
expense of performance, because one client can no longer benefit from
interleaved requests).
For the nbdcopy example, --exit-with-parent means it's just an
annoyance (nbdcopy has already exited, and no further client will be
connecting); but for a longer-running nbdkit server accepting multiple
clients (particularly likely when advertising MULTI-CONN, although
that is not a prerequisite), it means any one client can easily
trigger the SIGABRT by intentionally queueing multiple NBD_CMD_READ
then disconnecting early, and thereby kill nbdkit and starve other
clients. Note that SIGABRT is the most likely outcome; however, there
is also a small window for something else: if client 1's first thread
triggers conn->close(SHUT_WR) while holding read_lock, there is a
window betwen shutdown(conn->sockout) and conn->sockout=-1 where the
client's second thread holding write_lock can copy a stale
non-negative value of sock=conn->sockout. If during this window a 2nd
client connects, an fd allocated as part of the second client can end
up with the same value as the stale 'sock' in the first client, where
we send data to a completely unrelated socket rather than triggering
the assertion failure.
Whether this bug rises to the level of CVE depends on whether you
consider one client being able to starve others a privilege escalation
(if you are not using TLS, there are other ways for a bad client to
starve peers; if you are using TLS, then the starved client has the
same credentials as the client that caused the SIGABRT or data
corruption so there is no privilege boundary escalation).
Partially reverts 26fd5e42; protocol_recv_request_send_reply() must
once again have a return status, because it is essential that when the
client has multiple worker threads, conn->sockout not be modified
except under the write_lock, and altering it is easier at a central
point where we do not risk deadlock with read_lock or status_lock.
Add some testsuite coverage that does not depend on current nbdcopy's
abrubt disconnect behavior. The TLS case happens to pass even without
this patch, but the plaintext version demonstrates the same core dump
that triggered my efforts on this patch.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2173054
Fixes: daef505e ("server: Give client EOF when we are done writing", v1.32.4)
---
tests/Makefile.am | 6 +++
server/internal.h | 19 +++++---
server/connections.c | 27 ++++++++---
server/protocol.c | 85 +++++++++++++++-------------------
server/public.c | 7 ++-
server/test-public.c | 4 +-
tests/test-client-death-tls.sh | 81 ++++++++++++++++++++++++++++++++
tests/test-client-death.sh | 60 ++++++++++++++++++++++++
8 files changed, 224 insertions(+), 65 deletions(-)
create mode 100755 tests/test-client-death-tls.sh
create mode 100755 tests/test-client-death.sh
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 15832138..d4717383 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -256,6 +256,8 @@ TESTS += \
test-swap.sh \
test-disconnect.sh \
test-disconnect-tls.sh \
+ test-client-death.sh \
+ test-client-death-tls.sh \
test-shutdown.sh \
test-nbdkit-backend-debug.sh \
test-read-password.sh \
@@ -299,6 +301,8 @@ EXTRA_DIST += \
test-read-password-plugin.c \
test-disconnect.sh \
test-disconnect-tls.sh \
+ test-client-death.sh \
+ test-client-death-tls.sh \
test-shutdown.sh \
test-single-from-file.sh \
test-single-sh.sh \
@@ -406,6 +410,8 @@ test_disconnect_plugin_la_LDFLAGS = \
$(NULL)
test_disconnect_plugin_la_LIBADD = $(IMPORT_LIBRARY_ON_WINDOWS)
+test-client-death-tls.sh: keys.psk
+
# check_LTLIBRARIES won't build a shared library (see automake manual).
# So we have to do this and add a dependency.
noinst_LTLIBRARIES += \
diff --git a/server/internal.h b/server/internal.h
index 77fdc21d..af5d8677 100644
--- a/server/internal.h
+++ b/server/internal.h
@@ -1,5 +1,5 @@
/* nbdkit
- * Copyright (C) 2013-2022 Red Hat Inc.
+ * Copyright (C) 2013-2023 Red Hat Inc.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are
@@ -244,10 +244,13 @@ typedef enum {
} conn_status;
struct connection {
- pthread_mutex_t request_lock;
- pthread_mutex_t read_lock;
- pthread_mutex_t write_lock;
- pthread_mutex_t status_lock;
+ /* Listed in precedence order: do not grab earlier locks in this list
+ * while holding a later lock.
+ */
+ pthread_mutex_t request_lock; /* Forces serialization of requests */
+ pthread_mutex_t read_lock; /* Read entire client payload off wire */
+ pthread_mutex_t write_lock; /* Protect sockout, write response to wire */
+ pthread_mutex_t status_lock; /* Track current status of client */
conn_status status;
int status_pipe[2]; /* track status changes via poll when nworkers > 1 */
@@ -269,14 +272,16 @@ struct connection {
const char *exportname;
int sockin, sockout;
+ /* If nworkers > 1, only call this while read_lock is held */
connection_recv_function recv;
+ /* If nworkers > 1, only call these while write_lock is held */
connection_send_function send;
connection_close_function close;
};
extern void handle_single_connection (int sockin, int sockout);
extern conn_status connection_get_status (void);
-extern void connection_set_status (conn_status value);
+extern bool connection_set_status (conn_status value);
/* protocol-handshake.c */
extern int protocol_handshake (void);
@@ -291,7 +296,7 @@ extern int protocol_handshake_oldstyle (void);
extern int protocol_handshake_newstyle (void);
/* protocol.c */
-extern void protocol_recv_request_send_reply (void);
+extern bool protocol_recv_request_send_reply (void);
/* The context ID of base:allocation. As far as I can tell it doesn't
* matter what this is as long as nbdkit always returns the same
diff --git a/server/connections.c b/server/connections.c
index 4d776f2a..6d9f6a96 100644
--- a/server/connections.c
+++ b/server/connections.c
@@ -1,5 +1,5 @@
/* nbdkit
- * Copyright (C) 2013-2022 Red Hat Inc.
+ * Copyright (C) 2013-2023 Red Hat Inc.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are
@@ -80,11 +80,14 @@ connection_get_status (void)
return r;
}
-/* Update the status if the new value is lower than the existing value. */
-void
+/* Update the status if the new value is lower than the existing value.
+ * Return true if the caller should shutdown.
+ */
+bool
connection_set_status (conn_status value)
{
GET_CONN;
+ bool ret = false;
if (conn->nworkers &&
pthread_mutex_lock (&conn->status_lock))
@@ -99,12 +102,13 @@ connection_set_status (conn_status value)
debug ("failed to notify pipe-to-self: %m");
}
if (conn->status >= STATUS_CLIENT_DONE && value < STATUS_CLIENT_DONE)
- conn->close (SHUT_WR);
+ ret = true;
conn->status = value;
}
if (conn->nworkers &&
pthread_mutex_unlock (&conn->status_lock))
abort ();
+ return ret;
}
struct worker_data {
@@ -126,7 +130,10 @@ connection_worker (void *data)
free (worker);
while (!quit && connection_get_status () > STATUS_CLIENT_DONE)
- protocol_recv_request_send_reply ();
+ if (protocol_recv_request_send_reply ()) {
+ ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&conn->write_lock);
+ conn->close (SHUT_WR);
+ }
debug ("exiting worker thread %s", threadlocal_get_name ());
free (name);
return NULL;
@@ -178,7 +185,8 @@ handle_single_connection (int sockin, int sockout)
/* No need for a separate thread. */
debug ("handshake complete, processing requests serially");
while (!quit && connection_get_status () > STATUS_CLIENT_DONE)
- protocol_recv_request_send_reply ();
+ if (protocol_recv_request_send_reply ())
+ conn->close (SHUT_WR);
}
else {
/* Create thread pool to process requests. */
@@ -400,7 +408,10 @@ raw_send_socket (const void *vbuf, size_t len, int flags)
ssize_t r;
int f = 0;
- assert (sock >= 0);
+ if (sock < 0) {
+ errno = EBADF;
+ return -1;
+ }
#ifdef MSG_MORE
if (flags & SEND_MORE)
f |= MSG_MORE;
@@ -492,6 +503,8 @@ raw_recv (void *vbuf, size_t len)
/* There's no place in the NBD protocol to send back errors from
* close, so this function ignores errors.
+ *
+ * If @how == SHUT_WR and conn->nworkers > 1, the caller holds write_lock.
*/
static void
raw_close (int how)
diff --git a/server/protocol.c b/server/protocol.c
index cc1e4ed8..58f865ef 100644
--- a/server/protocol.c
+++ b/server/protocol.c
@@ -1,5 +1,5 @@
/* nbdkit
- * Copyright (C) 2013-2022 Red Hat Inc.
+ * Copyright (C) 2013-2023 Red Hat Inc.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are
@@ -362,7 +362,7 @@ nbd_errno (int error, uint16_t flags)
}
}
-static void
+static bool
send_simple_reply (uint64_t handle, uint16_t cmd, uint16_t flags,
const char *buf, uint32_t count,
uint32_t error)
@@ -380,8 +380,7 @@ send_simple_reply (uint64_t handle, uint16_t cmd, uint16_t flags,
r = conn->send (&reply, sizeof reply, f);
if (r == -1) {
nbdkit_error ("write reply: %s: %m", name_of_nbd_cmd (cmd));
- connection_set_status (STATUS_DEAD);
- return;
+ return connection_set_status (STATUS_DEAD);
}
/* Send the read data buffer. */
@@ -389,12 +388,13 @@ send_simple_reply (uint64_t handle, uint16_t cmd, uint16_t flags,
r = conn->send (buf, count, 0);
if (r == -1) {
nbdkit_error ("write data: %s: %m", name_of_nbd_cmd (cmd));
- connection_set_status (STATUS_DEAD);
+ return connection_set_status (STATUS_DEAD);
}
}
+ return false;
}
-static void
+static bool
send_structured_reply_read (uint64_t handle, uint16_t cmd,
const char *buf, uint32_t count, uint64_t offset)
{
@@ -420,8 +420,7 @@ send_structured_reply_read (uint64_t handle, uint16_t cmd,
r = conn->send (&reply, sizeof reply, SEND_MORE);
if (r == -1) {
nbdkit_error ("write reply: %s: %m", name_of_nbd_cmd (cmd));
- connection_set_status (STATUS_DEAD);
- return;
+ return connection_set_status (STATUS_DEAD);
}
/* Send the offset + read data buffer. */
@@ -429,15 +428,15 @@ send_structured_reply_read (uint64_t handle, uint16_t cmd,
r = conn->send (&offset_data, sizeof offset_data, SEND_MORE);
if (r == -1) {
nbdkit_error ("write data: %s: %m", name_of_nbd_cmd (cmd));
- connection_set_status (STATUS_DEAD);
- return;
+ return connection_set_status (STATUS_DEAD);
}
r = conn->send (buf, count, 0);
if (r == -1) {
nbdkit_error ("write data: %s: %m", name_of_nbd_cmd (cmd));
- connection_set_status (STATUS_DEAD);
+ return connection_set_status (STATUS_DEAD);
}
+ return false;
}
/* Convert a list of extents into NBD_REPLY_TYPE_BLOCK_STATUS blocks.
@@ -522,7 +521,7 @@ extents_to_block_descriptors (struct nbdkit_extents *extents,
return blocks;
}
-static void
+static bool
send_structured_reply_block_status (uint64_t handle,
uint16_t cmd, uint16_t flags,
uint32_t count, uint64_t offset,
@@ -542,10 +541,8 @@ send_structured_reply_block_status (uint64_t handle,
blocks = extents_to_block_descriptors (extents, flags, count, offset,
&nr_blocks);
- if (blocks == NULL) {
- connection_set_status (STATUS_DEAD);
- return;
- }
+ if (blocks == NULL)
+ return connection_set_status (STATUS_DEAD);
reply.magic = htobe32 (NBD_STRUCTURED_REPLY_MAGIC);
reply.handle = handle;
@@ -557,8 +554,7 @@ send_structured_reply_block_status (uint64_t handle,
r = conn->send (&reply, sizeof reply, SEND_MORE);
if (r == -1) {
nbdkit_error ("write reply: %s: %m", name_of_nbd_cmd (cmd));
- connection_set_status (STATUS_DEAD);
- return;
+ return connection_set_status (STATUS_DEAD);
}
/* Send the base:allocation context ID. */
@@ -566,8 +562,7 @@ send_structured_reply_block_status (uint64_t handle,
r = conn->send (&context_id, sizeof context_id, SEND_MORE);
if (r == -1) {
nbdkit_error ("write reply: %s: %m", name_of_nbd_cmd (cmd));
- connection_set_status (STATUS_DEAD);
- return;
+ return connection_set_status (STATUS_DEAD);
}
/* Send each block descriptor. */
@@ -576,12 +571,13 @@ send_structured_reply_block_status (uint64_t handle,
i == nr_blocks - 1 ? 0 : SEND_MORE);
if (r == -1) {
nbdkit_error ("write reply: %s: %m", name_of_nbd_cmd (cmd));
- connection_set_status (STATUS_DEAD);
+ return connection_set_status (STATUS_DEAD);
}
}
+ return false;
}
-static void
+static bool
send_structured_reply_error (uint64_t handle, uint16_t cmd, uint16_t flags,
uint32_t error)
{
@@ -600,8 +596,7 @@ send_structured_reply_error (uint64_t handle, uint16_t cmd, uint16_t flags,
r = conn->send (&reply, sizeof reply, SEND_MORE);
if (r == -1) {
nbdkit_error ("write error reply: %m");
- connection_set_status (STATUS_DEAD);
- return;
+ return connection_set_status (STATUS_DEAD);
}
/* Send the error. */
@@ -610,12 +605,14 @@ send_structured_reply_error (uint64_t handle, uint16_t cmd, uint16_t flags,
r = conn->send (&error_data, sizeof error_data, 0);
if (r == -1) {
nbdkit_error ("write data: %s: %m", name_of_nbd_cmd (cmd));
- connection_set_status (STATUS_DEAD);
+ return connection_set_status (STATUS_DEAD);
}
/* No human readable error message at the moment. */
+ return false;
}
-void
+/* Do a recv/send sequence. Return true if the caller should shutdown. */
+bool
protocol_recv_request_send_reply (void)
{
GET_CONN;
@@ -634,24 +631,21 @@ protocol_recv_request_send_reply (void)
r = conn->recv (&request, sizeof request);
cs = connection_get_status ();
if (cs <= STATUS_CLIENT_DONE)
- return;
+ return false;
if (r == -1) {
nbdkit_error ("read request: %m");
- connection_set_status (STATUS_DEAD);
- return;
+ return connection_set_status (STATUS_DEAD);
}
if (r == 0) {
debug ("client closed input socket, closing connection");
- connection_set_status (STATUS_CLIENT_DONE); /* disconnect */
- return;
+ return connection_set_status (STATUS_CLIENT_DONE); /* disconnect */
}
magic = be32toh (request.magic);
if (magic != NBD_REQUEST_MAGIC) {
nbdkit_error ("invalid request: 'magic' field is incorrect (0x%x)",
magic);
- connection_set_status (STATUS_DEAD);
- return;
+ return connection_set_status (STATUS_DEAD);
}
flags = be16toh (request.flags);
@@ -662,16 +656,14 @@ protocol_recv_request_send_reply (void)
if (cmd == NBD_CMD_DISC) {
debug ("client sent %s, closing connection", name_of_nbd_cmd (cmd));
- connection_set_status (STATUS_CLIENT_DONE); /* disconnect */
- return;
+ return connection_set_status (STATUS_CLIENT_DONE); /* disconnect */
}
/* Validate the request. */
if (!validate_request (cmd, flags, offset, count, &error)) {
if (cmd == NBD_CMD_WRITE &&
skip_over_write_buffer (conn->sockin, count) < 0) {
- connection_set_status (STATUS_DEAD);
- return;
+ return connection_set_status (STATUS_DEAD);
}
goto send_reply;
}
@@ -685,8 +677,7 @@ protocol_recv_request_send_reply (void)
error = ENOMEM;
if (cmd == NBD_CMD_WRITE &&
skip_over_write_buffer (conn->sockin, count) < 0) {
- connection_set_status (STATUS_DEAD);
- return;
+ return connection_set_status (STATUS_DEAD);
}
goto send_reply;
}
@@ -711,8 +702,7 @@ protocol_recv_request_send_reply (void)
}
if (r == -1) {
nbdkit_error ("read data: %s: %m", name_of_nbd_cmd (cmd));
- connection_set_status (STATUS_DEAD);
- return;
+ return connection_set_status (STATUS_DEAD);
}
}
}
@@ -731,7 +721,7 @@ protocol_recv_request_send_reply (void)
/* Send the reply packet. */
send_reply:
if (connection_get_status () < STATUS_CLIENT_DONE)
- return;
+ return false;
if (error != 0) {
/* Since we're about to send only the limited NBD_E* errno to the
@@ -752,14 +742,15 @@ protocol_recv_request_send_reply (void)
(cmd == NBD_CMD_READ || cmd == NBD_CMD_BLOCK_STATUS)) {
if (!error) {
if (cmd == NBD_CMD_READ)
- send_structured_reply_read (request.handle, cmd, buf, count, offset);
+ return send_structured_reply_read (request.handle, cmd, buf, count,
+ offset);
else /* NBD_CMD_BLOCK_STATUS */
- send_structured_reply_block_status (request.handle, cmd, flags,
- count, offset, extents);
+ return send_structured_reply_block_status (request.handle, cmd, flags,
+ count, offset, extents);
}
else
- send_structured_reply_error (request.handle, cmd, flags, error);
+ return send_structured_reply_error (request.handle, cmd, flags, error);
}
else
- send_simple_reply (request.handle, cmd, flags, buf, count, error);
+ return send_simple_reply (request.handle, cmd, flags, buf, count, error);
}
diff --git a/server/public.c b/server/public.c
index c2f67451..74ac18a1 100644
--- a/server/public.c
+++ b/server/public.c
@@ -1,5 +1,5 @@
/* nbdkit
- * Copyright (C) 2013-2022 Red Hat Inc.
+ * Copyright (C) 2013-2023 Red Hat Inc.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are
@@ -1107,5 +1107,8 @@ nbdkit_disconnect (int force)
debug ("no connection in this thread, ignoring disconnect request");
return;
}
- connection_set_status (force ? STATUS_DEAD : STATUS_SHUTDOWN);
+ if (connection_set_status (force ? STATUS_DEAD : STATUS_SHUTDOWN)) {
+ ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&conn->write_lock);
+ conn->close (SHUT_WR);
+ }
}
diff --git a/server/test-public.c b/server/test-public.c
index e1c48cba..e0249fc7 100644
--- a/server/test-public.c
+++ b/server/test-public.c
@@ -1,5 +1,5 @@
/* nbdkit
- * Copyright (C) 2018-2022 Red Hat Inc.
+ * Copyright (C) 2018-2023 Red Hat Inc.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are
@@ -96,7 +96,7 @@ connection_get_status (void)
abort ();
}
-void
+bool
connection_set_status (conn_status v)
{
abort ();
diff --git a/tests/test-client-death-tls.sh b/tests/test-client-death-tls.sh
new file mode 100755
index 00000000..49b3ec18
--- /dev/null
+++ b/tests/test-client-death-tls.sh
@@ -0,0 +1,81 @@
+#!/usr/bin/env bash
+# nbdkit
+# Copyright (C) 2019-2023 Red Hat Inc.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+#
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in the
+# documentation and/or other materials provided with the distribution.
+#
+# * Neither the name of Red Hat nor the names of its contributors may be
+# used to endorse or promote products derived from this software without
+# specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND
+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
+# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR
+# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+# SUCH DAMAGE.
+
+source ./functions.sh
+set -x
+
+requires nbdsh -c 'exit(not h.supports_tls())'
+
+
+# Does the nbdkit binary support TLS?
+if ! nbdkit --dump-config | grep -sq tls=yes; then
+ echo "$0: nbdkit built without TLS support"
+ exit 77
+fi
+
+# Did we create the PSK keys file?
+# Probably 'certtool' is missing.
+if [ ! -s keys.psk ]; then
+ echo "$0: PSK keys file was not created by the test harness"
+ exit 77
+fi
+
+sock=$(mktemp -u /tmp/nbdkit-test-sock.XXXXXX)
+files="client-death-tls.pid $sock"
+cleanup_fn rm -f $files
+
+# Start long-running nbdkit
+start_nbdkit -P client-death-tls.pid --tls require --tls-psk=keys.psk \
+ -U $sock memory 2M
+
+pid=`cat client-death-tls.pid`
+
+# We can't use 'nbdsh -u "$uri" because of nbd_set_uri_allow_local_file.
+# Run a client that abandons several in-flight requests, each large enough
+# that we should see EPIPE on one handler while other handlers are still
+# waiting to send their response.
+nbdsh -c '
+h.set_tls(nbd.TLS_REQUIRE)
+h.set_tls_psk_file("keys.psk")
+h.set_tls_username("qemu")
+h.connect_unix("'"$sock"'")
+
+buf = nbd.Buffer(2*1024*1024)
+c1 = h.aio_pread(buf, 0)
+c2 = h.aio_pread(buf, 0)
+c3 = h.aio_pread(buf, 0)
+c4 = h.aio_pread(buf, 0)
+'
+
+# nbdkit should still be running
+sleep 1
+kill -s 0 $pid
diff --git a/tests/test-client-death.sh b/tests/test-client-death.sh
new file mode 100755
index 00000000..d7f57ca9
--- /dev/null
+++ b/tests/test-client-death.sh
@@ -0,0 +1,60 @@
+#!/usr/bin/env bash
+# nbdkit
+# Copyright (C) 2019-2023 Red Hat Inc.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+#
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in the
+# documentation and/or other materials provided with the distribution.
+#
+# * Neither the name of Red Hat nor the names of its contributors may be
+# used to endorse or promote products derived from this software without
+# specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND
+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
+# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR
+# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+# SUCH DAMAGE.
+
+source ./functions.sh
+set -x
+
+requires_nbdsh_uri
+
+sock=$(mktemp -u /tmp/nbdkit-test-sock.XXXXXX)
+files="client-death.pid $sock"
+cleanup_fn rm -f $files
+
+# Start long-running nbdkit
+start_nbdkit -P client-death.pid -U $sock memory 2M
+
+pid=`cat client-death.pid`
+
+# Run a client that abandons several in-flight requests, each large enough
+# that we should see EPIPE on one handler while other handlers are still
+# waiting to send their response.
+nbdsh -u "nbd+unix:///?socket=$sock" -c '
+buf = nbd.Buffer(2*1024*1024)
+c1 = h.aio_pread(buf, 0)
+c2 = h.aio_pread(buf, 0)
+c3 = h.aio_pread(buf, 0)
+c4 = h.aio_pread(buf, 0)
+'
+
+# nbdkit should still be running
+sleep 1
+kill -s 0 $pid
--
2.39.2
1 year, 8 months
[libnbd PATCH v4 0/5] vector: introduce DEFINE_POINTER_VECTOR_TYPE()
by Laszlo Ersek
This reworks the single
[libnbd PATCH v3 05/29] vector: (mostly) factor out DEFINE_VECTOR_EMPTY
http://mid.mail-archive.com/20230215141158.2426855-6-lersek@redhat.com
according to the review comments I received from Eric and Rich in that
subthread.
Clearly, the more closely I look at anything, the more refactoring
opportunities emerge. :/ At least, although I didn't like the amount of
work needed, I quite like the results. :)
This series too is bisectable.
Laszlo
Laszlo Ersek (5):
common/include: add TYPE_IS_POINTER() macro
common/include: extract STATIC_ASSERT() macro
remove semicolon after DEFINE_VECTOR_TYPE() macro invocations
vector: introduce DEFINE_POINTER_VECTOR_TYPE()
convert string_vector_(iter(free) + reset()) to string_vector_empty()
common/include/Makefile.am | 1 +
common/include/checked-overflow.h | 8 ++--
common/include/compiler-macros.h | 31 ++++++++++++-
common/include/static-assert.h | 48 ++++++++++++++++++++
common/include/test-array-size.c | 45 ++++++------------
common/utils/const-string-vector.h | 2 +-
common/utils/nbdkit-string.h | 2 +-
common/utils/string-vector.h | 2 +-
common/utils/test-vector.c | 7 ++-
common/utils/vector.h | 31 +++++++++++++
copy/nbdcopy.h | 2 +-
generator/states-connect-socket-activation.c | 9 ++--
info/show.c | 3 +-
lib/handle.c | 12 ++---
lib/internal.h | 2 +-
lib/utils.c | 6 +--
16 files changed, 145 insertions(+), 66 deletions(-)
create mode 100644 common/include/static-assert.h
base-commit: da8887fd4d7ddc1f9cf2f4a3bc20fd51578fa565
1 year, 8 months
no way to force-close NBD handle in nbdsh
by Eric Blake
I'm out of time this weekend, but while trying to write a test for an
nbdkit bug fix (a nasty assertion failure when the client disconnects
uncleanly without NBD_CMD_DISC, which is what nbdcopy does if it gets
an EIO read response from the server), I realized it is extremely hard
to trigger this using nbdsh, but easy to do in C or the Go bindings.
In the Go bindings, we intentionally coded things so that the Go
structure knows if its underlying C pointer is live or not; it exposes
a way to force early closure, then the bindings return a
closed_handle_error() if early closure has happened. We probably need
to support something similar in the Python bindings.
As a short-term hack, I tried directly calling h.__del__() - this is
not a good idea, as our current __del__ implementation is not designed
to be called twice, and when the later garbage collection calls it
again, we segfault trying to free invalid memory. But less drastic
measures, such as:
import gc
del h
h = None
gc.collect()
were still insufficient to trigger a normal __del__ of h, because I
can't figure out how to force the garbage collector to see that I want
to close the handle but keep python open.
In the end, I did get a working test written (by just quit()ing python
instead of trying to keep it alive). See my next email for the nbdkit
patch that spawned my request here.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
1 year, 9 months