RFC for NBD protocol extension: extended headers
by Eric Blake
In response to this mail, I will be cross-posting a series of patches
to multiple projects as a proof-of-concept implementation and request
for comments on a new NBD protocol extension, called
NBD_OPT_EXTENDED_HEADERS. With this in place, it will be possible for
clients to request 64-bit zero, trim, cache, and block status
operations when supported by the server.
Not yet complete: an implementation of this in nbdkit. I also plan to
tweak libnbd's 'nbdinfo --map' and 'nbdcopy' to take advantage of the
larger block status results. Also, with 64-bit commands, we may want
to also make it easier to let servers advertise an actual maximum size
they are willing to accept for the commands in question (for example,
a server may be happy with a full 64-bit block status, but still want
to limit non-fast zero and cache to a smaller limit to avoid denial of
service).
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
2 years, 1 month
[PATCH libnbd] ublk: Add new nbdublk program
by Richard W.M. Jones
This patch adds simple support for a ublk-based NBD client.
It is also available here:
https://gitlab.com/rwmjones/libnbd/-/tree/nbdublk/ublk
ublk is a way to write Linux block device drivers in userspace:
https://lwn.net/Articles/903855/
For simplicity of implementation and because I don't currently
understand the thread model of ublksrv, this only implements
synchronous calls for now. It should be possible to extend this to a
fully asynchronous client without too much difficulty.
It does appear to work, at least for simple cases. I have created
filesystems, files, etc on a ublk device backed by an nbdkit RAM disk,
eg: On one machine do:
$ nbdkit memory 1G
On the client machine with the right kernel etc [see below] do:
# modprobe ublk_drv
# nbdublk /dev/ublkb0 nbd://remote
# ublk list
# blockdev --getsize64 /dev/ublkb0
# mke2fs /dev/ublkb0
# ...
# ublk del -n 0
Testing this is not for the fainthearted. I would start with a
throwaway Fedora Rawhide virtual machine, fully upgraded. You will
need to recompile the kernel with CONFIG_BLK_DEV_UBLK=m
You will need to upgrade to liburing 2.2 (I pushed this to Rawhide a
few days ago).
You will need to download & compile: https://github.com/ming1/ubdsrv
Apply this patch to libnbd and compile it with:
export PKG_CONFIG_PATH=$HOME/ubdsrv
export CFLAGS="$CFLAGS -I$HOME/ubdsrv/include"
export CXXFLAGS="$CXXFLAGS -I$HOME/ubdsrv/include"
export LDFLAGS="$LDFLAGS -L$HOME/ubdsrv/lib"
./configure
make
(Check that ublk dependencies are found and nbdublk is compiled)
You will then be able to run nbdublk from the compile directory using:
sudo ./run nbdublk --help
Rich.
2 years, 1 month
[libnbd PATCH v2 00/12] Improve NBD_OPT_ control
by Eric Blake
v1 was here (under the subject Smarter nbd_opt_info)
https://listman.redhat.com/archives/libguestfs/2022-August/029641.html
Since then, I've done a lot. The original patch 1/2 is now split
across 7/12 and 9/12; while original patch 2/2 is expanded across
several patches to test more scenarios. Several new prerequisite
patches were added to address smaller issues one at a time and with
better commit messages. Where possible, I have kept tests separate
from the patch introducing a fix to allow temporary reordering at that
point in the series to prove the test catches the issue. It was not
possible on tests for new APIs, nor on patch 2/12 where the only test
I could come up with involves hacking a one-off nbdkit that does not
behave like a normal server; it was also hard to justify on 3/12 where
reordering the patches results in a test skip rather than failure,
where distinguishing the difference requires reading logs to see
whether an API failed client-side or server-side.
Maybe we want to add an API to track how many packets are sent over
the wire (client->server, and server->client), to make it easier to
actually catch when things are filtered client-side, or how often a
server uses multiple reply chunks to answer a single client request.
I've also fixed a bug in nbd_can_meta_context that was discovered
while responding to v1 reviews, and completed part of the work at
adding new APIs hinted at earlier (but still to add: APIs for
nbd_opt_structured_replies and nbd_opt_starttls).
Eric Blake (12):
internal: Use vector instead of linked list for meta_contexts
api: Fix nbd_can_meta_context if NBD_OPT_SET_META_CONTEXT fails
api: Allow nbd_opt_list_meta_context without SR
api: Add nbd_set_request_meta_context()
tests: Language port of nbd_set_request_meta_context() tests
info: Explicitly skip NBD_OPT_SET_META_CONTEXT in --list mode
api: Make nbd_opt_list_meta_context stateless
tests: Add coverage for stateless nbd_opt_list_meta_context
api: Reset state on changed nbd_set_export_name()
tests: Add coverage for nbd_set_export_name fix
api: Add nbd_[aio_]opt_set_meta_context
tests: Language port of nbd_opt_set_meta_context() tests
lib/internal.h | 19 +-
generator/API.ml | 174 ++++++++++++-
generator/states-newstyle-opt-go.c | 1 +
generator/states-newstyle-opt-meta-context.c | 77 +++---
generator/states-newstyle.c | 1 +
generator/states-reply-structured.c | 11 +-
lib/flags.c | 27 +-
lib/handle.c | 26 ++
lib/opt.c | 53 +++-
lib/rw.c | 2 +-
python/t/110-defaults.py | 1 +
python/t/120-set-non-defaults.py | 2 +
python/t/230-opt-info.py | 36 ++-
python/t/240-opt-list-meta.py | 29 ++-
python/t/250-opt-set-meta.py | 126 ++++++++++
ocaml/tests/Makefile.am | 5 +-
ocaml/tests/test_110_defaults.ml | 2 +
ocaml/tests/test_120_set_non_defaults.ml | 3 +
ocaml/tests/test_230_opt_info.ml | 43 +++-
ocaml/tests/test_240_opt_list_meta.ml | 34 ++-
ocaml/tests/test_250_opt_set_meta.ml | 150 +++++++++++
tests/Makefile.am | 9 +
tests/opt-info.c | 91 ++++++-
tests/opt-list-meta.c | 104 +++++++-
tests/opt-set-meta | 210 ++++++++++++++++
tests/opt-set-meta.c | 236 ++++++++++++++++++
.gitignore | 1 +
golang/Makefile.am | 3 +-
golang/libnbd_110_defaults_test.go | 8 +
golang/libnbd_120_set_non_defaults_test.go | 12 +
golang/libnbd_230_opt_info_test.go | 111 ++++++++-
golang/libnbd_240_opt_list_meta_test.go | 106 ++++++--
golang/libnbd_250_opt_set_meta_test.go | 248 +++++++++++++++++++
info/list.c | 3 +-
info/show.c | 3 +-
35 files changed, 1813 insertions(+), 154 deletions(-)
create mode 100644 python/t/250-opt-set-meta.py
create mode 100644 ocaml/tests/test_250_opt_set_meta.ml
create mode 100755 tests/opt-set-meta
create mode 100644 tests/opt-set-meta.c
create mode 100644 golang/libnbd_250_opt_set_meta_test.go
--
2.37.2
2 years, 2 months
[libnbd PATCH] RFC: api: Add nbd_supports_vsock()
by Eric Blake
Similar to nbd_supports_tls(), it is nice to know from a feature probe
whether we are likely to have VSOCK support before even trying more
expensive APIs like nbd_connect_uri("nbd+vsock://...").
---
https://bugzilla.redhat.com/show_bug.cgi?id=2069558 documents a case
where AF_VSOCK is compiled, but vsock still doesn't work because
vsock_loopback is not loaded. Should we make nbd_supports_vsock()
return true only when ALL aspects of vsock are known to be working?
---
generator/API.ml | 25 ++++++++++++++++++++++---
lib/handle.c | 17 +++++++++++++++++
2 files changed, 39 insertions(+), 3 deletions(-)
diff --git a/generator/API.ml b/generator/API.ml
index b377b9f..7ec85ca 100644
--- a/generator/API.ml
+++ b/generator/API.ml
@@ -1393,7 +1393,8 @@ "connect_uri", {
=item C<nbds+vsock:>
Connect over the C<AF_VSOCK> transport, without or with
-TLS respectively.
+TLS respectively. You can use L<nbd_supports_vsock(3)> to
+see if this build of libnbd supports C<AF_VSOCK>.
=back
@@ -1494,7 +1495,8 @@ "connect_uri", {
see_also = [URLLink "https://github.com/NetworkBlockDevice/nbd/blob/master/doc/uri.md";
Link "aio_connect_uri";
Link "set_export_name"; Link "set_tls";
- Link "set_opt_mode"; Link "get_uri"];
+ Link "set_opt_mode"; Link "get_uri";
+ Link "supports_vsock"; Link "supports_uri"];
};
"connect_unix", {
@@ -1522,8 +1524,12 @@ "connect_vsock", {
C<cid> should be C<2> (to connect to the host), and C<port> might be
C<10809> or another port number assigned to you by the host
administrator.
+
+Not all systems support C<AF_VSOCK>; to determine if libnbd was
+built on a system with vsock support, see L<nbd_supports_vsock(3)>.
" ^ blocking_connect_call_description;
- see_also = [Link "aio_connect_vsock"; Link "set_opt_mode"];
+ see_also = [Link "aio_connect_vsock"; Link "set_opt_mode";
+ Link "supports_vsock"];
};
"connect_tcp", {
@@ -3061,6 +3067,16 @@ "supports_tls", {
see_also = [Link "set_tls"];
};
+ "supports_vsock", {
+ default_call with
+ args = []; ret = RBool; is_locked = false; may_set_error = false;
+ shortdesc = "true if libnbd was compiled with support for AF_VSOCK";
+ longdesc = "\
+Returns true if libnbd was compiled with support for the AF_VSOCK family
+of sockets, or false if not.";
+ see_also = [Link "connect_vsock"; Link "connect_uri"];
+ };
+
"supports_uri", {
default_call with
args = []; ret = RBool; is_locked = false; may_set_error = false;
@@ -3233,6 +3249,9 @@ let first_version =
"set_request_block_size", (1, 12);
"get_request_block_size", (1, 12);
+ (* Added in 1.13.x development cycle, will be stable and supported in 1.14. *)
+ "supports_vsock", (1, 14);
+
(* These calls are proposed for a future version of libnbd, but
* have not been added to any released version so far.
"get_tls_certificates", (1, ??);
diff --git a/lib/handle.c b/lib/handle.c
index 8713e18..ac64fe9 100644
--- a/lib/handle.c
+++ b/lib/handle.c
@@ -28,6 +28,12 @@
#include <sys/types.h>
#include <sys/wait.h>
+#ifdef HAVE_LINUX_VM_SOCKETS_H
+#include <linux/vm_sockets.h>
+#elif HAVE_SYS_VSOCK_H
+#include <sys/vsock.h>
+#endif
+
#include "internal.h"
static void
@@ -483,6 +489,17 @@ nbd_unlocked_supports_tls (struct nbd_handle *h)
#endif
}
+/* NB: is_locked = false, may_set_error = false. */
+int
+nbd_unlocked_supports_vsock (struct nbd_handle *h)
+{
+#ifdef AF_VSOCK
+ return 1;
+#else
+ return 0;
+#endif
+}
+
/* NB: is_locked = false, may_set_error = false. */
int
nbd_unlocked_supports_uri (struct nbd_handle *h)
--
2.37.2
2 years, 2 months
[p2v PATCH 0/4] improve the "virt-p2v in a VM" tests
by Laszlo Ersek
"make run-virt-p2v-in-a-vm" and "make run-virt-p2v-in-an-nvme-vm" suffer
from two problems: (a) possibly duplicate XFS UUIDs between the
$PHYSICAL_MACHINE disk and the "virt-p2v.img" boot media, (b) slow guest
(= fake "physical machine") boot. This series fixes these issues.
Tested with actual conversions (libvirt output) and post-conversion
booting.
Laszlo
Laszlo Ersek (4):
Makefile.am: factor out "make-physical-machine.sh"
make-physical-machine.sh: set root password to "p2v-phys"
make-physical-machine.sh: regenerate filesystem UUIDs
Makefile.am: speed up the boot phase of the "virt-p2v in a VM" tests
Makefile.am | 9 +-
.gitignore | 1 +
make-physical-machine.sh | 117 ++++++++++++++++++++
3 files changed, 124 insertions(+), 3 deletions(-)
create mode 100755 make-physical-machine.sh
2 years, 2 months
[libnbd PATCH 0/2] Smarter nbd_opt_info behavior
by Eric Blake
While trying to add a new API nbd_opt_set_meta_context(), I noticed
some existing oddities with nbd_opt_list_meta_context() and
nbd_opt_info().
Eric Blake (2):
api: Better use of nbd_internal_reset_size_and_flags
squash test
generator/states-newstyle-opt-meta-context.c | 7 +++----
lib/handle.c | 4 ++++
tests/opt-info.c | 20 +++++++++++++-------
3 files changed, 20 insertions(+), 11 deletions(-)
--
2.37.2
2 years, 2 months
shebang usage in bash scripts
by Laszlo Ersek
Hi,
most shell scripts in the v2v projects start with a shebang like this:
#!/bin/bash -
I *think* I understand the intent of the single hyphen, but (a) it seems unnecessary, (b) even if we insisted, using the double-hyphen separator "--" is much more idiomatic (even though the shell, per POSIX, is supposed to interpret "-" identically to "--").
Regarding why I think the hyphen is unnecessary:
- setuid shell scripts are not a thing on any platform we (should) care about
- the script name is not reinterpreted as an option *anyway*
Consider:
cat >-v <<EOT
#!/bin/bash
echo hello \$1
EOT
chmod +x -- -v
PATH=$PWD:$PATH -v world
--> hello world
That is, when we run the script under the name "-v" (with it being on the PATH) and with the single command line argument "world", we'd naively asume that the shebang translated to:
/bin/bash -v world
where "-v" came from the script's name, and "world" would be taken as the name of a shell script to execute. Thus, the idea would be to prevent this (i.e., to pass options to the interpreter by renaming or (sym)linking the shell script).
But that's not what we actually see; what we see is consistent with the command:
/bin/bash -- -v world
--> hello world
In fact, if I append "sleep 1000" to the script, I can also fetch:
$ hexdump -C /proc/10975/cmdline
00000000 62 61 73 68 00 2d 2d 00 2d 76 00 77 6f 72 6c 64 |bash.--.-v.world|
00000010 00 |.|
00000011
So glibc and/or the Linux kernel already inserts the "--" option/operand delimiter!
I intend to contribute a shell script to virt-p2v; do I need to use the hyphen in the shebang? (If so, I prefer the double-hyphen.)
Thanks!
Laszlo
2 years, 3 months