RFC for NBD protocol extension: extended headers
by Eric Blake
In response to this mail, I will be cross-posting a series of patches
to multiple projects as a proof-of-concept implementation and request
for comments on a new NBD protocol extension, called
NBD_OPT_EXTENDED_HEADERS. With this in place, it will be possible for
clients to request 64-bit zero, trim, cache, and block status
operations when supported by the server.
Not yet complete: an implementation of this in nbdkit. I also plan to
tweak libnbd's 'nbdinfo --map' and 'nbdcopy' to take advantage of the
larger block status results. Also, with 64-bit commands, we may want
to also make it easier to let servers advertise an actual maximum size
they are willing to accept for the commands in question (for example,
a server may be happy with a full 64-bit block status, but still want
to limit non-fast zero and cache to a smaller limit to avoid denial of
service).
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
2 years, 3 months
[PATCH libnbd 0/9] golang: Safer, easier to use, and faster AioBuffer
by Nir Soffer
Improve AioBuffer to make it safer, easier to use, and faster when integrating
with other Go APIs.
New Go specific APIs:
- MakeAioBufferZero() - creates a new buffer using calloc(), to make it easy
and efficient to use a zeroed buffer.
- AioBuffer.Slice() - create a slice backed by the underlying buffer without
copying the contents of the buffer.
Performance improments:
- FromBytes() is 3 time faster
- Code using Bytes() should use Slice() now. aio_copy example shows up to 260%
speedup.
Improve testing:
- New AioBuffer tests
- New AioBuffer benchmarks
Documention:
- AioBuffer is fully documnted now.
Nir Soffer (9):
golang: tests: Add test for AioBuffer
golang: aio_buffer.go: Make it safer to use
golang: aio_buffer.go: Add missing documentation
golang: aio_buffer.go: Add MakeAioBufferZero()
golang: aio_buffer.go: Add Slice()
golang: tests: Use AioBuffer.Slice()
golang: aio_buffer.go: Speed up FromBytes()
golang: aio_buffer.go: Benchmark copy flows
golang: examples: aio_copy: Simplify using AioBuffer
golang/Makefile.am | 1 +
golang/aio_buffer.go | 39 ++++-
golang/examples/aio_copy/aio_copy.go | 29 +---
golang/libnbd_500_aio_pread_test.go | 2 +-
golang/libnbd_510_aio_pwrite_test.go | 8 +-
golang/libnbd_620_aio_buffer_test.go | 236 +++++++++++++++++++++++++++
6 files changed, 281 insertions(+), 34 deletions(-)
create mode 100644 golang/libnbd_620_aio_buffer_test.go
--
2.34.1
2 years, 11 months
[PATCH 0/5] Fix rhv-upload output
by Nir Soffer
Fix problems in new rhv-upload implementation:
- The plugin does not flush to all connections in flush()
- The plugin does not close all connections in cleanup()
- Idle connections are closed in imageio server, and we don't have a safe way
to recover.
- virt-v2v try to get disk allocation using imageio output, but imageio output
does not support extents. Even if imageio output will support extents, the
call is done after the transfer was finalized so it does not have access to
storage.
Problems not fixed yet:
- Image transfer is finalized *before* closing the connection to imageio - this
will always time out with RHV < 4.4.9, and succeeds by mistake with RHV 4.4.9
due to a regression that will be fixed in 4.4.10. This will be a non-issue in
next RHV version[1]. To support older RHV versions, virt-v2v must finalize
the image transfer *after* closing the output.
Tested on RHEL 8.6 with upstream nbdkit and libnbd.
[1] https://github.com/oVirt/ovirt-imageio/pull/15
Fixes https://bugzilla.redhat.com/2032324
Nir Soffer (5):
output/rhv-upload-plugin: Fix flush and close
v2v/lib/util.ml: Get disk allocation from input
output/rhv-upload-plugin: Extract send_flush() helper
output/rhv-upload-plugin: Track http last request time
output/rhv-upload-plugin: Keep connections alive
lib/utils.ml | 2 +-
output/rhv-upload-plugin.py | 151 ++++++++++++++++++++++++++----------
2 files changed, 113 insertions(+), 40 deletions(-)
--
2.33.1
2 years, 11 months
Re: [Libguestfs] [libguestfs/libguestfs] guestfs_copy_in fails with error: copy_in: tar subprocess failed: tar: .: file changed as we read it: errno 0 (Issue #75)
by Richard W.M. Jones
On Wed, Jan 26, 2022 at 09:31:14PM -0800, anemade wrote:
> I am using libguestfs Golang binding APIs(version 1.44)
> Followed this document https://libguestfs.org/guestfs-golang.3.html to create
> the disk, add the disk, format the partition, launch the appliance and
> performing some copy_in and copy_out operations.
>
> While doing copy_in, I am seeing this strange issue
>
> error: copy_in: tar subprocess failed: tar: .: file changed as we read it:
> errno 0
>
> This issue is not all time reproducible. It comes like 1 out of 10
> runs. In my case, data is stable while doing copy_in. There is
> absolutely no change in the data while guestfs_copy_in operation
> going on. Any leads to issue or anything that I need to understand
> for copy_in or copy_out?
Can you share exactly how you are using copy_in? A small
reproducer would be good.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top
2 years, 11 months
[PATCH nbdkit] plugins: python: Add error example
by Nir Soffer
This plugin simulates errors in pread, pwrite, and extents. This is
useful for testing error handling in NBD clients, and understanding how
plugin exceptions are reported to the NBD client.
---
plugins/python/Makefile.am | 1 +
plugins/python/examples/error.py | 77 ++++++++++++++++++++++++++++++++
2 files changed, 78 insertions(+)
create mode 100644 plugins/python/examples/error.py
diff --git a/plugins/python/Makefile.am b/plugins/python/Makefile.am
index eecd7e89..e6e6c9e6 100644
--- a/plugins/python/Makefile.am
+++ b/plugins/python/Makefile.am
@@ -27,20 +27,21 @@
# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
include $(top_srcdir)/common-rules.mk
EXTRA_DIST = \
nbdkit-python-plugin.pod \
examples/file.py \
+ examples/error.py \
examples/imageio.py \
examples/ramdisk.py \
examples/url.py \
$(NULL)
if HAVE_PYTHON
plugin_LTLIBRARIES = nbdkit-python-plugin.la
nbdkit_python_plugin_la_SOURCES = \
diff --git a/plugins/python/examples/error.py b/plugins/python/examples/error.py
new file mode 100644
index 00000000..0331a83e
--- /dev/null
+++ b/plugins/python/examples/error.py
@@ -0,0 +1,77 @@
+# Example Python plugin.
+#
+# This plugin simulates errors for testing NBD client error hanlding.
+# Every odd call will fail, and every even call will succeed, unless
+# there a real error accesing the specified file.
+#
+# This example can be freely used for any purpose.
+
+# Run it from the build directory like this:
+#
+# ./nbdkit -f -v python ./plugins/python/examples/error.py file=test.img
+#
+# Or run it after installing nbdkit like this:
+#
+# nbdkit -f -v python ./plugins/python/examples/error.py file=test.img
+#
+# The -f -v arguments are optional. They cause the server to stay in
+# the foreground and print debugging, which is useful when testing.
+
+import os
+
+API_VERSION = 2
+
+filename = None
+calls = 0
+
+
+def config(key, value):
+ global filename
+ assert key == "file"
+ filename = value
+
+
+def open(readonly):
+ flags = os.O_RDONLY if readonly else os.O_RDWR
+ return {"fd": os.open(filename, flags)}
+
+
+def can_extents(h):
+ return True
+
+
+def get_size(h):
+ return os.stat(h["fd"]).st_size
+
+
+def extents(h, count, offset, flags):
+ global calls
+ calls += 1
+ if calls % 2:
+ raise RuntimeError(f"extents error offset={offset} count={count}")
+
+ # We don't really support extents, so we report the entire file as
+ # data.
+ return [(offset, count, 0)]
+
+
+def pread(h, buf, offset, flags):
+ global calls
+ calls += 1
+ if calls % 2:
+ raise RuntimeError(f"pread error offset={offset} count={len(buf)}")
+
+ os.lseek(h["fd"], offset, os.SEEK_SET)
+ n = os.readv(h['fd'], [buf])
+ assert n == len(buf)
+
+
+def pwrite(h, buf, offset, flags):
+ global calls
+ calls += 1
+ if calls % 2:
+ raise RuntimeError(f"pwrite error offset={offset} count={len(buf)}")
+
+ os.lseek(h["fd"], offset, os.SEEK_SET)
+ n = os.writev(h['fd'], [buf])
+ assert n == len(buf)
--
2.34.1
2 years, 11 months
[PATCH nbdkit v2 0/4] common/allocators: Always align mlock buffer
by Richard W.M. Jones
Earlier patch was reviewed here:
https://listman.redhat.com/archives/libguestfs/2022-January/msg00172.html
For this series I pulled out two smaller changes into separate commits
(patches 1 & 2).
I then decided to modify the vector library to add a new
<vector>_reserve_page_aligned function which always allocates
page-aligned memory (patch 3). Currently this is using
posix_memalign, maybe in future using mmap since there seems to be
some dispute about whether calling mlock on anonymous memory is safe.
This makes the change to common/allocators/malloc.c quite
straightforward (patch 4), at the cost of making the change to the
vector library rather headache-inducing. At least it is confined to a
new function and doesn't affect any existing callers!
Rich.
2 years, 11 months
[PATCH libnbd] copy: Implement destination preferred block size
by Richard W.M. Jones
[NB: I think this is a failed attempt, so shoudn't go upstream, and
doesn't need to be reviewed.]
When nbdcopy writes to an NBD server it ignores the server's
minimum/preferred block size. This hasn't caused a problem til now,
but it turns out that in at least one case we care about (writing to
qemu-nbd backed by a compressed qcow2 file) we must obey the minimum &
preferred block size of 64K.
For the longer story on that see this thread:
https://lists.nongnu.org/archive/html/qemu-devel/2022-01/threads.html#06108
This patch attempts to fix this. The uncontroversial part of this
patch adds a global "blocksize" variable which is the destination
preferred block size.
The tricky part of this patch tries to ensure that writes to the
destination obey this block size constraint.
Since nbdcopy is driven by the extent map read from the source, the
theory behind this implementation is that we read the extent map and
then try to "adjust" it so that extents which are not aligned to the
block size grow, shrink or are deleted. It proved to be very
difficult to get that right, but you can see the implementation in the
new function "adjust_extents_for_blocksize".
Unfortunately not only is this difficult to implement, but the theory
is wrong. Read requests are actually split up into smaller
subrequests on write (look for "copy_subcommand" in
multi-thread-copying.c). So this doesn't really solve the problem.
So I think in my second version I will look at adjusting the NBD
destination driver (nbd-ops.c) directly so that it turns unaligned
writes into buffered read/modify/write operations (which can be
optimized quite a lot because we understand the write pattern and know
that the main program doesn't go backwards within blocks).
Rich.
2 years, 11 months
[v2v PATCH 0/7] Fix "virtio-transitional" regression for Windows guests
by Laszlo Ersek
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2043333
My series merged as commit range f9d5448d2efe..f0cea012d018 (for RHBZ
1942325) caused a regression when converting Windows guests.
Namely, osinfo_os_get_all_devices(), the core of that series, only
reports devices that an OS supports "out of the box". So, for modern
Windows releases, Q35 is reported, but virtio1.0-net is never reported,
as the latter requires external drivers. As a result, the series in
question (incorrectly) requires virtio-transitional for Windows guests.
The current series fixes the issue as follows:
- The first two patches fix existent bugs in our "copy_from_libosinfo"
function.
- Patches #3 and #4 OCaml-ify the osinfo_device_driver_get_devices()
function from libosinfo.
- Patch #5 improves libosinfo-related logging.
- Patch #6 extracts the "best virtio driver for the guest" logic from
"copy_from_libosinfo".
- Patch #7 grabs the device list supported by the "best driver", and
concatenates it with the list of devices supported by the OS "out of
the box". The unified list is then checked for both Q35 and virtio-1.0
support.
Note: I still don't have a local environment for testing this against
actual (and, long-term!) Windows guests. I'll do that later, and/or
prepare a RHEL9 scratch build for Virt-tools QE at RH, for testing this
series.
Thanks,
Laszlo
Laszlo Ersek (7):
convert/windows_virtio: fix copy_from_libosinfo <-> VIRTIO_WIN
priority
convert/windows_virtio: map 32-bit arch name from libguestfs to osinfo
convert/libosinfo: factor out v2v_osinfo_device_list_to_value_list()
convert/libosinfo: retrieve the device list for OsinfoDeviceDriver
convert/libosinfo_utils: debug-log the devices supported by a driver
convert/libosinfo_utils: extract "best_driver" from
"windows_virtio.ml"
convert/convert_windows: consult "best driver"'s dev list for
virtio-1.0
convert/convert_windows.ml | 15 +++-
convert/libosinfo-c.c | 132 ++++++++++++++++++---------------
convert/libosinfo.ml | 19 ++---
convert/libosinfo.mli | 19 ++---
convert/libosinfo_utils.ml | 56 +++++++++++---
convert/libosinfo_utils.mli | 18 ++++-
convert/windows_virtio.ml | 51 ++-----------
docs/virt-v2v.pod | 24 ++++--
tests/test-v2v-cdrom.expected | 2 +-
tests/test-v2v-floppy.expected | 2 +-
tests/test-v2v-i-ova.xml | 8 +-
11 files changed, 197 insertions(+), 149 deletions(-)
base-commit: 68af35d48ca845133ede948d36ee351d171e3de8
--
2.19.1.3.g30247aa5d201
2 years, 11 months
[PATCH libnbd 0/2] Fix error handling in copy-libev example
by Nir Soffer
Like all examples, and ndbcopy, error handlng was completly broken, creating
corrupted image if the soruce server failed to read, or the destination server
failed to write or zero.
Additionally the example used uninitialized buffers, which together with broken
error hanlding, can lead to leaking sesitive data the the destinatino server.
Nir Soffer (2):
examples: copy-libev.c: Clear buffers before use
examples: copy-libev.c: Fix error handling
examples/copy-libev.c | 28 +++++++++++++++++++++++++++-
1 file changed, 27 insertions(+), 1 deletion(-)
--
2.34.1
2 years, 11 months
Re: [Libguestfs] Block alignment of qcow2 compress driver
by Eric Blake
Adding libnbd list (libguestfs) in cc
On Fri, Jan 28, 2022 at 11:56:19AM +0000, Richard W.M. Jones wrote:
>
> I hacked nbdcopy to ignore block alignment (the error actually comes
> from libnbd refusing to send the unaligned request, not from
> qemu-nbd), and indeed qemu-nbd accepts the unaligned request without
> complaint.
And only after I already replied to your other email, did I then see
your followup recommending to read this one instead ;)
The NBD spec says the client is non-complying when sending under-sized
requests. If the server accepts it anyway (presumably with RMW
performance pessimizations, as qemu-nbd does), that's a QoI bonus.
>
> Eric - maybe having some flag for nbdcopy to ignore unaligned requests
> when we know the server doesn't care (qemu-nbd) would work?
Yeah, that might make sense - a command-line option for stating "I
know the server has a nice QoI feature, and I don't mind the
performance pessimization".
Another thing to consider: the way the NBD spec is written, the rules
about a client sending unaligned requests being non-compliant only
applies to a client that requested block size information in the first
place. If the client did not request block alignment information, the
server should honor anything at alignment of 512 or above (even if it
would prefer a larger minimum); performance may suffer, but this is
needed to cater to older clients that don't know how to request
alignments - and what's more, qemu-nbd specifically has code that
changes what it advertises if the client did not query (that is, an
advertisement of 64k is ONLY possible if the client requested
alignment details).
So maybe the question becomes whether libnbd needs a knob on whether
to request alignment information. Libnbd commit 9e9c74755 (libnbd
v1.3.10) is where I added the code to unconditionally query for
alignment info. Given that we now know of a case where NOT querying
causes qemu-nbd to behave differently in what it advertises, adding
such a knob to the API makes total sense, at which point, 'nbdcopy
--ignore-align' becomes a way to request that the client not request
alignment in the first place, rather than as a way to call
nbd_set_strict_mode() to turn off alignment checking.
Looks like I have some API work do propose in libnbd...
>
> Rich.
>
> --- a/copy/nbd-ops.c
> +++ b/copy/nbd-ops.c
> @@ -59,6 +59,10 @@ open_one_nbd_handle (struct rw_nbd *rwn)
> exit (EXIT_FAILURE);
> }
>
> + uint32_t sm = nbd_get_strict_mode (nbd);
> + sm &= ~LIBNBD_STRICT_ALIGN;
> + nbd_set_strict_mode (nbd, sm);
> +
> nbd_set_debug (nbd, verbose);
>
> if (extents && rwn->d == READING &&
>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
2 years, 12 months