On Mon, Jan 06, 2025 at 04:12:08PM +0000, Richard W.M. Jones wrote:
The QueryAllocatedBlocks API has (another) frustrating feature. It
can only query whole "chunks" (128 sectors). If the disk size is not
aligned to the chunk size (say the size was 129 sectors) then there's
a bit at the end which cannot be queried.
I've hit this in qemu as well - but at least that is open source.
There are two obvious approaches: truncate the file until it is
aligned (since the API won't let you query the tail, pretend the tail
doesn't exist), or emulate the file being rounded up to the next
aligned boundary (special-casing queries of the partial block, and
ensuring that reads see zeroes after the real size and that writes
can't alter bytes beyond the real size). The latter is tricker than
the former, but arguably a bit nicer as it lets the user get at every
real byte of the file. So now to review the rest of your patch and
see which method you chose...
Furthermore, the API gives
an error in this case instead of being helpful:
VixDiskLib_QueryAllocatedBlocks: One of the parameters was invalid
Lame, but par for the course.
Fixes:
https://issues.redhat.com/browse/RHEL-71694
---
tests/Makefile.am | 5 +-
plugins/vddk/worker.c | 43 ++++++++++++-
tests/test-vddk-real-unaligned-chunk.sh | 82 +++++++++++++++++++++++++
3 files changed, 126 insertions(+), 4 deletions(-)
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 29c762b636..713951c741 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -202,7 +202,8 @@ if HAVE_VDDK
check-vddk:
$(MAKE) check TESTS="test-vddk-real.sh \
test-vddk-real-dump-plugin.sh \
- test-vddk-real-create.sh"
+ test-vddk-real-create.sh \
+ test-vddk-real-unaligned-chunk.sh"
endif HAVE_VDDK
#----------------------------------------------------------------------
@@ -1207,6 +1208,7 @@ TESTS += \
test-vddk-password-interactive.sh \
test-vddk-real-create.sh \
test-vddk-real-dump-plugin.sh \
+ test-vddk-real-unaligned-chunk.sh \
test-vddk-real.sh \
test-vddk-reexec.sh \
test-vddk-run.sh \
@@ -1239,6 +1241,7 @@ EXTRA_DIST += \
test-vddk-password-interactive.sh \
test-vddk-real-create.sh \
test-vddk-real-dump-plugin.sh \
+ test-vddk-real-unaligned-chunk.sh \
test-vddk-real.sh \
test-vddk-reexec.sh \
test-vddk-run.sh \
diff --git a/plugins/vddk/worker.c b/plugins/vddk/worker.c
index bc015d1618..12a68d30e4 100644
--- a/plugins/vddk/worker.c
+++ b/plugins/vddk/worker.c
@@ -392,10 +392,11 @@ do_extents (struct command *cmd, struct vddk_handle *h)
const uint64_t offset = cmd->offset;
const bool req_one = cmd->req_one;
struct nbdkit_extents *extents = cmd->ptr;
- uint64_t position, end, start_sector;
+ VixError err;
+ VixDiskLibInfo *info;
+ uint64_t position, start_sector, capacity, last_queryable_sector, end;
position = offset;
- end = offset + count;
/* We can only query whole chunks. Therefore start with the
* first chunk before offset.
@@ -403,8 +404,33 @@ do_extents (struct command *cmd, struct vddk_handle *h)
start_sector =
ROUND_DOWN (offset, VIXDISKLIB_MIN_CHUNK_SIZE * VIXDISKLIB_SECTOR_SIZE)
/ VIXDISKLIB_SECTOR_SIZE;
+
+ /* Get the size of the underlying disk. */
+ VDDK_CALL_START (VixDiskLib_GetInfo, "handle, &info")
+ err = VixDiskLib_GetInfo (h->handle, &info);
+ VDDK_CALL_END (VixDiskLib_GetInfo, 0);
+ if (err != VIX_OK) {
+ VDDK_ERROR (err, "VixDiskLib_GetInfo");
+ return -1;
+ }
+ capacity = info->capacity; /* in sectors! */
+ VDDK_CALL_START (VixDiskLib_FreeInfo, "info")
+ VixDiskLib_FreeInfo (info);
+ VDDK_CALL_END (VixDiskLib_FreeInfo, 0);
Is this code we should call once when we first .open the device (and
stash it in the vddk_handle), rather than querying every time .extents
is called?
+
+ /* Calculate the end that we're going to query, normally this is
+ * offset + count. However since chunks are larger than sectors,
+ * for a disk which has size (capacity) which is not aligned to the
+ * chunk size there is a part of the disk at the end that we can
+ * never query. Reduce 'end' to the maximum possible queryable part
+ * of the disk, and we'll deal with the unaligned bit after the loop
+ * (RHEL-71694).
+ */
+ end = offset + count;
+ last_queryable_sector = ROUND_DOWN (capacity, VIXDISKLIB_MIN_CHUNK_SIZE);
+ end = MIN (end, last_queryable_sector * VIXDISKLIB_SECTOR_SIZE);
You are reducing end here...
+
while (start_sector * VIXDISKLIB_SECTOR_SIZE < end) {
- VixError err;
uint32_t i;
uint64_t nr_chunks, nr_sectors;
VixDiskLibBlockList *block_list;
@@ -474,6 +500,17 @@ do_extents (struct command *cmd, struct vddk_handle *h)
break;
}
+ /* If 'end' spanned beyond the last chunk of the disk, then we
+ * reduced it above to avoid reading a chunk that extends beyond the
+ * end of the underlying disk. We have to synthesize an allocated
+ * block here, which is what VDDK's example code does
+ * (doc/samples/diskLib/vixDiskLibSample.cpp: DoGetAllocatedBlocks).
+ */
+ if (end < offset + count) {
+ if (add_extent (extents, &position, offset + count, false) == -1)
+ return -1;
+ }
...and then synthesizing a final extent for everything beyond end,
whether or not you were asked to provide that much information. But
that is safe - we intentionally designed add_extent() to take as much
or as little as you want to provide, and synthesizing the allocated
chunk always is easier than deciding whether or not you reduced end.
So I agree that you special-cased the tail chunk correctly (the latter
approach of the two I mentioned above).
+++ b/tests/test-vddk-real-unaligned-chunk.sh
+
+# Read the map using VDDK.
+export d
+nbdkit -rfv vddk libdir="$vddkdir" \
+ $PWD/$d/test.vmdk \
+ --run 'nbdinfo --map "$uri" > $d/map'
+cat $d/map
+
+# Note a few features of the expected map. The first 3 chunks (3*128
+# sectors) are allocated, followed by a single hole chunk. Then the
+# last 3 unaligned sectors appear allocated (even though they are not)
+# because we could not read them using the QueryAllocatedBlocks API so
+# we had to assume allocated.
+test "$(cat $d/map)" = "\
+ 0 196608 0 data
+ 196608 65536 3 hole,zero
+ 262144 1536 0 data"
Annoying but harmless - it is always safer to assume allocated when we
can't prove otherwise. Test looks good.
ACK series.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:
qemu.org |
libguestfs.org