On 3/28/19 11:18 AM, Richard W.M. Jones wrote:
This uses lseek SEEK_DATA/SEEK_HOLE to search for allocated data and
holes in the underlying file.
---
plugins/file/file.c | 141 ++++++++++++++++++++++++++++++++++---
tests/Makefile.am | 5 ++
tests/test-file-extents.sh | 57 +++++++++++++++
3 files changed, 193 insertions(+), 10 deletions(-)
+#ifdef SEEK_HOLE
+/* Extents. */
+
+static int
+file_can_extents (void *handle)
+{
+ struct handle *h = handle;
+ off_t r;
+
+ /* A simple test to see whether SEEK_HOLE etc is likely to work on
+ * the current filesystem.
+ */
+ pthread_mutex_lock (&lseek_lock);
+ r = lseek (h->fd, 0, SEEK_HOLE);
+ pthread_mutex_unlock (&lseek_lock);
+ if (r == -1) {
+ nbdkit_debug ("extents disabled: lseek: SEEK_HOLE: %m");
+ return 0;
+ }
+ return 1;
+}
Should we also return 0 if the lseek() returned the offset of EOF? More
concretely, what if we test:
r = lseek(h->fd, lseek(h->fd, 0, SEEK_HOLE), SEEK_DATA);
If r is non-negative, then we have a sparse file; if the inner SEEK_HOLE
failed then the outer lseek(, -1, SEEK_DATA) should fail with EINVAL;
while if SEEK_HOLE succeeded but reported the offset of EOF, then
SEEK_DATA should fail due to ENXIO and we know the file has no holes at
all. In favor of the double check: a completely non-sparse file is just
wasting time on further lseek() calls that won't tell us anything
different than our default of all data (and which may be helpful for
things like tmpfs that have abysmal lseek() performance). Perhaps a
reason against: if we expect NBD_CMD_TRIM/NBD_CMD_WRITE_ZEROES or even a
third-party to be punching holes in the meantime, then opting out of
future lseek() won't see those later additions of holes.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization:
qemu.org |
libvirt.org