On 07/27/2018 04:03 PM, Nir Soffer wrote:
>When using block device on RHEL 7.5, file plugin fails to zero with this
>error (copied from strace):
>
>[pid 39551] fallocate(8, FALLOC_FL_ZERO_RANGE, 1536, 64000) = -1 ENODEV (No such
device)
>
>This is expected error according to the manual:
>
>ENODEV fd does not refer to a regular file or a directory. (If fd is a
>pipe or FIFO, a different error results.)
The man page is out-of-date; newer kernels support
FALLOC_FL_ZERO_RANGE on block devices, in a manner that leaves the
pages marked allocated (that is, no discard happens). The kernel
also supports FALLOC_FL_PUNCH_HOLE to guarantee a read-zeroes result
(using discard, if that works), and
FALLOC_FL_PUNCH_HOLE|FALLOC_FL_NO_HIDE_STALE (which only discards,
and no longer guarantees a read-zeroes result). But you are also
right that in all cases, the errno returned when the operation
cannot be supported is not always EOPNOSUPP.
>
>Treat this error as EOPNOSUPP.
>
>Tested only on Fedora 28; I don't know how to build nbdkit on RHEL, but
>the change is pretty trivial.
>---
> plugins/file/file.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
>diff --git a/plugins/file/file.c b/plugins/file/file.c
>index b6e33de..a7c07fb 100644
>--- a/plugins/file/file.c
>+++ b/plugins/file/file.c
>@@ -243,7 +243,7 @@ file_zero (void *handle, uint32_t count, uint64_t offset, int
may_trim)
> if (may_trim) {
> r = fallocate (h->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> offset, count);
>- if (r == -1 && errno != EOPNOTSUPP) {
>+ if (r == -1 && errno != EOPNOTSUPP && errno != ENODEV) {
> nbdkit_error ("zero: %m");
> }
> /* PUNCH_HOLE is older; if it is not supported, it is likely that
Given the recent thread on qemu about the different fallocate flags
in relation to block devices, we may need to revisit the logic in
this function to more closely follow the flags (use
FALLOC_FL_ZERO_RANGE when may_trim is false, and
FALLOC_FL_PUNCH_HOLE when it is true; as well as tweak the
.discard() callback to use
FALLOC_FL_PUNCH_HOLE|FALLOC_FL_NO_HIDE_STALE).
https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg05135.html
>@@ -254,7 +254,7 @@ file_zero (void *handle, uint32_t count, uint64_t offset, int
may_trim)
> #ifdef FALLOC_FL_ZERO_RANGE
> r = fallocate (h->fd, FALLOC_FL_ZERO_RANGE, offset, count);
>- if (r == -1 && errno != EOPNOTSUPP) {
>+ if (r == -1 && errno != EOPNOTSUPP && errno != ENODEV) {
> nbdkit_error ("zero: %m");
> }
> #else
Are we sure that the caller still sees the correct EOPNOTSUPP errno
value that it uses in deciding whether to attempt the manual
fallbacks? You may need to explicitly set errno.
>@@ -288,11 +288,11 @@ file_trim (void *handle, uint32_t count, uint64_t offset)
> struct handle *h = handle;
> /* Trim is advisory; we don't care if it fails for anything other
>- * than EIO or EPERM. */
>+ * than EIO, EPERM, or ENODEV (kernel 3.10) */
> r = fallocate (h->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> offset, count);
> if (r < 0) {
>- if (errno != EPERM && errno != EIO) {
>+ if (errno != EPERM && errno != EIO && errno != ENODEV) {
This last hunk looks wrong. We want to ignore ENODEV errors in this
code path (the same way we are already ignoring EOPNOTSUPP errors).
I should have read the mailing list first ...
I pushed Nir's initial patch. Let me know if additional
changes are required.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine. Supports Linux and Windows.