On 08/02/2018 02:05 PM, Nir Soffer wrote:
When using file systems not supporting ZERO_RANGE (e.g. NFS 4.2) or
block device on kernel < 4.9, we used to call fallocate() for every
zero, fail with EOPNOTSUPP, and fallback to manual zeroing. When
trimming, we used to try unsupported fallocate() on every call.
Change file handle to remember if punching holes or zeroing range are
supported, and avoid unsupported calls.
- can_trim changed to report the actual capability once we tried to
punch a hole. I don't think this is useful yet.
The documentation states that filters (or the nbdkit engine itself) are
free to cache a per-connection value of .can_trim(), and thus the plugin
should keep the value stable per connection - in part, because they
affect what is advertised to the client, but the advertisement happens
only once as part of the client's initial connection. Changing it after
the fact on a given connection won't change the client's behavior
(turning the feature on is pointless as the client already remembers it
was off; turning the feature off is detrimental as the client already
assumes it works). So the best you can do is make subsequent
connections more efficient after the initial connection has learned state.
- zero changed to:
1. If we can punch hole and may trim, try PUNCH_HOLE
2. If we can zero range, try ZERO_RANGE
3. Fall back to manual writing
- trim changed to:
1. If we can punch hole, try PUNCH_HOLE
2. Succeed
Seems reasonable from the description.
@@ -118,6 +119,8 @@ file_config_complete (void)
/* The per-connection handle. */
struct handle {
int fd;
+ bool can_punch_hole;
+ bool can_zero_range;
Would it be better to make these tri-state rather than merely bool?
(Indeterminate, supported, known to fail)
};
/* Create the per-connection handle. */
@@ -146,6 +149,18 @@ file_open (int readonly)
return NULL;
}
+#ifdef FALLOC_FL_PUNCH_HOLE
+ h->can_punch_hole = true;
+#else
+ h->can_punch_hole = false;
If you use tri-state, then the existence of the macro results in an
initialization of indeterminate, whereas the absence results in known to
fail.
+#endif
+
+#ifdef FALLOC_FL_ZERO_RANGE
+ h->can_zero_range = true;
+#else
+ h->can_zero_range = false;
Likewise.
+#endif
+
return h;
}
@@ -189,19 +204,15 @@ file_get_size (void *handle)
return statbuf.st_size;
}
+/* Trim is advisory, but we prefer to advertise it only when we can actually
+ * (attempt to) punch holes. Before we tried to punch a hole, report true if
+ * FALLOC_FL_PUNCH_HOLE is defined before we did the first call. Once we tried
+ * to punch a hole, report the actual cappability of the underlying file. */
s/cappability/capability/
If you use a tri-state, then report true if the variable is
indeterminate or works, false if known to fail.
static int
file_can_trim (void *handle)
{
- /* Trim is advisory, but we prefer to advertise it only when we can
- * actually (attempt to) punch holes. Since not all filesystems
- * support all fallocate modes, it would be nice if we had a way
- * from fpathconf() to definitively learn what will work on a given
- * fd for a more precise answer; oh well. */
The comment about fpathconf() would still be nice to keep in some form.
-#ifdef FALLOC_FL_PUNCH_HOLE
- return 1;
-#else
- return 0;
-#endif
+ struct handle *h = handle;
+ return h->can_punch_hole;
I'm a bit worried about whether changing the return value within the
context of a single connection is wise, or if we need to further
maintain a per-connection state that is initialized according to the
overall plugin state.
}
Also, it might be nice to add a .can_zero() callback, so that nbdkit
won't even waste time calling into .zero() if we know we will just be
deferring right back to a full write (either because the macro is not
available, or because we encountered a failure trying to use it on a
previous connection).
@@ -301,27 +320,30 @@ file_flush (void *handle)
static int
file_trim (void *handle, uint32_t count, uint64_t offset)
{
- int r = -1;
#ifdef FALLOC_FL_PUNCH_HOLE
struct handle *h = handle;
+ int r = -1;
+
+ if (h->can_punch_hole) {
+ r = do_fallocate (h->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+ offset, count);
Should we also use FALLOC_FL_NO_HIDE_STALE here, if it is available?
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization:
qemu.org |
libvirt.org