The NBD protocol supports Forced Unit Access (FUA) as a more efficient
way to wait for just one write to land in persistent storage, rather
than all outstanding writes at the time of a flush; modeled after
the kernel's block I/O flag of the same name. While we can emulate
the proper semantics with a full-blown flush, there are some plugins
that can properly pass the FUA flag on to the end storage and thereby
avoid some overhead.
This patch introduces new callbacks and documentations for those
callbacks, although the actual implementation to take advantage of
the new callbacks will be in later patches.
Signed-off-by: Eric Blake <eblake(a)redhat.com>
---
docs/nbdkit-plugin.pod | 77 +++++++++++++++++++++++++++++++++++++++++++++++--
docs/nbdkit.pod | 9 ++++--
include/nbdkit-plugin.h | 8 ++++-
3 files changed, 88 insertions(+), 6 deletions(-)
diff --git a/docs/nbdkit-plugin.pod b/docs/nbdkit-plugin.pod
index 9abf75f..cb936f7 100644
--- a/docs/nbdkit-plugin.pod
+++ b/docs/nbdkit-plugin.pod
@@ -359,7 +359,7 @@ If there is an error, C<.can_write> should call
C<nbdkit_error> with
an error message and return C<-1>.
This callback is not required. If omitted, then we return true iff a
-C<.pwrite> callback has been defined.
+C<.pwrite> or C<.pwrite_fua> callback has been defined.
=head2 C<.can_flush>
@@ -400,7 +400,28 @@ If there is an error, C<.can_trim> should call
C<nbdkit_error> with an
error message and return C<-1>.
This callback is not required. If omitted, then we return true iff a
-C<.trim> callback has been defined.
+C<.trim> or C<.trim_fua> callback has been defined.
+
+=head2 C<.can_fua>
+
+ int can_fua (void *handle);
+
+This is called during the option negotiation phase to find out if the
+plugin supports the Forced Unit Access (FUA) flag on write and trim
+requests.
+
+If there is an error, C<.can_fua> should call C<nbdkit_error> with an
+error message and return C<-1>.
+
+This callback is not required. If omitted, then we return true iff
+either the C<.pwrite_fua> callback has been defined, or if C<.can_flush>
+returns true (in the latter case, FUA semantics are emulated by nbdkit
+calling C<.flush> before completing any write or trim operation with
+the FUA flag set).
+
+Note that if this defaults to true and C<.can_trim> also returns true,
+the plugin must provide either C<.flush> or C<.trim_fua> for correct
+FUA semantics.
=head2 C<.pread>
@@ -442,6 +463,21 @@ recovered from), C<.pwrite> should call C<nbdkit_error>
with an error
message, and C<nbdkit_set_error> to record an appropriate error
(unless C<errno> is sufficient), then return C<-1>.
+If the plugin can provide efficient Forced Unit Access (FUA) semantics,
+it should define C<.pwrite_fua> instead.
+
+=head2 C<.pwrite_fua>
+
+ int pwrite_fua (void *handle, const void *buf, uint32_t count, uint64_t offset, int
fua);
+
+This callback has the same requirements as C<.pwrite>, with the
+additional parameter C<fua> set to a non-zero value if the client
+wants FUA semantics (where the command must not return until the
+actions of the write have landed in persistent storage). If the
+plugin cannot provide efficient FUA, but C<.can_flush> returns true
+and C<.can_fua> does not return false, then client requests for FUA
+semantics are emulated by nbdkit calling C<.flush>.
+
=head2 C<.flush>
int flush (void *handle);
@@ -455,6 +491,11 @@ If there is an error, C<.flush> should call
C<nbdkit_error> with an
error message, and C<nbdkit_set_error> to record an appropriate error
(unless C<errno> is sufficient), then return C<-1>.
+Note that C<.flush> can be called both by the client doing an explicit
+flush request, and by nbdkit when emulating Forced Unit Access (FUA)
+semantics after a write or trim where the plugin did not provide FUA
+callbacks (C<.pwrite_fua>, C<.zero_fua>, and C<.trim_fua>).
+
=head2 C<.trim>
int trim (void *handle, uint32_t count, uint64_t offset);
@@ -467,6 +508,21 @@ If there is an error, C<.trim> should call
C<nbdkit_error> with an
error message, and C<nbdkit_set_error> to record an appropriate error
(unless C<errno> is sufficient), then return C<-1>.
+If the plugin can provide efficient Forced Unit Access (FUA) semantics,
+it should define C<.trim_fua> instead.
+
+=head2 C<.trim_fua>
+
+ int trim_fua (void *handle, uint32_t count, uint64_t offset, int fua);
+
+This callback has the same requirements as C<.trim>, with the
+additional parameter C<fua> set to a non-zero value if the client
+wants FUA semantics (where the command must not return until the
+actions of the trim have landed in persistent storage). If the plugin
+cannot provide efficient FUA, but C<.can_flush> returns true and
+C<.can_fua> does not return false, then client requests for FUA
+semantics are emulated by nbdkit calling C<.flush>.
+
=head2 C<.zero>
int zero (void *handle, uint32_t count, uint64_t offset, int may_trim);
@@ -488,6 +544,21 @@ If there is an error, C<.zero> should call
C<nbdkit_error> with an
error message, and C<nbdkit_set_error> to record an appropriate error
(unless C<errno> is sufficient), then return C<-1>.
+If the plugin can provide efficient Forced Unit Access (FUA) semantics,
+it should define C<.zero_fua> instead.
+
+=head2 C<.zero_fua>
+
+ int zero_fua (void *handle, uint32_t count, uint64_t offset, int may_trim, int fua);
+
+This callback has the same requirements as C<.zero>, with the
+additional parameter C<fua> set to a non-zero value if the client
+wants FUA semantics (where the command must not return until the
+actions of the write have landed in persistent storage). If the
+plugin cannot provide efficient FUA, but C<.can_flush> returns true
+and C<.can_fua> does not return false, then client requests for FUA
+semantics are emulated by nbdkit calling C<.flush>.
+
=head1 THREADS
Each nbdkit plugin must declare its thread safety model by defining
@@ -711,7 +782,7 @@ Pino Toscano
=head1 COPYRIGHT
-Copyright (C) 2013-2017 Red Hat Inc.
+Copyright (C) 2013-2018 Red Hat Inc.
=head1 LICENSE
diff --git a/docs/nbdkit.pod b/docs/nbdkit.pod
index 1687ac9..2b868d9 100644
--- a/docs/nbdkit.pod
+++ b/docs/nbdkit.pod
@@ -785,7 +785,12 @@ information about that plugin, eg:
[etc]
Plugins which ship with nbdkit usually have the same version as the
-corresponding nbdkit binary.
+corresponding nbdkit binary. The nbdkit binary will always be able
+to utilize plugins compiled against an older version of the header;
+however, there are cases where a newer plugin may not be fully
+supported by an older nbdkit binary (for example, a plugin that
+supplies C<.pwrite_fua> but not C<.pwrite> may not support writes
+when loaded by the older nbdkit).
=head2 Detect if a plugin is installed
@@ -895,7 +900,7 @@ Pino Toscano
=head1 COPYRIGHT
-Copyright (C) 2013-2017 Red Hat Inc.
+Copyright (C) 2013-2018 Red Hat Inc.
=head1 LICENSE
diff --git a/include/nbdkit-plugin.h b/include/nbdkit-plugin.h
index 2ec3b15..d3b0050 100644
--- a/include/nbdkit-plugin.h
+++ b/include/nbdkit-plugin.h
@@ -1,5 +1,5 @@
/* nbdkit
- * Copyright (C) 2013-2017 Red Hat Inc.
+ * Copyright (C) 2013-2018 Red Hat Inc.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -96,6 +96,12 @@ struct nbdkit_plugin {
void (*dump_plugin) (void);
+ int (*can_fua) (void *handle);
+ int (*pwrite_fua) (void *handle, const void *buf, uint32_t count,
+ uint64_t offset, int fua);
+ int (*zero_fua) (void *handle, uint32_t count, uint64_t offset, int may_trim,
+ int fua);
+ int (*trim_fua) (void *handle, uint32_t count, uint64_t offset, int fua);
/* int (*set_exportname) (void *handle, const char *exportname); */
};
--
2.14.3