[PATCH] v2v: Implement SSH password authentication for Xen and VMX over SSH.
by Richard W.M. Jones
This isn't quite the full thing. I think that Pino is also working on
replacing the ssh and scp commands in the v2v/input_vmx.ml file with
libssh. Without those changes, -i vmx will still issue raw ssh and
scp commands, which will use ssh-agent (or keyboard-interactive).
The Xen input method doesn't use raw ssh and scp commands, so that one
is OK.
Rich.
5 years, 8 months
nbdkit, VDDK, extents, readahead, etc
by Richard W.M. Jones
As I've spent really too long today investigating this, I want to
document this in a public email, even though there's nothing really
that interesting here. One thing you find from search for VDD 6.7 /
VixDiskLib_QueryAllocatedBlocks issues with Google is that we must be
one of the very few users out there. And the other thing is that it's
quite broken.
All testing was done using two baremetal servers connected back to
back through a gigabit ethernet switch. I used upstream qemu and
nbdkit from git as of today. I used a single test Fedora guest with a
16G thin provisioned disk with about 1.6G allocated.
Observations:
(1) VDDK hangs for a really long time when using the nbdkit --run
option.
It specifically hangs for exactly 120 seconds doing:
nbdkit: debug: VixDiskLib: Resolve host.
This seems to be a bug in VDDK, possibly connected with the fact that
we fork after initializing VDDK but before doing the
VixDiskLib_ConnectEx. I suspect it's something to do with the PID
changing.
It would be fair to deduct 2 minutes from all timings below.
(2) VDDK cannot use VixDiskLib_QueryAllocatedBlocks if the disk is
opened for writes. It fails with this uninformative error:
nbdkit: vddk[1]: error: [NFC ERROR] NfcFssrvrProcessErrorMsg: received NFC error 13 from server: NfcFssrvrOpen: Failed to open '[datastore1] Fedora 28/Fedora 28.vmdk'
nbdkit: vddk[1]: error: [NFC ERROR] NfcFssrvrClientOpen: received unexpected message 4 from server
nbdkit: vddk[1]: debug: VixDiskLib: Detected DiskLib error 290 (NBD_ERR_GENERIC).
nbdkit: vddk[1]: debug: VixDiskLib: VixDiskLibQueryBlockList: Fail to start query process. Error 1 (Unknown error) (DiskLib error 290: NBD_ERR_GENERIC) at 543.
nbdkit: vddk[1]: debug: can_extents: VixDiskLib_QueryAllocatedBlocks test failed, extents support will be disabled: original error: Unknown error
The last debug statement is from nbdkit itself indicating that because
VixDiskLib_QueryAllocatedBlocks didn't work, extents support is
disabled.
To work around this you can use nbdkit --readonly. However I don't
understand why that would be necessary, except perhaps it's just an
undocumented limitation of VDDK. For all the cases _we_ care about
we're using --readonly, so that's lucky.
(3) Using nbdkit-noextents-filter and nbdkit-stats-filter we can
nicely measure the benefits of extents:
With noextents (ie. force full copy):
elapsed time: 323.815 s
read: 8194 ops, 17179869696 bytes, 4.24437e+08 bits/s
Without noextents (ie. rely on qemu-img skipping sparse bits):
elapsed time: 237.41 s
read: 833 ops, 1734345216 bytes, 5.84423e+07 bits/s
extents: 70 ops, 135654246400 bytes, 4.57114e+09 bits/s
Note if you deduct 120 seconds (see point (1) above) from these times
then it goes from 203s -> 117s, about a 40% saving. We can likely do
better by having > 32 bit requests and qemu not using
NBD_CMD_FLAG_REQ_ONE.
(4) We can also add nbdkit-readahead-filter in both cases to see if
that helps or not:
With noextents and readahead:
elapsed time: 325.358 s
read: 265 ops, 17179869184 bytes, 4.22423e+08 bits/s
As expected the readahead filter reduces the numbers of iops greatly.
But in this back-to-back configuration VDDK requests are relatively
cheap so no time is saved.
Without noextents, with readahead:
elapsed time: 252.608 s
read: 96 ops, 1927282688 bytes, 6.10363e+07 bits/s
extents: 70 ops, 135654246400 bytes, 4.29612e+09 bits/s
Readahead is detrimental in this case, as expected because this filter
works best when reads are purely sequential, and if not it will tend
to prefetch extra data. Notice that the number of bytes read is
larger here than in the earlier test.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine. Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/
5 years, 8 months
[PATCH] daemon: drop error message check in do_part_expand_gpt
by Denis Plotnikov
part-expand-gpt takes extreme cautions and doesn't proceed to writing
to the disk if the preliminary dry run of sgdisk has generated any
warnings on stdout.
This blocks the use of part-expand-gpt on disk shrink (with disk
resize being the main usecase for part-expand-gpt), because sgdisk dry
run produces a warning in that case.
So remove the excessive safety check, and leave it up to the caller.
Signed-off-by: Denis Plotnikov <dplotnikov(a)virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan(a)virtuozzo.com>
---
daemon/parted.c | 20 +-------------------
tests/gdisk/test-expand-gpt.pl | 24 +++++++++++++++++++++---
2 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/daemon/parted.c b/daemon/parted.c
index 070ed4790..2cc714d64 100644
--- a/daemon/parted.c
+++ b/daemon/parted.c
@@ -699,26 +699,8 @@ do_part_expand_gpt(const char *device)
{
CLEANUP_FREE char *err = NULL;
- /* If something is broken, sgdisk may try to correct it.
- * (e.g. recreate partition table and so on).
- * We do not want such behavior, so dry-run at first.*/
int r = commandf (NULL, &err, COMMAND_FLAG_FOLD_STDOUT_ON_STDERR,
- "sgdisk", "--pretend", "-e", device, NULL);
-
- if (r == -1) {
- reply_with_error ("%s --pretend -e %s: %s", "sgdisk", device, err);
- return -1;
- }
- if (err && strlen(err) != 0) {
- /* Unexpected actions. */
- reply_with_error ("%s --pretend -e %s: %s", "sgdisk", device, err);
- return -1;
- }
- free(err);
-
- /* Now we can do a real run. */
- r = commandf (NULL, &err, COMMAND_FLAG_FOLD_STDOUT_ON_STDERR,
- "sgdisk", "-e", device, NULL);
+ "sgdisk", "-e", device, NULL);
if (r == -1) {
reply_with_error ("%s -e %s: %s", "sgdisk", device, err);
diff --git a/tests/gdisk/test-expand-gpt.pl b/tests/gdisk/test-expand-gpt.pl
index 4d647f1af..f17d034ee 100755
--- a/tests/gdisk/test-expand-gpt.pl
+++ b/tests/gdisk/test-expand-gpt.pl
@@ -54,11 +54,29 @@ sub tests {
my $end_sectors = 100 * 1024 * 2 - $output;
die unless $end_sectors <= 34;
- # Negative tests.
+ # Negative test.
eval { $g->part_expand_gpt ("/dev/sdb") };
die unless $@;
- eval { $g->part_expand_gpt ("/dev/sda1") };
- die unless $@;
+
+ $g->close ();
+
+ # Disk shrink test
+ die if system ("qemu-img resize --shrink disk_gpt.img 50M &>/dev/null");
+
+ $g = Sys::Guestfs->new ();
+
+ $g->add_drive ("disk_gpt.img", format => "qcow2");
+ $g->launch ();
+
+ die if $g->part_expand_gpt ("/dev/sda");
+
+ my $output = $g->debug ("sh", ["sgdisk", "-p", "/dev/sda"]);
+ die if $output eq "";
+ $output =~ s/\n/ /g;
+ $output =~ s/.*last usable sector is (\d+).*/$1/g;
+
+ my $end_sectors = 50 * 1024 * 2 - $output;
+ die unless $end_sectors <= 34;
}
eval { tests() };
--
2.17.2
5 years, 8 months
[PATCH] v2v: -o rhv-upload: check whether the cluster exists
by Pino Toscano
In the precheck script, check that the target cluster actually exists.
This will avoid errors when creating the VM after the data copying.
---
v2v/rhv-upload-precheck.py | 10 ++++++++++
v2v/test-v2v-o-rhv-upload-module/ovirtsdk4/__init__.py | 7 +++++++
2 files changed, 17 insertions(+)
diff --git a/v2v/rhv-upload-precheck.py b/v2v/rhv-upload-precheck.py
index 2798a29dd..b79f91b4a 100644
--- a/v2v/rhv-upload-precheck.py
+++ b/v2v/rhv-upload-precheck.py
@@ -70,4 +70,14 @@ if len(vms) > 0:
raise RuntimeError("VM already exists with name ‘%s’, id ‘%s’" %
(params['output_name'], vm.id))
+# Check whether the specified cluster exists.
+clusters_service = system_service.clusters_service()
+clusters = clusters_service.list(
+ search='name=%s' % params['rhv_cluster'],
+ case_sensitive=True,
+)
+if len(clusters) == 0:
+ raise RuntimeError("The cluster ‘%s’ does not exist" %
+ (params['rhv_cluster']))
+
# Otherwise everything is OK, exit with no error.
diff --git a/v2v/test-v2v-o-rhv-upload-module/ovirtsdk4/__init__.py b/v2v/test-v2v-o-rhv-upload-module/ovirtsdk4/__init__.py
index 8d1058d67..cc4224ccd 100644
--- a/v2v/test-v2v-o-rhv-upload-module/ovirtsdk4/__init__.py
+++ b/v2v/test-v2v-o-rhv-upload-module/ovirtsdk4/__init__.py
@@ -39,6 +39,9 @@ class Connection(object):
return SystemService()
class SystemService(object):
+ def clusters_service(self):
+ return ClustersService()
+
def data_centers_service(self):
return DataCentersService()
@@ -54,6 +57,10 @@ class SystemService(object):
def vms_service(self):
return VmsService()
+class ClustersService(object):
+ def list(self, search=None, case_sensitive=False):
+ return ["Default"]
+
class DataCentersService(object):
def list(self, search=None, case_sensitive=False):
return []
--
2.20.1
5 years, 8 months
[PATCH v2v 1/2] v2v: windows: Add a helper function for installing Powershell firstboot scripts.
by Richard W.M. Jones
---
v2v/windows.ml | 24 ++++++++++++++++++++++++
v2v/windows.mli | 6 ++++++
2 files changed, 30 insertions(+)
diff --git a/v2v/windows.ml b/v2v/windows.ml
index 23d589b00..dde64e677 100644
--- a/v2v/windows.ml
+++ b/v2v/windows.ml
@@ -19,6 +19,7 @@
open Printf
open Common_gettext.Gettext
+open Std_utils
open Tools_utils
open Utils
@@ -45,3 +46,26 @@ and check_app { Guestfs.app2_name = name;
publisher =~ rex_avg_tech
and (=~) str rex = PCRE.matches rex str
+
+(* Unfortunately Powershell scripts cannot be directly executed
+ * (unless some system config changes are made which for other
+ * reasons we don't want to do) and so we have to run this via
+ * a regular batch file.
+ *)
+let install_firstboot_powershell g { Types.i_windows_systemroot; i_root }
+ filename code =
+ let tempdir = sprintf "%s/Temp" i_windows_systemroot in
+ g#mkdir_p tempdir;
+ let code = String.concat "\r\n" code ^ "\r\n" in
+ g#write (sprintf "%s/%s" tempdir filename) code;
+
+ (* Powershell interpreter. Should we check this exists? XXX *)
+ let ps_exe =
+ i_windows_systemroot ^
+ "\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" in
+
+ (* Windows path to the Powershell script. *)
+ let ps_path = i_windows_systemroot ^ "\\Temp\\" ^ filename in
+
+ let fb = sprintf "%s -ExecutionPolicy ByPass -file %s" ps_exe ps_path in
+ Firstboot.add_firstboot_script g i_root filename fb
diff --git a/v2v/windows.mli b/v2v/windows.mli
index 016ef2a78..6db7874b0 100644
--- a/v2v/windows.mli
+++ b/v2v/windows.mli
@@ -21,3 +21,9 @@
val detect_antivirus : Types.inspect -> bool
(** Return [true] if anti-virus (AV) software was detected in
this Windows guest. *)
+
+val install_firstboot_powershell : Guestfs.guestfs -> Types.inspect ->
+ string -> string list -> unit
+(** [install_powershell_firstboot g inspect filename code] installs a
+ Powershell script (the lines of code) as a firstboot script in
+ the Windows VM. *)
--
2.20.1
5 years, 8 months
virt-v2v slow when running inside the VM
by Sureshkumar Kaliannan
Hi,
I'm trying to create a clone of a physical Window VM using p2v.
My goal is to create a cloning tools VM that has libguestfs tools installed
and acts as the convertor.
VM conversion works just fine but the conversion rate is significantly
slow(1/3) when running inside the VM compared to when the v2v is run on the
same bare-metal host.
On the host:
./virt-p2v-20190405-w1f4efxy/virt-v2v-conversion-log.txt:virtual copying
rate: 615.9 M bits/sec
./virt-p2v-20190405-w1f4efxy/virt-v2v-conversion-log.txt:real copying rate:
181.8 M bits/sec
>From the Guest VM (On the same bare-metal host)
virt-p2v-20190405-95azj89j/virt-v2v-conversion-log.txt:virtual copying
rate: 185.1 M bits/sec
virt-p2v-20190405-95azj89j/virt-v2v-conversion-log.txt:real copying rate:
62.7 M bits/sec
I understand there are several factors come into play but i tried to make
the VM comparable by making sure enough CPU / memory is given to the VM.
Also the I played by adjusting the disk cache modes for the VM(cache=none,
cache=unsafe). When the conversion happens there is not much load and
there are no other VMs on this machine.
I ruled out the disk being the bootleneck because when i do "virt-v2v -i
disk" conversion the VM is only slightly off.
For the same disk image,
virt-v2v when running in the host took '75 sec' whereas in the VM it took
'100 sec'
How to go about debugging this performance issue? Any pointers would be
helpful
thanks
Suresh
5 years, 8 months
[PATCH] v2v: warn when the guest has direct network interfaces (RHBZ#1518539)
by Pino Toscano
virt-v2v obviously cannot convert this kind of devices, since they are
specific to the host of the hypervisor. Thus, emit a warning about the
presence of direct network interfaces, so at least this can be noticed
when converting a guest.
---
v2v/parse_libvirt_xml.ml | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/v2v/parse_libvirt_xml.ml b/v2v/parse_libvirt_xml.ml
index 9cf4c496b..14cd82afd 100644
--- a/v2v/parse_libvirt_xml.ml
+++ b/v2v/parse_libvirt_xml.ml
@@ -499,6 +499,24 @@ let parse_libvirt_xml ?conn xml =
)
in
+ (* Check for direct attachments to physical network interfaces.
+ * (RHBZ#1518539)
+ *)
+ let () =
+ let obj = Xml.xpath_eval_expression xpathctx "/domain/devices/interface[@type='direct']" in
+ let nr_nodes = Xml.xpathobj_nr_nodes obj in
+ if nr_nodes > 0 then (
+ (* Sadly fn_ in ocaml-gettext seems broken, and always returns the
+ * singular string no matter what. Work around this by using a simple
+ * string with sn_ (which works), and outputting it as a whole.
+ *)
+ let msg = sn_ "this guest has a direct network interface which will be ignored"
+ "this guest has direct network interfaces which will be ignored"
+ nr_nodes in
+ warning "%s" msg
+ )
+ in
+
({
s_hypervisor = hypervisor;
s_name = name; s_orig_name = name;
--
2.20.1
5 years, 8 months
[PATCH] v2v: update documentation on nbdkit (RHBZ#1605242)
by Pino Toscano
nbdkit >= 1.6 ships a VDDK plugin always built, so recommend that
version instead of recommending to build nbdkit from sources.
---
v2v/virt-v2v-input-vmware.pod | 28 ++--------------------------
1 file changed, 2 insertions(+), 26 deletions(-)
diff --git a/v2v/virt-v2v-input-vmware.pod b/v2v/virt-v2v-input-vmware.pod
index 2b6dbaeec..b3ebda182 100644
--- a/v2v/virt-v2v-input-vmware.pod
+++ b/v2v/virt-v2v-input-vmware.pod
@@ -197,32 +197,8 @@ library is permitted by the license.
=item 2.
-You must also compile nbdkit, enabling the VDDK plugin. nbdkit E<ge>
-1.1.25 is recommended, but it is usually best to compile from the git
-tree.
-
-=over 4
-
-=item *
-
-L<https://github.com/libguestfs/nbdkit>
-
-=item *
-
-L<https://github.com/libguestfs/nbdkit/tree/master/plugins/vddk>
-
-=back
-
-Compile nbdkit as described in the sources (see link above).
-
-You do B<not> need to run C<make install> because you can run nbdkit
-from its source directory. The source directory has a shell script
-called F<nbdkit> which runs the locally built copy of nbdkit and its
-plugins. So set C<$PATH> to point to the nbdkit top build directory
-(that is, the directory containing the shell script called F<nbdkit>),
-eg:
-
- export PATH=/path/to/nbdkit-1.1.x:$PATH
+nbdkit E<ge> 1.6 is recommended, as it ships with the VDDK plugin
+enabled unconditionally.
=item 3.
--
2.20.1
5 years, 8 months
[supermin PATCH 0/5] rpm: fix package selection w/ multilib
by Pino Toscano
This patch series fixes the way supermin sorts the list of installed
packages when resolving a name, picking the right package for the host
architecture.
Pino Toscano (5):
rpm: do not unpack parameters
rpm: fix version comparison
rpm: query the RPM architecture
rpm: fix package sorting (RHBZ#1696822)
utils: remove unused 'compare_architecture' function
src/librpm-c.c | 10 ++++++++++
src/librpm.ml | 1 +
src/librpm.mli | 3 +++
src/ph_rpm.ml | 44 ++++++++++++++++++++++++++++++++++++++------
src/utils.ml | 27 ---------------------------
src/utils.mli | 3 ---
6 files changed, 52 insertions(+), 36 deletions(-)
--
2.20.1
5 years, 8 months
[RFC PATCH] protocol: Add NBD_CMD_FLAG_FAST_ZERO
by Eric Blake
While it may be counterintuitive at first, the introduction of
NBD_CMD_WRITE_ZEROES and NBD_CMD_BLOCK_STATUS has caused a performance
regression in qemu [1], when copying a sparse file. When the
destination file must contain the same contents as the source, but it
is not known in advance whether the destination started life with all
zero content, then there are cases where it is faster to request a
bulk zero of the entire device followed by writing only the portions
of the device that are to contain data, as that results in fewer I/O
transactions overall. In fact, there are even situations where
trimming the entire device prior to writing zeroes may be faster than
bare write zero request [2]. However, if a bulk zero request ever
falls back to the same speed as a normal write, a bulk pre-zeroing
algorithm is actually a pessimization, as it ends up writing portions
of the disk twice.
[1] https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg06389.html
[2] https://github.com/libguestfs/nbdkit/commit/407f8dde
Hence, it is desirable to have a way for clients to specify that a
particular write zero request is being attempted for a fast wipe, and
get an immediate failure if the zero request would otherwise take the
same time as a write. Conversely, if the client is not performing a
pre-initialization pass, it is still more efficient in terms of
networking traffic to send NBD_CMD_WRITE_ZERO requests where the
server implements the fallback to the slower write, than it is for the
client to have to perform the fallback to send NBD_CMD_WRITE with a
zeroed buffer.
Add a protocol flag and corresponding transmission advertisement flag
to make it easier for clients to inform the server of their intent. If
the server advertises NBD_FLAG_SEND_FAST_ZERO, then it promises two
things: to perform a fallback to write when the client does not
request NBD_CMD_FLAG_FAST_ZERO (so that the client benefits from the
lower network overhead); and to fail quickly with ENOTSUP if the
client requested the flag but the server cannot write zeroes more
efficiently than a normal write (so that the client is not penalized
with the time of writing data areas of the disk twice).
Note that the semantics are chosen so that servers should advertise
the new flag whether or not they have fast zeroing (that is, this is
NOT the server advertising that it has fast zeroes, but rather
advertising that the client can get feedback as needed on whether
zeroing is fast). It is also intentional that the new advertisement
includes a new errno value, ENOTSUP, with rules that this error should
not be returned for any pre-existing behaviors, must not happen when
the client does not request a fast zero, and must be returned quickly
if the client requested fast zero but anything other than the error
would not be fast; while leaving it possible for clients to
distinguish other errors like EINVAL if alignment constraints are not
met. Clients should not send the flag unless the server advertised
support, but well-behaved servers should already be reporting EINVAL
to unrecognized flags. If the server does not advertise the new
feature, clients can safely fall back to assuming that writing zeroes
is no faster than normal writes.
Note that the Linux fallocate(2) interface may or may not be powerful
enough to easily determine if zeroing will be efficient - in
particular, FALLOC_FL_ZERO_RANGE in isolation does NOT give that
insight; for block devices, it is known that ioctl(BLKZEROOUT) does
NOT have a way for userspace to probe if it is efficient or slow. But
with enough demand, the kernel may add another FALLOC_FL_ flag to use
with FALLOC_FL_ZERO_RANGE, and/or appropriate ioctls with guaranteed
ENOTSUP failures if a fast path cannot be taken. If a server cannot
easily determine if write zeroes will be efficient, it is better off
not advertising NBD_FLAG_SEND_FAST_ZERO.
Signed-off-by: Eric Blake <eblake(a)redhat.com>
---
I will not push this without both:
- a positive review (for example, we may decide that burning another
NBD_FLAG_* is undesirable, and that we should instead have some sort
of NBD_OPT_ handshake for determining when the server supports
NBD_CMF_FLAG_FAST_ZERO)
- a reference client and server implementation (probably both via qemu,
since it was qemu that raised the problem in the first place)
doc/proto.md | 44 +++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 43 insertions(+), 1 deletion(-)
diff --git a/doc/proto.md b/doc/proto.md
index 8aaad96..1107766 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -1059,6 +1059,17 @@ The field has the following format:
which support the command without advertising this bit, and
conversely that this bit does not guarantee that the command will
succeed or have an impact.
+- bit 11, `NBD_FLAG_SEND_FAST_ZERO`: allow clients to detect whether
+ `NBD_CMD_WRITE_ZEROES` is efficient. The server MUST set this
+ transmission flag to 1 if the `NBD_CMD_WRITE_ZEROES` request
+ supports the `NBD_CMD_FLAG_FAST_ZERO` flag, and MUST set this
+ transmission flag to 0 if `NBD_FLAG_SEND_WRITE_ZEROES` is not
+ set. Servers SHOULD NOT set this transmission flag if there is no
+ quick way to determine whether a particular write zeroes request
+ will be efficient, but the lack of an efficient write zero
+ implementation SHOULD NOT prevent a server from setting this
+ flag. Clients MUST NOT set the `NBD_CMD_FLAG_FAST_ZERO` request flag
+ unless this transmission flag is set.
Clients SHOULD ignore unknown flags.
@@ -1636,6 +1647,12 @@ valid may depend on negotiation during the handshake phase.
MUST NOT send metadata on more than one extent in the reply. Client
implementors should note that using this flag on multiple contiguous
requests is likely to be inefficient.
+- bit 4, `NBD_CMD_FLAG_FAST_ZERO`; valid during
+ `NBD_CMD_WRITE_ZEROES`. If set, but the server cannot perform the
+ write zeroes any faster than it would for an equivalent
+ `NBD_CMD_WRITE`, then the server MUST fail quickly with an error of
+ `ENOTSUP`. The client MUST NOT set this unless the server advertised
+ `NBD_FLAG_SEND_FAST_ZERO`.
##### Structured reply flags
@@ -2004,7 +2021,10 @@ The following request types exist:
reached permanent storage, unless `NBD_CMD_FLAG_FUA` is in use.
A client MUST NOT send a write zeroes request unless
- `NBD_FLAG_SEND_WRITE_ZEROES` was set in the transmission flags field.
+ `NBD_FLAG_SEND_WRITE_ZEROES` was set in the transmission flags
+ field. Additionally, a client MUST NOT send the
+ `NBD_CMD_FLAG_FAST_ZERO` flag unless `NBD_FLAG_SEND_FAST_ZERO` was
+ set in the transimssion flags field.
By default, the server MAY use trimming to zero out the area, even
if it did not advertise `NBD_FLAG_SEND_TRIM`; but it MUST ensure
@@ -2014,6 +2034,23 @@ The following request types exist:
same area will not cause fragmentation or cause failure due to
insufficient space.
+ If the server advertised `NBD_FLAG_SEND_FAST_ZERO` but
+ `NBD_CMD_FLAG_FAST_ZERO` is not set, then the server MUST NOT fail
+ with `ENOTSUP`, even if the operation is no faster than a
+ corresponding `NBD_CMD_WRITE`. Conversely, if
+ `NBD_CMD_FLAG_FAST_ZERO` is set, the server MUST fail quickly with
+ `ENOTSUP` unless the request can be serviced more efficiently than
+ a corresponding `NBD_CMD_WRITE`. The server's determination of
+ efficiency MAY depend on whether the request was suitably aligned,
+ on whether the `NBD_CMD_FLAG_NO_HOLE` flag was present, or even on
+ whether a previous `NBD_CMD_TRIM` had been performed on the
+ region. If the server did not advertise
+ `NBD_FLAG_SEND_FAST_ZERO`, then it SHOULD NOT fail with `ENOTSUP`,
+ regardless of the speed of servicing a request, and SHOULD fail
+ with `EINVAL` if the `NBD_CMD_FLAG_FAST_ZERO` flag was set. A
+ server MAY advertise `NBD_FLAG_SEND_FAST_ZERO` whether or not it
+ can perform efficient zeroing.
+
If an error occurs, the server MUST set the appropriate error code
in the error field.
@@ -2114,6 +2151,7 @@ The following error values are defined:
* `EINVAL` (22), Invalid argument.
* `ENOSPC` (28), No space left on device.
* `EOVERFLOW` (75), Value too large.
+* `ENOTSUP` (95), Operation not supported.
* `ESHUTDOWN` (108), Server is in the process of being shut down.
The server SHOULD return `ENOSPC` if it receives a write request
@@ -2125,6 +2163,10 @@ request is not aligned to advertised minimum block sizes. Finally, it
SHOULD return `EPERM` if it receives a write or trim request on a
read-only export.
+The server SHOULD NOT return `ENOTSUP` except as documented in
+response to `NBD_CMD_WRITE_ZEROES` when `NBD_CMD_FLAG_FAST_ZERO` is
+supported.
+
The server SHOULD return `EINVAL` if it receives an unknown command.
The server SHOULD return `EINVAL` if it receives an unknown command flag. It
--
2.20.1
5 years, 8 months