On 3/8/19 4:04 AM, Richard W.M. Jones wrote:
This is about the simplest implementation of NBD Structured Replies
(SR) that we can do right now.
It accepts NBD_OPT_STRUCTURED_REPLIES when negotiated by the client,
but only sends back the simplest possible SR when required to by
NBD_CMD_READ. The rest of the time it will send back simple replies
as before. We do not modify the plugin API so plugins are unable to
send complex SRs.
Yep, sounds compliant, and similar to how I added it in qemu. I'll read
through this one in detail, but interoperability with qemu is already a
good sign that it's workable.
Also we do not handle human-readable error messages yet because that
would require changes in how we handle nbdkit_error().
Also we do not understand NBD_CMD_FLAG_DF, but that seems to be OK
because (a) we don't advertize the feature and (b) we only send back a
single chunk anyway.
Or, we COULD advertise it because we always honor it (but that's a
larger diffstat, and thus at odds with "minimal implementation"). Either
way works.
+/* Structured reply types. */
+extern const char *name_of_nbd_reply_type (int);
+#define NBD_REPLY_TYPE_NONE 0
+#define NBD_REPLY_TYPE_OFFSET_DATA 1
+#define NBD_REPLY_TYPE_OFFSET_HOLE 2
+#define NBD_REPLY_TYPE_BLOCK_STATUS 3
+#define NBD_REPLY_TYPE_ERROR 32769
+#define NBD_REPLY_TYPE_ERROR_OFFSET 32770
Worth writing these later ones in hex or via a helper macro that does
((1 << 15) | value)? Or would that mess up the generated
protocol-to-lookup magic?
+++ b/plugins/nbd/nbd.c
@@ -345,7 +345,7 @@ nbd_request (struct handle *h, uint16_t flags, uint16_t type,
uint64_t offset,
static int
nbd_reply_raw (struct handle *h, int *fd)
{
- struct reply rep;
+ struct simple_reply rep;
struct transaction *trans;
void *buf;
uint32_t count;
@@ -353,7 +353,7 @@ nbd_reply_raw (struct handle *h, int *fd)
*fd = -1;
if (read_full (h->fd, &rep, sizeof rep) < 0)
return nbd_mark_dead (h);
- if (be32toh (rep.magic) != NBD_REPLY_MAGIC)
+ if (be32toh (rep.magic) != NBD_SIMPLE_REPLY_MAGIC)
return nbd_mark_dead (h);
nbdkit_debug ("received reply for cookie %#" PRIx64 ", status
%s",
rep.handle, name_of_nbd_error(be32toh (rep.error)));
Teaching the nbd plugin to negotiate structured replies as the client is
obviously a separate patch, I'm fine if you leave that to me :)
+static int
+send_structured_reply_read (struct connection *conn,
+ uint64_t handle, uint16_t cmd,
+ const char *buf, uint32_t count, uint64_t offset)
+{
+ /* Once we are really using structured replies and sending data back
+ * in chunks, we'll be able to grab the write lock for each chunk,
+ * allowing other threads to interleave replies. As we're not doing
+ * that yet we acquire the lock for the whole function.
+ */
Fair enough.
+ ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&conn->write_lock);
+ struct structured_reply reply;
+ struct structured_reply_offset_data offset_data;
+ int r;
+
+ assert (cmd == NBD_CMD_READ);
+
+ reply.magic = htobe32 (NBD_STRUCTURED_REPLY_MAGIC);
+ reply.handle = handle;
+ reply.flags = htobe16 (NBD_REPLY_FLAG_DONE);
+ reply.type = htobe16 (NBD_REPLY_TYPE_OFFSET_DATA);
+ reply.length = htobe32 (sizeof offset_data + count);
This line is correct, but I had to remind myself of C precedence rules
on this one; writing 'count + sizeof offset_data' instead has the same
effect without worrying whether sizeof binds with higher or lower
precedence than +.
@@ -1317,40 +1449,33 @@ recv_request_send_reply (struct connection
*conn)
/* Send the reply packet. */
send_reply:
- {
- ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&conn->write_lock);
- if (get_status (conn) < 0)
- return -1;
- reply.magic = htobe32 (NBD_REPLY_MAGIC);
- reply.handle = request.handle;
- reply.error = htobe32 (nbd_errno (error));
+ if (get_status (conn) < 0)
+ return -1;
Hmm, previously get_status() was checked under lock. But since it is a
thread-local variable, we should be safe grabbing it while unlocked.
ACK.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization:
qemu.org |
libvirt.org