Re: [Libguestfs] [libnbd PATCH v4 2/4] generator: Finish parsing structured header in states-reply.c

Friday, 9 June 2023

On 6/9/23 04:17, Eric Blake wrote:
...
 Splitting the parse of a 20-byte structured reply header across two
 source files (16 bytes overlapping with simple replies in
 states-reply.c, the remaining 4 bytes in states-reply-structured.c) is
 confusing.  The upcoming addition of extended headers will reuse the
 payload parsing portion of structured replies, but will parse its
 32-byte header completely in states-reply.c.  So it is better to have
 a single file in charge of parsing all three header sizes (once the
 third type is introduced), where we then farm out to the payload
 parsers. 
"Farm out" is a transitive phrasal verb, but we don't have an object for
it. "Fan out" perhaps? (Or add an object.)

...

 To do that, rename REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY to
 REPLY.CHECK_REPLY_MAGIC and have it pick up the setup from
 REPLY.STRUCTURED_REPLY.START, rename REPLY 
The last REPLY is superfluous

...
 REPLY.STRUCTURED_REPLY.RECV_REMAINING to
 REPLY.RECV_STRUCTURED_REMAINING across files, and merge
 REPLY.STRUCTURED_REPLY.CHECK into what would otherwise be the
 now-empty REPLY.STRUCTURED_REPLY.START. 
A different way to describe the same restructuring (I needed this
wording for explaining the code movement to myself):

- We have (prepare, receive, parse) triplets.

- Before the patch, our starter triplet is (REPLY.START,
REPLY.RECV_REPLY, REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY). These are all
in "states-reply.c". In the last component of that triplet (the
"parse"
component), we branch either to SIMPLE_REPLY.START
(states-reply-simple.c) for actual parsing, or we start another triplet,
for the remaining bytes of a structured reply:
(REPLY.STRUCTURED_REPLY.START, REPLY.STRUCTURED_REPLY.RECV_REMAINING,
REPLY.STRUCTURED_REPLY.CHECK); these are all in

- The patch effectively redistributes the second triplet. The prepare
stage REPLY.STRUCTURED_REPLY.START is merged into the last (parse) stage
of the first triplet, i.e., into REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY
(and it is renamed to REPLY.CHECK_REPLY_MAGIC). The receive stage of the
second triplet is moved to "states-reply.c", and renamed to
REPLY.RECV_STRUCTURED_REMAINING. The last (parse) stage of the second
triplet is kept in "states-reply-structured.c", but renamed to
REPLY.STRUCTURED_REPLY.START.

- Thus "states-reply-structured.c" loses two states (prepare and
receive); the first is absorbed by "states-reply.c" into an existing
state, and the second is gained by "states-reply.c" as a new state.

I agree this is an improvement, because now we finish reading all reply
headers in a single C file.

...

 Signed-off-by: Eric Blake <eblake(a)redhat.com&gt;
 ---
  generator/state_machine.ml          | 29 ++++++++-----------
  generator/states-reply.c            | 43 ++++++++++++++++++++++-------
  generator/states-reply-structured.c | 26 -----------------
  3 files changed, 44 insertions(+), 54 deletions(-)

 diff --git a/generator/state_machine.ml b/generator/state_machine.ml
 index 126159b9..b5485aec 100644
 --- a/generator/state_machine.ml
 +++ b/generator/state_machine.ml
 @@ -769,8 +769,15 @@ and

    State {
      default_state with
 -    name = "CHECK_SIMPLE_OR_STRUCTURED_REPLY";
 -    comment = "Check if the reply is a simple or structured reply";
 +    name = "CHECK_REPLY_MAGIC";
 +    comment = "Check if the reply has expected magic";
 +    external_events = [];
 +  }; 
So this remains in "states-reply.c" and absorbs the original prep logic
of REPLY.STRUCTURED_REPLY.START, from "states-reply-structured.c".

...
 +
 +  State {
 +    default_state with
 +    name = "RECV_STRUCTURED_REMAINING";
 +    comment = "Receiving the remaining part of a structured reply header";
      external_events = [];
    };

This is effectively moved from "states-reply-structured.c" (originally
REPLY.STRUCTURED_REPLY.RECV_REMAINING) to "states-reply.c".

...
 @@ -804,28 +811,14 @@ and
    };
  ]

 -(* Receiving a structured reply from the server.
 +(* Receiving a structured reply payload from the server.
   * Implementation: generator/states-reply-structured.c
   *)
  and structured_reply_state_machine = [
    State {
      default_state with
      name = "START";
 -    comment = "Prepare to receive the remaining part of a structured reply";
 -    external_events = [];
 -  };
 - 
Yes, moved & merged into CHECK_REPLY_MAGIC.

...
 -  State {
 -    default_state with
 -    name = "RECV_REMAINING";
 -    comment = "Receiving the remaining part of a structured reply";
 -    external_events = [];
 -  }; 
Yes, moved and renamed to RECV_STRUCTURED_REMAINING.

...
 -
 -  State {
 -    default_state with
 -    name = "CHECK";
 -    comment = "Parse a structured reply from the server";
 +    comment = "Start parsing a structured reply payload from the server";
      external_events = [];
    };

Stays in "states-reply-structured.c", renamed from CHECK to START.

...
 diff --git a/generator/states-reply.c b/generator/states-reply.c
 index 2c77658b..87f17bae 100644
 --- a/generator/states-reply.c
 +++ b/generator/states-reply.c
 @@ -64,11 +64,11 @@  REPLY.START:
    ssize_t r;

    /* We read all replies initially as if they are simple replies, but
 -   * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below.  This
 -   * works because the structured_reply header is larger, and because
 -   * the last member of a simple reply, cookie, is coincident between
 -   * the two structs (an intentional design decision in the NBD spec
 -   * when structured replies were added).
 +   * check the magic in CHECK_REPLY_MAGIC below.  This works because
 +   * the structured_reply header is larger, and because the last
 +   * member of a simple reply, cookie, is coincident between the two
 +   * structs (an intentional design decision in the NBD spec when
 +   * structured replies were added).
     */
    STATIC_ASSERT (offsetof (struct nbd_handle, sbuf.simple_reply.cookie) ==
                   offsetof (struct nbd_handle, sbuf.sr.structured_reply.cookie), 
OK, only the state name is updated, and the comment is reflowed.

...
 @@ -113,11 +113,11 @@  REPLY.RECV_REPLY:
    switch (recv_into_rbuf (h)) {
    case -1: SET_NEXT_STATE (%.DEAD); return 0;
    case 1: SET_NEXT_STATE (%.READY); return 0;
 -  case 0: SET_NEXT_STATE (%CHECK_SIMPLE_OR_STRUCTURED_REPLY);
 +  case 0: SET_NEXT_STATE (%CHECK_REPLY_MAGIC);
    }
    return 0; 
OK, state rename.

...

 - REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY:
 + REPLY.CHECK_REPLY_MAGIC:
    struct command *cmd;
    uint32_t magic;
    uint64_t cookie; 
Ditto.

...
 @@ -127,7 +127,18 @@  REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY:
      SET_NEXT_STATE (%SIMPLE_REPLY.START);
    }
    else if (magic == NBD_STRUCTURED_REPLY_MAGIC) {
 -    SET_NEXT_STATE (%STRUCTURED_REPLY.START);
 +    /* We've only read the bytes that fill simple_reply.  The
 +     * structured_reply is longer, so prepare to read the remaining
 +     * bytes.  We depend on the memory aliasing in union sbuf to
 +     * overlay the two reply types.
 +     */
 +    STATIC_ASSERT (sizeof h->sbuf.simple_reply ==
 +                   offsetof (struct nbd_structured_reply, length),
 +                   simple_structured_overlap);
 +    assert (h->rbuf == (char *)&h->sbuf + sizeof h->sbuf.simple_reply);
 +    h->rlen = sizeof h->sbuf.sr.structured_reply;
 +    h->rlen -= sizeof h->sbuf.simple_reply; 
This is being absorbed from REPLY.STRUCTURED_REPLY.START. The comment's
wording is slightly updated. OK.

The code is identical.

...
 +    SET_NEXT_STATE (%RECV_STRUCTURED_REMAINING); 
Having dealt with the "prepare" stage right here, we can now read the
remaining bytes; that gets renamed from
REPLY.STRUCTURED_REPLY.RECV_REMAINING, and moved to this file.

...
    }
    else {
      /* We've probably lost synchronization. */
 @@ -141,8 +152,9 @@  REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY:
      return 0;
    }

 -  /* NB: This works for both simple and structured replies because the
 -   * handle (our cookie) is stored at the same offset.  See the
 +  /* NB: This works for both simple and structured replies, even
 +   * though we haven't finished reading the structured header yet,
 +   * because the cookie is stored at the same offset.  See the
     * STATIC_ASSERT above in state REPLY.START that confirmed this.
     */
    h->chunks_received++; 
Good!

...
 @@ -157,6 +169,17 @@  REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY:
    h->reply_cmd = cmd;
    return 0;

 + REPLY.RECV_STRUCTURED_REMAINING:
 +  switch (recv_into_rbuf (h)) {
 +  case -1: SET_NEXT_STATE (%.DEAD); return 0;
 +  case 1:
 +    save_reply_state (h);
 +    SET_NEXT_STATE (%.READY);
 +    return 0;
 +  case 0: SET_NEXT_STATE (%STRUCTURED_REPLY.START);
 +  }
 +  return 0;
 +
   REPLY.FINISH_COMMAND:
    struct command *prev_cmd, *cmd;
    uint64_t cookie; 
Moved nearly identically from REPLY.STRUCTURED_REPLY.RECV_REMAINING. The
state itself is renamed, and the state we jump to (for parsing) is also
renamed. Good.

"Nearly" identically: the code movement sneakily fixes a whitespace
error in the original! :) Double space after "case 0:".

...
 diff --git a/generator/states-reply-structured.c
b/generator/states-reply-structured.c
 index 205a236d..3a7a03fd 100644
 --- a/generator/states-reply-structured.c
 +++ b/generator/states-reply-structured.c
 @@ -46,32 +46,6 @@ structured_reply_in_bounds (uint64_t offset, uint32_t length,

  STATE_MACHINE {
   REPLY.STRUCTURED_REPLY.START:
 -  /* We've only read the bytes that fill simple_reply.  The
 -   * structured_reply is longer, so read the remaining part.  We
 -   * depend on the memory aliasing in union sbuf to overlay the two
 -   * reply types.
 -   */
 -  STATIC_ASSERT (sizeof h->sbuf.simple_reply ==
 -                 offsetof (struct nbd_structured_reply, length),
 -                 simple_structured_overlap);
 -  assert (h->rbuf == (char *)&h->sbuf + sizeof h->sbuf.simple_reply);
 -  h->rlen = sizeof h->sbuf.sr.structured_reply;
 -  h->rlen -= sizeof h->sbuf.simple_reply;
 -  SET_NEXT_STATE (%RECV_REMAINING);
 -  return 0;
 -
 - REPLY.STRUCTURED_REPLY.RECV_REMAINING:
 -  switch (recv_into_rbuf (h)) {
 -  case -1: SET_NEXT_STATE (%.DEAD); return 0;
 -  case 1:
 -    save_reply_state (h);
 -    SET_NEXT_STATE (%.READY);
 -    return 0;
 -  case 0:  SET_NEXT_STATE (%CHECK);
 -  }
 -  return 0;
 -
 - REPLY.STRUCTURED_REPLY.CHECK:
    struct command *cmd = h->reply_cmd;
    uint16_t flags, type;
    uint32_t length; 
Yes.

Hellishly difficult to review, but correct, as far as I can tell, and
certainly an improvement for the codebase.

Reviewed-by: Laszlo Ersek <lersek(a)redhat.com&gt;

I'll continue with the rest of the series sometime later.

Thanks,
Laszlo

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [libnbd PATCH v4 2/4] generator: Finish parsing structured header in states-reply.c