On Fri, Jun 09, 2023 at 05:22:36PM +0200, Laszlo Ersek wrote:
On 6/9/23 04:17, Eric Blake wrote:
> Splitting the parse of a 20-byte structured reply header across two
> source files (16 bytes overlapping with simple replies in
> states-reply.c, the remaining 4 bytes in states-reply-structured.c) is
> confusing. The upcoming addition of extended headers will reuse the
> payload parsing portion of structured replies, but will parse its
> 32-byte header completely in states-reply.c. So it is better to have
> a single file in charge of parsing all three header sizes (once the
> third type is introduced), where we then farm out to the payload
> parsers.
"Farm out" is a transitive phrasal verb, but we don't have an object for
it. "Fan out" perhaps? (Or add an object.)
Fan out works for me (and yeah, I can see how this one is more of a
localized idiom than a common English phrase); mentioning the payload
parsers states-reply-simple and states-reply-structured by name can't
hurt either.
>
> To do that, rename REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY to
> REPLY.CHECK_REPLY_MAGIC and have it pick up the setup from
> REPLY.STRUCTURED_REPLY.START, rename REPLY
The last REPLY is superfluous
The wall of ALL CAPS makes it harder to read. Your rewording for the
commit message was very nice; I've lifted quite a bit of it.
> REPLY.STRUCTURED_REPLY.RECV_REMAINING to
> REPLY.RECV_STRUCTURED_REMAINING across files, and merge
> REPLY.STRUCTURED_REPLY.CHECK into what would otherwise be the
> now-empty REPLY.STRUCTURED_REPLY.START.
A different way to describe the same restructuring (I needed this
wording for explaining the code movement to myself):
- We have (prepare, receive, parse) triplets.
More generically, our state machine is separated between phases of
preparing h->rbuf/rlen (where to read to), then a re-entrant state
that reads into the buffer (can be re-started as many times when
partial reads block), then once h->rlen reaches zero post-process the
data. Sometimes, when we read into a buffer, we immediately parse it
in the same function; other times, we merely SET_NEXT to a parser
state to parse the data (part of it depends on how much code reuse we
want to have, and whether embedding the parser inside a
switch(recv_into_rbuf) is going to be confusing because the parser
itself also wants a switch, but C lacks Java's 'break label;'
construct). But yes, for structured headers, we do indeed start with
two state triples that break out those actions, and have some
flexibility to consolidate things or split out more states if it makes
the code easier to read.
- Before the patch, our starter triplet is (REPLY.START,
REPLY.RECV_REPLY, REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY). These are all
in "states-reply.c". In the last component of that triplet (the
"parse"
component), we branch either to SIMPLE_REPLY.START
(states-reply-simple.c) for actual parsing, or we start another triplet,
for the remaining bytes of a structured reply:
(REPLY.STRUCTURED_REPLY.START, REPLY.STRUCTURED_REPLY.RECV_REMAINING,
REPLY.STRUCTURED_REPLY.CHECK); these are all in
states-reply-structured.c
- The patch effectively redistributes the second triplet. The prepare
stage REPLY.STRUCTURED_REPLY.START is merged into the last (parse) stage
of the first triplet, i.e., into REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY
(and it is renamed to REPLY.CHECK_REPLY_MAGIC). The receive stage of the
second triplet is moved to "states-reply.c", and renamed to
REPLY.RECV_STRUCTURED_REMAINING. The last (parse) stage of the second
triplet is kept in "states-reply-structured.c", but renamed to
REPLY.STRUCTURED_REPLY.START.
Partly because the state machine generator requires that any
sub-hierarchy of states must have a START block (when a parent state
group wants to call into the subhierarchy, it must use the START
state).
- Thus "states-reply-structured.c" loses two states (prepare and
receive); the first is absorbed by "states-reply.c" into an existing
state, and the second is gained by "states-reply.c" as a new state.
I agree this is an improvement, because now we finish reading all reply
headers in a single C file.
[...]
Hellishly difficult to review, but correct, as far as I can tell, and
certainly an improvement for the codebase.
Reviewed-by: Laszlo Ersek <lersek(a)redhat.com>
Thanks for taking the time on this one. I tweaked the commit message,
then pushed this as 2e34ceb8
I'll continue with the rest of the series sometime later.
Thanks,
Laszlo
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization:
qemu.org |
libvirt.org