On 05/01/22 12:49, Richard W.M. Jones wrote:
On Sun, May 01, 2022 at 09:04:41AM +0200, Laszlo Ersek wrote:
> In guestfs_readdir(), the daemon currently sends each XDR-encoded
> "guestfs_int_dirent" to the library with a separate send_file_write()
> call.
>
> Determine the largest encoded size (from the longest filename that a
> "guestfs_int_dirent" could carry, from readdir()'s "struct
dirent"), and
> batch up the XDR encodings until the next encoding might not fit in
> GUESTFS_MAX_CHUNK_SIZE. Call send_file_write() only then.
>
> Bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=1674392
> Signed-off-by: Laszlo Ersek <lersek(a)redhat.com>
> ---
> daemon/readdir.c | 38 ++++++++++++++++----
> 1 file changed, 32 insertions(+), 6 deletions(-)
>
> diff --git a/daemon/readdir.c b/daemon/readdir.c
> index 9ab0b0aec955..3084ba939c10 100644
> --- a/daemon/readdir.c
> +++ b/daemon/readdir.c
> @@ -35,6 +35,9 @@ do_internal_readdir (const char *dir)
> DIR *dirstream;
> void *xdr_buf;
> XDR xdr;
> + struct dirent fill;
> + guestfs_int_dirent v;
> + unsigned max_encoded;
>
> /* Prepare to fail. */
> ret = -1;
> @@ -55,6 +58,20 @@ do_internal_readdir (const char *dir)
> }
> xdrmem_create (&xdr, xdr_buf, GUESTFS_MAX_CHUNK_SIZE, XDR_ENCODE);
>
> + /* Calculate the max number of bytes a "guestfs_int_dirent" can be
encoded to.
> + */
> + memset (fill.d_name, 'a', sizeof fill.d_name - 1);
> + fill.d_name[sizeof fill.d_name - 1] = '\0';
> + v.ino = INT64_MAX;
> + v.ftyp = '?';
> + v.name = fill.d_name;
> + if (!xdr_guestfs_int_dirent (&xdr, &v)) {
> + fprintf (stderr, "xdr_guestfs_int_dirent failed\n");
> + goto release_xdr;
> + }
> + max_encoded = xdr_getpos (&xdr);
> + xdr_setpos (&xdr, 0);
> +
> /* Send an "OK" reply, before starting the file transfer. */
> reply (NULL, NULL);
>
> @@ -63,7 +80,6 @@ do_internal_readdir (const char *dir)
> */
> for (;;) {
> struct dirent *d;
> - guestfs_int_dirent v;
>
> errno = 0;
> d = readdir (dirstream);
> @@ -94,22 +110,32 @@ do_internal_readdir (const char *dir)
> v.ftyp = 'u';
> #endif
>
> + /* Flush "xdr_buf" if we may not have enough room for encoding
"v". */
> + if (GUESTFS_MAX_CHUNK_SIZE - xdr_getpos (&xdr) < max_encoded) {
> + if (send_file_write (xdr_buf, xdr_getpos (&xdr)) != 0)
> + break;
> +
> + xdr_setpos (&xdr, 0);
> + }
> +
> if (!xdr_guestfs_int_dirent (&xdr, &v)) {
> fprintf (stderr, "xdr_guestfs_int_dirent failed\n");
> break;
> }
> -
> - if (send_file_write (xdr_buf, xdr_getpos (&xdr)) != 0)
> - break;
> -
> - xdr_setpos (&xdr, 0);
> }
>
> + /* Flush "xdr_buf" if the loop completed successfully and
"xdr_buf" is not
> + * empty. */
> + if (ret == 0 && xdr_getpos (&xdr) > 0 &&
> + send_file_write (xdr_buf, xdr_getpos (&xdr)) != 0)
> + ret = -1;
> +
> /* Finish or cancel the transfer. Note that if (ret == -1) because the library
> * canceled, we still need to cancel back!
> */
> send_file_end (ret == -1);
>
> +release_xdr:
> xdr_destroy (&xdr);
> free (xdr_buf);
Interesting - does it make things faster?
Yes it does, although the runtime of "ls -l" over guestmount is
dominated by the individual stat'ing of the files returned by readdir.
Interestingly, when (for an independent reason) I later searched the git
log for GUESTFS_MAX_CHUNK_SIZE, I stumbled upon Pino's commit
9aef2c7e3a5b ("daemon: tsk: properly use GUESTFS_MAX_CHUNK_SIZE",
2017-03-03). That commit was apparently driven by the same idea.
Anyway as it doesn't complicate things very much, ACK.