On Mon, Sep 20, 2021 at 06:25:34PM +0200, Laszlo Ersek wrote:
On 09/20/21 13:04, Richard W.M. Jones wrote:
> For example 1024 is returned as "1K".
>
> This does not attempt to handle decimals or SI units. If the number
> isn't some multiple of a power of 1024 then it is returned as bytes (a
> flag is available to indicate this).
>
> I looked at both the gnulib and qemu versions of this function. The
> gnulib version is not under a license which is compatible with libnbd
> and is also really complicated, although it does handle fractions and
> SI units. The qemu version is essentially just frexp + sprintf and
> doesn't attempt to convert to the human-readable version reversibly.
> ---
> .gitignore | 1 +
> common/utils/Makefile.am | 10 +++-
> common/utils/human-size.c | 54 +++++++++++++++++++++
> common/utils/human-size.h | 49 +++++++++++++++++++
> common/utils/test-human-size.c | 89 ++++++++++++++++++++++++++++++++++
> 5 files changed, 201 insertions(+), 2 deletions(-)
>
> diff --git a/.gitignore b/.gitignore
> index 2aa1fd99..5fc59677 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -31,6 +31,7 @@ Makefile.in
> /bash/nbdcopy
> /bash/nbdfuse
> /bash/nbdinfo
> +/common/utils/test-human-size
> /common/utils/test-vector
> /compile
> /config.cache
> diff --git a/common/utils/Makefile.am b/common/utils/Makefile.am
> index 1ca4a370..b273ada1 100644
> --- a/common/utils/Makefile.am
> +++ b/common/utils/Makefile.am
> @@ -34,6 +34,8 @@ include $(top_srcdir)/common-rules.mk
> noinst_LTLIBRARIES = libutils.la
>
> libutils_la_SOURCES = \
> + human-size.c \
> + human-size.h \
> vector.c \
> vector.h \
> version.c \
> @@ -50,8 +52,12 @@ libutils_la_LIBADD = \
>
> # Unit tests.
>
> -TESTS = test-vector
> -check_PROGRAMS = test-vector
> +TESTS = test-human-size test-vector
> +check_PROGRAMS = test-human-size test-vector
> +
> +test_human_size_SOURCES = test-human-size.c human-size.c human-size.h
> +test_human_size_CPPFLAGS = -I$(srcdir)
> +test_human_size_CFLAGS = $(WARNINGS_CFLAGS)
>
> test_vector_SOURCES = test-vector.c vector.c vector.h
> test_vector_CPPFLAGS = -I$(srcdir)
> diff --git a/common/utils/human-size.c b/common/utils/human-size.c
> new file mode 100644
> index 00000000..772f2489
> --- /dev/null
> +++ b/common/utils/human-size.c
> @@ -0,0 +1,54 @@
> +/* nbd client library in userspace
> + * Copyright (C) 2020-2021 Red Hat Inc.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
(1) General question: should we adopt "SPDX-License-Identifier"s? They
are more succinct.
I'd usually follow what qemu or libvirt are doing, and as far as I can
see they are not using these.
> +
> +#include <config.h>
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <stdbool.h>
> +#include <stdint.h>
> +#include <inttypes.h>
> +
> +#include "human-size.h"
> +
> +char *
> +human_size (char *buf, uint64_t bytes, bool *human)
> +{
> + static const char *ext[] = { "E", "P", "T",
"G", "M", "K", "" };
(2) The following would be more frugal:
static const char ext[][2] = { "E", "P", "T",
"G", "M", "K", "" };
(each pointer in the current array is 4 bytes or 8 bytes).
Good idea - I'll try this.
> + size_t i;
> +
> + if (buf == NULL) {
> + buf = malloc (HUMAN_SIZE_LONGEST);
> + if (buf == NULL)
> + return NULL;
> + }
> +
> + /* Work out which extension to use, if any. */
> + for (i = 6; i >= 0; --i) {
> + if (bytes == 0 || (bytes & 1023) != 0)
> + break;
> + bytes /= 1024;
> + }
(3) "size_t" is an unsigned type. When the loop is entered with i==0,
and the "break" is not taken, "i" will flip back to ((size_t)-1), a
very
large positive integer. In other words, the controlling expression of
this loop is unfalsifiable.
Damn! I thought GCC would warn me about that ...
Functionally, that's not a problem; when we enter the loop with
i==0, we
will have shifted out 6*10 = 60 bits on the least significant end, we
only have 4 bits left, and so (bytes == 0 || (bytes & 1023) != 0) will
definitely fire. It's just that the controlling expression of the loop
is misleading (it suggests it has some role, when it doesn't).
But it's still potentially dangerous code. I'll see if there's a way
to fix this. I see you've got a better suggestion below.
(4) Style: I think bit-and (&) should be coupled with bit-shift
(>>), or
else division (/) should be coupled with remainder (%).
OK
(5) It is not necessary to keep the zero check in the loop body. Once
we
determine (before the loop) that the input is nonzero, we know it must
have at least one bit that is set. So we can just continue shifting out
runs of ten zero bits, until we hit a run of ten bits that is not
all-bits-zero.
Thus, how about:
i = 6;
if (bytes != 0) {
while ((bytes & 1023) == 0) {
bytes >>= 10;
i--;
}
}
> +
> + /* Set to flag to true if we're going to add a human-readable extension. */
(4) "Set [the] flag to true"?
Oops, yes.
> + if (human)
> + *human = ext[i][0] != '\0';
> +
> + snprintf (buf, HUMAN_SIZE_LONGEST, "%" PRIu64 "%s", bytes,
ext[i]);
> + return buf;
> +}
> diff --git a/common/utils/human-size.h b/common/utils/human-size.h
> new file mode 100644
> index 00000000..9ee78803
> --- /dev/null
> +++ b/common/utils/human-size.h
> @@ -0,0 +1,49 @@
> +/* nbd client library in userspace
> + * Copyright (C) 2020-2021 Red Hat Inc.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#ifndef LIBNBD_HUMAN_SIZE_H
> +#define LIBNBD_HUMAN_SIZE_H
> +
> +#include <stdbool.h>
> +#include <stdint.h>
> +
> +/* If you allocate a buffer of at least this length in bytes and pass
> + * it as the first parameter to human_size, then it will not overrun.
> + */
> +#define HUMAN_SIZE_LONGEST 64
(5) The integer constant expression
((sizeof (uint64_t) * 8 + 2) / 3 + 1)
would be more frugal (but we might not care).
If we take the number of three-bit groups in the word, and divide that
by three -- rounded up --, we get the number of necessary octal digits.
The number of decimal digits needed never exceeds the number of octal
digits needed, so this is safe. Add one character for the NUL terminator.)
Yeah I was just guessing here. FWIW gnulib really does the hard work:
https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/human.h;h=...
> +
> +/* Convert bytes to a human-readable string.
> + *
> + * This is roughly the opposite of nbdkit_parse_size. It will convert
> + * multiples of powers of 1024 to the appropriate human size with the
> + * right extension like 'M' or 'G'. Anything that cannot be
converted
> + * is returned as bytes. The *human flag is set to true if the output
> + * was abbreviated to a human-readable size, or false if it is just
> + * bytes.
> + *
> + * If buf == NULL, a buffer is allocated and returned. In this case
> + * the returned buffer must be freed.
> + *
> + * buf may also be allocated by the caller, in which case it must be
> + * at least HUMAN_SIZE_LONGEST bytes.
> + *
> + * On error the function returns an error and sets errno.
> + */
> +extern char *human_size (char *buf, uint64_t bytes, bool *human);
> +
> +#endif /* LIBNBD_HUMAN_SIZE_H */
> diff --git a/common/utils/test-human-size.c b/common/utils/test-human-size.c
> new file mode 100644
> index 00000000..d35a21bf
> --- /dev/null
> +++ b/common/utils/test-human-size.c
> @@ -0,0 +1,89 @@
> +/* nbd client library in userspace
> + * Copyright (C) 2020-2021 Red Hat Inc.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include <config.h>
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <stdbool.h>
> +#include <stdint.h>
> +#include <inttypes.h>
> +#include <string.h>
> +
> +#include "human-size.h"
> +
> +static unsigned errors = 0;
> +
> +static void
> +test (uint64_t bytes, const char *expected, bool expected_human_flag)
> +{
> + char actual[HUMAN_SIZE_LONGEST];
> + bool actual_human_flag;
> +
> + human_size (actual, bytes, &actual_human_flag);
> +
> + if (strcmp (actual, expected) == 0 ||
> + actual_human_flag != expected_human_flag) {
(6) I don't understand this condition. If I understand the branch *body*
correctly, this is the "test succeeded" case. For that, we need:
strcmp (actual, expected) == 0 &&
actual_human_flag == expected_human_flag
The negation of this condition (for the "test failed" case) would be:
strcmp (actual, expected) != 0 ||
actual_human_flag != expected_human_flag
Hmm this is a mess, let me fix this too ...
> + printf ("test-human-size: %" PRIu64 " ->
\"%s\" (%s) OK\n",
> + bytes, actual, actual_human_flag ? "true" :
"false");
> + fflush (stdout);
> + }
> + else {
> + fprintf (stderr,
> + "test-human-size: error: test case %" PRIu64
> + "expected \"%s\" (%s) "
(7) Missing space character in the format string between PRIu64 and
"expected".
Ok.
Thanks,
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top