Re: [Libguestfs] [PATCH libnbd] nbdfuse: New tool to present a network block device in a FUSE filesystem.

Monday, 14 October 2019

On 10/12/19 9:21 AM, Richard W.M. Jones wrote:
...
 This program allows you to turn a network block device source into a
 FUSE filesystem containing a virtual file:

    $ nbdkit memory 128M
    $ mkdir mp
    $ nbdfuse mp/ramdisk nbd://localhost &
    $ ls -l mp
    total 0
    -rw-rw-rw-. 1 rjones rjones 134217728 Oct 12 15:09 ramdisk
    $ dd if=/dev/urandom bs=1M count=128 of=mp/ramdisk conv=notrunc,nocreat
    128+0 records in
    128+0 records out
    134217728 bytes (134 MB, 128 MiB) copied, 3.10171 s, 43.3 MB/s
    $ fusermount -u mp

Cool!

...
 There are still some shortcomings, such as lack of zero and trim
 support.  These are documented in the TODO file.

...
 +++ b/README
 @@ -82,6 +82,8 @@ Optional:

    * Python >= 3.3 to build the Python 3 bindings and NBD shell (nbdsh).

 + * FUSE to build the nbdfuse program. 
Minimum version?

...
 +++ b/docs/libnbd.pod
 @@ -840,6 +840,7 @@
L<https://github.com/NetworkBlockDevice/nbd/blob/master/doc/uri.md>;.

   L<libnbd-security(3)>,
   L<nbdsh(1)>,
 +L<nbdfuse(1)>, 
Worth sorting these two alphabetically?

...
   L<qemu(1)>.

...
 +++ b/fuse/nbdfuse.c 
...
 +
 +#define FUSE_USE_VERSION 26
 +
 +#include <fuse.h>
 +#include <fuse_lowlevel.h>
 +
 +#include <libnbd.h>
 +
 +#define MAX_REQUEST_SIZE (64 * 1024 * 1024) 
Although this works with nbdkit, qemu-nbd doesn't like more than 32M. 
(We really should find time to teach nbdkit/libnbd about block size 
reporting, but that's a bigger project...)

...
 +
 +static struct nbd_handle *nbd;
 +static bool readonly = false; 
Looks funny to initialize a static variable to 0 in a block of static 
variables with no initializers (C guarantees 0-initialization even if 
you aren't explicit).

...
 +static char *mountpoint, *filename;
 +static const char *pidfile;
 +static char *fuse_options;
 +static struct fuse_chan *ch;
 +static struct fuse *fuse;
 +static struct timespec start_t;
 +static uint64_t size;
 + 
...
 +static void __attribute__((noreturn))
 +usage (FILE *fp, int exitcode)
 +{
 +  fprintf (fp,
 +"    nbdfuse [-r] MOUNTPOINT[/FILENAME] URI\n" 
Do we want to use any #ifdefs to avoid advertising URI support on the 
command line when libnbd is compiled without libxml2?

...
 +"Other modes:\n"
 +"    nbdfuse MOUNTPOINT[/FILENAME] --command CMD [ARGS ...]\n"
 +"    nbdfuse MOUNTPOINT[/FILENAME] --socket-activation CMD [ARGS ...]\n"
 +"    nbdfuse MOUNTPOINT[/FILENAME] --fd N\n"
 +"    nbdfuse MOUNTPOINT[/FILENAME] --tcp HOST PORT\n"
 +"    nbdfuse MOUNTPOINT[/FILENAME] --unix SOCKET\n" 
No mention of nbdfuse -o or -P.

...
 +"\n"
 +"Please read the nbdfuse(1) manual page for full usage.\n"
 +);
 +  exit (exitcode); 
nbdfuse --help > /dev/full

exits with status 0 because we didn't check for error on stdout/stderr. 
That's a corner case, and many programs don't care about it, but it's 
worth deciding if we want to care.

...
 +}
 +
 +static void
 +display_version (void)
 +{
 +  printf ("%s %s\n", PACKAGE_NAME, PACKAGE_VERSION);
 +}
 +
 +static void
 +fuse_help (const char *prog)
 +{
 +  static struct fuse_operations null_operations;
 +  const char *tmp_argv[] = { prog, "--help", NULL };
 +  fuse_main (2, (char **) tmp_argv, &null_operations, NULL);
 +  exit (EXIT_SUCCESS);
 +}
 +
 +static bool
 +is_directory (const char *path)
 +{
 +  struct stat statbuf;
 +
 +  if (stat (path, &statbuf) == -1)
 +    return false;
 +  return S_ISDIR (statbuf.st_mode); 
Accepts a symlink-to-directory, but that's fine by me.

...
 +}
 +
 +int
 +main (int argc, char *argv[])
 +{
 +  enum {
 +    MODE_URI,
 +    MODE_COMMAND,
 +    MODE_FD,
 +    MODE_SOCKET_ACTIVATION,
 +    MODE_TCP,
 +    MODE_UNIX,
 +  } mode = MODE_URI;
 +  enum {
 +    HELP_OPTION = CHAR_MAX + 1,
 +    FUSE_HELP_OPTION,
 +  };
 +  /* Note the "+" means we stop processing as soon as we get to the
 +   * first non-option argument (the mountpoint) and then we parse the
 +   * rest of the command line without getopt.
 +   */
 +  const char *short_options = "+o:P:rV";
 +  const struct option long_options[] = {
 +    { "fuse-help",          no_argument,       NULL, FUSE_HELP_OPTION },
 +    { "help",               no_argument,       NULL, HELP_OPTION },
 +    { "pidfile",            required_argument, NULL, 'P' },
 +    { "pid-file",           required_argument, NULL, 'P' },
 +    { "readonly",           no_argument,       NULL, 'r' },
 +    { "read-only",          no_argument,       NULL, 'r' },
 +    { "version",            no_argument,       NULL, 'V' }, 
Worth a long-option synonym for -o?

...
 +  /* The next parameter is either a URI or a mode switch. */
 +  if (strcmp (argv[optind], "--command") == 0 ||
 +      strcmp (argv[optind], "--cmd") == 0) {
 +    mode = MODE_COMMAND;
 +    optind++;
 +  } 
Is it worth using getopt_long() in this section for allowing unambiguous 
prefix spellings (--c for example) and/or a short option (-c for example)?

...
 +  else if (strcmp (argv[optind], "--socket-activation") ==
0 ||
 +           strcmp (argv[optind], "--systemd-socket-activation") == 0) {
 +    mode = MODE_SOCKET_ACTIVATION;
 +    optind++;
 +  } 
On the same theme, '--socket' as a synonym is easier to type than 
--socket-activation.

...
 +  else if (strcmp (argv[optind], "--fd") == 0) {
 +    mode = MODE_FD;
 +    optind++;
 +  }
 +  else if (strcmp (argv[optind], "--tcp") == 0) {
 +    mode = MODE_TCP;
 +    optind++;
 +  }
 +  else if (strcmp (argv[optind], "--unix") == 0) {
 +    mode = MODE_UNIX;
 +    optind++;
 +  }
 +  else if (argv[optind][0] == '-') {
 +    fprintf (stderr, "%s: unknown mode: %s\n\n", argv[0], argv[optind]);
 +    usage (stderr, EXIT_FAILURE);
 +  }
 +
 +  /* Check there are enough parameters following given the mode. */
 +  switch (mode) {
 +  case MODE_URI:
 +  case MODE_FD:
 +  case MODE_UNIX:
 +    if (argc - optind != 1)
 +      usage (stderr, EXIT_FAILURE);
 +    break;
 +  case MODE_TCP:
 +    if (argc - optind != 2)
 +      usage (stderr, EXIT_FAILURE);
 +    break;
 +  case MODE_COMMAND:
 +  case MODE_SOCKET_ACTIVATION:
 +    if (argc - optind < 1)
 +      usage (stderr, EXIT_FAILURE);
 +    break;
 +  }
 +  /* At this point we know the command line is valid, and so can start
 +   * opening FUSE and libnbd.
 +   */
 +
 +  /* Create the libnbd handle. */
 +  nbd = nbd_create ();
 +  if (nbd == NULL) {
 +    fprintf (stderr, "%s\n", nbd_get_error ());
 +    exit (EXIT_FAILURE);
 +  }
 +
 +  /* Connect to the NBD server synchronously. */
 +  switch (mode) { 
...
 +
 +  case MODE_FD:
 +    if (sscanf (argv[optind], "%d", &fd) != 1) { 
Overflow is undetected.

...
 +      fprintf (stderr, "%s: could not parse file descriptor:
%s\n\n",
 +               argv[0], argv[optind]);
 +      exit (EXIT_FAILURE);
 +    }
 +    if (nbd_connect_socket (nbd, fd) == -1) {
 +      fprintf (stderr, "%s\n", nbd_get_error ());
 +      exit (EXIT_FAILURE);
 +    }
 +    break;
 + 
...
 +
 +  /* Create the FUSE args. */
 +  if (fuse_opt_add_arg (&fuse_args, argv[0]) == -1) {
 +  fuse_opt_error:
 +    perror ("fuse_opt_add_arg");
 +    exit (EXIT_FAILURE);
 +  }
 +
 +  if (fuse_options) {
 +    if (fuse_opt_add_arg (&fuse_args, "-o") == -1 ||
 +        fuse_opt_add_arg (&fuse_args, fuse_options) == -1)
 +      goto fuse_opt_error;
 +  }
 +
 +  /* Create the FUSE mountpoint. */
 +  ch = fuse_mount (mountpoint, &fuse_args);
 +  if (ch == NULL) {
 +    fprintf (stderr,
 +             "%s: fuse_mount failed: see error messages above", argv[0]);
 +    exit (EXIT_FAILURE);
 +  }
 +
 +  /* Set F_CLOEXEC on the channel.  Some versions of libfuse don't do
 +   * this.
 +   */
 +  fd = fuse_chan_fd (ch);
 +  if (fd >= 0) {
 +    int flags = fcntl (fd, F_GETFD, 0);
 +    if (flags >= 0)
 +      fcntl (fd, F_SETFD, flags & ~FD_CLOEXEC); 
Doesn't check for (unlikely) error.

...
 +  }
 +
 +  /* Create the FUSE handle. */
 +  fuse = fuse_new (ch, &fuse_args,
 +                   &fuse_operations, sizeof fuse_operations, NULL);
 +  if (!fuse) {
 +    perror ("fuse_new");
 +    exit (EXIT_FAILURE);
 +  }
 +  fuse_opt_free_args (&fuse_args);
 +
 +  /* Catch signals since they can leave the mountpoint in a funny
 +   * state.  To exit the program callers must use ‘fusermount -u’.  We
 +   * also must be careful not to call exit(2) in this program until we
 +   * have unmounted the filesystem below.
 +   */
 +  memset (&sa, 0, sizeof sa);
 +  sa.sa_handler = SIG_IGN;
 +  sa.sa_flags = SA_RESTART;
 +  sigaction (SIGPIPE, &sa, NULL);
 +  sigaction (SIGINT, &sa, NULL);
 +  sigaction (SIGQUIT, &sa, NULL);
 +
 +  /* Ready to serve, write pidfile. */
 +  if (pidfile) {
 +    fp = fopen (pidfile, "w");
 +    if (fp) {
 +      fprintf (fp, "%ld", (long) getpid ());
 +      fclose (fp);
 +    }
 +  }
 +
 +  /* Enter the main loop. */
 +  r = fuse_loop (fuse);
 +  if (r != 0)
 +    perror ("fuse_loop");
 +
 +  /* Close FUSE. */
 +  fuse_unmount (mountpoint, ch);
 +  fuse_destroy (fuse);
 +
 +  /* Close NBD handle. */
 +  nbd_close (nbd);
 +
 +  free (mountpoint);
 +  free (filename);
 +  free (fuse_options);
 +
 +  exit (r == 0 ? EXIT_SUCCESS : EXIT_FAILURE); 
Looks deceptively simple :)

...
 +}
 +
 +/* Wraps calls to libnbd functions and automatically checks for a
 + * returns errors in the format required by FUSE.  It also prints out 
Missing a word or two after 'checks for a'

...
 + * the full error message on stderr, so that we don't lose it.
 + */
 +#define CHECK_NBD_ERROR(CALL)                                   \
 +  do { if ((CALL) == -1) return check_nbd_error (); } while (0)
 +static int
 +check_nbd_error (void)
 +{
 +  int err;
 +
 +  fprintf (stderr, "%s\n", nbd_get_error ());
 +  err = nbd_get_errno ();
 +  if (err != 0)
 +    return -err;
 +  else
 +    return -EIO;
 +}
 +
 +static int
 +nbdfuse_getattr (const char *path, struct stat *statbuf)
 +{
 +  const int mode = readonly ? 0444 : 0666;
 +
 +  memset (statbuf, 0, sizeof (struct stat));
 +
 +  /* We're probably making some Linux-specific assumptions here, but
 +   * this file is not compiled on non-Linux systems.
 +   */
 +  statbuf->st_atim = start_t;
 +  statbuf->st_mtim = start_t;
 +  statbuf->st_ctim = start_t;
 +  statbuf->st_uid = geteuid ();
 +  statbuf->st_gid = getegid (); 
Comment is interesting if true.  However, a google search for 'man 
fuse_main' pulls up https://man.openbsd.org/fuse_main.3 as its first 
hit, so I think FUSE has graduated to non-Linux systems, so we may have 
to revisit this later.

...
 +static int
 +nbdfuse_readdir (const char *path, void *buf,
 +                 fuse_fill_dir_t filler,
 +                 off_t offset, struct fuse_file_info *fi)
 +{
 +  if (strcmp (path, "/") != 0)
 +    return -ENOENT;
 +
 +  filler (buf, ".", NULL, 0);
 +  filler (buf, "..", NULL, 0);
 +  filler (buf, filename, NULL, 0);
 + 
Does FUSE have a way to populate d_type during readdir (DT_DIR for '.', 
'..', DT_REG for filename)?

...
 +static int
 +nbdfuse_write (const char *path, const char *buf,
 +               size_t count, off_t offset,
 +               struct fuse_file_info *fi)
 +{
 +  /* Probably shouldn't happen because of nbdfuse_open check. */
 +  if (readonly)
 +    return -EACCES; 
Is EROFS any better here?

...
 +++ b/fuse/nbdfuse.pod
 @@ -0,0 +1,262 @@
 +=head1 NAME
 +
 +nbdfuse - present a network block device in a FUSE filesystem
 +
 +=head1 SYNOPSIS
 +
 + nbdfuse [-o FUSE-OPTION] [-P PIDFILE] [-r]
 +         MOUNTPOINT[/FILENAME] URI 
This synopsis looks better than the one in usage().

...
 +
 +The NBD device itself can be local or remote and is specified by an
 +NBD URI (like C<nbd://localhost>, see L<nbd_connect_uri(3)>) or
 +various other modes.
 +
 +Use C<fusermount -u MOUNTPOINT> to unmount the filesystem after you
 +have used it. 
Does umount(8) call into fusermount correctly?

...
 +
 +This program is similar in concept to L<nbd-client(8)> (which turns
 +NBD into F</dev/nbdX> device nodes), except: 
Is it worth mentioning that qemu-nbd(8) alongside nbd-client(8)?

...
 +
 +=over 4
 +
 +=item *
 +
 +nbd-client is faster because it uses a special kernel module
 +
 +=item *
 +
 +nbd-client requires root, but nbdfuse can be used by any user
 +
 +=item *
 +
 +nbdfuse virtual files can be mounted anywhere in the filesystem
 +
 +=item *
 +
 +nbdfuse uses libnbd to talk to the NBD server
 +
 +=item *
 +
 +nbdfuse requires FUSE support in the kernel
 +
 +=back 
Decent list.

...
 +
 +=head1 EXAMPLES
 +
 +=head2 Present a remote NBD server as a local file
 +
 +If there is a remote NBD server running on C<example.com> at the
 +default NBD port number (10809) then you can turn it into a local file
 +by doing:
 +
 + $ mkdir dir
 + $ nbdfuse dir nbd://example.com &
 + $ ls -l dir/
 + total 0
 + -rw-rw-rw-. 1 nbd nbd 1073741824 Jan  1 10:10 nbd
 +
 +The file is called F<dir/nbd> and you can read and write to it as if
 +it is a normal file.  Note that writes to the file will write to the
 +remote NBD server.  After using it, unmount it:
 +
 + $ fusermount -u dir
 + $ rmdir dir
 +
 +=head2 Use nbdkit to create a file backed by a temporary RAM disk
 +
 +L<nbdkit(1)> has an I<-s> option allowing it to serve over
 +stdin/stdout.  You can combine this with nbdfuse as follows:
 +
 + $ mkdir dir
 + $ nbdfuse dir/ramdisk --command nbdkit -s memory 1G &
 + $ ls -l dir/
 + total 0
 + -rw-rw-rw-. 1 nbd nbd 1073741824 Jan  1 10:10 ramdisk
 + $ dd if=/dev/urandom bs=1M count=100 of=mp/ramdisk conv=notrunc,nocreat
 + 100+0 records in
 + 100+0 records out
 + 104857600 bytes (105 MB, 100 MiB) copied, 2.08319 s, 50.3 MB/s
 +
 +When you have finished with the RAM disk, you can unmount it as below
 +which will cause nbdkit to exit and the RAM disk contents to be
 +discarded:
 +
 + $ fusermount -u dir
 + $ rmdir dir 
What a fun way to use memory :)

...
 +
 +=head2 Use qemu-nbd to read and modify a qcow2 file
 +
 +L<qemu-nbd(8)> cannot serve over stdin/stdout, but it can use systemd
 +socket activation.  You can combine this with nbdfuse and use it to
 +open any file format which qemu understands:
 +
 + $ mkdir dir
 + $ nbdfuse dir/file.raw \
 +           --socket-activation qemu-nbd -f qcow2 file.qcow2 &
 + $ ls -l dir/
 + total 0
 + -rw-rw-rw-. 1 nbd nbd 1073741824 Jan  1 10:10 file.raw
 +
 +File F<dir/file.raw> is in raw format, backed by F<file.qcow2>.  Any
 +changes made to F<dir/file.raw> are reflected into the qcow2 file.  To
 +unmount the file do:
 +
 + $ fusermount -u dir
 + $ rmdir dir
 + 
The real power shines through - we have used the FUSE kernel module for 
user-space mounting of a qcow2 image, instead of the nbd kernel module 
for root-only mounting of a qcow2 image ;)

...
 +Some potentially useful FUSE options:
 +
 +=over 4
 +
 +=item B<-o> B<allow_other>
 +
 +Allow other users to see the filesystem.  This option has no effect
 +unless you enable it globally in F</etc/fuse.conf>.
 +
 +=item B<-o> B<kernel_cache>
 +
 +Allow the kernel to cache files (reduces the number of reads that have
 +to go through the L<libnbd(3)> API).  This is generally a good idea if
 +you can afford the extra memory usage.
 +
 +=item B<-o> B<uid=>N B<-o> B<gid=>N
 +
 +Use these options to map UIDs and GIDs. 
Does this line up with the stats we reported earlier in getattr()?

...
 +
 +=back
 +
 +=item B<-P> PIDFILE
 +
 +=item B<--pidfile> PIDFILE
 +
 +When nbdfuse is ready to serve, write the nbdfuse process ID (PID) to
 +F<PIDFILE>.  This can be used in scripts to wait until nbdfuse is
 +ready.  Note you mustn't try to kill nbdfuse.  Use C<fusermount -u> to
 +unmount the mountpoint which will cause nbdfuse to exit cleanly.
 +
 +=item B<-r>
 +
 +=item B<--readonly>
 +
 +Access the network block device read-only.  The virtual file will have
 +read-only permissions, and any writes will return errors.
 +
 +=item B<--socket-activation> CMD [ARGS ...]
 +
 +Select systemd socket activation mode.  This is similar to
 +I<--command>, but is used for servers like L<qemu-nbd(8)> which
 +support systemd socket activation.  See L</EXAMPLES> above and
 +L<nbd_connect_systemd_socket_activation(3)>.
 +
 +=item B<--tcp> HOST PORT
 +
 +Select TCP mode.  Connect to an NBD server on a host and port over an
 +unencrypted TCP socket.  See also L<nbd_connect_tcp(3)>. 
How hard would it be to support encryption?  Obviously, the fuse-mounted 
file will be unencrypted, but libnbd connect to an encrypted nbd server 
could prove useful.

...
 +
 +=item B<--unix> SOCKET
 +
 +Select Unix mode.  Connect to an NBD server on a Unix domain socket.
 +See also L<nbd_connect_unix(3)>.
 +
 +=item B<-V>
 +
 +=item B<--version>
 +
 +Display the package name and version and exit.
 +
 +=back
 +
 +=head1 NOTES
 +
 +=head2 Loop mounting
 +
 +It is tempting (and possible) to loop mount the file.  However this
 +will be very slow and may sometimes deadlock.  Better alternatives are
 +to use either L<nbd-client(8)>, or more securely L<libguestfs(3)>, 
Worth mentioning qemu-nbd(8) alongside nbd-client(8)?

...
 +L<guestfish(1)> or L<guestmount(1)> which can all access
NBD servers.
 +
 +=head2 As a way to access NBD servers
 +
 +You can use this to access NBD servers, but it is usually better (and
 +definitely much faster) to use L<libnbd(3)> directly instead.  To
 +access NBD servers from the command line, look at L<nbdsh(1)>.
 + 
Overall looks like a fun wrapper, to demonstrate how many layers we can 
shuffle data through to produce/consume it in the format of interest ;)

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [PATCH libnbd] nbdfuse: New tool to present a network block device in a FUSE filesystem.