Re: [Libguestfs] [PATCH libnbd v3 3/3] copy: Evict pages from the page cache when writing to local files.

Thursday, 25 February 2021

On 2/25/21 11:34 AM, Richard W.M. Jones wrote:
...
 When writing to a file or block device, we are always writing new
 (ie. previously uncached) data.  This commit ensures that very little
 of that data will be in the page cache after nbdcopy finishes by
 evicting it as we go along.  This ensures that the host page cache is
 largely unchanged for other host processes.

 This uses Linus's technique described here:
 https://stackoverflow.com/a/3756466
 but instead of using 2 windows, it uses a configurable larger number
 of windows (in this case 8). 
Here you state configurable...

...

 Before this commit:

   $ rm /var/tmp/pattern ; sync ; time ./run nbdcopy [ nbdkit pattern 32G ]
/var/tmp/pattern && cachestats /var/tmp/pattern
   real	0m34.852s
   user	0m18.368s
   sys	0m33.117s
   pages in cache: 7090389/8388608 (84.5%)  [filesize=33554432.0K, pagesize=4K]

 Notice that the newly written file ends up in the cache, thus trashing
 the page cache on the host.

 After this commit:

   $ rm /var/tmp/pattern ; sync ; time ./run nbdcopy [ nbdkit pattern 32G ]
/var/tmp/pattern && cachestats /var/tmp/pattern
   real	0m38.721s
   user	0m18.837s
   sys	0m40.654s
   pages in cache: 65536/8388608 (0.8%)  [filesize=33554432.0K, pagesize=4K]

 The newly written file does not disturb the page cache.  However there
 is about 11% slow down. 
I suspect that is because we end up waiting longer for flushing actions
to complete before evicting things from cache.  Do we want this to be an
opt-in/out knob on the command line?  If so, which way should we lean
for the default value of that knob?

...
 @@ -159,7 +165,60 @@ page_cache_evict (struct rw_file *rwf, uint64_t
orig_offset, size_t orig_len)
      len -= n;
    }
  }
 -#endif
 +#endif /* PAGE_CACHE_MAPPING */
 +
 +#ifdef EVICT_WRITES
 +/* Prepare to evict file contents from the page cache when writing.
 + * We cannot do this directly (as for reads above) because we have to
 + * wait for Linux to finish writing the pages to disk.  Therefore the
 + * strategy is to (1) tell Linux to begin writing asynchronously and
 + * (2) evict the previous pages, which have hopefully been written
 + * already by the time we get here.  We have to maintain window(s) per
 + * thread.
 + *
 + * For more information see https://stackoverflow.com/a/3756466 and
 + * the links to Linus's advice from that entry.
 + */ 
I'm less familiar with this interface (having never used it before), but
your usage patterns appear to match the man page and reference materials.

...
 +
 +/* Increasing the number of windows gives better performance since
 + * writes are given more time to make it to disk before we have to
 + * pause to do the page cache eviction.  But a larger number of
 + * windows means less success overall since (a) more page cache is
 + * used as the program runs, and (b) we don't evict any writes which
 + * are still pending when the program exits.
 + */
 +#define NR_WINDOWS 8 
...but here you have a #define.  Are you missing a command line option,
or saving it for a later patch on top?

Otherwise it looks reasonable, once you decide what command-line tuning
it might need (as the choice between speed vs. cache clobbering may be
something users want to make).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [PATCH libnbd v3 3/3] copy: Evict pages from the page cache when writing to local files.