On Fri, Feb 19, 2021 at 01:58:04PM -0600, Eric Blake wrote:
On 2/19/21 1:41 PM, Richard W.M. Jones wrote:
>>> +#if defined (HAVE_POSIX_FADVISE) && defined (POSIX_FADV_DONTNEED)
>>> + /* On Linux this will evict the pages we just read from the page cache.
*/
>>> + posix_fadvise (fd, offset, r, POSIX_FADV_DONTNEED);
>>
>> I don't think this is a good idea, since this affects the current
>> page cache, for the entire host.
>>
>> So if the host is having an image in cache for good reason,
>> running nbdcopy will drop the cache since nbdcopy does not need
>> it, but maybe the host will need that cache after running
>> nbdcopy.
>>
>> The right way to avoid polluting the page cache is to bypass the
>> cache using O_DIRECT, so nbdcopy is not using or affecting the
>> page cache. This can be useful if the user can enable this with
>> a flag.
After thinking about this a bit more I'm not too convinced this would
affect many people. It would require a situation where multiple
processes on the host are all reading the same file that is also being
processed by nbdcopy. It's possible, but for RHV/OpenStack/etc there
are lots of hypervisors running but generally confined to their own
sets of disks.
> The trouble with O_DIRECT is it's a pain to use correctly,
and I guess
> doesn't use readahead or the existing cache (kind of the opposite
> problem).
>
> However I take your point that we probably ought not to remove files
> from the cache that are already there. I somehow thought that
> POSIX_FADV_DONTNEED only affected the current 'fd', but looking at the
> Linux impl I see it affects the whole system.
>
> This argues for having a flag to enable this.
BTW it's extremely hairy, racy and probably slow, but it seems
possible using mmap + mincore to determine which pages are resident,
and so in theory avoid using POSIX_FADV_DONTNEED for those pages.
If I understand correctly, unconditionally attempting
FADV_SEQUENTIAL
should always should be fine both for reads (from existing file) and
writes (when copying to a new file), so its only FADV_DONTNEED for
clearing out the cache that needs a flag?
Looking at the code, FADV_SEQUENTIAL just doubles the size of
readahead:
https://github.com/torvalds/linux/blob/f40ddce88593482919761f74910f42f4b8...
I assume it's only used for reads and would not affect writes, but
it's hard to follow the code. I'm sure it should be safe though.
So yes FADV_DONTNEED seems like it's the only one that would need a
command line option to enable or disable.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages.
http://libguestfs.org