On 1/4/19 3:48 AM, Richard W.M. Jones wrote:
The original plan was to have a background thread doing the reclaim.
However that cannot work given the design of filters, because a
background thread cannot access the next_ops struct which is only
available during requests.
Therefore we spread the work over the request threads. Each blk_*
function checks whether there is work to do, and if there is will
reclaim up to two blocks from the cache (thus ensuring that we always
make forward progress, since each blk_* function can only add at most
one block to the cache).
Another large change is that the cache block size can no longer be
fixed. We must use a block size which is at least as large as the
filesystem block size so that FALLOC_FL_PUNCH_HOLE works. To do this,
test the filesystem block size and set blksize dynamically to
MAX (4096, filesystem block size).
This also adds a test
Incomplete sentence?
---
+++ b/filters/cache/cache.h
@@ -34,12 +34,7 @@
#ifndef NBDKIT_CACHE_H
#define NBDKIT_CACHE_H
-/* Size of a block in the cache. A 4K block size means that we need
- * 64 MB of memory to store the bitmaps for a 1 TB underlying image.
- * It is also smaller than the usual hole size for sparse files, which
- * means we have no reason to call next_ops->zero.
- */
-#define BLKSIZE 4096
This comment is now gone...
@@ -278,14 +344,14 @@ cache_zero (struct nbdkit_next_ops *next_ops,
void *nxdata,
uint8_t *block;
bool need_flush = false;
- block = malloc (BLKSIZE);
+ block = malloc (blksize);
if (block == NULL) {
*err = errno;
nbdkit_error ("malloc: %m");
return -1;
}
- flags &= ~NBDKIT_FLAG_MAY_TRIM; /* See BLKSIZE comment above. */
+ flags &= ~NBDKIT_FLAG_MAY_TRIM; /* See blksize comment above. */
...which makes this comment stale. What's more, if we are able to punch
holes, maybe we should consider using a fourth mode in our bitmap. Right
now we have 00 for uncached, 01 for clean, 11 for dirty (and allocated);
we could add 10 for dirty-but-known-zero by punching a hole locally at
the time of the client's NBD_CMD_WRITE_ZERO with MAY_TRIM, then later
calling into next_ops->zero() when flushing a known-zero back to the
underlying plugin, all since we are now sized for holes. But other than
cleaning up the comment, the rest of this paragraph deserves a separate
commit.
I'm not spotting any obvious problems now.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization:
qemu.org |
libvirt.org