Re: [Libguestfs] libnbd completion callback question

Saturday, 5 September 2020

On 9/5/20 8:56 AM, Eric Blake wrote:
...
 Thinking about it more, it is probably easiest to just declare that
we 
 guarantee the cleanup callback is called regardless of failure mode, for 
 all closures, and that we merely had a memory leak in earlier libnbd 
 releases.  Yes, this means that we have a risk of older apps hitting a 
 double free if they cleaned up after a cookie of -1 to compensate for 
 our failure to cleanup; but it is unlikely, since the code for an app to 
 call the cleanup manually is more verbose, and since good apps are 
 unlikely to call functions that are going to fail client-side to trigger 
 the problem in the first place.  I'll post a patch, including coverage 
 in tests/closure-lifetimes.c, to demonstrate what I mean. 
Proof that common clients have a memory leak (rather than a risk of a 
double free error), and thus that this is worth fixing by guaranteeing 
closure cleanup even on client-side failures:

$ export count=10
$ nbdkit null 1m --run 'export uri; /usr/bin/time -v nbdsh -c "
h.set_request_structured_replies(False)
def f (c, o, e, err):
   pass
import os
h.connect_uri(os.environ[\"uri\"])
for _ in range(int(os.environ[\"count\"])):
   try:
     h.block_status(512, 0, f)
   except nbd.Error as ex:
     pass
print(\"got here\")
"' 2>&1 | grep '\(here\|resident\)'
got here
print("got here")
	Maximum resident set size (kbytes): 14416
	Average resident set size (kbytes): 0
$ count=10000
$ nbdkit null 1m --run 'export uri; /usr/bin/time -v nbdsh -c "
h.set_request_structured_replies(False)
def f (c, o, e, err):
   pass
import os
h.connect_uri(os.environ[\"uri\"])
for _ in range(int(os.environ[\"count\"])):
   try:
     h.block_status(512, 0, f)
   except nbd.Error as ex:
     pass
print(\"got here\")
"' 2>&1 | grep '\(here\|resident\)'
got here
print("got here")
	Maximum resident set size (kbytes): 17796
	Average resident set size (kbytes): 0

Ideally, increasing the loop iterations should have no effect on the max 
memory usage.  Replacing the h.set_request_structured_replies(False) 
line with h.add_meta_context(nbd.CONTEXT_BASE_ALLOCATION) (so that the 
block status request succeeds instead) proves the point; after that 
change, I'm seeing 17296k with count=1 and 17284k with count=10000 (that 
is, the memory usage is higher and somewhat less deterministic because 
there is now server interactions, but definitely no longer something 
that scales up as count increases).

And in proving that, I found several _other_ bugs, now fixed: Python.ml 
was mapping Bool incorrectly (so that 
h.set_request_structured_replies(False) was often setting things to true 
instead); which warranted testsuite coverage of functions previously 
uncalled under Python or Ocaml testsuites, and flushed out bugs in ocaml 
NBD.set_tls and NBD.set_handshake_flags.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] libnbd completion callback question