On 26.09.2016 19:20, Carl-Daniel Hailfinger wrote:
On 26.09.2016 14:29, Richard W.M. Jones wrote:
> On Mon, Sep 26, 2016 at 02:18:02PM +0200, Carl-Daniel Hailfinger wrote:
>> Hi,
>>
>> has anyone ever run "make check" from nbd against nbdkit with a python
>> plugin? I usually get segfaults during such a run, and sometimes various
>> other errors happen before the segfault, suggesting that some memory
>> corruption is underway.
>> AFAICS a pure python plugin should not be able to cause memory corruption.
> Correct, a python plugin should not cause memory corruption,
> and nbdkit shouldn't segfault ever.
>
> Did you get a stack trace from C (not from Python)?
The core files were useless, but I ran nbdkit in gdb and got something...
(gdb) run -n -f /software/nbdkit/plugins/python/.libs/nbdkit-python-plugin.so
script=/software/nbdkit/plugins/python/ssd-ftl-wearleveling.py disksize=52428800
Starting program: /software/nbdkit/src/nbdkit -n -f
/software/nbdkit/plugins/python/.libs/nbdkit-python-plugin.so
script=/software/nbdkit/plugins/python/ssd-ftl-wearleveling.py disksize=52428800
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, py_config (key=0x7fffffffe2e0 "script", value=0x7fffffffe2e7
"/software/nbdkit/plugins/python/ssd-ftl-wearleveling.py") at python.c:107
107 {
(gdb) info shared
>From To Syms Read Shared Object Library
0x00007ffff7ddaae0 0x00007ffff7df5490 Yes /lib64/ld-linux-x86-64.so.2
0x00007ffff7bd6ed0 0x00007ffff7bd79ce Yes /lib/x86_64-linux-gnu/libdl.so.2
0x00007ffff79bd9f0 0x00007ffff79ca471 Yes
/lib/x86_64-linux-gnu/libpthread.so.0
0x00007ffff7612520 0x00007ffff7757183 Yes /lib/x86_64-linux-gnu/libc.so.6
0x00007ffff73f0000 0x00007ffff73f13e7 Yes
/software/nbdkit/plugins/python/.libs/nbdkit-python-plugin.so
0x00007ffff6ed70c0 0x00007ffff7065782 Yes (*)
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
0x00007ffff6c73e00 0x00007ffff6c83bf8 Yes /lib/x86_64-linux-gnu/libz.so.1
0x00007ffff6a6ff10 0x00007ffff6a70804 Yes /lib/x86_64-linux-gnu/libutil.so.1
0x00007ffff676e610 0x00007ffff67dd056 Yes /lib/x86_64-linux-gnu/libm.so.6
(*): Shared library is missing debugging information.
(gdb) c
Continuing.
[New Thread 0x7ffff62eb700 (LWP 24498)]
[Thread 0x7ffff62eb700 (LWP 24498) exited]
[New Thread 0x7ffff62eb700 (LWP 24516)]
[Thread 0x7ffff62eb700 (LWP 24516) exited]
[New Thread 0x7ffff62eb700 (LWP 24519)]
[Thread 0x7ffff62eb700 (LWP 24519) exited]
[New Thread 0x7ffff62eb700 (LWP 24536)]
[Thread 0x7ffff62eb700 (LWP 24536) exited]
[New Thread 0x7ffff62eb700 (LWP 24549)]
[Thread 0x7ffff62eb700 (LWP 24549) exited]
[New Thread 0x7ffff62eb700 (LWP 24561)]
[New Thread 0x7ffff4c26700 (LWP 24580)]
nbdkit: python[6]: error: ssd-ftl-wearleveling.py: callback failed: close
[Thread 0x7ffff62eb700 (LWP 24561) exited]
nbdkit: python[7]: error: invalid flush request: expecting offset and length == 0
nbdkit: python[7]: error: write reply: Connection reset by peer
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff4c26700 (LWP 24580)]
0x00007ffff7015306 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
(gdb) bt
#0 0x00007ffff7015306 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#1 0x00007ffff7018059 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#2 0x00007ffff7018059 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#3 0x00007ffff701954d in PyEval_EvalCodeEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#4 0x00007ffff7017dd8 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#5 0x00007ffff701954d in PyEval_EvalCodeEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#6 0x00007ffff704e6d0 in ?? () from /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#7 0x00007ffff6fbad43 in PyObject_Call () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#8 0x00007ffff7033577 in PyEval_CallObjectWithKeywords () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#9 0x00007ffff73f0af0 in py_close (handle=0x625338) at python.c:234
#10 0x0000000000405803 in plugin_close (conn=conn@entry=0x7fffe8000910) at plugins.c:377
#11 0x00000000004037ec in free_connection (conn=0x7fffe8000910) at connections.c:147
#12 0x0000000000404476 in _handle_single_connection (sockout=<optimized out>,
sockin=<optimized out>) at connections.c:99
#13 handle_single_connection (sockin=<optimized out>, sockout=<optimized
out>) at connections.c:109
#14 0x0000000000405e73 in start_thread (datav=0x7fffffffdd40) at sockets.c:220
#15 0x00007ffff79c0184 in start_thread (arg=0x7ffff4c26700) at pthread_create.c:312
#16 0x00007ffff76ed37d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
And another one... my close function pickles all python objects which
were accessed in the plugin. Maybe that's what exposes the memory
corruption on close.
(gdb) run -n -f /software/nbdkit/plugins/python/.libs/nbdkit-python-plugin.so
script=/software/nbdkit/plugins/python/ssd-ftl-wearleveling.py disksize=52428800
Starting program: /software/nbdkit/src/nbdkit -n -f
/software/nbdkit/plugins/python/.libs/nbdkit-python-plugin.so
script=/software/nbdkit/plugins/python/ssd-ftl-wearleveling.py disksize=52428800
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff62eb700 (LWP 24790)]
[Thread 0x7ffff62eb700 (LWP 24790) exited]
[New Thread 0x7ffff62eb700 (LWP 24802)]
[Thread 0x7ffff62eb700 (LWP 24802) exited]
[New Thread 0x7ffff62eb700 (LWP 24804)]
[Thread 0x7ffff62eb700 (LWP 24804) exited]
[New Thread 0x7ffff62eb700 (LWP 24822)]
[Thread 0x7ffff62eb700 (LWP 24822) exited]
[New Thread 0x7ffff62eb700 (LWP 24836)]
[Thread 0x7ffff62eb700 (LWP 24836) exited]
[New Thread 0x7ffff62eb700 (LWP 24848)]
[New Thread 0x7ffff4bea700 (LWP 24984)]
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff62eb700 (LWP 24848)]
0x00007ffff6fd19cc in PyObject_GetAttr () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
(gdb) bt
#0 0x00007ffff6fd19cc in PyObject_GetAttr () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#1 0x00007ffff7015b11 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#2 0x00007ffff701954d in PyEval_EvalCodeEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#3 0x00007ffff7017dd8 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#4 0x00007ffff7018059 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#5 0x00007ffff7018059 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#6 0x00007ffff7018059 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#7 0x00007ffff7018059 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#8 0x00007ffff7018059 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#9 0x00007ffff7018059 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#10 0x00007ffff7018059 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#11 0x00007ffff701954d in PyEval_EvalCodeEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#12 0x00007ffff7017dd8 in PyEval_EvalFrameEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#13 0x00007ffff701954d in PyEval_EvalCodeEx () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#14 0x00007ffff704e6d0 in ?? () from /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#15 0x00007ffff6fbad43 in PyObject_Call () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#16 0x00007ffff7033577 in PyEval_CallObjectWithKeywords () from
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#17 0x00007ffff73f0af0 in py_close (handle=0x625338) at python.c:234
#18 0x0000000000405803 in plugin_close (conn=conn@entry=0x7ffff0000910) at plugins.c:377
#19 0x00000000004037ec in free_connection (conn=0x7ffff0000910) at connections.c:147
#20 0x00000000004040f6 in _handle_single_connection (sockout=<optimized out>,
sockin=<optimized out>) at connections.c:95
#21 handle_single_connection (sockin=<optimized out>, sockout=<optimized out>)
at connections.c:109
#22 0x0000000000405e73 in start_thread (datav=0x7fffffffdd40) at sockets.c:220
#23 0x00007ffff79c0184 in start_thread (arg=0x7ffff62eb700) at pthread_create.c:312
#24 0x00007ffff76ed37d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Will try instrumenting the close function next.
Regards,
Carl-Daniel
>> Examples of nbdkit logs for running "make check" or
subsets of it from
>> nbd against nbdkit:
>>
>> nbdkit -n -f python script=ssd-ftl-wearleveling.py disksize=52428800
>> nbdkit: python[6]: error: ssd-ftl-wearleveling.py: callback failed: close
>> nbdkit: python[7]: error: invalid flush request: expecting offset and
>> length == 0
>> nbdkit: python[7]: error: write reply: Connection reset by peer
>> Segmentation fault
>>
>> nbdkit -n -f python script=ssd-ftl-wearleveling.py disksize=52428800
>> Fatal Python error: GC object already tracked
>> Segmentation fault
>>
>> nbdkit -n -f python script=ssd-ftl-wearleveling.py disksize=52428800
>> Segmentation fault
>>
>> nbdkit -n -f python script=ssd-ftl-wearleveling.py disksize=52428800
>> nbdkit: python[1]: error: invalid flush request: expecting offset and
>> length == 0
>> nbdkit: python[1]: error: write reply: Broken pipe
>> nbdkit: python[3]: error: client sent NBD_OPT_ABORT to abort the connection
>> nbdkit: python[6]: error: ssd-ftl-wearleveling.py: callback failed: close
>> Traceback (most recent call last):
>> File "ssd-ftl-wearleveling.py", line 221, in close
>> pickle.dump(lbahist, f)
>> File "/usr/lib/python2.7/pickle.py", line 1370, in dump
>> Segmentation fault (core dumped)
>>
>> My close function just dumps the disk image and write access history
>> with python pickle. It does not have any return statement, and because
>> of that I seriously doubt that said close function can fail.
>>
>> I'm now trying to replicate this with example.py. Will report back.