On Tue, Aug 30, 2022 at 05:13:46PM +0100, Richard W.M. Jones wrote:
On Tue, Aug 30, 2022 at 11:29:26PM +0800, Ming Lei wrote:
> On Tue, Aug 30, 2022 at 03:38:50PM +0100, Richard W.M. Jones wrote:
> > On Tue, Aug 30, 2022 at 03:12:23PM +0800, Ming Lei wrote:
> > > The patch sent in last email may cause io hang on MQ, and follows the
fixed
> > > version:
> >
> > I split this into two commits and cleaned them up and posted them here:
> >
> >
https://gitlab.com/rwmjones/libnbd/-/commits/nbdublk/
> >
> > Unfortunately this doesn't work for me. When I do various filesystem
> > operations like git clone and a compile I see some subtle disk errors
> > and eventually it deadlocks, so I guess there is some problem.
>
> OK, care to provide more details about the reproducer? Like how backend
> is setup, MQ/SQ is used, disk size, ...
My test script is attached. $1 == "ublk".
It basically just clones a Linux repo and compiles it. It hangs
either during the clone or early in the build, and there are various
"scary messages" from git which might indicate disk corruption.
The NBD server is:
nbdkit -f memory 24G
running on the hypervisor ("nbd://pick").
> I have cloned linux kernel source tree on nbdublk disk and built it with
> fedora 36 config for ~20min, so far so good. In my setting, backend is
> 'nbdkit file /dev/sda(virtio-scsi)', nbdublk is single queue.
Can you see if you can reproduce a hang with the source from:
https://gitlab.com/rwmjones/libnbd/-/commits/nbdublk/
I may have made a mistake when rebasing your patch or fixing it up to
remove compiler warnings.
My test used the your tree directly. And I compared with it with
my native tree, basically same.
Today I will setup & run the test by your approach.
Thanks,
Ming