On Sat, Nov 07, 2015 at 12:21:29AM -0600, Jason Pepas wrote:
Hi,
So I've been hacking together an nbdkit plugin (similar to the "file"
plugin, but it splits the file up into chunks):
https://github.com/pepaslabs/nbdkit-chunks-plugin
I got it to the point of being a working prototype. Then I threw it
onto a raspberry pi, which it turns out only has a 50/50 shot of
fallocate() working correctly.
I'm checking the return code of fallocate(), and my chunks_pwrite()
returns -1 if it fails. No problems there.
When I run mkfs.ext2 /dev/nbd0 on the client, I see this on the nbd-server:
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000030723'
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000030724'
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000030725'
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000030726'
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000030727'
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000030728'
nbdkit: chunks[1]: error: Unable to fallocate
'/home/cell/nbds/default/chunks/00000000000000031232'
Indeed, there is definitely a problem with fallocate, as some of the
chunks are the correct size (256k), and some are zero length:
cell@pi1$ pwd
/home/cell/nbds/default/chunks
cell@pi1$ ls -l | tail
-rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032256
-rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032257
-rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032258
-rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032259
-rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032260
-rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032261
-rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032262
-rw------- 1 cell cell 262144 Nov 7 06:01 00000000000000032263
-rw------- 1 cell cell 0 Nov 7 06:01 00000000000000032264
-rw------- 1 cell cell 0 Nov 7 06:01 00000000000000032767
But that's my concern. The problem is that, alarmingly, mkfs.ext2
isn't phased by this at all:
root@debian:~# nbd-client pi1 10809 /dev/nbd0
Negotiation: ..size = 8192MB
bs=1024, sz=8589934592 bytes
root@debian:~# mkfs.ext2 /dev/nbd0
mke2fs 1.42.12 (29-Aug-2014)
Creating filesystem with 2097152 4k blocks and 524288 inodes
Filesystem UUID: 2230269c-6d2a-4927-93df-d9dd9f4fa40c
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
Allocating group tables: done
Writing inode tables: done
Writing superblocks and filesystem accounting information: done
root@debian:~#
However, the nbd-client's dmesg is chock full of errors:
[ 9832.409219] block nbd0: Other side returned error (22)
[ 9832.457401] block nbd0: Other side returned error (22)
[ 9832.503100] block nbd0: Other side returned error (22)
[ 9832.542457] block nbd0: Other side returned error (22)
[ 9832.590394] block nbd0: Other side returned error (22)
[ 9832.642393] block nbd0: Other side returned error (22)
[ 9832.681455] block nbd0: Other side returned error (22)
[ 9832.721355] block nbd0: Other side returned error (22)
[ 9832.722676] quiet_error: 15129 callbacks suppressed
[ 9832.722679] Buffer I/O error on device nbd0, logical block 6293248
[ 9832.724274] lost page write due to I/O error on nbd0
[ 9832.724282] Buffer I/O error on device nbd0, logical block 6293249
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293250
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293251
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293252
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293253
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293254
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293255
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293256
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.725110] Buffer I/O error on device nbd0, logical block 6293257
[ 9832.725110] lost page write due to I/O error on nbd0
[ 9832.743111] block nbd0: Other side returned error (22)
[ 9832.744420] blk_update_request: 125 callbacks suppressed
[ 9832.744422] end_request: I/O error, dev nbd0, sector 12587008
[ 9832.758203] block nbd0: Other side returned error (22)
[ 9832.759513] end_request: I/O error, dev nbd0, sector 12845056
[ 9832.777635] block nbd0: Other side returned error (22)
[ 9832.779511] end_request: I/O error, dev nbd0, sector 12849160
[ 9832.805950] block nbd0: Other side returned error (22)
[ 9832.810278] end_request: I/O error, dev nbd0, sector 12849416
[ 9832.846880] block nbd0: Other side returned error (22)
So, my question / concern is, how is it that the nbd-client's kernel
is correctly detecting massive I/O errors, but apparently not sending
them through to mkfs.ext2?
It's definitely not good, but I don't think it can be nbdkit, since
nbd-client is seeing the errors.
Or perhaps mkfs.ext2 doesn't check for I/O errors? That's a
bit hard
to believe...
How about 'strace mkfs.ext2 ..' and see if any system calls are
returning errors. That would show you whether nbd-client is throwing
errors away, or whether mkfs is getting the errors and ignoring them
(seems pretty unlikely, but you never know).
After that, it'd be down to tracing where the errors end up in the
kernel.
Rich.
Anyway, I'm sure someone on this list has run into similar
issues, so
I thought I'd reach out before I go too far down a rabbit hole.
Thanks,
Jason Pepas
_______________________________________________
Libguestfs mailing list
Libguestfs(a)redhat.com
https://www.redhat.com/mailman/listinfo/libguestfs
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top