On Thu, 2011-11-10 at 13:30 +0000, Richard W.M. Jones wrote:
On Thu, Nov 10, 2011 at 12:16:20PM +0000, Mark McLoughlin wrote:
> Hey Rich,
>
> On Wed, 2011-11-09 at 18:23 +0000, Richard W.M. Jones wrote:
> > At the moment OpenStack uses kpartx and nbd to resize filesystems and
> > inject files to guests. I sincerely hope they don't allow untrusted
> > users to upload guest images / AMIs :-(
>
> I'm not saying the current situation is ideal, but could you talk me
> through exactly what the concerns are with what OpenStack is currently
> doing with potentially untrusted images?
A lot of what I say below does concern untrusted images. I don't know
to what extent clouds allow untrusted images to be uploaded. They can
be on EC2, but I don't know about OpenStack.
> Is it this one?
>
>
http://libguestfs.org/guestfs.3.html#security_of_mounting_filesystems
>
> "there are very many filesystem drivers in the kernel, and many of
> them are infrequently used and not much developer attention has been
> paid to the code. Linux userspace helps potential crackers by
> detecting the filesystem type and automatically choosing the right
> VFS driver, even if that filesystem type is obscure or unexpected for
> the administrator."
If I'm reading the code in nova/virt/disk.py correctly, what happens
now is this command is invoked:
mount <kpartx_device> <tmpdir>
In this scenario (no -t option) libblkid is invoked to examine the
device, it suggests a filesystem type (eg. "minix", "ext2", ..) and
then the equivalent of this kernel syscall is done:
mount -t <suggested_type> <kpartx_device> <tmpdir>
*Any* filesystem that can be identified by libblkid and is compiled
into the current host kernel is fair game. (I counted about 30
filesystem types supported by Fedora, including such common ones as
Amiga Fast Filesystem,and the BeOS Filesystem).
A bug in the mount or other code in any of these could be a host
kernel exploit. Not just theoretical either: RHBZ#635266 would allow
at least a DoS attack if any of the utilities that OpenStack runs does
a 'statfs/statvfs' call on the filesystem. (There's a patch for this
bug but it's not included in Linux 3.1.)
> I guess passing e.g. '-t ext2,ext3' to the mount command would mitigate
> this?
Yes, they need to choose a list of filesystems which are well enough
tested that bugs won't exist.
Hmmm.
Really the way to avoid exploits is to run libguestfs over libvirt
over sVirt. We don't yet have this working, although IBM are looking
into it. But they're still in a much better position using libguestfs
because having another kernel and qemu gives them some isolation.
> Any other glaring issues with what it's doing?
kpartx runs as root and creates DM linear mappings. That's a "needs
root" / "clean up" problem already.
But also kpartx will happily create up to 256 linear mappings from one
device, and it has code to support IBM DASD, Solaris, Mac, UnixWare
and other schemes that wouldn't be found on common guests. I bet you
one italian Euro that there's an exploit to be found in there.
resize2fs is also run (as non-root, I think) on the untrusted
filesystem, and that would be my next place to look for problems.
Thanks for all that Rich. My takeaways are:
1) The current file injection and disk resizing code in OpenStack
doesn't provide sufficient protection against the possibility of
users exploiting vulnerabilities in the kernel or core OS userspace
utilities.
However, there's no known vulnerability here that needs an urgent
response (e.g. filing a CVE) - i.e. it's not like the issue with
using qemu's disk format auto-detection.
2) Restricting the set of guest filesystems we support would
eliminate one of the most likely sources of potential
vulnerabilities.
3) Using libguestfs (and later, using it over libvirt/svirt) would
provide much greater protection along with the potential to
support things like LVM inside guest images.
Thanks,
Mark.