I managed to get a reproduction with udevd --debug.
I just waited for g.inspect_os() (per g.launch()) to fail return anything
(/dev/rhel/root symlink doesn't exist) in 32 concurrent executions
Attached output (~300KB).
Thanks!
Sam
On Tue, Apr 6, 2021 at 7:29 PM Sam Eiderman <sameid(a)google.com> wrote:
FYI, in debian the rules are a bit in a different flavor:
01-md-raid-creating.rules
50-firmware.rules
50-udev-default.rules
55-dm.rules
56-lvm.rules
60-block.rules
60-cdrom_id.rules
60-drm.rules
60-evdev.rules
60-input-id.rules
60-persistent-alsa.rules
60-persistent-input.rules
60-persistent-storage-dm.rules
60-persistent-storage-tape.rules
60-persistent-storage.rules
60-persistent-v4l.rules
60-sensor.rules
60-serial.rules
63-md-raid-arrays.rules
64-btrfs-dm.rules
64-btrfs.rules
64-md-raid-assembly.rules
69-lvm-metad.rules
69-md-clustered-confirm-device.rules
70-joystick.rules
70-mouse.rules
70-power-switch.rules
70-touchpad.rules
70-uaccess.rules
71-seat.rules
73-seat-late.rules
73-special-net-names.rules
73-usb-net-by-mac.rules
75-net-description.rules
75-probe_mtd.rules
78-sound-card.rules
80-debian-compat.rules
80-drivers.rules
80-net-setup-link.rules
82-gfs2-withdraw.rules
85-hwclock.rules
95-dm-notify.rules
99-systemd.rules
So maybe there is some difference here.
Sam
On Mon, Apr 5, 2021 at 9:31 PM Richard W.M. Jones <rjones(a)redhat.com>
wrote:
> On Mon, Apr 05, 2021 at 05:49:17PM +0300, Sam Eiderman wrote:
> > Yes, attached below (it's quite long)
>
> Thanks - just looking at the lines where the /dev/rhel/swap
> device appears or its various synonyms:
>
> > 2021-03-07 10:58:46 T libguestfs - 0 - appliance - drwxr-xr-x 2 0 0 60
> Mar 7 10:58 rhel
> [...]
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - /dev/mapper:
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - total 0
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - crw------- 1 0 0 10,
> 236 Mar
> > 7 10:58 control
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - lrwxrwxrwx 1 0 0
> 7 Mar
> > 7 10:58 rhel-root -> ../dm-2
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - lrwxrwxrwx 1 0 0
> 7 Mar
> > 7 10:58 rhel-swap -> ../dm-1
>
> This is fairly early on during appliance boot where the init script is
> listing out /dev/ recursively
> (
>
https://github.com/libguestfs/libguestfs/blob/b18ac489db76a700f2168ae6eb6...
> ).
> This is also after we did udevadm settle in the init script. It's
> interesting that at this point /dev/mapper/rhel-swap exists correctly,
> but /dev/rhel/ is an empty directory.
>
> That's very much not expected ...
>
> A bit later in the init script we list out PVs/VGs/LVs and everything
> looks normal:
>
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - + lvm pvs
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - PV VG
> Fmt
> > Attr PSize PFree
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - /dev/sda2 rhel
> lvm2
> > a-- <15.00g 0
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - /dev/sdb vg_myvg
> lvm2
> > a-- 496.00m 0
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - + lvm vgs
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - VG #PV #LV
> #SN Attr
> > VSize VFree
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - rhel 1 2
> 0 wz--n-
> > <15.00g 0
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - vg_myvg 1 1
> 0 wz--n-
> > 496.00m 0
> > 2021-03-07 10:58:47 T libguestfs - 0 - appliance - + lvm lvs
> > 2021-03-07 10:58:48 T libguestfs - 0 - appliance - LV VG
> Attr
> > LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
> > 2021-03-07 10:58:48 T libguestfs - 0 - appliance - root rhel
> -wi-a-----
> > 13.39g
> > 2021-03-07 10:58:48 T libguestfs - 0 - appliance - swap rhel
> -wi-a-----
> > 1.60g
> > 2021-03-07 10:58:48 T libguestfs - 0 - appliance - lv_var vg_myvg
> -wi-a-----
> > 496.00m
>
> A little bit further on while we're executing the guestfs_inspect_os()
> API call, we have this snippet:
>
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - command: lvm 'lvs'
> '-o'
> > 'vg_name,lv_name' '-S' 'lv_role=public &&
lv_skip_activation!=yes'
> > '--noheadings' '--separator' '/'
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - command: lvm
> returned 0
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - command: lvm: stdout:
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - rhel/root
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - rhel/swap
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - vg_myvg/lv_var
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - command: blkid '-c'
> '/dev/
> > null' '-o' 'value' '-s' 'TYPE'
'/dev/rhel/root'
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - command: blkid
> returned 0
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - command: blkid:
> stdout:
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - xfs
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - command: blkid '-c'
> '/dev/
> > null' '-o' 'value' '-s' 'TYPE'
'/dev/rhel/swap'
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - command: blkid
> returned 2
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - command: blkid '-c'
> '/dev/
> > null' '-o' 'value' '-s' 'TYPE'
'/dev/vg_myvg/lv_var'
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - command: blkid
> returned 0
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - command: blkid:
> stdout:
> > 2021-03-07 10:58:49 T libguestfs - 0 - appliance - xfs
>
> If you look closely it's clear that /dev/rhel/root has been created,
> /dev/vg_myvg/lv_var has been created (that was also an empty directory
> earlier on), but /dev/rhel/swap has *not* been created--notice how the
> blkid command exits with code 2.
>
> And obviously by the time we get down here ...
>
> > 2021-03-07 10:58:53 T libguestfs - 0 - appliance - rhel/root
> > 2021-03-07 10:58:53 T libguestfs - 0 - appliance - rhel/swap
> > 2021-03-07 10:58:53 T libguestfs - 0 - appliance - vg_myvg/lv_var
> > 2021-03-07 10:58:53 T libguestfs - 0 - appliance - guestfsd: error:
> stat: /dev/
> > rhel/swap: No such file or directory
>
> ... it still doesn't exist.
>
> We ran a udevadm --debug settle command before this, so if it wasn't
> going to be created because of a delayed udev event then I'd expect it
> should have done.
>
> I'm not much closer to understanding why the /dev/VG/LV nodes are
> created so much later/delayed than the /dev/mapper/VG-LV nodes. As
> far as I can tell they should be created by udev
> (/usr/lib/udev/rules.d/11-dm-lvm.rules in Fedora).
>
> Would suggest enabling udevadm debugging (uncomment #--debug in
> appliance/init).
>
> It would also be good to have some more concrete reproducer: you might
> have appliance/init kill PID 1 if /dev/rhel/swap does not exist, then
> repeatedly run g.launch() to see if you can make it fail, all the time
> while trying to capture the log of the failure with udev debugging
> turned on.
>
> Rich.
>
> --
> Richard Jones, Virtualization Group, Red Hat
>
http://people.redhat.com/~rjones
> Read my programming and virtualization blog:
http://rwmj.wordpress.com
> virt-df lists disk usage of guests without needing to install any
> software inside the virtual machine. Supports Linux and Windows.
>
http://people.redhat.com/~rjones/virt-df/
>
>