[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd





On 07/05/2012 10:51 AM, Kevin Wolf wrote:
Am 05.07.2012 16:22, schrieb Corey Bryant:
For some examples:

1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
now 0 so qemu closes it

Doesn't the refcount belong to an fdset rather than a single fd? As long
as the fdset has still a reference, it's possible to reach the other fds
in the set through a reopen. For example:

1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member
of fdset1, in use by monitor, refcount 1
2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a
member of fdset, in use by monitor, refcount is still 1
3. client calls 'device-add' with /dev/fdset/1 as the backing filename
and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2
4. client crashes, so all tracked fdsets are visited; fdset1 gets its
refcount decreased to 1, and both member fds 4 and 5 stay open.

If you had refcounted a single fd, fd 5 would be closed now and qemu
couldn't switch to O_RDONLY any more.


If the O_RDONLY is for a future command, it would make more sense to add
that fd before that future command.  Or are you passing it at step 2
because it is needed for the probe re-open issue?

Come on, requiring realistic examples isn't fair. ;-)

Heh :)


Swap steps 2 and 3 in the example, then step 3 is just the preparation
for a command that uses the new fd. The problem stays the same. Or a
live commit operation could be in flight so that qemu must be able to
switch on completion without libvirt interaction.

Good point.

I thought it would be useful to run through the examples again with each fdset having a refcount, and each fd having an in-use flag. One difference though is that refcount is only incremented/decremented when qemu_open/qemu_close are called. I think this works and covers Kevin's concerns. Actually it might be a bit simpler too.

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so it's in-use flag is on; in-use flag is turned off and fd=4 is left open because it is still in use by the block device (refcount is 1) 4. client re-establishes QMP connection, so all tracked fds are visited, and in-use flags are turned back on; 'query-fds' lets client learn about fd=4 still being open as part of fdset1

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and is left open because it is in use by the block device (refcount is 1) 4. client crashes, so all tracked fds are visited; fd=4 is not in-use but refcount is 1 so it is not closed

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
but the command fails for some other reason, so the refcount is still 0
at the end of the command (although it may have been temporarily
incremented then decremented during the command)
3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
QMP connection is closed), so qemu marks fd=4 as no longer in-use by the monitor; refcount is 0 so all fds in fdset1 are closed

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with refcount still 0; fd=5's in-use flag is turned on 3. client crashes, so all tracked fds are visited; fd=4 and fd=5 had not yet been passed to 'remove-fd', so their in-use flags are on; in-use flags are turned off; refcount of fdset1 is 0 so qemu closes all fds in fdset1

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with refcount still 0; fd=5's in-use flag is turned on
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and fd=4 is closed since the refcount is 0; fd=5 remains open

1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1 with refcount still 0; fd=5's in-use flag is turned on 3. client calls 'device-add' with /dev/fdset/1 as the backing filename and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1
4. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, but fd=4 remains open because refcount of fdset1 is 1
5. client calls 'remove-fd fdset=1 fd=5', so qemu marks fd=5 as no
longer in-use by the monitor, and fd=5 remains open because refcount of fdset1 is 1 6. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both fds are closed since refcount is 0 and their in-use flags are off

1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1 with refcount of 0; fd=4's in-use flag is turned on 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1 with refcount still 0; fd=5's in-use flag is turned on 3. client calls 'device-add' with /dev/fdset/1 as the backing filename and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1 4. client crashes, so all tracked fds are visited; fd=4 and fd=5 have not yet been passed to 'remove-fd', so their in-use flags are on; in-use flags are turned off and both fds are left open because the set is still in use by the block device (refcount is 1) 5. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both fds are closed since refcount is 0 and their in-use flags are off


There are enough reasons that an fd in an fdset exists and is not
opened, but I can't think of one why it should be dropped.

In use by whom? If it's still in use in qemu (as in "in-use flag would
be set") and we have a refcount of zero, then that's a bug.


In use by qemu.  I don't think it's a bug.  I think there are situations
where refcount gets to zero but qemu is still using the fd.

I think the refcount being non-zero _is_ what defines an fd as being in
use by qemu (including use by the monitor).  Any place you have to close
an fd before reopening it is dangerous; the safe way is always to open
with the new permissions before closing the old permissions.

There is one case I'm aware of where we need to be careful: Before
opening an image, qemu may probe the format. In this case, the image
gets opened twice, and the first close comes before the second open. I'm
not entirely sure how hard it would be to get rid of that behaviour.

If we can't get rid of it, we have a small window that the refcount
doesn't really cover, and if we weren't careful it would become racy.
This is why I mentioned earlier that maybe we need to defer the refcount
decrease a bit. However, I can't see how the in-use flag would make a
difference there. If the refcount can't cover it, the in-use flag can't
either.

Yeah this is a problem.  Could we introduce another flag to cover this?

Adding more refcounts or flags is not a problem, but it doesn't solve it
either. The hard question is when to set that flag.

I believe it may be easier to just change the block layer so that it
opens files only once during bdrv_open().

Kevin


--
Regards,
Corey



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]