Re: [libvirt] dhcp6, radvd, ip6tables, etc. (update)

On 10/29/2012 04:08 PM, Laine Stump wrote:
On 10/29/2012 08:26 AM, Gene Czarcinski wrote:
On 10/27/2012 03:18 PM, Gene Czarcinski wrote:
OK, I have the basic implementation for libvirt support of dhcp6. Let
me say again that 98% of the work was already done.  There is still a
bunch of work today which includes writing some tests, understanding
how things such as bootp, dhcp-host, etc. should be supported with
dhcp6, as well as the items I discuss below.

1.  Right now, the only way that dhcp6 is in effect is if there is no
dhcp4 range definition.  This will be fixed/expanded so that, at a
minimum, you can have both a dhcp4 and dhcp6 on the same interface.
However, it appears to be easier to just pass to dnsmasq ANY/EVERY
dhcp4 range or dhcp6 range defined in the xml.

Comments?  Any input on which approach to use or avoid?
For the current situation, the implementation is for one (the first)
IPv4 dhcp and one (the first) IPv6 dhcp.  This introduces enough
little gotchas that need to be worked out.
I think that is the proper thing to do for now. As discussed earlier,
before supporting dhcp on multiple subnets of the same protocol (ipv6 vs
ipv4) we would need to decide why and how we want to do that - IPs
assigned from different subnets need to be matched with the IP address
of that subnet, and it will take a more complicated dnsmasq commandline
to do that, iirc.
I cannot think of a good reason to have multiple IPv4 or IPv6 dhcp-range specification. Some day, someone will come up with a good reason but, for right now, I believe that one IPv4 and one IPv6 dhcp-range specifications is one of those "good enough" answers.

2.  I have modified radvd so both stateful (dhcp6) and stateless
(SLAAC) addressing is supported with radvd for the default route.
This is done on an interface basis (that is the way it works).  So if
any dhcp6 range is specified, then stateful is used.  The way this is
implemented will make it easy to add some tests verifying that the
configuration parameters are working.  I intend this to be an
expansion to networkxml2argvtest since it has the xml specification
files which determine both dnsmasq and radvd configuration parameters.

NC ... working fine.
3. After completing what I thought was code that should result in a
guest getting dhcp6 addresses, it was not working.  Once more it took
me a little time to realize that ip6tables rules were blocking it. [I
have been down this path before, you would think I would realize the
problem sooner.]

3a. In looking over the ip6tables rules, I saw a whole bunch of
additions at the top of the INPUT chain which were accepts for
udp/tcp port 53.  In looking at the code in bridge_driver.c, I found
that, every time a network device was started, 3 FORWARD rules and 2
INPUT rules were added, but, when the network device was destroyed,
only the 3 FORWARD rules were removed.  I believe this is a bug (but
not high priority) and I will be submitting a separate patch to fix

3b. There are two different approaches for the rule which allows the
dhcp6 server to work.  I could add (actually insert) one rule to the
INPUT chain which accepted the packet if it is "-d ff02::1:2 "--dport
547".  Or, I could add (insert) a rule specifying "-i virbr__" for
every IPv6 device which would be removed when the device was destroyed.
OBE - I chose the approach of adding (and removing) a rule per
interface.  The rule adds "--dport 547" but does NOT specify "-d
I haven't looked at how dhcp6 works, but if its anything like dhcp4, the
IP address is irrelevant and shouldn't be included in the rule. As long
as your rule specifies both the interface and port, that should be fine
(take a look at the rules already being added for dhcp4) (and no, I have
absolutely no idea why we add a rule to allow *tcp* on the dhcp port.
It's just been that way since the first day I set eyes on the code).
Well, ff02::1:2 does have some meaning in dhcp6.

From what I have seen by "well behaved" clients is that the client always uses port 546 and the server always uses port 547. But, dnsmasq have some comments/code which indicates that not all clients are "well behaved."

In dhcp6, a little four dhcpv6 dance is performed to establish the clients address:

1. dhcpv6 solicit:  from=fe80::client:546  to=ff02::1:2:547
2. dhcpv6 advertise:  from=fe80::server:547  to=fe80::client:546
3. dhcpv6 request:  from=fe80::client:546  to=ff02::1:2:547
4. dhcpv6 reply:  from=fe80::server:547  to=fe80::client:546

Or, in other words: (1) need dhcpv6, (2) I serve it, (3) OK, give me one, and (4) here it is.

Since dnsmasq does its own packet filtering and with bind-interfaces having a real meaning, it all works (assuming that radvd has the right configuration).

This works With the radvd configuration and a dhcp-range specified for
a ipv6 subnet, a guest will get a dhcp6 address and RA default route.
Interesting - so both radvd and dnsmasq are involved, correct?
Yes, why not? Sometime in the future this should be reconsidered and either everything is done by dnsmasq or it is at least an option. I must say that having dnsmasq do everything does have appeal ... one less dependency.

4.  After getting all of this working to my satisfaction, my next
mountain to climb is VM ... it really does not like network xml
definitions which include a dhcp-range for an ipv6 definition.


NOTE:  I am implementing all of this assuming that my previous
patches have been accepted ... the ones for creating a dnsmasq
conf-file for parameters rather than using the dnsmasq command-line.
I have no problem with the "convert from long commandline to conf file"
patch except for the bit that points to a "conf directory" where user
supplied conf files can be added. Aside from that part needing to be in
a spearate patch, if we're going to add that kind of configurability, we
need to do it in a way that will allow us to easily see that the user is
playing outside the fence (otherwise we spend a lot of time chasing
"bugs" that end up being caused by user-supplied options).
Originally, I wanted the conf-dir so that I could pop in/out some configuration changes that would happen when dnsmasq re-read the configuration. Well, it does not work that way and dnsmasq has to be restarted for some of the more interesting changes. Given this, I believe that conf-dir serves no useful purpose and should be removed and I will remove it and resubmit the patch.

Because we're in freeze right now I haven't spent a lot of time
discussing that, but planned to send a message about it when I get a minute.
Right now I am working on getting dhcpv6 functional and, while what I have works, there is still more to do.

I am sure that someone could spend the time refitting the dhcp6
patches to the old code but why get aggravated?  If you folks do not
want to do things that way, fine, please say so.  But if it is going
to be accepted, then I would like some indication of this.
5. As far as I can tell (or at least this is for dnsmasq),
"dhcp-no-override", "enable-tftp", "tftp-root=", and "dhcp-boot=" are
all IPv4 only and thus ignored for IPv6 in bridge_driver.  I have not
looked to see what network_conf.c does.
"what network_conf.c does"? Well, it of course doesn't deal directly
with those options, but the config that feeds into some of those options
is parsed in virNetworkIPParseXML(), and is only done if the <ip>
element is ipv4. But then you've already seen that code if you have dhcp
working for ipv6 - the <dhcp> element is also only parsed for ipv6. The
format-side code doesn't have that extra check; I guess I figured that
if there was no way to configure ipv6 with dhcp or tftp, it was safe to
assume any ip element with dhcp or tftp info was ipv4 anyway.
I have now dived into network_conf.c and a little into dnsmasq.c. Yet again I was surprised because most of what was needed for dhcpv6 was a little tweaking here and there.

To support dhcp-host for IPv6, I did assumed that for IPv6 no MAC address would be specified since it does not have a defined meaning in DHCPv6. Therefore, in dnsmasq.c/hostsfileAdd(), if the mac==NULL, I use the IPv6 format of <hostname>,[<ipv6-addr>] whereas for IPv4 it is either MAC,<ipv4-addr>,<hostname. or MAC,<ipv4-addr>.

This way most of the code works as is. Dnsmasq has lots of options and different ways that dhcp-host= can be specified, but this is simple and I know it works.

6.  Handling of the info for addn-hosts file and the dhcp-hostsfile.
This currently works because things are forced so that one and only
one IPv4 dhcp definition will be handled.  With the addition of IPv6
dhcp, things fall apart.

6a. addn-hosts:  The addn-hosts file is similar to the /etc/hosts file
in both form and function.  The <dns>-<host> specification is done on
an interface bases and, thus, the processing of the data and creation
of the file should only be done once.

6b. dhcp-hostsfile (dhcp-host=):  This needs to be done at least for
every ip definition that is processed for dhcp.  Initially, this will
be for dhcp4 only until I can figure out how to do it for dhcp6.

6c. Thus, networkBuildDnsmasqHostsfile() needs to be split into two
functions [one for addn-hosts and one for dhcp-hosts]. Additionally,
all the functions which call dnsmasqSave() need to be reworked

I've actually never liked the "dnsmasqContext" concept, as it seems like
overkill and has conceptual problems such as what you've described. I
would be just as happy with replacements that were simpler and easier to
deal with. I think part of what complicates dnsmasq.[hc] is that it's in
the util directory, so it isn't allowed to understand the contents of
virNetworkDef, and must instead be sent the list of hosts in a simpler
format. If, instead, there was a file src/network/bridge_dnsmasq.[hc],
that could have functions that took virNetworkDef as an arg, and just
immediately return a string (or, in a separate function, write to a
file). That should simplify calling, and writing tests. And existing
dnsmasq-related functions in bridge_driver.c could be moved there as
well, reducing clutter in bridge_driver.c. (Of course I'm saying all
this without ever seriously considering it, just talking off the cuff,
so I may be completely wrong :-)
Right now I have fixed things up so they work. I would like to leave this as an exercise for someone else [or at least a later time].

Besides, isn't bridge_driver.c pretty much for dnsmasq only?

7. So far, the only things I have done involving the xml specification
is to enable <dhcp> for IPv6.  However, the  xml to specify a dns
addn-hosts appears, IMHO, to be overly verbose and complicated.
It is made that way because a single IP address may have multiple
hostnames associated with it, and we want to avoid having multiple
methods of describing the same thing. What you propose in the next
couple sentences was already proposed, tried, and rejected when dns host
support was originally added.

   So, while allowing the current xml to be valid, I suggest adding an
alternate form for which is similar to that used for dhcp-host.  An
example is "<host ip='' name='one' />"

There are a few places in libvirt's XML where the same thing can be
expressed in two different ways, but that is only done when necessary
because the existing XML is unable to completely describe the new
functionality but backward compatibility is required. It creates all
sorts of problems when formatting the config back into XML though (which
of the two do you choose? Or do you do both? Either of these is a bad
answer), therefore we definitely don't want to do that except in cases
where it is absolutely necessary; this isn't one of those cases.
This is why I have tried to tweak and bend things so it works (and because it mostly worked before).

And now, as the saying goes, one more thing.

I now realize that I am going to need to get into virsh net-update since I am adding things to the xml specification and net-update will need to differentiate between dhcp4 and dhcp6 changes.

Another thought that occurs to me is whether there has any consideration been given having a "virsh net-restart" which would just restart dnsmasq and radvd. Typing stuff in for the command line of net-update is a little prone to typos. Wouldn't having net-edit and net-restart do what is intended for net-update. Maybe there is a way to have net-update do the equivalent of net-edit/net-restart. For example, if you only did "virsh net-update <network>" it would do it.

BTW, as I mentioned in another message, net-update for <dns> <host> does not work.


