[libvirt] [RFC] Support for CPUID masking v2

Tue Sep 22 12:41:08 UTC 2009

[ Sending again as my mail from yesterday seems to not have gone out :-( ]

On Fri, Sep 04, 2009 at 04:58:25PM +0200, Jiri Denemark wrote:
> Hi,
> 
> This is an attempt to provide similar flexibility to CPU ID masking without
> being x86-specific and unfriendly to users. As suggested by Dan, we need a way
> to specify both CPU flags and topology to achieve this goal.

  Right, thanks for trying to get this rolling :-)

> Firstly, CPU topology and all (actually all that libvirt knows about) CPU
> features have to be advertised in host capabilities:
> 
>     <host>
>         <cpu>
>             ...
>             <features>
>                 <feature>NAME</feature>
>             </features>
>             <topology>
>                 <sockets>NUMBER_OF_SOCKETS</sockets>
>                 <cores>CORES_PER_SOCKET</cores>
>                 <threads>THREADS_PER_CORE</threads>
>             </topology>

  <topology sockets="x" cores="y" threads="z"/>

would work too and give the possibility to extend in a completely
different way later by using subelement if CPU architeture were to
evolve drastically later.

>         </cpu>
>         ...
>     </host>
> 
> I'm not 100% sure we should represent CPU features as <feature>NAME</feature>
> especially because some features are currently advertised as <NAME/>. However,
> extending XML schema every time a new feature is introduced doesn't look like
> a good idea at all. The problem is we can't get rid of <NAME/>-style features,
> which would result in redundancy:
> 
>     <features>
>         <vmx/>
>         <feature>vmx</feature>
>     </features>

  I'm not afraid of that, it's not ideal but since those are
  virtualization related features having them separated sounds fine.
We just can't grow the schemas and parsing code to accomodate a
different element for each different name.

  IMHO the worst is that the definition of the names.
First there is gonna be a bunch of them and second their name if you
rely just on the procinfo output may not be sufficient in the absolute.

  Registries are an nightmare by definition, and we should not add
a registry of features in libvirt, nor try to assert any semantic to
those names. So I'm afraid we are good for just sampling/dumping
/proc/cpuinfo and leave the mess to the kernel. The feature list will
grow quite long but that's fine IMHO.

> But I think it's better than changing the schema to add new features.

  Yeah that's unmaintainable.

> Secondly, drivers which support detailed CPU specification have to advertise
> it in guest capabilities. In case <features> are meant to be hypervisor
> features, than it could look like:
> 
>     <guest>
>         ...
>         <features>
>             <cpu/>
>         </features>
>     </guest>

  Somehow we will get the same mess, I assume QEmu interface can provide
that list, right ?
  I'm also wondering if it's not possibly dependant on the machine, I
hope not, i.e. that the emulated CPU features are not also dependent on
the emaulated hardware...

> But if they are meant to be CPU features, we need to come up with something
> else:
> 
>     <guest>
>         ...
>         <cpu_selection/>
>     </guest>

  Something like
      <guest>
         <cpu model="foo">
           <features>fpu vme de pse tsc msr pae mce cx8 apic</features>
         </cpu>
         <cpu model="bar">
           <features>fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca</features>
         </cpu>
     </guest>

hoping it doesn't go per machine !

> I'm not sure how to deal with named CPUs suggested by Dan. Either we need to
> come up with global set of named CPUs and document what they mean or let
> drivers specify their own named CPUs and advertise them through guest
> capabilities:
> 
>     <guest>
>         ...
>         <cpu model="NAME">
>             <feature>NAME</feature>
>             ...
>         </cpu>
>     </guest>

  Again I would not build the registry in libvirt itself, at least as a
first approach, let the drivers provide them if available, expose them
in the capabilities for the given guest type.
  If we really start to see duplication, then maygbe we can provide an
helper. We could certainly provide utilities APIs to extract the set of
flags and topology informations from utils/ but I would let the drivers
being repsonsible for the list in the end.

> The former approach would make matching named CPUs with those defined by a
> hypervisor (such as qemu) quite hard. The latter could bring the need for
> hardcoding features provided by specific CPU models or, in case we decide not
> to provide a list of features for each CPU model, it can complicate
> transferring a domain from one hypervisor to another.
> 
> 
> And finally, CPU may be configured in domain XML configuration:
> 
> <domain>
>     ...
>     <cpu model="NAME">
>         <topology>
>             <sockets>NUMBER_OF_SOCKETS</sockets>
>             <cores>CORES_PER_SOCKET</cores>
>             <threads>THREADS_PER_CORE</threads>
>         </topology>
> 

  <topology sockets="x" cores="y" threads="z"/>

  Might be better, in any case it should be kept consistant with the
capabilities section format.

>         <feature name="NAME" mode="set|check" value="on|off"/>
>     </cpu>
> </domain>
> 
> Mode 'check' checks physical CPU for the feature and refuses the domain to
> start if it doesn't match. VCPU feature is set to the same value. Mode 'set'
> just sets the VCPU feature.

  Okay, I expect NAME for model or feature name to come from one in the
list shown by capabilities, so that there is no need to guess in the
process and allow management apps to build an UI based on data
available dynamically from libvirt (libvirt hopefully fetching it from
the kernel or the hypervisor itself).
  With your followup mail, it seems mode="set|check" value="on|off"
should really be sufficient, 

> 
> Final note: <topology> could also be called <cpu_topology> to avoid confusion
> with NUMA <topology>, which is used in host capabilities. However, I prefer
> <cpu><topology>...</topology></cpu> over
> <cpu><cpu_topology>...</cpu_topology></cpu>.

  Agreed,

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel at veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/