Daniel Veillard wrote:
1) Provide a function describing the topology as an XML instance: char * virNodeGetTopology(virConnectPtr conn);
which would return an XML instance as in virConnectGetCapabilities. I toyed with the idea of extending virConnectGetCapabilities() to add a topology section in case of NUMA support at the hypervisor level, but it was looking to me that the two might be used at different times and separating both might be a bit cleaner, but I could be convinced otherwise.
I'd definitely prefer to extend virConnectGetCapabilities XML. It avoids changing the remote driver and language bindings, and really callers only need to pull capabilities once per connection.
--------------------------------- <topology> <cells num='2'> <cell id='0'> <cpus num='2'> <cpu id='0'/> <cpu id='1'/> </cpus> <memory size='2097152'/> </cell> <cell id='1'> <cpus num='2'> <cpu id='2'/> <cpu id='3'/> </cpus> <memory size='2097152'/> </cell> </cells> </topology> --------------------------------- A few things to note: - the <cells> element list the top sibling cells
- the <cell> element describes as child the resources available like the list of CPUs, the size of the local memory, that could be extended by disk descriptions too <disk dev='/dev/sdb'/> and possibly other special devices (no idea what ATM). - in case of deeper hierarchical topology one may need to be able to name sub-cells and the format could be extended for example as <cells num='2'> <cells num='2'> <cell id='1'> ... </cell> <cell id='2'> ... </cell> </cells> <cells num='2'> <cell id='3'> ... </cell> <cell id='4'> ... </cell> </cells> </cells> But that can be discussed/changed when the need arise :-)
Especially note that 4 (or more) socket AMDs have a topology like this, with two different penalties for reaching nodes which are one and two hops away. Do we have a way to describe the penalties along different paths?
2) Function to get the free memory of a given cell: unsigned long virNodeGetCellFreeMemory(virConnectPtr conn, int cell); that's relatively simple, would match the request from the initial mail but I'm wondering a bit. If the program tries to do a best placement it will usually run that request for a number of cells no ? Maybe a call returning the memory amounts for a range of cells would be more appropriate.
Yes, I guess they'd want to get the free memory for all nodes. But IBM will have a better idea about this.
Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903
Description: S/MIME Cryptographic Signature