[augeas-devel] improving performance of aug_get() and aug_match() with large datasets

Laine Stump laine at redhat.com
Mon Oct 5 13:54:05 UTC 2015


(David - I wrote this over the weekend, then turned on my redhat.com 
email connection just as I was getting ready to send it, got your 
replies, and see that you anticipated what I tried / asked about at the 
end of the email. Nice!)

On 10/02/2015 02:50 PM, Laine Stump wrote:
> On 10/02/2015 02:32 PM, David Lutterkort wrote:
>> On Thu, Oct 1, 2015 at 11:44 AM, Laine Stump <laine at redhat.com
>> <mailto:laine at redhat.com>> wrote:
>>
>>     On 09/22/2015 03:18 PM, Laine Stump wrote:
>> I played
>>     around a bit in gdb and found that most of the time now seems to be
>>     spent in one call to aug_match():
>>
>>
>>        r = aug_match(aug, path, "/files/etc/sysconfig/network-scripts/*[
>>     DEVICE = 'br1' or BRIDGE = 'br1' or MASTER = 'br1' or MASTER =
>>     ../*[BRIDGE = 'br1']/DEVICE ]/DEVICE");
>>
>>     (this is the result of a call to netcf's aug_fmt_match() in the
>>     netcf function aug_get_xml_for_nif())
>>
>>     When I step over that call to aug_match(), there is a very
>>     noticeable pause before the gdb prompt comes back, while continuing
>>     from that point all the way through virt-manager's "get all
>>     interfaces" loop back to the next call to aug_get_xml_for_nif()
>>     (including several other calls to aug_match() that have much simpler
>>     search expressions) seems to happen instantly.
>>
>>     So apparently doing a match against all ifcfg files based on this
>>     complex match expression is really slowing us down. Any ideas on how
>>     to either make this expression simpler, or alternately how to get
>>     augeas doing the search more quickly?
>>
>>
>> Was that with the performance stuff I did a few days ago ? (You'd need
>> Augeas HEAD for that)
>
> No, I am running the augeas that comes with Fedora 22 (1.4.0-1) (or 
> alternately, the one that comes with RHEL6.7 - an ancient 1.0.0). Let 
> me see if I can successfully make augeas rpms from upstream (in the 
> middle of "make distcheck right now) and see if there's a difference 
> with the latest code.

I've taken a bit more controlled approach to my benchmarking, and have 
found that your latest patches to augeas do indeed make a huge 
improvement (and also that the patch to netcf did have a big effect on 
"virsh iface-list --all", but didn't have much effect on virt-manager - 
this was a misperception due to mixing up the numbers for two different 
configs).

For the now-standard test system with 514 bridges and 514 vlans here are 
the numbers ("list" = "virsh iface-list -all", "dump" = virsh 
iface-dumpxml of all interfaces, and "virt-manager" is the time it takes 
for libvirtd CPU usage to drop down below 50% after starting virt-manager):


augeas netcf  libvirt list   dump    virt-manager
------ -----  ------- ------ ------- ------------
1.4.0  0.2.8  1.2.20  1:37.6 13:46.6 15:37
upstrm 0.2.8  1.2.20  1:04.7 07:34.8 08:41
upstrm upstrm 1.2.20  0:03.7 06:40.3 06:46
upstrm upstrm upstrm  0:02.0 06:39.5 06:39

(the upstream change in netcf is to call aug_load() only 1/sec, and to 
libvirt is to avoid calling ncf_if_mac_string() multiple times for each 
interface during iface-list).

In the case of virt-manager, the application becomes responsive after ~ 
the "list" time, then stops using CPU at the end of the "virt-manager" 
time, so the netcf change has a big effect on the amount of time until 
virt-manager is usable, and the augeas change has an even bigger effect 
on how long it takes for the system to settle down.


So just as an experiment, I tried removing the most complicated term:

   "MASTER = ../*[BRIDGE = 'br1']/DEVICE"

from the search expression. When I did this, the time for a "virsh 
iface-dumpxml" of all interfaces dropped from 6min39.5sec down to 15.3 
seconds!

upstrm upstrm upstrm  0:02.1 0:15.3 00:17

Any bright ideas on how to either make that search term execute faster, 
or alternately replace it with something simpler? (A first thought is 
that maybe it would be faster to do a two-staged search where we first 
look for everything with BRIDGE=="br1", then retrieve the DEVICE of all 
those matches, then search for MASTER==[any of the DEVICEs found in the 
first step]. Or maybe not; hard for me to say without trying.) *EDIT - 
David already replied showing how to do exactly this and indicating 
another drastic improvement, so now I'll be implementing his suggested 
changes in netcf).





More information about the augeas-devel mailing list