[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [augeas-devel] improving performance of aug_get() and aug_match() with large datasets



On Thu, Oct 1, 2015 at 11:44 AM, Laine Stump <laine redhat com> wrote:

But 13 (or even 8) minutes is still a very long time, so I played around a bit in gdb and found that most of the time now seems to be spent in one call to aug_match():


  r = aug_match(aug, path, "/files/etc/sysconfig/network-scripts/*[ DEVICE = 'br1' or BRIDGE = 'br1' or MASTER = 'br1' or MASTER = ../*[BRIDGE = 'br1']/DEVICE ]/DEVICE");

Whoever wrote that code must have thought they were incredibly clever with this query ;)

There's a few ways in which I think this can be sped up: for one, rather than use 'or', we can build an intermediate nodeset for the first three nodesets by matching

(1) /files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'br1']/DEVICE

The last term in that 'or' is very expensive since it constitutes a nested loop, with "/files/etc/sysconfig/network-scripts/*" being the outer loop ("for each ifcfg file") and "../*[BRIDGE = 'br1']/DEVICE" being the inner loop ("for each ifcfg file see if it is a BRIDGE and return its DEVICE"). That can be made a little more targetted by using

(2) /files/etc/sysconfig/network-scripts/*/MASTER[ . = ../*[BRIDGE = 'br1']/DEVICE ]

so that we only trigger the inner loop for ifcfg files that actually have a MASTER entry. This helps if you don't have bonds - I suspect, if there are any bonds on the system, the query will still be very expensive.

Making these two changes brings the time for the aug_match down from 680ms to ~ 40ms on my machine, using NUMVLANS=514. The query that I ran for the latter was ('|' produces the union of two nodesets)

(3) (/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42']|/files/etc/sysconfig/network-scripts/*/MASTER[ . = ../*[BRIDGE = 'brvlan42']/DEVICE ])/DEVICE

Even better would be if we knew whether we need the whole MASTER business - my recollection of this is dim, but I believe this query tries to find the bond device for which the bridge is a slave. It might be faster for netcf to run a query for that separately and then instead of query (2) do something like

(4) /files/etc/sysconfig/network-scripts/*/MASTER[ . = '$master1' or . = '$master2' ...]/DEVICE

Mocking this up with a query that assumes there are 'bond0' and 'bond1' on the system brings the time for the query from ~ 40ms to 4ms on my machine.

Be warned: my memory of ifcfg-* files especially around bonding is kinda hazy, and I might have screwed up these queries ...

Attached is a file of Augeas commands that I ran through 'augtool -e -r /var/tmp/bridges-root'; all timings were from changing aug_match to print the time taken from just after calling api_entry() to just before api_exit() against current HEAD.

David

# Original
match /files/etc/sysconfig/network-scripts/*[ DEVICE = 'brvlan42' or BRIDGE = 'brvlan42' or MASTER = 'brvlan42' or MASTER = ../*[BRIDGE = 'brvlan42']/DEVICE ]/DEVICE
#
#
# Turn first three 'or' terms into an inetrmediate nodeset, and trigger
# the inner loop for MASTER only if there actually are bonds
match (/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42']|/files/etc/sysconfig/network-scripts/*/MASTER[ . = ../*[BRIDGE = 'brvlan42']/DEVICE ])/DEVICE
#
#
# Assuming we have two bonds on the system
match (/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42']|/files/etc/sysconfig/network-scripts/*/MASTER[ . = 'bond0' or . = 'bond1' ])/DEVICE
citron:augeas (master)>./src/try -e -r /var/tmp/bridges-for-vlans/root/         augtool> # Original
augtool> match /files/etc/sysconfig/network-scripts/*[ DEVICE = 'brvlan42' or BRIDGE = 'brvlan42' or MASTER = 'brvlan42' or MASTER = ../*[BRIDGE = 'brvlan42']/DEVICE ]/DEVICE
aug_match(/files/etc/sysconfig/network-scripts/*[ DEVICE = 'brvlan42' or BRIDGE = 'brvlan42' or MASTER = 'brvlan42' or MASTER = ../*[BRIDGE = 'brvlan42']/DEVICE ]/DEVICE) = 2
Time: 661ms
/files/etc/sysconfig/network-scripts/ifcfg-p14p1.42/DEVICE = p14p1.42
/files/etc/sysconfig/network-scripts/ifcfg-brvlan42/DEVICE = brvlan42
augtool> #
augtool> #
augtool> # Turn first three 'or' terms into an inetrmediate nodeset, and triggeraugtool> # the inner loop for MASTER only if there actually are bonds
augtool> match (/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42']|/files/etc/sysconfig/network-scripts/*/MASTER[ . = ../*[BRIDGE = 'brvlan42']/DEVICE ])/DEVICE
aug_match((/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42']|/files/etc/sysconfig/network-scripts/*/MASTER[ . = ../*[BRIDGE = 'brvlan42']/DEVICE ])/DEVICE) = 2
Time: 36ms
/files/etc/sysconfig/network-scripts/ifcfg-p14p1.42/DEVICE = p14p1.42
/files/etc/sysconfig/network-scripts/ifcfg-brvlan42/DEVICE = brvlan42
augtool> #
augtool> #
augtool> # Assuming we have two bonds on the system
augtool> match (/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42']|/files/etc/sysconfig/network-scripts/*/MASTER[ . = 'bond0' or . = 'bond1' ])/DEVICE
aug_match((/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42']|/files/etc/sysconfig/network-scripts/*/MASTER[ . = 'bond0' or . = 'bond1' ])/DEVICE) = 2
Time: 4ms
/files/etc/sysconfig/network-scripts/ifcfg-p14p1.42/DEVICE = p14p1.42
/files/etc/sysconfig/network-scripts/ifcfg-brvlan42/DEVICE = brvlan42
augtool>

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]