[linux-lvm] LVM2 scalability within volume group

Wed Mar 17 17:36:38 UTC 2004

Greetings,

I'm doing some evaluation of LVM2 on large systems, with lots of disk
devices.  I'm currently using LVM2.2.00.08, along with device-mapper.1.00.07.
I plan to eventually upgrade to the lastest CVS trees for LVM2.  I recall
earlier mail that it's faster than what I'm using.

The first thing I noticed is that creating a VG with lots of PVs is a bad
idea.  Creating a volume group with one PV takes about 12
seconds elapsed time.  Adding a new PV initially takes about 5 seconds.
But this grows to 15 seconds when adding the 40th PV, 25 seconds adding
the 60th PV.  Adding the 200th PV takes about 6 minutes.

Activing this volume group (with 200 PVs) takes about 48 minutes.

I can be spreading PVs out among the available 99 VGs.
The individual VGs seem to be independent of each other, so having
large numbers of VGs with a few PVs performs OK.

But stil, would it make sense for adding PVs to VG to scale better?
I'm guessing that the current lack of scaling is from the way redundant
meta data is stored.  Are redundant copies of meta data being updated on
every PV within the VG whenever a new PV is added?  Even so, why should
200 read/writes take so long?

Having redundant copies of meta data is a good thing.  But how about
allowing the adminstrator to set a limit on the degree of redundancy when
a VG is created.  You could limit a VG to having for example 10 redundant
copies.  Then adding more PVs beyond the 10th would encounter less overhead.

Am I missing something important?

Thanks!
Dave Olien