[linux-lvm] inconsistency between thin pool metadata mapped_blocks and lvs output
John Hamilton
john.l.hamilton at gmail.com
Fri May 11 17:09:44 UTC 2018
Thanks for the response.
>Is this everything?
Yes, that is everything in the metadata xml dump. I just removed all of
the *_mapping entries for brevity. For the lvs output I removed other
logical volumes that aren't related to this pool.
>Is this a pool used by docker, which does not (did not) use LVM to manage
thin-volumes?
It's not docker, but it is an application called serviced that uses
docker's library for managing the volumes
>LVM just queries DM, and displays whatever that provides
Yeah, it looks like dmsetup status output matches lvs:
myvg-my--pool: 0 5242880000 thin-pool 70 207941/4145152
29018611/40960000 - rw discard_passdown queue_if_no_space -
myvg-my--pool_tdata: 0 4194304000 <(419)%20430-4000> linear
myvg-my--pool_tdata: 4194304000 1048576000 linear
myvg-my--pool_tmeta: 0 33161216 linear
>What is kernel/lvm version?
# uname -r
3.10.0-693.21.1.el7.x86_64
# lvm version
LVM version: 2.02.171(2)-RHEL7 (2017-05-03)
Library version: 1.02.140-RHEL7 (2017-05-03)
Driver version: 4.35.0
Configuration: ./configure --build=x86_64-redhat-linux-gnu
--host=x86_64-redhat-linux-gnu --program-prefix=
--disable-dependency-tracking --prefix=/usr --exec-prefix=/usr
--bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc
--datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64
--libexecdir=/usr/libexec --localstatedir=/var
--sharedstatedir=/var/lib --mandir=/usr/share/man
--infodir=/usr/share/info --with-default-dm-run-dir=/run
--with-default-run-dir=/run/lvm --with-default-pid-dir=/run
--with-default-locking-dir=/run/lock/lvm --with-usrlibdir=/usr/lib64
--enable-lvm1_fallback --enable-fsadm --with-pool=internal
--enable-write_install --with-user= --with-group= --with-device-uid=0
--with-device-gid=6 --with-device-mode=0660 --enable-pkgconfig
--enable-applib --enable-cmdlib --enable-dmeventd
--enable-blkid_wiping --enable-python2-bindings
--with-cluster=internal --with-clvmd=corosync --enable-cmirrord
--with-udevdir=/usr/lib/udev/rules.d --enable-udev_sync
--with-thin=internal --enable-lvmetad --with-cache=internal
--enable-lvmpolld --enable-lvmlockd-dlm --enable-lvmlockd-sanlock
--enable-dmfilemapd
>Is thin_check_executable configured in lvm.conf?
Yes
I also just found out that they apparently ran thin_check recently and got
a message about a corrupt superblock, but didn't repair it. They were
still able to re-activate the pool though. We'll run a repair as soon as we
get a chance and see if that fixes it.
Thanks,
John
On Fri, May 11, 2018 at 3:54 AM Marian Csontos <mcsontos at redhat.com> wrote:
> On 05/11/2018 10:21 AM, Joe Thornber wrote:
> > On Thu, May 10, 2018 at 07:30:09PM +0000, John Hamilton wrote:
> >> I saw something today that I don't understand and I'm hoping somebody
> can
> >> help. We had a ~2.5TB thin pool that was showing 69% data utilization
> in
> >> lvs:
> >>
> >> # lvs -a
> >> LV VG Attr LSize Pool Origin Data%
> >> Meta% Move Log Cpy%Sync Convert
> >> my-pool myvg twi-aotz-- 2.44t 69.04 4.90
> >> [my-pool_tdata] myvg Twi-ao---- 2.44t
> >> [my-pool_tmeta] myvg ewi-ao---- 15.81g
>
> Is this everything? Is this a pool used by docker, which does not (did
> not) use LVM to manage thin-volumes?
>
> >> However, when I dump the thin pool metadata and look at the
> mapped_blocks
> >> for the 2 devices in the pool, I can only account for about 950GB.
> Here is
> >> the superblock and device entries from the metadata xml. There are no
> >> other devices listed in the metadata:
> >>
> >> <superblock uuid="" time="34" transaction="68" flags="0" version="2"
> >> data_block_size="128" nr_data_blocks="0">
> >> <device dev_id="1" mapped_blocks="258767" transaction="0"
> >> creation_time="0" snap_time="14">
> >> <device dev_id="8" mapped_blocks="15616093" transaction="27"
> >> creation_time="15" snap_time="34">
> >>
> >> That first device looks like it has about 16GB allocated to it and the
> >> second device about 950GB. So, I would expect lvs to show somewhere
> >> between 950G-966G Is something wrong, or am I misunderstanding how to
> read
> >> the metadata dump? Where is the other 700 or so GB that lvs is showing
> >> used?
> >
> > The non zero snap_time suggests that you're using snapshots. I which
> case it
> > could just be there is common data shared between volumes that is
> getting counted
> > more than once.
> >
> > You can confirm this using the thin_ls tool and specifying a format line
> that
> > includes EXCLUSIVE_BLOCKS, or SHARED_BLOCKS. Lvm doesn't take shared
> blocks into
> > account because it has to scan all the metadata to calculate what's
> shared.
>
> LVM just queries DM, and displays whatever that provides. You could see
> that in `dmsetup status` output, there are two pairs of '/' separated
> entries - first is metadata usage (USED_BLOCKS/ALL_BLOCKS), second data
> usage (USED_CHUNKS/ALL_CHUNKS).
>
> So the error lies somewhere between dmsetup and kernel.
>
> What is kernel/lvm version?
> Is thin_check_executable configured in lvm.conf?
>
> -- Martian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-lvm/attachments/20180511/6849de51/attachment.htm>
More information about the linux-lvm
mailing list