[vfio-users] Allocating hugepages to a specific numa node on boot

Ryan Flagler ryan.flagler at gmail.com
Wed May 17 19:48:12 UTC 2017


I forgot to add the following. You also need to update your
/etc/default/grub entry with the proper hugepagesz=2M entry. Like below.
The service won't "activate" unless you configure this.

GRUB_CMDLINE_LINUX_DEFAULT="nomodeset quiet splash intel_iommu=on
hugepagesz=2M pci_stub.ids=10de:13c2,10de:0fbb,104c:8241"

On Wed, May 17, 2017 at 2:44 PM Ryan Flagler <ryan.flagler at gmail.com> wrote:

> So, I just wanted to share the process I found for allocating hugepages
> from memory tied to a specific numa node on your system. Traditionally, I
> had a /etc/default/grub that looked like this.
>
> GRUB_CMDLINE_LINUX_DEFAULT="nomodeset quiet splash intel_iommu=on
> hugepages=6144 pci_stub.ids=10de:13c2,10de:0fbb,104c:8241"
>
> This would allocate 12GB of hugepages (utilizing the default size of
> 2MB/hugepage). However, my allocation would look like this.
>
> cat /sys/devices/system/node/node*/meminfo | fgrep Huge
> Node 0 AnonHugePages:     20480 kB
> Node 0 HugePages_Total:  3072
> Node 0 HugePages_Free:   3072
> Node 0 HugePages_Surp:      0
> Node 1 AnonHugePages:         0 kB
> Node 1 HugePages_Total:  3072
> Node 1 HugePages_Free:   3072
> Node 1 HugePages_Surp:      0
>
> As you can see, half of the hugepages were allocated to each of my numa
> nodes. In my VM configuration, I had all my CPUs pinned to Numa Node 1, and
> I allocated 8GB of memory to that VM. So some of my hugepage memory was
> coming from numa node 1 and some from numa node 0.
>
> If you're dynamically allocating hugepages on the fly, you can easily
> specify which numa node you allocate the memory from with something like
> this.
> echo 6144 > $nodes_path/node1/hugepages/hugepages-2048kB/nr_hugepages
>
> The problem is you may not have 6144 available pages on numa node1 when
> you need them.
>
> I came across the following page that documented how to dyanmically
> allocate hugepages betweene numa nodes on boot on RHEL7 using systemd.
>
> http://fibrevillage.com/sysadmin/536-how-to-enable-and-config-hugepage-and-transparent-hugepages-on-rhel-centos
>
> Thankfully, I'm running Ubuntu 16.04 which also utilizes systemd. There
> were a couple differences, so here's what I did.
>
> Create /lib/systemd/system/hugetlb-gigantic-pages.service with the
> following contents
> [Unit]
> Description=HugeTLB Gigantic Pages Reservation
> DefaultDependencies=no
> Before=dev-hugepages.mount
> ConditionPathExists=/sys/devices/system/node
> ConditionKernelCommandLine=hugepagesz=2M
> [Service]
> Type=oneshot
> RemainAfterExit=yes
> ExecStart=/lib/systemd/hugetlb-reserve-pages
> [Install]
> WantedBy=sysinit.target
>
> Create /lib/systemd/hugetlb-reserve-pages with the following contents
> #!/bin/sh
> nodes_path=/sys/devices/system/node/
> if [ ! -d $nodes_path ]; then
> echo "ERROR: $nodes_path does not exist"
> exit 1
> fi
> reserve_pages()
> {
> echo $1 > $nodes_path/$2/hugepages/hugepages-2048kB/nr_hugepages
> }
> # This example reserves 2 2M pages on node0 and 1 1M page on node1.
> # You can modify it to your needs or add more lines to reserve memory in
> # other nodes. Don't forget to uncomment the lines, otherwise they won't
> # be executed.
> # reserve_pages 2 node0
> # reserve_pages 1 node1
> reserve_pages 6144 node1
>
> Note my uncommented line to allocate 6144 pages to numa node1.
>
> Update permissions and enable the job
> chmod +x /lib/systemd/hugetlb-reserve-pages
> systemctl enable hugetlb-gigantic-pages
>
> Reboot
>
> After reboot, I saw the following.
> cat /sys/devices/system/node/node*/meminfo | fgrep Huge
> Node 0 AnonHugePages:    169984 kB
> Node 0 HugePages_Total:     0
> Node 0 HugePages_Free:      0
> Node 0 HugePages_Surp:      0
> Node 1 AnonHugePages:    184320 kB
> Node 1 HugePages_Total:  6144
> Node 1 HugePages_Free:   6144
> Node 1 HugePages_Surp:      0
>
> And after starting my VM with all CPUs pinned on Node 1 and with 8GB of
> memory, I see this.
> cat /sys/devices/system/node/node*/meminfo | fgrep Huge
> Node 0 AnonHugePages:    724992 kB
> Node 0 HugePages_Total:     0
> Node 0 HugePages_Free:      0
> Node 0 HugePages_Surp:      0
> Node 1 AnonHugePages:    270336 kB
> Node 1 HugePages_Total:  6144
> Node 1 HugePages_Free:   2048
> Node 1 HugePages_Surp:      0
>
> Lastly, here is the status output of the systemd job.
> service hugetlb-gigantic-pages status
> â— hugetlb-gigantic-pages.service - HugeTLB Gigantic Pages Reservation
>    Loaded: loaded (/lib/systemd/system/hugetlb-gigantic-pages.service;
> enabled; vendor preset: enabled)
>    Active: active (exited) since Wed 2017-05-17 12:01:49 CDT; 3min 17s ago
>   Process: 872 ExecStart=/lib/systemd/hugetlb-reserve-pages (code=exited,
> status=0/SUCCESS)
>  Main PID: 872 (code=exited, status=0/SUCCESS)
>     Tasks: 0
>    Memory: 0B
>       CPU: 0
>    CGroup: /system.slice/hugetlb-gigantic-pages.service
>
> Hopefully this helps someone else!
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20170517/fdb530ce/attachment.htm>


More information about the vfio-users mailing list