Log in
All Red Hat


How to support GDPR compliance with automation

Last Updated: December 31, 2018

In May 2018, the European Union (EU) will start enforcing the biggest change in data privacy regulation in the last 20 years. The EU General Data Protection Regulation (GDPR) will affect a variety of roles in every EU entity that processes personal data.

GDPR privacy violations can be complicated to detect, correct, and avoid. At scale and in agile DevOps environments, compliance may be very challenging to achieve without automation. Moreover, existing tools to collect, aggregate, and protect data may lack native capabilities to comply with GDPR. However, they may offer application programming interfaces (APIs) and other programmatic access methods that can be automated to support compliance.

IT organizations can help data protection officers meet the challenges of enforcing GDPR compliance in many ways. An easy-to-use automation platform featuring centralized management and a modular architecture for a variety of use cases can provide robust GDPR support.

Red Hat® Ansible® Automation, an enterprise open source automation platform, can be used independently or with other tools to provide support for critical compliance capabilities, such as automated data discovery and mapping, automated protection of sensitive data, automated disposal of noncompliant data, automated delivery of customer data to meet Right of Access inquiries, deletion of customer data to meet Right to Erasure inquiries, or automated export of customer data to meet Right to Portability requests.


The GDPR is the European Union’s reform of its privacy framework that aims to harmonize its disparate national data privacy laws. It introduces extensive obligations for companies that collect, use, or otherwise process personal data.

Under the GDPR, personal data can only be processed if there is at least one of the following lawful bases to do so: the data subject has given consent to the processing of personal data for one or more specific purposes, or if processing is necessary:

  • For the performance of a contract to which the data subject is party or to take steps at the request of the data subject prior to entering into a contract.
  • For compliance with a legal obligation to which the controller is subject.
  • To protect the vital interests of the data subject or of another natural person.
  • For the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller.
  • For the purposes of the legitimate interests pursued by the controller or by a third party, except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data, in particular where the data subject is a child.

The scope of the GDPR goes beyond territorial boundaries of the EU to include organizations that are offering services or goods to individuals in the EU or are monitoring an individual’s behavior in the EU.

Companies found in violation of the GDPR can be fined up to 4% of their global annual revenue or €20 million, whichever figure is highest.


In June 2017, Gartner released two strategic assumptions:

  1. On 25 May 2018, less than 50% of all organizations impacted will fully comply with the GDPR.
  2. Before 2020, we will have seen a multimillion Euro regulatory sanction for GDPR noncompliance.

According to Gartner, these assumptions are based on the complexity of detecting, correcting, and avoiding privacy violations— especially at scale and in agile DevOps environments.

  • Difficult to detect because, more often than not, customer data is collected through various channels across geography and lines of business — as well as across on-premise, cloud, and hybrid infrastructures  —  and it is not always aggregated through a unified data management strategy. This sprawl makes it challenging to assess what data is available, what data is of legitimate interest and sensitive enough to be protected, and what data must be disposed of to comply with GDPR.
  • Difficult to correct because solutions depend on performing large-scale customer data activities, such as adding pseudonym, token, or erasure protections. When data systems lack native capabilities for these tasks, the next best option is using existing APIs and other programmatic data access methods.
  • Difficult to avoid because continuous compliance requires significant changes in how data is acquired, a way to centrally aggregate data—for example, data lakes—and a system to automate compliance verification and reporting.

In addition, the EU has not been overly prescriptive in terms of tools and operational frameworks necessary for compliance. The most obvious solutions to comply with GDPR regulation are privacy management tools—such as integrated risk management—consent and cookie management tools, and privacy control tools, including data life-cycle management and pseudonym tools, such as tokenization or masking. However, these tools must be automated to sustain compliance at scale.


Automation is a versatile and powerful capability that can be used to accomplish a broad range of IT tasks, such as compute, network, and storage infrastructure, OS configuration, multitier application provisioning, and security policy and compliance rules enforcement.

There is a significant chance that your IT organization is already using one or more automation tools for some of these tasks, or as part of strategic initiatives like DevOps. The more flexible and easy to use your automation tool is, the easier it is to use it to address GDPR compliance.

In fact, according to Joerg Fritsch, research director of Gartner’s Security and Risk Management Strategies team, “Technical professionals can support the GDPR program by using technology to make a process repeatable and scalable.”

Red Hat Ansible Automation is an IT automation platform featuring an easy-to-learn language, powerful centralized management features, and a modular architecture to address GDPR compliance issues in multiple scenarios.


At scale, even the process of discovering data becomes challenging. Red Hat Ansible Automation can simplify the discovery and mapping of electronically stored information (ESI)—for example, e-discovery products across an entire IT infrastructure.

Red Hat Ansible Engine is agentless and connects to target systems through Secure Shell (SSH). IT teams can write automation workflows, or Ansible Playbooks, that use a simple language to describe and dictate how the e-discovery agent is deployed and executed on target systems.

In the following example, an Ansible Playbook ensures that the agent of an e-discovery solution is installed and starts properly. In case the agent is not running as expected, the playbook sends a warning message:


-  name: Ensure operating discovery agent

     hosts: all

     become: yes


 -  name: Ensure latest ediscovery configuration


         src: templates/ediscovery-agent.j2

         dest: /etc/ediscovery-agent.conf

  - name: Ensure latest ediscovery agent is installed


         name: ediscovery-agent

         state: latest

 -  name: Ensure ediscovery agent is actually running


         name: ediscovery-agent

         state: started

     register: ediscovery_status

 -  name: notify admins if agent was not already running


         token: token/generatedby/slack

         msg: “{{ inventory_hostname }} had no running eDiscovery agent!”

     when: ediscovery_status.changed == true        slack:

        token: token/generatedby/slack

        msg: “{{ inventory_hostname }} had no running eDiscovery agent!”

     when: ediscovery_status.changed == true


Red Hat Ansible Engine works across on-premise and public cloud infrastructures, supporting both Linux® and Windows operating systems, ensuring broader access to corporate ESI.

While Ansible Engine manages playbook execution, Red Hat Ansible Tower offers a centralized view of which playbooks have been executed, in which environment, providing instant visibility into what systems are yet to be analyzed and where data collection failed to start (Figure 1).

Figure 1. Red Hat Ansible Tower offers a centralized playbook interface

Figure 1. Red Hat Ansible Tower offers a centralized playbook interface


Even when e-discovery tools cannot be automated, the automation engine by itself can be used to partially accomplish the goal. For example, Ansible can tag resources, then store a map of resources’ physical locations in a configuration management database (CMDB) to address data sovereignty for the GDPR.


         -  name: ensure given files are not present anywhere

hosts: all


 -  name: get list with all important files      


path: \Users

recurse: True

patterns: [‘*.doc’,’*.docx’,’*.xls’,’*.xlsx’,’*.pdf’]      

      register: all_files    

   -  name: Submit file list to CMDB via REST


url: “https://your.cmdb.example.com/rest/api/2/{{ inventory_hostname }}/files”

method: POST        

body: “{{ all_files.files|map(attribute=’path’)|list() }}”

body_format: json


As the GDPR does not dictate the specific tools and methods IT organizations must use to stay compliant, sensitive data can be protected in a variety of ways depending on your risk assessment. In some cases, advanced protection approaches, like encryption, might be necessary. In this scenario, automation is critical to successful implementation.

Ansible can apply device encryption to all machines that match certain GDPR criteria in the inventory. In the following playbook, Ansible performs the task without native integration with third-party tools:


-  name: ensure given files are not present anywhere

   hosts: all


    -  name: check if device is available


          path: /dev/mapper/lukscrypt

       register: stat_cryptdata

     -  name: decrypt device if necessary


          echo -n “{{ luks_pwd }}”|cryptsetup luksOpen /dev/md/0 lukscrypt

        when: stat_cryptdata.stat.exists == False

     -  name: mount crypted device


path: /data

src: /dev/mapper/lukscrypt

state: mounted      

opts: noauto


Regardless of the compliance methods used, organizations must prove their ongoing compliance. Automation used in conjunction with self-healing management solutions, such as Red Hat Insights, an offering that provides real-time assessment of protected systems and automated remediation for noncompliant scenarios.

In Figure 2, Red Hat Insights has identified a security vulnerability that can be fixed by executing an Ansible Playbook. The playbook is automatically generated and made available for download and manual execution on the affected targets. Alternatively, it can be imported in Ansible Tower for centralized execution.

Figure 2. Red Hat Insights security vulnerability identification

Figure 2. Red Hat Insights security vulnerability identification



Once sensitive data has been discovered, mapped, and protected, all other noncompliant data can be discarded. This task can be accomplished at scale by writing specific playbooks that Ansible Engine will execute on target systems as necessary.

Automation is especially valuable in this scenario, as noncompliant data might be stored in back-up media and offline storage that is extremely resource-intensive to process manually.

For example, Ansible can execute a playbook that accesses mount servers, temporarily mounts a defined device, and ensure that all files from a given list are not present. Finally, the playbook can send a report with the list of deleted files via email:


-  name: ensure given files are not present anywhere

   hosts: backup-servers

   become: yes


     -  name: mount backup dirs


path: /mnt/backup        

src: /dev/dm0

fstype: xfs

state: mounted    

     -  name: ensure files from forbidden list are deleted      


name: “{{ item }}”

state: absent


            - “{{ lookup(‘file’, ‘forbiddenfiles.txt’).split() }}”      

         register: found_files    

     -  name: send report of deleted files via mail


host: smtp.gmail.com

port: 587 username:


password: “{{ secret_from_vault }}”        

to: Backup Reporting  <backup.reporting@example.com>

subject: Backup file deletion report        

body: “System {{ ansible_hostname }} deleted the files {{ found_files. results | selectattr(‘changed’)|map(attribute=’item’)|list() }}”      

        when: found_files.changed == true

     -  name: umount backup dirs


      path: /mnt/backup

  state: unmounted



The automated processes in the three use cases described can be combined in a master playbook, then executed manually or automatically, on an ad hoc or recurring basis, to any new IT system deployed within your IT environment.

Red Hat Ansible Automation can be used in a variety of additional compliance scenarios, such as automated delivery of customer data upon Right of Access inquiries by the data subject (GDPR Art. 15), the deletion of customer data upon Right to Erasure—or right to be forgotten—inquiries by the data subject (GDPR Art. 17), or the automated export of customer data in a desired format upon a Right to Portability request by the data subject (GDPR Art. 20).

To get started with Red Hat Ansible Automation, visit redhat.com/en/technologies/management/ansible/get-started.

Need help developing an automation strategy for compliance? Get started with Red Hat Consulting at redhat.com/consulting.

For additional information about the GDPR, review the following resources: