Suscríbase al feed

Osquery is an open source, cross-platform tool that allows you to obtain information about your system using a SQL query language. My previous article explained how to use Osquery to query data about a system interactively. Running a query when needed is an excellent way to become comfortable with Osquery's SQL language and provides a convenient method for quick data collection on a system.

[ Learn more about open source SQL databases. Download the MariaDB and MySQL cheat sheet. ]

However, the real power of Osquery is its ability to scrape data about a system regularly. Osquery can run as a daemon and execute scheduled queries, allowing you to collect and process data on a regular cadence and respond to changes in the state of your systems.

Run a basic scheduled query

Setting up a basic scheduled query involves adding the query to Osquery's configuration file and starting the Osquery daemon. The default configuration file is located at /etc/osquery/osquery.conf, although you can change this by passing flags to the service.

The configuration is a JSON object that specifies certain global options and defines a schedule of queries to execute. The example file below will run a query every five seconds to obtain the user ID (UID), username, and shell for any users with a UID greater than or equal to 1000:

[root@fedora ~]# cat /etc/osquery/osquery.conf 
{
  "options": {
    "host_identifier": "hostname"
  },
  "schedule": {
    "users": {
      "query": "SELECT uid,username,shell FROM users WHERE uid >= 1000;",
      "interval": 5
    }
  }
}

Once the configuration file is in place, you can use the osqueryctl command to start, restart, or stop the Osquery daemon:

[root@fedora ~]# osqueryctl start

The daemon will start and begin executing the scheduled queries. Osquery logs to the filesystem by default by sending JSON output to a log file located at /var/log/osquery/osqueryd.results.log. Each log entry contains metadata, such as the query execution time and the columns of data that the query returned:

[root@fedora ~]# cat /var/log/osquery/osqueryd.results.log | jq
{
  "name": "users",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 16:40:10 2022 UTC",
  "unixTime": 1664901610,
  "epoch": 0,
  "counter": 0,
  "numerics": false,
  "columns": {
    "shell": "/sbin/nologin",
    "uid": "65534",
    "username": "nobody"
  },
  "action": "added"
}

[ Learn how to manage your Linux environment for success. ]

Scheduled queries provide a differential between each point in time when the query was run. The "action" field in the JSON log indicates whether a row in the table was added or removed since the query was last run. You can see this in action by adding two users to the system and looking at the query results:

[root@fedora ~]# useradd testuser
[root@fedora ~]# useradd testuser2

The query results show the addition of two users. The action is added, indicating that these rows of data have been added since the last query:

[root@fedora ~]# cat /var/log/osquery/osqueryd.results.log | jq
{
  "name": "users",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 16:40:10 2022 UTC",
  "unixTime": 1664901610,
  "epoch": 0,
  "counter": 0,
  "numerics": false,
  "columns": {
    "shell": "/sbin/nologin",
    "uid": "65534",
    "username": "nobody"
  },
  "action": "added"
}
{
  "name": "users",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 16:41:35 2022 UTC",
  "unixTime": 1664901695,
  "epoch": 0,
  "counter": 1,
  "numerics": false,
  "columns": {
    "shell": "/bin/bash",
    "uid": "1000",
    "username": "testuser"
  },
  "action": "added"
}
{
  "name": "users",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 16:41:40 2022 UTC",
  "unixTime": 1664901700,
  "epoch": 0,
  "counter": 2,
  "numerics": false,
  "columns": {
    "shell": "/bin/bash",
    "uid": "1001",
    "username": "testuser2"
  },
  "action": "added"
}
[root@f

Finally, you can delete the users from the system, and Osquery will report them as removed:

[root@fedora ~]# userdel -r testuser
[root@fedora ~]# userdel -r testuser2

The next set of query results shows that the data (and subsequently the users) have been removed since the last run of the query:

[root@fedora ~]# tail -n 2 /var/log/osquery/osqueryd.results.log | jq
{
  "name": "users",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 16:47:10 2022 UTC",
  "unixTime": 1664902030,
  "epoch": 0,
  "counter": 3,
  "numerics": false,
  "columns": {
    "shell": "/bin/bash",
    "uid": "1000",
    "username": "testuser"
  },
  "action": "removed"
}
{
  "name": "users",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 16:47:15 2022 UTC",
  "unixTime": 1664902035,
  "epoch": 0,
  "counter": 4,
  "numerics": false,
  "columns": {
    "shell": "/bin/bash",
    "uid": "1001",
    "username": "testuser2"
  },
  "action": "removed"
}

This differential approach allows you to build tooling that monitors Osquery logs and reports on system state changes. This is very useful for observability. Being able to answer questions about system changes is important for security, incident response, troubleshooting, and outage response situations.

Scheduled snapshots

A differential query doesn't always make sense for a particular dataset. Sometimes, you need a complete point-in-time view of the data that a query returns. For example, a query to monitor memory utilization should return a complete picture each time it runs. Osquery supports this type of scheduled query with the "snapshot" parameter in the query's configuration.

The configuration below schedules an additional query to run every 15 seconds and collect data about memory utilization on the system:

# cat /etc/osquery/osquery.conf 
{
  "options": {
    "host_identifier": "hostname"
  },
  "schedule": {
    "users": {
      "query": "SELECT uid,username,shell FROM users WHERE uid >= 1000;",
      "interval": 5
    },
    "memory_info": {
      "query": "SELECT memory_total,memory_free,memory_available,buffers FROM memory_info;",
      "interval": 15,
      "snapshot": true
    }
  }
}

Once you modify the configuration, you must restart Osquery for the changes to take effect:

# osqueryctl restart

Snapshot results are saved as JSON to a different log file located at /var/log/osquery/osqueryd.snapshots.log. The JSON objects in this log file contain a complete picture of the query results at the time it was executed, and the "action" field is not present:

# cat /var/log/osquery/osqueryd.snapshots.log  | jq
{
  "snapshot": [
    {
      "buffers": "1708032",
      "memory_available": "1471823872",
      "memory_free": "855764992",
      "memory_total": "2066640896"
    }
  ],
  "action": "snapshot",
  "name": "memory_info",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 17:12:28 2022 UTC",
  "unixTime": 1664903548,
  "epoch": 0,
  "counter": 0,
  "numerics": false
}
{
  "snapshot": [
    {
      "buffers": "1708032",
      "memory_available": "1471827968",
      "memory_free": "855764992",
      "memory_total": "2066640896"
    }
  ],
  "action": "snapshot",
  "name": "memory_info",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 17:12:42 2022 UTC",
  "unixTime": 1664903562,
  "epoch": 0,
  "counter": 0,
  "numerics": false
}

Snapshots allow you to collect a complete view of data at a specific point in time, and they are excellent for queries where a differential view of data doesn't make sense.

Wrap up

In this article, you extended your Osquery knowledge to build scheduled queries that can regularly collect data about a system. You learned how to run Osquery as a daemon and saw how queries could provide a different view or a point-in-time snapshot of the system state.

Osquery is a very powerful tool, and this two-part series has only scratched the surface of its capabilities. If you are interested in learning more about Osquery, check out the official documentation for a deeper dive into Osquery's underlying architecture and features.


Sobre el autor

Anthony Critelli is a Linux systems engineer with interests in automation, containerization, tracing, and performance. He started his professional career as a network engineer and eventually made the switch to the Linux systems side of IT. He holds a B.S. and an M.S. from the Rochester Institute of Technology.

Read full bio
UI_Icon-Red_Hat-Close-A-Black-RGB

Navegar por canal

automation icon

Automatización

Las últimas novedades en la automatización de la TI para los equipos, la tecnología y los entornos

AI icon

Inteligencia artificial

Descubra las actualizaciones en las plataformas que permiten a los clientes ejecutar cargas de trabajo de inteligecia artificial en cualquier lugar

open hybrid cloud icon

Nube híbrida abierta

Vea como construimos un futuro flexible con la nube híbrida

security icon

Seguridad

Vea las últimas novedades sobre cómo reducimos los riesgos en entornos y tecnologías

edge icon

Edge computing

Conozca las actualizaciones en las plataformas que simplifican las operaciones en el edge

Infrastructure icon

Infraestructura

Vea las últimas novedades sobre la plataforma Linux empresarial líder en el mundo

application development icon

Aplicaciones

Conozca nuestras soluciones para abordar los desafíos más complejos de las aplicaciones

Original series icon

Programas originales

Vea historias divertidas de creadores y líderes en tecnología empresarial