피드 구독

Osquery is an open source, cross-platform tool that allows you to obtain information about your system using a SQL query language. My previous article explained how to use Osquery to query data about a system interactively. Running a query when needed is an excellent way to become comfortable with Osquery's SQL language and provides a convenient method for quick data collection on a system.

[ Learn more about open source SQL databases. Download the MariaDB and MySQL cheat sheet. ]

However, the real power of Osquery is its ability to scrape data about a system regularly. Osquery can run as a daemon and execute scheduled queries, allowing you to collect and process data on a regular cadence and respond to changes in the state of your systems.

Run a basic scheduled query

Setting up a basic scheduled query involves adding the query to Osquery's configuration file and starting the Osquery daemon. The default configuration file is located at /etc/osquery/osquery.conf, although you can change this by passing flags to the service.

The configuration is a JSON object that specifies certain global options and defines a schedule of queries to execute. The example file below will run a query every five seconds to obtain the user ID (UID), username, and shell for any users with a UID greater than or equal to 1000:

[root@fedora ~]# cat /etc/osquery/osquery.conf 
{
  "options": {
    "host_identifier": "hostname"
  },
  "schedule": {
    "users": {
      "query": "SELECT uid,username,shell FROM users WHERE uid >= 1000;",
      "interval": 5
    }
  }
}

Once the configuration file is in place, you can use the osqueryctl command to start, restart, or stop the Osquery daemon:

[root@fedora ~]# osqueryctl start

The daemon will start and begin executing the scheduled queries. Osquery logs to the filesystem by default by sending JSON output to a log file located at /var/log/osquery/osqueryd.results.log. Each log entry contains metadata, such as the query execution time and the columns of data that the query returned:

[root@fedora ~]# cat /var/log/osquery/osqueryd.results.log | jq
{
  "name": "users",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 16:40:10 2022 UTC",
  "unixTime": 1664901610,
  "epoch": 0,
  "counter": 0,
  "numerics": false,
  "columns": {
    "shell": "/sbin/nologin",
    "uid": "65534",
    "username": "nobody"
  },
  "action": "added"
}

[ Learn how to manage your Linux environment for success. ]

Scheduled queries provide a differential between each point in time when the query was run. The "action" field in the JSON log indicates whether a row in the table was added or removed since the query was last run. You can see this in action by adding two users to the system and looking at the query results:

[root@fedora ~]# useradd testuser
[root@fedora ~]# useradd testuser2

The query results show the addition of two users. The action is added, indicating that these rows of data have been added since the last query:

[root@fedora ~]# cat /var/log/osquery/osqueryd.results.log | jq
{
  "name": "users",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 16:40:10 2022 UTC",
  "unixTime": 1664901610,
  "epoch": 0,
  "counter": 0,
  "numerics": false,
  "columns": {
    "shell": "/sbin/nologin",
    "uid": "65534",
    "username": "nobody"
  },
  "action": "added"
}
{
  "name": "users",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 16:41:35 2022 UTC",
  "unixTime": 1664901695,
  "epoch": 0,
  "counter": 1,
  "numerics": false,
  "columns": {
    "shell": "/bin/bash",
    "uid": "1000",
    "username": "testuser"
  },
  "action": "added"
}
{
  "name": "users",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 16:41:40 2022 UTC",
  "unixTime": 1664901700,
  "epoch": 0,
  "counter": 2,
  "numerics": false,
  "columns": {
    "shell": "/bin/bash",
    "uid": "1001",
    "username": "testuser2"
  },
  "action": "added"
}
[root@f

Finally, you can delete the users from the system, and Osquery will report them as removed:

[root@fedora ~]# userdel -r testuser
[root@fedora ~]# userdel -r testuser2

The next set of query results shows that the data (and subsequently the users) have been removed since the last run of the query:

[root@fedora ~]# tail -n 2 /var/log/osquery/osqueryd.results.log | jq
{
  "name": "users",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 16:47:10 2022 UTC",
  "unixTime": 1664902030,
  "epoch": 0,
  "counter": 3,
  "numerics": false,
  "columns": {
    "shell": "/bin/bash",
    "uid": "1000",
    "username": "testuser"
  },
  "action": "removed"
}
{
  "name": "users",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 16:47:15 2022 UTC",
  "unixTime": 1664902035,
  "epoch": 0,
  "counter": 4,
  "numerics": false,
  "columns": {
    "shell": "/bin/bash",
    "uid": "1001",
    "username": "testuser2"
  },
  "action": "removed"
}

This differential approach allows you to build tooling that monitors Osquery logs and reports on system state changes. This is very useful for observability. Being able to answer questions about system changes is important for security, incident response, troubleshooting, and outage response situations.

Scheduled snapshots

A differential query doesn't always make sense for a particular dataset. Sometimes, you need a complete point-in-time view of the data that a query returns. For example, a query to monitor memory utilization should return a complete picture each time it runs. Osquery supports this type of scheduled query with the "snapshot" parameter in the query's configuration.

The configuration below schedules an additional query to run every 15 seconds and collect data about memory utilization on the system:

# cat /etc/osquery/osquery.conf 
{
  "options": {
    "host_identifier": "hostname"
  },
  "schedule": {
    "users": {
      "query": "SELECT uid,username,shell FROM users WHERE uid >= 1000;",
      "interval": 5
    },
    "memory_info": {
      "query": "SELECT memory_total,memory_free,memory_available,buffers FROM memory_info;",
      "interval": 15,
      "snapshot": true
    }
  }
}

Once you modify the configuration, you must restart Osquery for the changes to take effect:

# osqueryctl restart

Snapshot results are saved as JSON to a different log file located at /var/log/osquery/osqueryd.snapshots.log. The JSON objects in this log file contain a complete picture of the query results at the time it was executed, and the "action" field is not present:

# cat /var/log/osquery/osqueryd.snapshots.log  | jq
{
  "snapshot": [
    {
      "buffers": "1708032",
      "memory_available": "1471823872",
      "memory_free": "855764992",
      "memory_total": "2066640896"
    }
  ],
  "action": "snapshot",
  "name": "memory_info",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 17:12:28 2022 UTC",
  "unixTime": 1664903548,
  "epoch": 0,
  "counter": 0,
  "numerics": false
}
{
  "snapshot": [
    {
      "buffers": "1708032",
      "memory_available": "1471827968",
      "memory_free": "855764992",
      "memory_total": "2066640896"
    }
  ],
  "action": "snapshot",
  "name": "memory_info",
  "hostIdentifier": "fedora",
  "calendarTime": "Tue Oct  4 17:12:42 2022 UTC",
  "unixTime": 1664903562,
  "epoch": 0,
  "counter": 0,
  "numerics": false
}

Snapshots allow you to collect a complete view of data at a specific point in time, and they are excellent for queries where a differential view of data doesn't make sense.

Wrap up

In this article, you extended your Osquery knowledge to build scheduled queries that can regularly collect data about a system. You learned how to run Osquery as a daemon and saw how queries could provide a different view or a point-in-time snapshot of the system state.

Osquery is a very powerful tool, and this two-part series has only scratched the surface of its capabilities. If you are interested in learning more about Osquery, check out the official documentation for a deeper dive into Osquery's underlying architecture and features.


저자 소개

Anthony Critelli is a Linux systems engineer with interests in automation, containerization, tracing, and performance. He started his professional career as a network engineer and eventually made the switch to the Linux systems side of IT. He holds a B.S. and an M.S. from the Rochester Institute of Technology.

Read full bio
UI_Icon-Red_Hat-Close-A-Black-RGB

채널별 검색

automation icon

오토메이션

기술, 팀, 인프라를 위한 IT 자동화 최신 동향

AI icon

인공지능

고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트

open hybrid cloud icon

오픈 하이브리드 클라우드

하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요

security icon

보안

환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보

edge icon

엣지 컴퓨팅

엣지에서의 운영을 단순화하는 플랫폼 업데이트

Infrastructure icon

인프라

세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보

application development icon

애플리케이션

복잡한 애플리케이션에 대한 솔루션 더 보기

Original series icon

오리지널 쇼

엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리