Gathering data about a system is one of a system administrator's most basic tasks. Sysadmins query systems for performance, capacity planning, and general inventory purposes. The need to learn about a system is so common that job interview questions regularly assess a candidate's ability to identify interesting information about an unfamiliar system quickly.
[ Learn how to manage your Linux environment for success. ]
Most sysadmins have a collection of scripts, one-liners, and other approaches for collecting essential data about a system. However, these approaches are often brittle, difficult to maintain, and require deep knowledge of the proper commands to run or files to examine.
Osquery is an open source project that allows you to obtain information about your system using a SQL query language. Osquery is cross-platform and can run both scheduled and ad-hoc queries. This article walks you through installing and using Osquery on Linux. My next article will explain how to schedule queries to collect and process data on a regular cadence and respond to changes in the state of your systems.
Osquery provides official packages for various operating systems on its downloads page. Red Hat systems can install the RPM using DNF:
[root@fedora ~]# dnf install -y https://pkg.osquery.io/rpm/osquery-5.5.1-1.linux.x86_64.rpm Last metadata expiration check: 0:06:34 ago on Mon 03 Oct 2022 04:42:51 PM EDT. osquery-5.5.1-1.linux.x86_64.rpm 7.1 MB/s | 17 MB 00:02 Dependencies resolved. ===================================================================================================================================================================================== Package Architecture Version Repository Size ===================================================================================================================================================================================== Installing: osquery x86_64 5.5.1-1.linux @commandline 17 M Transaction Summary ===================================================================================================================================================================================== Install 1 Package Total size: 17 M Installed size: 50 M Downloading Packages: Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Preparing : 1/1 Installing : osquery-5.5.1-1.linux.x86_64 1/1 Running scriptlet: osquery-5.5.1-1.linux.x86_64 1/1 Verifying : osquery-5.5.1-1.linux.x86_64 1/1 Installed: osquery-5.5.1-1.linux.x86_64 Complete!
[ Get the guide to installing applications on Linux. ]
Use Osquery interactively
Osquery provides an interactive query environment similar to a MySQL shell, which is an excellent place to start learning about Osquery's capabilities. You can launch the shell using the
[root@fedora ~]# osqueryi Using a virtual database. Need help, type '.help' osquery> .quit
The Osquery schema contains information about hundreds of aspects of a system. It allows you to query any of this information using a SQL-based syntax. For example, you can query the block devices table to obtain information about the block devices on a Linux system:
You will often need to filter the data returned by your queries. Osquery supports SQL clauses, such as the
WHERE clause, to narrow query results. For example, the query below will select the root user's user ID (UID), group ID (GID), username, and shell. The
WHERE clause matches a value of
0 for the UID.
[ Learn more about open source SQL databases. Download the MariaDB and MySQL cheat sheet. ]
Use SQL joins
Querying individual tables is a great way to return structured data about your system. However, you frequently want to combine the information from multiple tables to obtain a more complete picture of your system. The example below queries the processes table to display the UID and name of processes running on the system. It filters the results to processes not being run with UID
0 and limits the number of returned rows to five.
It would be useful to join this table with the information in the users table displayed previously. For example, you may want the username associated with the running process instead of just the UID. You can accomplish this with a join.
The example below combines the data from both tables. The selection now specifies the table for each column of data in the form
JOIN clause tells Osquery to combine the data from both tables by matching on the UID column. The result returns data from both tables:
Osquery supports the different types of SQL joins, allowing for complex queries that can assimilate data from multiple tables.
Control output formats
Using the Osquery shell interactively is perfect for exploring the data Osquery exposes. However, you will likely want to leverage this data in your scripts, which requires a non-interactive way to run queries and support machine-readable output.
Osquery provides the ability to run a query directly from a single command, and it can also output the results in a script-friendly format, such as JSON. For example, the previous command to obtain process data can be run directly from the command line and printed as either JSON or a list of vertical-bar separated fields:
Osquery supports additional output formats. You can learn about them using the
-h flag to display the help page.
Osquery provides a powerful way to query thousands of data points about a system and return data in a structured format. This tool enables you to explore and understand a system more effectively. The ability to return data in machine-readable formats, such as JSON, makes Osquery valuable to existing scripts and tools. My next article will explain how to schedule Osquery to run queries for you.
If you are interested in learning more about Osquery, check out the official documentation for a deeper dive into Osquery’s underlying architecture and features.