Gathering data about a system is one of a system administrator's most basic tasks. Sysadmins query systems for performance, capacity planning, and general inventory purposes. The need to learn about a system is so common that job interview questions regularly assess a candidate's ability to identify interesting information about an unfamiliar system quickly.
[ Learn how to manage your Linux environment for success. ]
Most sysadmins have a collection of scripts, one-liners, and other approaches for collecting essential data about a system. However, these approaches are often brittle, difficult to maintain, and require deep knowledge of the proper commands to run or files to examine.
Osquery is an open source project that allows you to obtain information about your system using a SQL query language. Osquery is cross-platform and can run both scheduled and ad-hoc queries. This article walks you through installing and using Osquery on Linux. My next article will explain how to schedule queries to collect and process data on a regular cadence and respond to changes in the state of your systems.
Install Osquery
Osquery provides official packages for various operating systems on its downloads page. Red Hat systems can install the RPM using DNF:
[root@fedora ~]# dnf install -y https://pkg.osquery.io/rpm/osquery-5.5.1-1.linux.x86_64.rpm
Last metadata expiration check: 0:06:34 ago on Mon 03 Oct 2022 04:42:51 PM EDT.
osquery-5.5.1-1.linux.x86_64.rpm 7.1 MB/s | 17 MB 00:02
Dependencies resolved.
=====================================================================================================================================================================================
Package Architecture Version Repository Size
=====================================================================================================================================================================================
Installing:
osquery x86_64 5.5.1-1.linux @commandline 17 M
Transaction Summary
=====================================================================================================================================================================================
Install 1 Package
Total size: 17 M
Installed size: 50 M
Downloading Packages:
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : osquery-5.5.1-1.linux.x86_64 1/1
Running scriptlet: osquery-5.5.1-1.linux.x86_64 1/1
Verifying : osquery-5.5.1-1.linux.x86_64 1/1
Installed:
osquery-5.5.1-1.linux.x86_64
Complete!
[ Get the guide to installing applications on Linux. ]
Use Osquery interactively
Osquery provides an interactive query environment similar to a MySQL shell, which is an excellent place to start learning about Osquery's capabilities. You can launch the shell using the osqueryi
command:
[root@fedora ~]# osqueryi
Using a virtual database. Need help, type '.help'
osquery> .quit
The Osquery schema contains information about hundreds of aspects of a system. It allows you to query any of this information using a SQL-based syntax. For example, you can query the block devices table to obtain information about the block devices on a Linux system:
You will often need to filter the data returned by your queries. Osquery supports SQL clauses, such as the WHERE
clause, to narrow query results. For example, the query below will select the root user's user ID (UID), group ID (GID), username, and shell. The WHERE
clause matches a value of 0
for the UID.
[ Learn more about open source SQL databases. Download the MariaDB and MySQL cheat sheet. ]
Use SQL joins
Querying individual tables is a great way to return structured data about your system. However, you frequently want to combine the information from multiple tables to obtain a more complete picture of your system. The example below queries the processes table to display the UID and name of processes running on the system. It filters the results to processes not being run with UID 0
and limits the number of returned rows to five.
It would be useful to join this table with the information in the users table displayed previously. For example, you may want the username associated with the running process instead of just the UID. You can accomplish this with a join.
The example below combines the data from both tables. The selection now specifies the table for each column of data in the form $tableName.$columnName
. The JOIN
clause tells Osquery to combine the data from both tables by matching on the UID column. The result returns data from both tables:
Osquery supports the different types of SQL joins, allowing for complex queries that can assimilate data from multiple tables.
Control output formats
Using the Osquery shell interactively is perfect for exploring the data Osquery exposes. However, you will likely want to leverage this data in your scripts, which requires a non-interactive way to run queries and support machine-readable output.
Osquery provides the ability to run a query directly from a single command, and it can also output the results in a script-friendly format, such as JSON. For example, the previous command to obtain process data can be run directly from the command line and printed as either JSON or a list of vertical-bar separated fields:
Osquery supports additional output formats. You can learn about them using the -h
flag to display the help page.
Wrap up
Osquery provides a powerful way to query thousands of data points about a system and return data in a structured format. This tool enables you to explore and understand a system more effectively. The ability to return data in machine-readable formats, such as JSON, makes Osquery valuable to existing scripts and tools. My next article will explain how to schedule Osquery to run queries for you.
If you are interested in learning more about Osquery, check out the official documentation for a deeper dive into Osquery’s underlying architecture and features.
About the author
Anthony Critelli is a Linux systems engineer with interests in automation, containerization, tracing, and performance. He started his professional career as a network engineer and eventually made the switch to the Linux systems side of IT. He holds a B.S. and an M.S. from the Rochester Institute of Technology.
Browse by channel
Automation
The latest on IT automation for tech, teams, and environments
Artificial intelligence
Updates on the platforms that free customers to run AI workloads anywhere
Open hybrid cloud
Explore how we build a more flexible future with hybrid cloud
Security
The latest on how we reduce risks across environments and technologies
Edge computing
Updates on the platforms that simplify operations at the edge
Infrastructure
The latest on the world’s leading enterprise Linux platform
Applications
Inside our solutions to the toughest application challenges
Original shows
Entertaining stories from the makers and leaders in enterprise tech
Products
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Cloud services
- See all products
Tools
- Training and certification
- My account
- Customer support
- Developer resources
- Find a partner
- Red Hat Ecosystem Catalog
- Red Hat value calculator
- Documentation
Try, buy, & sell
Communicate
About Red Hat
We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.
Select a language
Red Hat legal and privacy links
- About Red Hat
- Jobs
- Events
- Locations
- Contact Red Hat
- Red Hat Blog
- Diversity, equity, and inclusion
- Cool Stuff Store
- Red Hat Summit