Subscribe to the feed

A long time ago in UNIX history, users on a server were actual UNIX users with entries in /etc/shadow and an interactive login shell and a home directory. There were tools for admins to communicate with users, and to monitor their activity to avoid stupid or malicious mistakes that would cause server resources to be unfairly allocated.

These days, your userbase is less likely to have entries in /etc/shadow, instead being managed by a layer of abstraction, whether it’s LDAP or Drupal or OpenShift. Then again, there are a lot more servers now, which means there are a lot more sysadmins logging in and out to perform maintenance. Where there’s activity, there’s opportunity for mistakes and confusion, so it’s time to dust off those old monitoring tools and put them to good use.

Here are some of the monitoring commands you may have forgotten about (or never knew about) to help you track what’s been happening on your server.

who

First, the basics.

The who command is provided by the GNU coreutils package, and its primary job is to parse the /var/log/utmp file and report its findings.

The utmp file logs the current users on the system. It doesn’t necessarily show every process, because not all programs initiate utmp logging. In fact, your system may not even have a utmp file by default. In that case, who falls back upon /var/log/wtmp, which records all logins and logouts.

The wtmp file format is exactly the same as utmp, except that a null user name indicates a logout and the ~ character indicates a system shutdown or reboot. The wtmp file is maintained by login(1), init(1), and some versions of getty(8), however, none of these applications creates the file, so if you remove wtmp, then record-keeping is deactivated. That alone is good to know: if wtmp is missing, you should find out why!

The output of who --heading looks something like this:

NAME     LINE     TIME               COMMENT 
seth     tty2     2020-01-26 18:19   (tty2)
larry    pts/2    2020-01-28 13:02   (10.1.1.8)
curly    pts/3    2020-01-28 14:42   (10.1.1.5)

This shows you the username of each person logged in, the time their login was recorded, and their IP address.

The who command also humbly provides the official POSIX way of discovering which user you are logged in as, but only if utmp exists:

$ who -m
curly   pts/3   2020-01-28 14:44 (10.1.1.8)

It also provides a mechanism to display the current runlevel:

$ who -r 
     run-level 5   2020-01-26 23:58

w

For a little more context about users, the simple w command provides a list of who’s logged in and what they’re doing. This information is displayed in a format similar to the output of who, but the time the user has been idle, the CPU time used by all processes attached to the login TTY, and the CPU time used by just the current process. The user’s current process is listed in the final field.

Sample output:

$ w
 13:45:48 up 29 days, 19:24,  2 users,  load average: 0.53, 0.52, 0.54
USER     TTY     LOGIN@  IDLE    JCPU   PCPU WHAT
seth     tty2    Sun18   43:22m  0.01s  0.01s /usr/libexec/gnome-session-binary
curly    pts/2   13:02   35:12   0.03s  0.03s -bash

Alternatively, you can view the user’s IP address with the -i or --ip-addr option.

You can narrow the output down to a single user name by specifying which user you want information about:

$ w seth
 13:45:48 up 29 days, 19:27,  2 users,  load average: 0.53, 0.52, 0.54
USER     TTY     LOGIN@  IDLE    JCPU   PCPU WHAT
seth     tty2    Sun18   43:25m  0.01s  0.01s /usr/libexec/gnome-session-binary

utmpdump

The utmpdump utility does (almost) exactly what its name suggests: it dumps the contents of the /var/log/utmp file to your screen. Actually, it dumps either the utmp or the wtmp file, depending on which you specify. Of course, the file you specify doesn’t have to be located in /var/log or even named utmp or wtmp, and it doesn’t even have to be in the right format. If you feed utmpdump a text file, it dumps the contents to your screen (or a file, with the --output option) in a format that’s predictable and easy to parse.

Normally, of course, you would just use who or w to parse login records, but utmpdump is useful in many instances.

  • Files can get corrupted. While who and w are often able to detect corruption themselves, utmpdump is ever more tolerant because it does no parsing on its own. It renders the raw data for you to deal with.
  • Once you’ve repaired a corrupted file, utmpdump can patch your changes back in.
  • Sometimes you just want to parse data yourself. Maybe you’re looking for something that who and w aren’t programmed to look for, or maybe you’re trying to make correlations all your own.

Whatever the reason, utmpdump is a useful tool to extract raw data from the login records.

If you have repaired a corrupted login log, you can use utmpdump to write your changes back to the master log:

$ sudo utmpdump -r < wtmp.fix > /var/log/wtmp

ps

Once you know who’s logged in on your system, you can use ps to get a snapshot of current processes. This isn’t to be confused with the top, which displays a running report on current processes; this is a snapshot taken the moment ps is issued, and then printed to your screen. There are advantages and disadvantages to both, so you can choose which to use based on your requirements. Because of its static nature, ps is particularly useful for later analysis, or just as a nice manageable summary.

The ps command is old and well-known, and it seems many admins have learned the old UNIX command rather than the latest implementation. The modern ps (from the procps-ng package) offers many helpful mnemonics, and it’s what ships on RHEL, CentOS, Fedora, and many other distributions, so it’s what this article uses.

You can get all processes being run by a single user with the --user (or -u) option, along with the user name of who you want a report on. To give the output the added context of which process is the parent of a child process, use the --forest option for a “tree” view:

$ ps --forst --user larry
  PID TTY        TIME     CMD
  39707 ?        00:00:00 sshd
  39713 pts/4    00:00:00  \_ bash
  39684 ?        00:00:00 systemd
  39691 ?        00:00:00  \_ (sd-pam)

For every process on the system:

$ ps --forest -e
[...]
  29284 ?        00:00:48  \_ gnome-terminal-
  29423 pts/0    00:00:00  |   \_ bash
  42767 pts/0    00:00:00  |   |   \_ ps
  39631 pts/1    00:00:00  |   \_ bash
  39671 pts/1    00:00:00  |       \_ ssh
  32604 ?        00:00:00  \_ bwrap
  32612 ?        00:00:00  |   \_ bwrap
  32613 ?        00:09:05  |       \_ dring
  32609 ?        00:00:00  \_ bwrap
  32610 ?        00:00:15      \_ xdg-dbus-proxy
   1870 ?        00:00:05 gnome-keyring-d
   4809 ?        00:00:00  \_ ssh-agent
[...]

The default columns are useful, but you can change them to better suit what you’re researching. The -o option gives you full control over which columns you see. For a full list of possible columns, refer to the Standard Format Specifiers section of the ps(1) man page.

$ ps -eo pid,user,pcpu,args --sort user
   42799 root      0.0 [kworker/u16:7-flush-253:1]
  42829 root      0.0 [kworker/0:2-events]
  42985 root      0.0 [kworker/3:0-events_freezable_power_]
   1181 rtkit     0.0 /usr/libexec/rtkit-daemon
   1849 seth      0.0 /usr/lib/systemd/systemd --user
   1857 seth      0.0 (sd-pam)
   1870 seth      0.0 /usr/bin/gnome-keyring-daemon --daemonize --login
   1879 seth      0.0 /usr/libexec/gdm-wayland-session /usr/bin/gnome-session

The ps command is very flexible. You can modify its output natively so you don’t have to rely on grep and awk to find what you care about. Craft a good ps command, alias it to something memorable, and run it often. It’s one of the top ways to stay informed about what’s happening on your server.

pgrep

Sometimes, you may have some idea of a problematic process and need to investigate it instead of your users or system. To do that, there’s the pgrep command from the psproc-ng package.

At its most basic, pgrep works like a grep on the output of ps:

$ pgrep bash
29423
39631
39713

Instead of listing the PIDs, you can just get a count of how many PIDs would be returned:

$ pgrep --count bash
3

For more information, you can affect your search through processes by user name (-u), terminal (--terminal), and age (--newest and --oldest), and more. To find a process belonging to a specific user, for example:

$ pgrep bash -u moe --list-name
39631 bash

You can even get inverse matches with the --inverse option.

pkill

Related to pgrep is the pkill command. It’s a lot like the kill command, except that it uses the same options as pgrep so you can send signals to a troublesome process using whatever information is easiest for you.

For example, if you have discovered that a process initiated by user larry is monopolizing resources, and you know from w that larry is located on terminal pts/2, then you can kill the login session and all of its children with just the terminal name:

$ sudo pkill -9 --terminal pts/2

Or you can use just the user name to end all processes matching it:

$ sudo pkill -u larry

Used judiciously, pkill is a good “panic” button or sledgehammer-style solution when a problem has gotten out of hand.

Terminal monitoring

Just because a series of commands exist in a terminal doesn’t mean they’re necessarily better than other solutions. Take stock of your requirements and choose the best tool for what you need. Sometimes a graphical monitoring and reporting system is exactly what you need, and other times terminal commands that are easily scripted and parsed are the right answer. Choose wisely, learn your tools, and you’ll never be in the dark about what’s happening within your bare metal.

[Want to learn more about monitoring and security? Check out the IT security and compliance checklist. ]


About the author

Seth Kenlon is a Linux geek, open source enthusiast, free culture advocate, and tabletop gamer. Between gigs in the film industry and the tech industry (not necessarily exclusive of one another), he likes to design games and hack on code (also not necessarily exclusive of one another).

Read full bio
UI_Icon-Red_Hat-Close-A-Black-RGB

Browse by channel

automation icon

Automation

The latest on IT automation for tech, teams, and environments

AI icon

Artificial intelligence

Updates on the platforms that free customers to run AI workloads anywhere

open hybrid cloud icon

Open hybrid cloud

Explore how we build a more flexible future with hybrid cloud

security icon

Security

The latest on how we reduce risks across environments and technologies

edge icon

Edge computing

Updates on the platforms that simplify operations at the edge

Infrastructure icon

Infrastructure

The latest on the world’s leading enterprise Linux platform

application development icon

Applications

Inside our solutions to the toughest application challenges

Original series icon

Original shows

Entertaining stories from the makers and leaders in enterprise tech