Skip to main content

Learn Gawk by playing a fun word game

Practice your command-line skills while seeing how many words you can make from 9 random letters.

Image by Wokandapix from Pixabay

I like to play puzzle and word games. It's a fun way to pass the time and exercise my brain. About 2 years ago, I wrote about how to play a math game with some Linux commands. That game presented a selection of random numbers and an arbitrary target; you had to use arithmetic with the random numbers to get the target number.

This word game is similar: try to spell the longest word you can from a selection of 9 random vowels and consonants. I find that a combination of 4 vowels and 5 consonants gives me the best opportunity to spell a longer word.

You can play this game at home using the tiles from your Scrabble board game and randomly selecting letters. But sometimes, I let my computer pick the random letters for me. Here's how to use the gawk and shuf commands to play this fun word game.

Print letters by frequency

I found a list online of the letter frequency in Scrabble and used that as my starting point to play this word game on Linux:

A-9, B-2, C-2, D-4, E-12, F-2, G-3, H-2, I-9, J-1, K-1, L-4, M-2, N-6, O-8, P-2, Q-1, R-6, S-4, T-6, U-4, V-2, W-2, X-1, Y-2, Z-1

Save that to a file called letters. The list is all on one line, but you can make clever use of Gawk to use it for the word game.

[ Read a beginner's guide to gawk. ]

Gawk is the GNU version of the classic text processing system Awk and provides great flexibility in working with text files. You can use Gawk to separate the letters in this string. While Gawk is designed to work on one line at a time, you can instruct it to consider this single line as 26 individual records by changing the record separator (RS) value. By default, Gawk uses a new line as the record separator.

Use the --assign option with gawk to set a new RS value. This assigns special values to internal variables. For example, --assign RS=, will temporarily change the internal record separator from new line to a comma. To see this in action, run this command to list only the A, E, I, O, and U records from the letters file:

$ gawk --assign RS=, '/[AEIOU]-/ {print $0}' letters
A-9
 E-12
 I-9
 O-8
 U-4

Gawk instructions are always in pattern-action pairs. In this case, /[AEIOU]-/ is a regular expression that provides the pattern. This effectively means "any line that contains the uppercase letters A, E, I, O, or U followed by a hyphen." And {print $0} is a Gawk action that says to print everything on the line ($0 means "the entire line").

Print that many of each letter

So the previous command neatly lists the frequency of the vowels. You can expand the Gawk command to loop through the frequency of each letter to print 9 A's, 12 E's, 9 I's, 8 O's, and 4 U's— each on separate lines.

In order to do that, however, Gawk needs to know how to separate each line into letter and frequency fields. Use the --field-separator option to set the field separator value. For instance, use --field-separator=- to set the separator to a hyphen:

$ gawk --assign RS=, --field-separator=- '/[AEIOU]-/ { for (n=0;n<$2;n++) {print $1} }' letters | head
A
A
A
A
A
A
A
A
A
 E

Don't worry about the A appearing immediately at the start of the line while other letters start with a space. That doesn't really impact playing the word game.

This version of the Gawk command also includes a for loop as the action to print multiple copies of the first field ($1). The second field ($2) contains the frequency, so (n=0;n<$2;n++) {print $1} starts the for loop with a zero value in the n variable and continues the loop as long as n is less than the letter frequency ($2). After every iteration of the loop (use the Gawk command print $1 to print the first field), the loop increments the n variable with the n++ instruction.

[ How well do you know Linux? Take a quiz and get a badge. ]

Pick a few random letters

To select only 4 random entries from this list, use the shuf command to randomize (or shuffle) the results. The -n option makes shuf print only a few lines, such as shuf -n 4 to print only 4 lines of output:

$ gawk --assign RS=, --field-separator=- '/[AEIOU]-/ { for (n=0;n<$2;n++) {print $1} }' letters | shuf -n 4
 E
 I
A
 U

Picking 5 consonants is similar but requires a small change to Gawk's regular expression. Instead of using /[AEIOU]/ to act only on the vowels, use /[^AEIOU]/ to specify any records that do not contain an uppercase A, E, I, O, or U followed by a hyphen. You also need to update the -n option to shuf, to print 5 lines:

$ gawk --assign RS=, --field-separator=- '/[^AEIOU]-/ { for (n=0;n<$2;n++) {print $1} }' letters | shuf -n 5
 Z
 G
 D
 R
 M

Put it all together

I like to put the commands into a script to play the word game from the Linux command line. I'll call it letters.sh. Then I can run one instruction to generate a list of 4 random vowels and 5 random consonants. Placing the commands in a script also makes things a bit tidier because I can keep the letter-frequency list in a variable and pass that to Gawk to generate a list of random vowels and consonants. The script looks like this:

#!/bin/sh

letters='A-9, B-2, C-2, D-4, E-12, F-2, G-3, H-2, I-9, J-1, K-1, L-4, M-2, N-6, O-8, P-2, Q-1, R-6, S-4, T-6, U-4, V-2, W-2, X-1, Y-2, Z-1'

echo $letters | gawk --assign RS=, --field-separator=- '/[AEIOU]-/ { for (n=0;n<$2;n++) {print $1} }' | shuf -n 4

echo $letters | gawk --assign RS=, --field-separator=- '/[^AEIOU]-/ { for (n=0;n<$2;n++) {print $1} }' | shuf -n 5

Try playing the game and see what words you can come up with. The rules for making words are simple: Use letters only once to make a word, but no abbreviations or proper nouns. Use regular words only. If you're not sure if a word is a proper noun, use the Linux dict command to look it up.

Here's one example:

$ chmod 750 letters.sh
$ ./letters.sh
 E
 U
 I
 O
 V
 R
 C
 D
 H

With those random letters, I can spell several 5-letter words: HIRED, CHIVE, HOVER, and CHORD. I can also spell a few 6-letter words: CURVED, RUCHED, and VOICED. I can spell VOUCHER and VOUCHED, both 7-letter words. In this word game, the longest word wins, so VOUCHED would win over VOICED.

[Cheat sheet: Old Linux commands and their modern replacements ]

Run the script to play again and see who can spell the longest word with the next set of random letters:

$ letters.sh
 E
 U
 E
 I
 N
 G
 R
 C
 L

With these random letters, I can spell CRUEL for 5 letters, which is pretty good. But CURLING and REELING—both for 7 letters—would likely win this round.

Your turn

What's the longest word you can spell with a random selection of vowels and consonants? If you're lucky, you might be able to spell a 9-letter word.

[ Learn how to manage your Linux environment for success. ]

Topics:   Linux   Command line utilities  
Author’s photo

Jim Hall

Jim Hall is an open source software advocate and developer, best known for usability testing in GNOME and as the founder + project coordinator of FreeDOS. More about me

Try Red Hat Enterprise Linux

Download it at no charge from the Red Hat Developer program.