Sysadmins use an untold number of command-line tools, and you probably regularly use the three discussed in this article: grep
, sed
, and awk
. But do you know all the ways you can use them to manipulate text? If not (or you're not sure), continue reading.
Before I get started, here are the origins of the commands' names:
grep
: According to Wikipedia, the name "comes from theed
command g/re/p (globally search for a regular expression and print matching lines), which has the same effect."ed
is a "line-oriented text editor." Even for someone who likes the command line, editing files line-by-line seems too old-fashioned, but people had to start with something in ancient times ).sed
: The name comes from its main use, as a stream editor.awk
: Its name comes from its authors' initials (Aho, Weinberger, and Kernighan). If the name Kernighan rings any bells (pun intended) for you, it is because this Canadian computer scientist contributed to the creation of Unix and co-authored the first book about the C language.
It's excellent to trace the commands' genealogical tree, but what really matters is that these commands are pretty helpful for text manipulation.
In the following examples, I will use a file named quotes.txt
to illustrate how to use the commands. Here are the contents of this file:
$ cat quotes.txt
"God does not play dice with the universe."
- Albert Einstein, The Born-Einstein Letters 1916-55
"Not only does God play dice but... he sometimes throws them where they cannot be seen."
- Stephen Hawking
"I regard consciousness as fundamental..."
- Max Planck
"The cosmos is within us. We are made of star-stuff. We are a way for the universe to know itself."
- Carl Sagan
"[T]he atoms or elementary particles themselves are not real; they form a world of potentialities or possibilities rather than one of things or facts."
- Werner Heisenberg
grep
The simplest way to use grep
is:
$ grep universe quotes.txt
"God does not play dice with the universe."
"The cosmos is within us. We are made of star-stuff. We are a way for the universe to know itself."
This example provides the string to search for (universe) and the place to look for it (quotes.txt).
If there are spaces in the string you want to search, you must put quotes around it:
$ grep "the universe" quotes.txt
"God does not play dice with the universe."
"The cosmos is within us. We are made of star-stuff. We are a way for the universe to know itself."
Some common variations when using grep
are:
- Ignore case:
grep -i string-to-search filename
- Search in multiple files:
grep -i string-to-search *.txt
You can search for a regular expression:
$ grep "191[0-9]" quotes.txt
- Albert Einstein, The Born-Einstein Letters 1916-55
If you want to enable extended regexp patterns to use symbols like +
, ?
, or |
, you can use the egrep
command, which is a shortcut for adding the -E
flag to grep
. This also enables you to search for multiple strings:
$ egrep -i "albe|hawk" quotes.txt
- Albert Einstein, The Born-Einstein Letters 1916-55
- Stephen Hawking
To show lines that include the word "universe" plus the next line (in order to include the author's name):
$ grep -i universe -A 1 quotes.txt
"God does not play dice with the universe."
- Albert Einstein, The Born-Einstein Letters 1916-55
--
"The cosmos is within us. We are made of star-stuff. We are a way for the universe to know itself."
- Carl Sagan
As you can probably guess, you could display more lines by passing a different number. Or you could show the lines before by using the flag -B
.
So far, I've showed grep
running alone, but it is very common to have it in a chain of commands:
$ echo "Authors who mentioned 'universe'"; cat quotes.txt | grep -i universe -A 1 | grep "^-"
- Albert Einstein, The Born-Einstein Letters 1916-55
- Carl Sagan
[ You might also be interested in reading 11 Linux commands I can't live without. ]
sed
My favorite use for sed
is to replace strings in files. For example:
$ cat quotes.txt | sed 's/universe/Universe/g'
This will replace universe
with Universe
and send the result to stdout. The g
flag means "replace all occurrences of the string in each line."
Some variations for this are:
- Replace the string only if it's found in the first three lines:
sed '1,3 s/universe/Universe/g' quotes.txt
- Replace the n-th occurrence of a pattern in a line (for example, the second occurrence):
sed 's/universe/Universe/2' quotes.txt
These examples don't change the original file. If you want sed
to change the file in place, use -i
:
$ sed -i 's/universe/Universe/g' quotes.txt
If you use the -i
flag, make sure that you know exactly what and how many occurrences will be affected, as it will modify the original file. To find out, you can run a grep
and search for the pattern first.
[ Want to test your sysadmin skills? Take a skills assessment today. ]
awk
The awk
utility is very powerful, offering many options for processing text files.
Most of the situations where I use awk
involve processing files with a structure (columns) that is reasonably predictable, including the character used as a column separator.
When awk
processes a file, it splits each line using the "field separator" (internal variable FS
, which by default is the space character). Each field is assigned to positional variables ($1
contains the first field, $2
contains the second, and so forth. $0
represents the full line).
You can also apply filters to each line. For example:
$ cat quotes.txt | awk '/universe/ { print NR " - " $0 }'
1 - "God does not play dice with the universe."
10 - "The cosmos is within us. We are made of star-stuff. We are a way for the universe to know itself."
The commands passed to awk
use single quotes (it is like passing a mini-program to be interpreted):
- The
/universe/
part tellsawk
to select only the lines that match this pattern. - The "main" program goes between the curly brackets.
NR
is the internal variable that contains the number of the current record, for example, the current line number.- I added the
" -"
string for aesthetics.
The internal variables in awk
are:
NR
: The total number of input records seen so far by the commandNF
: The number of fields in the current input recordFS
: The input field separator (a space by default)
Here is an example using a more "predictable" file format:
$ cat /etc/passwd | awk '/nologin/ { FS=":"; print $1 }'
(output omitted)
...
redis
akmods
cjdns
haproxy
systemd-oom
In this last example:
/nologin/
selects only the lines that contain this pattern.FS=": ";
sets the field separator to:
instead of the default (space).print $1
prints the first field in each line (considering that the separator is:
).
Learn more
Those were some simple examples for using grep
, sed
, and awk
.
If you read the man
pages for each, you will notice plenty of additional parameters and uses for these handy commands.
For simple use cases and things you do only once in a while, it is always good to have tools like these in your toolbox.
If the required action is more complex, it is worth considering if these tools still make sense for you to use. For a corporate use case or managing "everything-as-code," I recommend using Ansible. Ansible modules have similar features that let you emulate the operations described above, with the advantage that Ansible modules usually have idempotency and that the full process will be documented somewhere (such as in your internal Git repo).
저자 소개
Roberto Nozaki (RHCSA/RHCE/RHCA) is an Automation Principal Consultant at Red Hat Canada where he specializes in IT automation with Ansible. He has experience in the financial, retail, and telecommunications sectors, having performed different roles in his career, from programming in mainframe environments to delivering IBM/Tivoli and Netcool products as a pre-sales and post-sales consultant.
Roberto has been a computer and software programming enthusiast for over 35 years. He is currently interested in hacking what he considers to be the ultimate hardware and software: our bodies and our minds.
Roberto lives in Toronto, and when he is not studying and working with Linux and Ansible, he likes to meditate, play the electric guitar, and research neuroscience, altered states of consciousness, biohacking, and spirituality.
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
오리지널 쇼
엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리
제품
- Red Hat Enterprise Linux
- Red Hat OpenShift Enterprise
- Red Hat Ansible Automation Platform
- 클라우드 서비스
- 모든 제품 보기
툴
체험, 구매 & 영업
커뮤니케이션
Red Hat 소개
Red Hat은 Linux, 클라우드, 컨테이너, 쿠버네티스 등을 포함한 글로벌 엔터프라이즈 오픈소스 솔루션 공급업체입니다. Red Hat은 코어 데이터센터에서 네트워크 엣지에 이르기까지 다양한 플랫폼과 환경에서 기업의 업무 편의성을 높여 주는 강화된 기능의 솔루션을 제공합니다.