We have seen the sort command in our previous article, but sorting any file will often result in many duplicate lines adjacent to each other. It becomes too difficult to properly view those lines.
In this scenario, the uniq command helps you to print duplicate lines once in the output. It actually discards the lines which are repeated and prints the first adjacent repeated line, which enables us to view the output properly.
The lines used in the input file for the uniq command can neither exceed 2048 bytes in length (including any newline characters) nor contain null characters.
Syntax
uniq [OPTION]... [INPUT [OUTPUT]]
Examples
Below are a series of examples, beginning with no options. We'll walk through several use cases. Some involve only uniq, and others rely on additional commands.
Without any option
Below is a file named file2, which contains some data. Note that this file is not sorted, and the duplicate lines are not adjacent to each other. Before using the uniq command with this file, we should sort it. In the example, I have tried the uniq command with the original file, but it only prints the output as it is, much like a cat output. In the next example, we take output from a sort command and pipe it with uniq command. This helps us understand the behavior of the uniq command:
$ cat file2
ChhatrapatiShahuMaharaj
Dr.B.R.Ambedkar
Budhha
Dr.B.R.Ambedkar
Budhha
Dr.B.R.Ambedkar
Budhha
$ uniq file2
ChhatrapatiShahuMaharaj
Dr.B.R.Ambedkar
Budhha
Dr.B.R.Ambedkar
Budhha
Dr.B.R.Ambedkar
Budhha
$ sort file2
Budhha
Budhha
Budhha
ChhatrapatiShahuMaharaj
Dr.B.R.Ambedkar
Dr.B.R.Ambedkar
Dr.B.R.Ambedkar
$ sort file2 | uniq
Budhha
ChhatrapatiShahuMaharaj
Dr.B.R.Ambedkar
With -c, --count option
Below, in the next example, we’re using the -c option to count the repeated lines. The uniq command prints that count as a prefix with the line. The below example tells us that the first line is repeated three times, the second line one time, and the third line three times:
$ sort file2 | uniq -c
3 Budhha
1 ChhatrapatiShahuMaharaj
3 Dr.B.R.Ambedkar
With -d, --repeated option
The -d option prints only lines that are repeated. It discards non-duplicate lines. Therefore, line ChhatrapatiShahuMaharaj has been discarded in the below example:
$ sort file2 | uniq -d
Budhha
Dr.B.R.Ambedkar
In the below example, I’ve used the -c option to cross-check whether the -d option is only printing the repeated lines or not:
$ sort file2 | uniq -cd
3 Budhha
3 Dr.B.R.Ambedkar
With -D, --all-repeated option
The -D option prints repeated lines and discards the non-duplicate lines. In the below example, the uniq command prints all duplicate lines only and discards non-duplicate lines:
$ sort file2 | uniq -D
Budhha
Budhha
Budhha
Dr.B.R.Ambedkar
Dr.B.R.Ambedkar
Dr.B.R.Ambedkar
With -u, --unique option
Opposite of the above option, the -u option prints unique lines i.e., non-duplicate lines. Therefore, in the below example, it prints ChhatrapatiShahuMaharaj as an output:
$ sort file2 | uniq -u
ChhatrapatiShahuMaharaj
With -i, --ignore-case option
Using the -i option, we can ignore the case sensitivity of characters. Below I’ve given an output of the uniq command with and without the -i option to compare:
$ cat file3
aaaa
aaaa
AAAA
AAAA
bbbb
BBBB
$ uniq file3
aaaa
AAAA
bbbb
BBBB
$ uniq -i file3
aaaa
bbbb
With -f, --skip-fields=N
Sometimes we need to skip some fields to filter duplicate lines. This is possible using the -f option. In the following example, we’re skipping the first field (first column) to compare the duplicate lines from the second field. I’ve given both examples, with and without the -f option, for a better understanding of the option’s behavior:
$ cat file5
Amit aaaa
Ajit aaaa
Advi bbbb
Kaju bbbb
$ uniq file5
Amit aaaa
Ajit aaaa
Advi bbbb
Kaju bbbb
$ uniq -f 1 file5
Amit aaaa
Advi bbbb
[ Readers also liked: Working with pipes on the Linux command line ]
With -s, --skip-char=N option
Just like the field, we can skip characters as well by using the -s option. Please keep in mind that the uniq command prints only the first duplicate line and discards other duplicate lines. Therefore 33aa and 55bb have been discarded. Here is the example:
$ cat file4
22aa
33aa
44bb
55bb
$ uniq file4
22aa
33aa
44bb
55bb
$ uniq -s 2 file4
22aa
44bb
With -w, --check-chars=N option
Just like skipping characters, we can consider characters as well using the -w option, such as in the example:
$ cat file6
aa12
aa34
bb56
bb78
$ uniq file6
aa12
aa34
bb56
bb78
$ uniq -w 2 file6
aa12
bb56
With --version option
Use the --version option to check the version of the uniq command.
$ uniq --version
uniq (GNU coreutils) 8.4
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Richard M. Stallman and David MacKenzie.
[ Free download: Advanced Linux commands cheat sheet. ]
Wrap up
uniq does not detect repeated lines unless they are adjacent. The uniq command can count and print the number of repeated lines. Just like duplicate lines, we can filter unique lines (non-duplicate lines) as well and can also ignore case sensitivity. We can skip fields and characters before comparing duplicate lines and also consider characters for filtering lines.
After reviewing the multiple uniq command options, I would like to share a small image to keep it with you for reference.
저자 소개
I'm a techie guy with lots of love for Linux. I've started my career with a US-based project as Linux Administrator. Later, I got an opportunity to work with HPC clusters, where I learned several other products. I love to teach, write blogs, troubleshoot complex issues, and write scripts to automate tasks. I also love to read books and watch movies/web series.
유사한 검색 결과
Behind the scenes of RHEL 10, part 3
Alliander modernises its electricity grid with Red Hat for long-term reliability in balance with rapid innovation
The Overlooked Operating System | Compiler: Stack/Unstuck
Linux, Shadowman, And Open Source Spirit | Compiler
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
가상화
온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래