We have seen the sort command in our previous article, but sorting any file will often result in many duplicate lines adjacent to each other. It becomes too difficult to properly view those lines.

In this scenario, the uniq command helps you to print duplicate lines once in the output. It actually discards the lines which are repeated and prints the first adjacent repeated line, which enables us to view the output properly.

The lines used in the input file for the uniq command can neither exceed 2048 bytes in length (including any newline characters) nor contain null characters.

Syntax

uniq [OPTION]... [INPUT [OUTPUT]]

Examples

Below are a series of examples, beginning with no options. We'll walk through several use cases. Some involve only uniq, and others rely on additional commands.

Without any option

Below is a file named file2, which contains some data. Note that this file is not sorted, and the duplicate lines are not adjacent to each other. Before using the uniq command with this file, we should sort it. In the example, I have tried the uniq command with the original file, but it only prints the output as it is, much like a cat output. In the next example, we take output from a sort command and pipe it with uniq command. This helps us understand the behavior of the uniq command:

$ cat file2
ChhatrapatiShahuMaharaj
Dr.B.R.Ambedkar
Budhha
Dr.B.R.Ambedkar
Budhha
Dr.B.R.Ambedkar
Budhha

$ uniq file2
ChhatrapatiShahuMaharaj
Dr.B.R.Ambedkar
Budhha
Dr.B.R.Ambedkar
Budhha
Dr.B.R.Ambedkar
Budhha

$ sort file2
Budhha
Budhha
Budhha
ChhatrapatiShahuMaharaj
Dr.B.R.Ambedkar
Dr.B.R.Ambedkar
Dr.B.R.Ambedkar

$ sort file2 | uniq
Budhha
ChhatrapatiShahuMaharaj
Dr.B.R.Ambedkar

With -c, --count option

Below, in the next example, we’re using the -c option to count the repeated lines. The uniq command prints that count as a prefix with the line. The below example tells us that the first line is repeated three times, the second line one time, and the third line three times:

$ sort file2 | uniq -c
    3 Budhha
    1 ChhatrapatiShahuMaharaj
    3 Dr.B.R.Ambedkar

With -d, --repeated option

The -d option prints only lines that are repeated. It discards non-duplicate lines. Therefore, line ChhatrapatiShahuMaharaj has been discarded in the below example:

$ sort file2 | uniq -d
Budhha
Dr.B.R.Ambedkar

In the below example, I’ve used the -c option to cross-check whether the -d option is only printing the repeated lines or not:

$ sort file2 | uniq -cd
    3 Budhha
    3 Dr.B.R.Ambedkar

With -D, --all-repeated option

The -D option prints repeated lines and discards the non-duplicate lines. In the below example, the uniq command prints all duplicate lines only and discards non-duplicate lines:

$ sort file2 | uniq -D
Budhha
Budhha
Budhha
Dr.B.R.Ambedkar
Dr.B.R.Ambedkar
Dr.B.R.Ambedkar

With -u, --unique option

Opposite of the above option, the -u option prints unique lines i.e., non-duplicate lines. Therefore, in the below example, it prints ChhatrapatiShahuMaharaj as an output:

$ sort file2 | uniq -u
ChhatrapatiShahuMaharaj

With -i, --ignore-case option

Using the -i option, we can ignore the case sensitivity of characters. Below I’ve given an output of the uniq command with and without the -i option to compare:

$ cat file3
aaaa
aaaa
AAAA
AAAA
bbbb
BBBB

$ uniq file3
aaaa
AAAA
bbbb
BBBB

$ uniq -i file3
aaaa
bbbb

With -f, --skip-fields=N

Sometimes we need to skip some fields to filter duplicate lines. This is possible using the -f option. In the following example, we’re skipping the first field (first column) to compare the duplicate lines from the second field. I’ve given both examples, with and without the -f option, for a better understanding of the option’s behavior:

$ cat file5
Amit aaaa
Ajit aaaa
Advi bbbb
Kaju bbbb

$ uniq file5
Amit aaaa
Ajit aaaa
Advi bbbb
Kaju bbbb

$ uniq -f 1 file5
Amit aaaa
Advi bbbb

[ Readers also liked: Working with pipes on the Linux command line ]

With -s, --skip-char=N option

Just like the field, we can skip characters as well by using the -s option. Please keep in mind that the uniq command prints only the first duplicate line and discards other duplicate lines. Therefore 33aa and 55bb have been discarded. Here is the example:

$ cat file4
22aa
33aa
44bb
55bb

$ uniq file4
22aa
33aa
44bb
55bb

$ uniq -s 2 file4
22aa
44bb

With -w, --check-chars=N option

Just like skipping characters, we can consider characters as well using the -w option, such as in the example:

$ cat file6
aa12
aa34
bb56
bb78

$ uniq file6
aa12
aa34
bb56
bb78

$ uniq -w 2 file6
aa12
bb56

With --version option

Use the --version option to check the version of the uniq command.

$ uniq --version
uniq (GNU coreutils) 8.4
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Richard M. Stallman and David MacKenzie.

[ Free download: Advanced Linux commands cheat sheet.

Wrap up

uniq does not detect repeated lines unless they are adjacent. The uniq command can count and print the number of repeated lines. Just like duplicate lines, we can filter unique lines (non-duplicate lines) as well and can also ignore case sensitivity. We can skip fields and characters before comparing duplicate lines and also consider characters for filtering lines.

After reviewing the multiple uniq command options, I would like to share a small image to keep it with you for reference.

The uniq command

저자 소개

I'm a techie guy with lots of love for Linux. I've started my career with a US-based project as Linux Administrator. Later, I got an opportunity to work with HPC clusters, where I learned several other products. I love to teach, write blogs, troubleshoot complex issues, and write scripts to automate tasks. I also love to read books and watch movies/web series.

UI_Icon-Red_Hat-Close-A-Black-RGB

채널별 검색

automation icon

오토메이션

기술, 팀, 인프라를 위한 IT 자동화 최신 동향

AI icon

인공지능

고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트

open hybrid cloud icon

오픈 하이브리드 클라우드

하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요

security icon

보안

환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보

edge icon

엣지 컴퓨팅

엣지에서의 운영을 단순화하는 플랫폼 업데이트

Infrastructure icon

인프라

세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보

application development icon

애플리케이션

복잡한 애플리케이션에 대한 솔루션 더 보기

Virtualization icon

가상화

온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래