Syntax
Description
The uniq command filters out adjacent duplicate lines from input. It’s most commonly used after sort to remove all duplicates from a dataset, or to count how many times each unique line appears.
Important: uniq only removes adjacent duplicates. To remove all duplicates regardless of position, pipe through sort first.
Nash’s uniq operates in-memory and supports counting, showing only duplicates, or showing only unique lines.
Options
Count mode. Prefix each output line with the number of times it appeared consecutively.Output format: 7-space right-aligned count, followed by space and the line.
Duplicates only. Output only lines that appear more than once (consecutively).
Unique only. Output only lines that appear exactly once (no consecutive duplicates).
Flags -d and -u are mutually exclusive. If both are specified, -d takes precedence.
Examples
Basic duplicate removal
echo -e "apple\napple\nbanana\nbanana\nbanana\ncherry" | uniq
Count consecutive occurrences
echo -e "apple\napple\nbanana\nbanana\nbanana\ncherry" | uniq -c
2 apple
3 banana
1 cherry
Show only duplicates
echo -e "apple\napple\nbanana\ncherry" | uniq -d
Only “apple” appears more than once consecutively.
Show only unique lines
echo -e "apple\napple\nbanana\ncherry" | uniq -u
Lines that appear exactly once.
Non-adjacent duplicates
echo -e "apple\nbanana\napple" | uniq
Notice “apple” appears twice in output because the duplicates are not adjacent. Use sort | uniq to remove all duplicates.
Complete deduplication
echo -e "apple\nbanana\napple\ncherry\nbanana" | sort | uniq
Pipeline Examples
Count unique IP addresses
cat access.log | cut -d' ' -f1 | sort | uniq -c | sort -r
- Extract IP addresses
- Sort them
- Count each unique IP
- Sort by count (most frequent first)
Find duplicate entries
cat emails.txt | sort | uniq -d
Shows emails that appear more than once.
Find unique entries
cat transactions.txt | sort | uniq -u
Shows transactions that appear exactly once.
Most frequent errors
grep ERROR app.log | sort | uniq -c | sort -rn | head -10
Shows top 10 most common error messages.
Remove duplicates from sorted data
sort data.txt | uniq > unique-data.txt
Or use sort -u for the same effect:
sort -u data.txt > unique-data.txt
Practical Use Cases
Deduplicate email list
cat emails.txt | sort | uniq > unique-emails.txt
Find repeated log entries
cat system.log | sort | uniq -d
Identifies repeated messages (potential issues).
Count word frequency
cat document.txt | tr ' ' '\n' | sort | uniq -c | sort -rn
- Split into words (one per line)
- Sort alphabetically
- Count each word
- Sort by frequency
Verify data consistency
cat data.csv | cut -d, -f1 | sort | uniq -d
Finds duplicate IDs in first column.
Create unique tag list
cat articles.txt | grep tags: | sed 's/tags: //' | tr ',' '\n' | sort | uniq
Extracts and deduplicates all tags.
Compare file contents
cat file1.txt file2.txt | sort | uniq -d
Shows lines that appear in both files.
Advanced Examples
Top 10 most common items
cat data.txt | sort | uniq -c | sort -rn | head -10
Bottom 10 least common items
cat data.txt | sort | uniq -c | sort -n | head -10
Items appearing exactly N times
cat data.txt | sort | uniq -c | grep "^ 3 "
Finds items appearing exactly 3 times.
Frequency distribution
cat data.txt | sort | uniq -c | sort -rn
Complete frequency table, most to least common.
Find singletons and duplicates
echo "Unique entries:"
cat data.txt | sort | uniq -u
echo "\nDuplicate entries:"
cat data.txt | sort | uniq -d
The -c flag outputs counts right-aligned in a 7-character field:
echo -e "a\na\na" | uniq -c
Format: [7 spaces for number] [line content]
Comparison with sort -u
Output:Only removes adjacent duplicates. echo -e "b\na\nb" | sort | uniq
Output:Removes all duplicates.echo -e "b\na\nb" | sort -u
Output:Same as sort | uniq, more efficient.
Tips
Always sort first
For complete deduplication:
# Wrong - may miss duplicates
cat file.txt | uniq
# Right - removes all duplicates
cat file.txt | sort | uniq
Counting technique
To count all occurrences of each unique item:
This is a fundamental pattern for frequency analysis.
Filter by frequency
Show only items appearing more than once:
sort data.txt | uniq -c | grep -v "^ 1 "
Case sensitivity
uniq is case-sensitive. “Apple” and “apple” are different:
echo -e "Apple\napple" | uniq # Both appear
To ignore case, convert first:
cat file.txt | tr '[:upper:]' '[:lower:]' | sort | uniq
Empty lines
Empty lines are treated as distinct values:
echo -e "\n\na" | uniq -c
Output shows count for empty lines.
Common Patterns
Frequency histogram
cat data.txt | sort | uniq -c | sort -rn
Sorted by frequency, descending.
Find unique values
Equivalent to sort | uniq, more efficient.
Duplicates only with count
cat file.txt | sort | uniq -c | grep -v "^ 1 "
Shows items appearing multiple times with counts.
Intersection of two files
cat file1.txt file2.txt | sort | uniq -d
Finds common lines.
- sort - Sort lines (required for complete deduplication)
- grep - Filter by pattern
- wc - Count lines/words
- cut - Extract fields before deduplication
- sed - Transform before deduplication