When working with files in the command line, you may encounter situations where you need to compare two files and display only the lines that differ between them. The diff
command in Bash is a powerful tool for this purpose. In this article, we will explore how to effectively use diff
to show only the rows that are different between two files, along with examples and explanations.
Understanding the Problem
The challenge is to compare two text files and output only the lines that are not identical between them. The diff
command is commonly used for this task, but by default, it shows all differences, including lines that are unchanged in both files. Therefore, we need to refine the command to extract only the differing rows.
The Basic diff
Command
Before we dive into the solution, let’s look at a basic example of the diff
command:
diff file1.txt file2.txt
This command will output the differences between file1.txt
and file2.txt
, showing lines that have been added, deleted, or changed.
Displaying Only Different Rows
To display only the differing rows, we can use a combination of the diff
command with some additional flags and options.
Solution Using diff
Here’s how you can do this:
diff -u file1.txt file2.txt | grep -E '^\+' | sed 's/^\+//'
Breakdown of the Command
-
diff -u
: This option generates a unified diff format, which is easier to read and contains context lines. The output includes lines starting with-
for lines infile1.txt
and+
for lines infile2.txt
. -
grep -E '^\+'
: This command filters the output fromdiff
, allowing only lines that start with a+
, which indicates lines present infile2.txt
but not infile1.txt
. -
sed 's/^\+//'
: This command removes the+
symbol from the beginning of each line, leaving just the differing text.
Handling Lines Removed from file2.txt
To see lines that were removed from file2.txt
, you can modify the command slightly:
diff -u file1.txt file2.txt | grep -E '^\-' | sed 's/^\-//'
Here, grep -E '^\-'
filters for lines that start with a -
, indicating lines that are present in file1.txt
but not in file2.txt
.
Example Scenario
Let’s say we have the following content in two files:
file1.txt
apple
banana
cherry
date
file2.txt
apple
banana
citrus
date
elderberry
When we run the commands:
# To find lines in file2.txt not in file1.txt
diff -u file1.txt file2.txt | grep -E '^\+' | sed 's/^\+//'
Output will be:
citrus
elderberry
# To find lines in file1.txt not in file2.txt
diff -u file1.txt file2.txt | grep -E '^\-' | sed 's/^\-//'
Output will be:
cherry
Conclusion
The diff
command is an essential tool for comparing files in Bash. By leveraging diff
with grep
and sed
, you can efficiently extract only the differing rows from two files, enhancing your ability to analyze file changes effectively. This method not only saves time but also simplifies the process of identifying differences in large text files.
Additional Resources
By mastering these commands, you can improve your productivity while working with text files in Bash. Happy scripting!