How to display only different rows using diff (bash)

2 min read 08-10-2024
How to display only different rows using diff (bash)


When working with files in the command line, you may encounter situations where you need to compare two files and display only the lines that differ between them. The diff command in Bash is a powerful tool for this purpose. In this article, we will explore how to effectively use diff to show only the rows that are different between two files, along with examples and explanations.

Understanding the Problem

The challenge is to compare two text files and output only the lines that are not identical between them. The diff command is commonly used for this task, but by default, it shows all differences, including lines that are unchanged in both files. Therefore, we need to refine the command to extract only the differing rows.

The Basic diff Command

Before we dive into the solution, let’s look at a basic example of the diff command:

diff file1.txt file2.txt

This command will output the differences between file1.txt and file2.txt, showing lines that have been added, deleted, or changed.

Displaying Only Different Rows

To display only the differing rows, we can use a combination of the diff command with some additional flags and options.

Solution Using diff

Here’s how you can do this:

diff -u file1.txt file2.txt | grep -E '^\+' | sed 's/^\+//'

Breakdown of the Command

  1. diff -u: This option generates a unified diff format, which is easier to read and contains context lines. The output includes lines starting with - for lines in file1.txt and + for lines in file2.txt.

  2. grep -E '^\+': This command filters the output from diff, allowing only lines that start with a +, which indicates lines present in file2.txt but not in file1.txt.

  3. sed 's/^\+//': This command removes the + symbol from the beginning of each line, leaving just the differing text.

Handling Lines Removed from file2.txt

To see lines that were removed from file2.txt, you can modify the command slightly:

diff -u file1.txt file2.txt | grep -E '^\-' | sed 's/^\-//'

Here, grep -E '^\-' filters for lines that start with a -, indicating lines that are present in file1.txt but not in file2.txt.

Example Scenario

Let’s say we have the following content in two files:

file1.txt

apple
banana
cherry
date

file2.txt

apple
banana
citrus
date
elderberry

When we run the commands:

# To find lines in file2.txt not in file1.txt
diff -u file1.txt file2.txt | grep -E '^\+' | sed 's/^\+//'

Output will be:

citrus
elderberry
# To find lines in file1.txt not in file2.txt
diff -u file1.txt file2.txt | grep -E '^\-' | sed 's/^\-//'

Output will be:

cherry

Conclusion

The diff command is an essential tool for comparing files in Bash. By leveraging diff with grep and sed, you can efficiently extract only the differing rows from two files, enhancing your ability to analyze file changes effectively. This method not only saves time but also simplifies the process of identifying differences in large text files.

Additional Resources

By mastering these commands, you can improve your productivity while working with text files in Bash. Happy scripting!