Sponsored Content
Top Forums Shell Programming and Scripting find duplicate string in many different files Post 302713401 by xshang on Wednesday 10th of October 2012 03:19:43 PM
Old 10-10-2012
I use the following code when dealing with two files.

Code:
awk 'NR==FNR{A[$1]=$0; next} $1=A[$1]' file1 FS=, OFS='\t' file2

The file like this is what I want: (in case the strings shown here are duplicate between these files)

HTML Code:
string         file1       file2     file3    file4    file5   ...............
SVERTGTGYT	6	      4           5
SVGLADYNEQF	21	      3           7
SVGQGYEQY	32	      5           6
SVKTVLGYEQF	2	      4          9
SVNNEQF	       12	      4           6
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to find duplicate files with find ?

hello all I like to make search on files , and the result need to be the files that are duplicated? (8 Replies)
Discussion started by: umen
8 Replies

2. Shell Programming and Scripting

Find Duplicate files, not by name

I have a directory with images: -rw-r--r-- 1 root root 26216 Mar 19 21:00 020109.210001.jpg -rw-r--r-- 1 root root 21760 Mar 19 21:15 020109.211502.jpg -rw-r--r-- 1 root root 23144 Mar 19 21:30 020109.213002.jpg -rw-r--r-- 1 root root 31350 Mar 20 00:45 020109.004501.jpg -rw-r--r-- 1 root... (2 Replies)
Discussion started by: Ikon
2 Replies

3. Shell Programming and Scripting

Remove duplicate files based on text string?

Hi I have been struggling with a script for removing duplicate messages from a shared mailbox. I would like to search for duplicate messages based on the “Message-ID” string within the messages files. I have managed to find the duplicate “Message-ID” strings and (if I would like) delete... (1 Reply)
Discussion started by: spangberg
1 Replies

4. Shell Programming and Scripting

Find duplicate files

What utility do you recommend for simply finding all duplicate files among all files? (4 Replies)
Discussion started by: kiasas
4 Replies

5. Shell Programming and Scripting

Find duplicate files by file size

Hi! I want to find duplicate files (criteria: file size) in my download folder. I try it like this: find /Users/frodo/Downloads \! -type d -exec du {} \; | sort > /Users/frodo/Desktop/duplicates_1.txt; cut -f 1 /Users/frodo/Desktop/duplicates_1.txt | uniq -d | grep -hif -... (9 Replies)
Discussion started by: Dirk Einecke
9 Replies

6. Shell Programming and Scripting

Find duplicate files but with different extensions

Hi ! I wonder if anyone can help on this : I have a directory: /xyz that has the following files: chsLog.107.20130603.gz chsLog.115.20130603 chsLog.111.20130603.gz chsLog.107.20130603 chsLog.115.20130603.gz As you ca see there are two files that are the same but only with a minor... (10 Replies)
Discussion started by: fretagi
10 Replies

7. Shell Programming and Scripting

Find duplicate rows between files

Hi champs, I have one of the requirement, where I need to compare two files line by line and ignore duplicates. Note, I hav files in sorted order. I have tried using the comm command, but its not working for my scenario. Input file1 srv1..development..employee..empname,empid,empdesg... (1 Reply)
Discussion started by: Selva_2507
1 Replies

8. Shell Programming and Scripting

Find help in shell - that clears away duplicate files

I am so frustrated!!! I want a nice command that clears away duplicate files: find . -type f -regex '.*{1,3}\..*' | xargs -I## rm -v '##' should work in my opinion. But it finds nothing even though I have files that have the file name: Scooby-Doo-1.txt Himalaya-2.jpg Camping... (8 Replies)
Discussion started by: Mr.Glaurung
8 Replies

9. Shell Programming and Scripting

To Find Duplicate files using latest in Linux

I have tried the following code and with that i couldnt achieve what i want. #!/usr/bin/bash find ./ -type f \( -iname "*.xml" \) | sort -n > fileList sed -i '/\.\/fileList/d' fileList NAMEOFTHISFILE=$(echo $0|sed -e 's/\/()$*.^|/\\&/g') sed -i "/$NAMEOFTHISFILE/d"... (2 Replies)
Discussion started by: gold2k8
2 Replies

10. UNIX for Advanced & Expert Users

AIX find duplicate backup files

I would like find and delete old backup files in aix. How would I go about doing this? For example: server1_1-20-2020 server1_1-21-2020 server1_1-22-2020 server1_1-23-2020 server2_1-20-2020 server2_1-21-2020 server2_1-22-2020 server2_1-23-2020 How would I go about finding and... (3 Replies)
Discussion started by: cokedude
3 Replies
MERGE(1)						      General Commands Manual							  MERGE(1)

NAME
merge - three-way file merge SYNOPSIS
merge [ options ] file1 file2 file3 DESCRIPTION
merge incorporates all changes that lead from file2 to file3 into file1. The result ordinarily goes into file1. merge is useful for com- bining separate changes to an original. Suppose file2 is the original, and both file1 and file3 are modifications of file2. Then merge combines both changes. A conflict occurs if both file1 and file3 have changes in a common segment of lines. If a conflict is found, merge normally outputs a warning and brackets the conflict with <<<<<<< and >>>>>>> lines. A typical conflict will look like this: <<<<<<< file A lines in file A ======= lines in file B >>>>>>> file B If there are conflicts, the user should edit the result and delete one of the alternatives. OPTIONS
-A Output conflicts using the -A style of diff3(1), if supported by diff3. This merges all changes leading from file2 to file3 into file1, and generates the most verbose output. -E, -e These options specify conflict styles that generate less information than -A. See diff3(1) for details. The default is -E. With -e, merge does not warn about conflicts. -L label This option may be given up to three times, and specifies labels to be used in place of the corresponding file names in conflict reports. That is, merge -L x -L y -L z a b c generates output that looks like it came from files x, y and z instead of from files a, b and c. -p Send results to standard output instead of overwriting file1. -q Quiet; do not warn about conflicts. -V Print 's version number. DIAGNOSTICS
Exit status is 0 for no conflicts, 1 for some conflicts, 2 for trouble. IDENTIFICATION
Author: Walter F. Tichy. Manual Page Revision: 5.7; Release Date: 1995/06/01. Copyright (C) 1982, 1988, 1989 Walter F. Tichy. Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995 Paul Eggert. SEE ALSO
diff3(1), diff(1), rcsmerge(1), co(1). BUGS
It normally does not make sense to merge binary files as if they were text, but merge tries to do it anyway. GNU
1995/06/01 MERGE(1)
All times are GMT -4. The time now is 05:23 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy