If string matches within 2 files, delete one file.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting If string matches within 2 files, delete one file.
# 1  
Old 08-31-2009
If string matches within 2 files, delete one file.

I have a directory with a large # of files and in each file I am looking to match a string in one file with a string in the subsequent n file(s). If there is a match between a string in one file and a string in the next n file(s) then delete the subsequent duplicate file(s). Here is sample input:
Code:
blake [ ~/scratch ]$ ls -l ???.txt
-rw-r--r--  1 blake  staff  7 Aug 31 16:23 aaa.txt
-rw-r--r--  1 blake  staff  7 Aug 31 16:23 bbb.txt
-rw-r--r--  1 blake  staff  7 Aug 31 16:23 ccc.txt
-rw-r--r--  1 blake  staff  7 Aug 31 16:23 ddd.txt
-rw-r--r--  1 blake  staff  7 Aug 31 16:24 eee.txt
-rw-r--r--  1 blake  staff  7 Aug 31 16:24 fff.txt
-rw-r--r--  1 blake  staff  7 Aug 31 16:24 ggg.txt

blake [ ~/scratch ]$ cat ???.txt
aabbcc
aabbcc
aabbcc
abcabc
abcabc
abc123
aabbcc

And the desired output is as follows (assuming that I set n to look at 7 files or more)

Code:
blake [ ~/scratch ]$ ls -l ???.txt
-rw-r--r--  1 blake  staff  7 Aug 31 16:23 aaa.txt
-rw-r--r--  1 blake  staff  7 Aug 31 16:23 ddd.txt
-rw-r--r--  1 blake  staff  7 Aug 31 16:24 fff.txt

Many thanks.
# 2  
Old 08-31-2009
What constitutes "subsequent"? Is it the next file in the order "ls -l", and is a "subsequent" list terminated by a file not containing the match characters, or can the "subsequent" list extent to any file containing the match characters?

Usage of the word subsequent imply a sequence.

Last edited by methyl; 08-31-2009 at 07:18 PM.. Reason: grammar, punctuatiion, usage and spelling
# 3  
Old 09-01-2009
Sitney,

Here is something to try...
Assumptions are:
- The first line in each file contains the comparison string
- only one instance of a specific string is allowed, regardless of the number of files.

Test files are:
Code:
# ls -l ???.txt
-rw-r--r-- 1 root root 7 2009-08-31 22:54 aaa.txt
-rw-r--r-- 1 root root 7 2009-08-31 22:54 bbb.txt
-rw-r--r-- 1 root root 7 2009-08-31 22:54 ccc.txt
-rw-r--r-- 1 root root 7 2009-08-31 22:55 ddd.txt
-rw-r--r-- 1 root root 7 2009-08-31 22:55 eee.txt
-rw-r--r-- 1 root root 7 2009-08-31 22:55 fff.txt
-rw-r--r-- 1 root root 7 2009-08-31 22:56 ggg.txt

Contents of files:
Code:
# cat ???.txt
aabbcc
aabbcc
aabbcc
abcabc
abcabc
abc123
aabbcc

Script to run:
Code:
for i in ???.txt
do 
   c=$(head -1 $i)
   echo "$c|$i"
done | perl -e '{my %s; while(<>){chomp;($st,$fn) = split(/\|/);if (! defined($s{$st})) {$s{$st} = $fn; print "$s{$st}\n";}}}' | xargs ls -l

Description:
For each file,
echo the string, followed by pipe symbol, followed by the filename
end of for loop, pass this into perl script via standard in
the perl script splits output on the pipe symbol,
checks if the string name is defined in the hash, if not, store the filename value, with the string as the key to the hash, then print the filename
Send this output as standard input to the xargs which passes each filename to the "ls -l" command.

Output is:
Code:
-rw-r--r-- 1 root root 7 2009-08-31 22:54 aaa.txt
-rw-r--r-- 1 root root 7 2009-08-31 22:55 ddd.txt
-rw-r--r-- 1 root root 7 2009-08-31 22:55 fff.txt

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replace string of a file with a string of another file for matches using grep,sed,awk

I have a file comp.pkglist which mention package version and release . In 'version change' and 'release change' line there are two versions 'old' and 'new' Version Change: --> Release Change: --> cat comp.pkglist Package list: nss-util-devel-3.28.4-1.el6_9.x86_64 Version Change: 3.28.4 -->... (1 Reply)
Discussion started by: Paras Pandey
1 Replies

2. Shell Programming and Scripting

Replace all string matches in file with unique random number

Hello Take this file... Test01 Ref test Version 01 Test02 Ref test Version 02 Test66 Ref test Version 66 Test99 Ref test Version 99 I want to substitute every occurrence of Test{2} with a unique random number, so for example, if I was using sed, substitution would be something... (1 Reply)
Discussion started by: funkman
1 Replies

3. Shell Programming and Scripting

Required 3 lines above the file and below file when string matches

i had requirement like i need to get "error" line of above 3 and below 3 from a file .I tried with the below script.But it's not working. y='grep -n -i error /home/file.txt|cut -c1' echo $y head -$y /home/file.txt| tail -3 >tmp.txt tail -$y /home/file.txt head -3 >>tmp.txt (4 Replies)
Discussion started by: bhas85
4 Replies

4. UNIX for Dummies Questions & Answers

Print only '+' or '-' if string matches (two files)

I would like to add two additional conditions to the actual code I have: print '+' if in File2 field 5 is greater than 35 and also field 7 is grater than 90. while read -r line do grep -q "$line" File2.txt && echo "$line +" || echo "$line -" done < File1.txt ' Input file 1: ... (5 Replies)
Discussion started by: bernardo.bello
5 Replies

5. Shell Programming and Scripting

Help in printing n number of lines if a search string matches in a file

Hi I have below script which is used to grep specific errors and if error string matches send an email alert. Script is working fine , however , i wish to print next 10 lines of the string match to get the details of error in the email alert Current code:- #!/bin/bash tail -Fn0 --retry... (2 Replies)
Discussion started by: neha0785
2 Replies

6. Shell Programming and Scripting

Compare 2 files and print matches and non-matches in separate files

Hi all, I have two files, chap.txt and complex.txt. chap.txt looks like this: a d l m r k complex.txt looks like this: a c d e l m n j a d l p q r c p r m ......... (7 Replies)
Discussion started by: AshwaniSharma09
7 Replies

7. Shell Programming and Scripting

String replacement when particular pattern matches in a file

I have a file file123.xml which looks like this xmlEntry="username"="josh" <property="never_back_down"> phone="<178652>" apn=property:address="wonderland" xmlEntry="username"="jessica" <property="never_back_down"> phone="<178653>" apn=property:address="wonderland"... (5 Replies)
Discussion started by: poga
5 Replies

8. Shell Programming and Scripting

Grep a string from input file and delete next three lines including the line contains string in xml

Hi, 1_strings file contains $ cat 1_strings /home/$USER/Src /home/Valid /home/Review$ cat myxml <projected value="some string" path="/home/$USER/Src"> <input 1/> <estimate value/> <somestring/> </projected> <few more lines > <projected value="some string" path="/home/$USER/check">... (4 Replies)
Discussion started by: greet_sed
4 Replies

9. Shell Programming and Scripting

script to delete lines from a txt file if pattern matches

File 6 dbnawldb010-b office Memphis_Corp_SQL_Diff Memphis-Corp-SQL-Inc-Application-Backup 03/09/11 03:24:04 42 luigi-b IPNRemitDB Memphis_Corp_SQL_Diff Memphis-Corp-SQL-Inc-Application-Backup 03/10/11 00:41:36 6 ebs-sqldev1-b IPNTracking Memphis_Corp_SQL_Diff... (4 Replies)
Discussion started by: ajiwww
4 Replies

10. Shell Programming and Scripting

Looking for a string in files and reporting matches

Can someone please help me figure out what the command syntax I need to use is? Here is what I am wanting to do. I have hundreds of thousands of files I need to look for a specific search string in. These files are spread across multiple subdirectories from one main directory. I would like... (4 Replies)
Discussion started by: btrotter
4 Replies
Login or Register to Ask a Question