Sponsored Content
Top Forums Shell Programming and Scripting Complex comparison of 3 files Post 302850549 by MR.bean on Thursday 5th of September 2013 03:02:20 AM
Old 09-05-2013
Another one

Code:
bash-3.2$ cat f1
547680
575210
804270
123989
623989
221209
bash-3.2$ cat f2
1|4501892|547680|1|2|30|73491|12|34|1
2|4788930|575210|1|2|30|73472|12|34|1
3|6793773|804270|1|2|30|73420
4|6673724|123989|1|2|30|73001|12|34|1
5|8099821|333722|1|30|73473|10|34|1
6|7889200|623989|1|2|30|73001|12|45|1
7|8882662|221209|1|2|30|83002|12|34|1
bash-3.2$ 
bash-3.2$ cat f3
1|2|30|73472|12|34|1
1|2|30|73001|12|34|1
1|2|30|83002|12|34|1
bash-3.2$ 
bash-3.2$ awk 'NR==FNR {  x[$0]++; next; } { m=0; for(i in x) { regexp=gensub(/\|/, "\\\\\\|", "g", i); if(match($0, regexp)) { m++; }} printf "%s", $0; print m ? ",C" : ",D" } ' f3 <(join -t'|' -2 3 f1 f2 | cut -d'|' -f 1,4-)
547680|1|2|30|73491|12|34|1,D
575210|1|2|30|73472|12|34|1,C
804270|1|2|30|73420,D
123989|1|2|30|73001|12|34|1,C
623989|1|2|30|73001|12|45|1,D
221209|1|2|30|83002|12|34|1,C


Last edited by MR.bean; 09-05-2013 at 04:11 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with complex merg of files with common field

Please help, I am new to shell Programming. I have three files each containg a unique text (key) field (e.g. ABCDEF, XCDUD as shown below), line return followed by some data of which there can be more then one instance. In addition, in some cases there may be no data but only a key field. Please... (18 Replies)
Discussion started by: gugs
18 Replies

2. Shell Programming and Scripting

comparison of 2 files

Kindly help on follows. I have 2 files. One file contains only one column of mobile numbers. And total records in a file 12 million. Second file contains 2 columns mobile numbers and balance. and total records 30 million. I want to find out balance of each data in file 1 corresponding to file 2.... (2 Replies)
Discussion started by: kamal_418
2 Replies

3. Shell Programming and Scripting

sed: How to modify files in a complex way

Hello, I am new to sed and hope that someone can help me with the following task. I need to modify a txt file which has format like this: xy=CreateDB|head.queue|head.source|head.definition|rtf.edit|rtf.task|rft.cut abc|source|divine|line4|5|true into something like: head.queue=abc... (19 Replies)
Discussion started by: pinkypunky
19 Replies

4. Shell Programming and Scripting

Joining files in a complex way

if input1 1st row labels (S1or S2 or S3 or any (actually so many in original text file)) are similar to 1st column of input2 i.e "ID" merge them together based on input1 1st row labels. for example take S1..... input1 "aphab" "S1" "S2" "S3" "a" "A/A" "A/A" "A/A" "b" ... (19 Replies)
Discussion started by: stateperl
19 Replies

5. Shell Programming and Scripting

Comparison of two files (sh)

Hi, I have a problem with comparison of two files file1 20100101 20090101 20080101 20071001 20121229 file2 19990112 12 456 7 20011131 19 20100101 2 567 1 987 17890709 123 555 and, sh script needs to compare of these two files and give out to me result: 20100101 2 567 1 987 it... (5 Replies)
Discussion started by: shizik
5 Replies

6. Shell Programming and Scripting

mass renaming files with complex filenames

Hi, I've got files with names like this : _Some_Name_178_HD_.mp4 _Some_Name_-_496_Vost_SD_(720x400_XviD_MP3).avi Goffytofansub_Some name 483_HD.avi And iam trying to rename it with a regular pattern. My gola is this : Ep 178.mp4 Ep 496.avi Ep 483.avi I've tried using sed with... (8 Replies)
Discussion started by: VLaw
8 Replies

7. Shell Programming and Scripting

Complex renaming then moving files

I am a biologist and using an program on a computer cluster that generates a lot of data. The program creates a directory named ExperimentX (where X is a number) that contains files "out.pdb" and "log.txt". I would like to create a script that renames the out.pdb file to out_ExperimentX.pdb (or... (1 Reply)
Discussion started by: yaledocker
1 Replies

8. Shell Programming and Scripting

Complex data sorting in excel files or text files

Dear all, I have a complex data file shown below,,,,, A_ABCD_13208 0 0 4.16735 141044 902449 1293900 168919 C_ABCD_13208 0 0 4.16735 141044 902449 1293900 168919 A_ABCDEF715 52410.9 18598.2 10611 10754.7 122535 252426 36631.4 C_DBCDI_1353 0... (19 Replies)
Discussion started by: AAWT
19 Replies

9. Shell Programming and Scripting

Comparison of two files

Hi all I have two files which I have to compare that whetehr there is soemthing common or not body, div, table, thead, tbody, tfoot, tr, th, td, p { font-family: "Liberation Sans"; font-size: x-small; } body, div, table, thead, tbody, tfoot,... (2 Replies)
Discussion started by: manigrover
2 Replies

10. Shell Programming and Scripting

Comparison of files

I have the requirement I have two files cat fileA something anythg nothing everythg cat fileB everythg anythg Now i shld use fileB and compare every line at fileA and get the output as something nothing (3 Replies)
Discussion started by: Priya Amaresh
3 Replies
DIFF(1) 						      General Commands Manual							   DIFF(1)

NAME
diff - differential file comparator SYNOPSIS
diff [ -efbh ] file1 file2 DESCRIPTION
Diff tells what lines must be changed in two files to bring them into agreement. If file1 (file2) is `-', the standard input is used. If file1 (file2) is a directory, then a file in that directory whose file-name is the same as the file-name of file2 (file1) is used. The normal output contains lines of these forms: n1 a n3,n4 n1,n2 d n3 n1,n2 c n3,n4 These lines resemble ed commands to convert file1 into file2. The numbers after the letters pertain to file2. In fact, by exchanging `a' for `d' and reading backward one may ascertain equally how to convert file2 into file1. As in ed, identical pairs where n1 = n2 or n3 = n4 are abbreviated as a single number. Following each of these lines come all the lines that are affected in the first file flagged by `<', then all the lines that are affected in the second file flagged by `>'. The -b option causes trailing blanks (spaces and tabs) to be ignored and other strings of blanks to compare equal. The -e option produces a script of a, c and d commands for the editor ed, which will recreate file2 from file1. The -f option produces a similar script, not useful with ed, in the opposite order. In connection with -e, the following shell program may help maintain multiple versions of a file. Only an ancestral file ($1) and a chain of version-to-version ed scripts ($2,$3,...) made by diff need be on hand. A `latest version' appears on the standard output. (shift; cat $*; echo '1,$p') | ed - $1 Except in rare circumstances, diff finds a smallest sufficient set of file differences. Option -h does a fast, half-hearted job. It works only when changed stretches are short and well separated, but does work on files of unlimited length. Options -e and -f are unavailable with -h. FILES
/tmp/d????? /usr/lib/diffh for -h SEE ALSO
cmp(1), comm(1), ed(1) DIAGNOSTICS
Exit status is 0 for no differences, 1 for some, 2 for trouble. BUGS
Editing scripts produced under the -e or -f option are naive about creating lines consisting of a single `.'. DIFF(1)
All times are GMT -4. The time now is 08:55 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy