diff 2 files > file3, but records in various order


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting diff 2 files > file3, but records in various order
# 1  
Old 04-14-2010
diff 2 files > file3, but records in various order

What I really need is a script that compares 2 (.csv) text files line by line with a single entries on each line and then outputs NON-duplicate lines to a third (.csv) text file, the problem is the lines may be exactly the same, but in different order in the 2 text files, so

sourcefile1 contains
bob
jane
sally

sourcefile2 contains
sally
bob

output to > file3 containing
jane

I've tried using:
Code:
grep -vxf sourcefile1 sourcefile2 > file3

but I get no output, because the lines are in different order? if I do:
Code:
comm -13 sourcefile1 sourcefile2 > file3

I get error: "file 1 is not in sorted order", but sorting it doesn't seem to help. I was thinking about writing a loop that said something like:
Code:
cat sourcefile1 | while read LINE
do
      cat sourcefile2 | while read LINE2
      do
             if [ "$LINE2" = "$LINE" ]
             then
                   exit
             else
                   echo $LINE > file3
             fi
      done
done

but I don't know if my logic is right (I'm sure my syntax is wrong), and it seems like an inefficient way to do it, are there better/more elegant ways to do this?
# 2  
Old 04-14-2010
Replace the order of the files:
Code:
grep -vxf sourcefile2 sourcefile1 > file3

# 3  
Old 04-14-2010
I think the best approach would be this:

Code:
grep -vxf sourcefile2 sourcefile1 > file3
grep -vxf sourcefile1 sourcefile2 >> file3

We need to do it both ways since there can be unique entries in each file.
# 4  
Old 04-14-2010
If there's a possibility of a name containing a regular expression metacharacter (such as a dot following an initial, for example), to strictly match an entire line with grep, you'll want to use -xF along with whatever other options the logic of the solution demands.

Regards,
Alister
# 5  
Old 04-14-2010
Code:
#!/bin/ksh
              
        
while read name            
do                         
  grep $name $2 >/dev/null
  if [ "$?" != "0" ]       
  then                     
     echo $name            
  fi                       
done <$1

./script source1 source2

Last edited by jgt; 04-14-2010 at 06:49 PM.. Reason: Don't need to sort files
# 6  
Old 04-14-2010
What about lines in source2 that are not in source1?
# 7  
Old 04-14-2010
Quote:
Originally Posted by soleil4716
What about lines in source2 that are not in source1?
run the script above as ./script source2 source1
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Want to grep records in alphabetical order from a file and split into other files

Hi All, I have one file containing thousands of table names in single column. Now I want that file split into multiple files e.g one file containing table names starting from A, other containing all tables starting from B...and so on..till Z. I tried below but it did not work. for i in... (6 Replies)
Discussion started by: shekhar_4_u
6 Replies

2. Shell Programming and Scripting

Diff 3 files, but diff only their 2nd column

Guys i have 3 files, but i want to compare and diff only the 2nd column path=`/home/whois/doms` for i in `cat domain.tx` do whois $i| sed -n '/Registry Registrant ID:/,/Registrant Email:/p' > $path/$i.registrant whois $i| sed -n '/Registry Admin ID:/,/Admin Email:/p' > $path/$i.admin... (10 Replies)
Discussion started by: kenshinhimura
10 Replies

3. Shell Programming and Scripting

Delete records in reverse order

Hi all, i have dynamic file 'xyz.txt', records always look likes below format ... 0000021 RET 31-MAR-1984 FAP 0000021 DTA 14-JAN-2003 CNV 0000021 DTA 25-MAR-2012 DTA 0000021 DTA 26-MAR-2012 DTA ################################################# 0000021 DTA ... (4 Replies)
Discussion started by: krupasindhu18
4 Replies

4. Shell Programming and Scripting

Compare two files with different number of records and output only the Extra records from file1

Hi Freinds , I have 2 files . File 1 |nag|HYd|1|Che |esw|Gun|2|hyd |pra|bhe|3|hyd |omu|hei|4|bnsj |uer|oeri|5|uery File 2 |nag|HYd|1|Che |esw|Gun|2|hyd |uer|oi|3|uery output : (9 Replies)
Discussion started by: i150371485
9 Replies

5. Shell Programming and Scripting

sort the files based on timestamp and execute sorted files in order

Hi I have a requirement like below I need to sort the files based on the timestamp in the file name and run them in sorted order and then archive all the files which are one day old to temp directory My files looks like this PGABOLTXML1D_201108121235.xml... (1 Reply)
Discussion started by: saidutta123
1 Replies

6. Shell Programming and Scripting

PERL Scripting: Diff 2 files and save output to file3

Hi, I need to create a script to compare 2 files and store the output in a 3rd file. This is how I do manually, but since I need to do this for about 150 files every week, I am trying to automate it using perl. diff -u file1 file2 > file3.patch For my script, - I have 2 files... (4 Replies)
Discussion started by: script2010
4 Replies

7. UNIX for Dummies Questions & Answers

cat file1 file2 > file3

file1 has pgap500 500 file2 has bunch of data cat file1 file2 > file3 cp file2 file3.dat then vi pgap500 500 onto 1st line compare file3 and fil3.dat, they are not the same. any idea ? the 1st line, i want to put pg500 xxx ---------- Post updated at 07:35 AM ---------- Previous... (2 Replies)
Discussion started by: tjmannonline
2 Replies

8. Shell Programming and Scripting

How to arange records in a particular order?

Hi guys, I have a problem please help if you have any solutions. I have two files. FILE1 having records separated by '>' FILE1 >LOG_Ps04g30040.1|12004.m08110|test lc-like prot, test1 MGASPSREEAHSNSSFSGNGKAMAVASSASSSGSNQAQSKRAPALHMFQEIVAEKDFTAS LPKQ* >ab|22329085|xyz|PP_194957.2|... (4 Replies)
Discussion started by: sam_2921
4 Replies

9. Shell Programming and Scripting

Find duplicates from multuple files with 2 diff types of files

I need to compare 2 diff type of files and find out the duplicate after comparing each types of files: Type 1 file name is like: file1.abc (the extension abc could any 3 characters but I can narrow it down or hardcode for 10/15 combinations). The other file is file1.bcd01abc (the extension... (2 Replies)
Discussion started by: ricky007
2 Replies

10. Shell Programming and Scripting

diff 2 files; output diff's to 3rd file

Hello, I want to compare two files. All records in file 2 that are not in file 1 should be output to file 3. For example: file 1 123 1234 123456 file 2 123 2345 23456 file 3 should have 2345 23456 I have looked at diff, bdiff, cmp, comm, diff3 without any luck! (2 Replies)
Discussion started by: blt123
2 Replies
Login or Register to Ask a Question