Script to find NOT common strings in two files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Script to find NOT common strings in two files
# 15  
Old 03-22-2011
Quote:
Originally Posted by DGPickett
Code:
comm -13 <(sort -u file1) <(sort -u file2)

Of course, the <(...) only works on Solaris

It works on any system with bash installed. It is a feature of bash, not the system.
# 16  
Old 03-24-2011
I wonder how bash does that! For ksh users, it comes and goes, and if it was easy to have all the time, I'd think David K would go that way. Using truss/tusc/strace, I see bash is managing named pipes for this (and too many /, no pipe cleanup? -- I just emailed the bash devs and DGK.):

Code:
$ truss -faelpo /tmp/bash.tr bash -c 'comm -13 <(sort -u .profile) <(sort -u .profile)'
view /tmp/bash.tr
 .
 .
 .
[12884]{17507} lstat64("/var/tmp//sh-np-1300956853", 0x7b0f6418) ERR#2 ENOENT
[12884]{17507} mknod("/var/tmp//sh-np-1300956853", S_IFIFO|0600, 0 0x000000) = 0
 .
 .
 .
[12884]{17507} execve(0x4003efc8, 0x4003ec68, 0x4003e008)  [entry]
                              argv[0] @ 0x4003dda8: "comm"
                              argv[1] @ 0x400212d8: "-13"
                              argv[2] @ 0x4003ef88: "/var/tmp//sh-np-1300956853"
                              argv[3] @ 0x4003edc8: "/var/tmp//sh-np-3600645176"
 .
 .
 .
[12884]{17507} open("/var/tmp//sh-np-1300956853", O_RDONLY|O_LARGEFILE, 0666) =
7
[12885]{17510} open("/var/tmp//sh-np-1300956853", O_WRONLY|O_LARGEFILE, 0166600)
 = 6
 .
 .
 .
[12885]{17510} dup2(6, 1) ................................ = 1
 
and no cleanup:
 
$ ls -l /var/tmp/sh-np-*
prw-------   1 nbkodln    develop          0 Mar 24 09:52 /var/tmp/sh-np-1300956853
prw-------   1 nbkodln    develop          0 Mar 24 09:47 /var/tmp/sh-np-1300959986
prw-------   1 nbkodln    develop          0 Mar 24 09:46 /var/tmp/sh-np-1300964951
prw-------   1 nbkodln    develop          0 Mar 24 09:45 /var/tmp/sh-np-1300966577
prw-------   1 nbkodln    develop          0 Mar 24 09:45 /var/tmp/sh-np-1300973486
prw-------   1 nbkodln    develop          0 Mar 24 09:48 /var/tmp/sh-np-1300985557
prw-------   1 nbkodln    develop          0 Mar 24 09:45 /var/tmp/sh-np-3600617851
prw-------   1 nbkodln    develop          0 Mar 24 09:46 /var/tmp/sh-np-3600620375
prw-------   1 nbkodln    develop          0 Mar 24 09:47 /var/tmp/sh-np-3600639559
prw-------   1 nbkodln    develop          0 Mar 24 09:52 /var/tmp/sh-np-3600645176
prw-------   1 nbkodln    develop          0 Mar 24 09:45 /var/tmp/sh-np-3600657288
prw-------   1 nbkodln    develop          0 Mar 24 09:48 /var/tmp/sh-np-3600666071
$

I prefer the <() to the >(), as the latter spawns a background job with job id display and such. I am a big fan of pipeline parallelism, low latency through pipes and no scripted temp files to have name collisions and cleanup.

The named pipe is in the middle, nicer than temp files but demanding pre-creation and, hopefully, cleanup. Also, named pipes can persist with a left over process waiting in vain for a partner. They are more appropriate in a service paradigm.

Last edited by DGPickett; 03-24-2011 at 11:53 AM..
# 17  
Old 03-24-2011
Quote:
Originally Posted by DGPickett
I wonder how bash does that! For ksh users, it comes and goes, and if it was easy to have all the time, I'd think David K would go that way.

Sorry, my mistake:

"Process substitution is supported on systems that support named pipes (FIFOs) or the /dev/fd method of naming open files."
# 18  
Old 03-24-2011
Quote:
Originally Posted by DGPickett
(and too many /, no pipe cleanup? -- I just emailed the bash devs and DGK.):
I can confirm, my /tmp directory here has dozens of sh-np-* files (I'm on version 3.2.16(1))

edit:

Yep found it: you can tell the bash devs in your email if you like:
There are a whole heap of calls to unlink_fifo_list(); in execute_cmd.c that need to have the test defined(HAVE_DEV_FD) changed to !defined(HAVE_DEV_FD)

Last edited by Chubler_XL; 03-24-2011 at 06:34 PM..
# 19  
Old 03-25-2011
I referred them here -- why copy when you can point, eh?

Really, they also need to kill what has the pipe open, like:
Code:
kill -9 $(fuser pipe_path) 2>/dev/null

That might be more cleanup than the /dev/fd/# versions do, but cleanup is good.

Still, it'd be nice if all UNIX had fd in file tree, as I also get lots of mileage out of /dev/stdin, /dev/stderr and /dev/stdout, to get things back onto pipes or into one log with commands that have their heads in the file-no-pipe sand. I guess you can <(cat) or >(cat), but that is a waste and delay.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find common files between two directories

I have two directories Dir 1 /home/sid/release1 Dir 2 /home/sid/release2 I want to find the common files between the two directories Dir 1 files /home/sid/release1>ls -lrt total 16 -rw-r--r-- 1 sid cool 0 Jun 19 12:53 File123 -rw-r--r-- 1 sid cool 0 Jun 19 12:53... (5 Replies)
Discussion started by: sidnow
5 Replies

2. Shell Programming and Scripting

Find Common Values Across Two Files

Hi All, I have two files like below: File1 MYFILE_28012012_1112.txt|4 MYFILE_28012012_1113.txt|51 MYFILE_28012012_1114.txt|57 MYFILE_28012012_1115.txt|57 MYFILE_28012012_1116.txt|57 MYFILE_28012012_1117.txt|57 File2 MYFILE_28012012_1110.txt|57 MYFILE_28012012_1111.txt|57... (2 Replies)
Discussion started by: angshuman
2 Replies

3. Shell Programming and Scripting

Need the script to remove common strings,tags etc

I have a file say "example.xml" and the contents of this example.xml are <project name="platform/packages/wallpapers/Basic" path="packages/wallpapers/Basic" revision="225e410f054c4ad5c828b0fec9be1b47c4376711"/> <project name="platform/packages/wallpapers/Galaxy4"... (3 Replies)
Discussion started by: acdc
3 Replies

4. Shell Programming and Scripting

Script to find NOT common strings in two files

Hi all, I'd like you to help or give any advise about the following: I have two (2) files, file1 and file2, both files have information common to each other. The contents of file1 is a subset of the contents of file2: file1: errormsgadmin esdp esgservices esignipa iprice ipvpn irm... (0 Replies)
Discussion started by: hnux
0 Replies

5. UNIX for Advanced & Expert Users

Find common Strings in two large files

Hi , I have a text file in the format DB2: DB2: WB: WB: WB: WB: and a second text file of the format Time=00:00:00.473 Time=00:00:00.436 Time=00:00:00.016 Time=00:00:00.027 Time=00:00:00.471 Time=00:00:00.436 the last string in both the text files is of the... (4 Replies)
Discussion started by: kanthrajgowda
4 Replies

6. Shell Programming and Scripting

Simple script to find common strings in two files

Hi , I want to write a simple script. I have two files file1: BCSpeciality Backend CB CBAPQualDisp CBCimsVFTRCK CBDSNQualDisp CBDefault CBDisney CBFaxMCGen CBMCGeneral CBMCQualDisp file2: CSpeciality Backend (8 Replies)
Discussion started by: ramky79
8 Replies

7. UNIX for Dummies Questions & Answers

how to find common words and take them out from two files

Hi, everyone, Let's say, we have xxx.txt A 1 2 3 4 5 C 1 2 3 4 5 E 1 2 3 4 5 yyy.txt A 1 2 3 4 5 B 1 2 3 4 5 C 1 2 3 4 5 D 1 2 3 4 5 E 1 2 3 4 5 First I match the first column I find intersection (A,C, E), then I want to take those lines with ACE out from yyy.txt, like A 1... (11 Replies)
Discussion started by: kaixinsjtu
11 Replies

8. Shell Programming and Scripting

Files common in two sets ??? How to find ??

Suppose we have 2 set of files set 1 set 2 ------ ------ abc hgb def ppp mgh vvv nmk sdf hgb ... (1 Reply)
Discussion started by: skyineyes
1 Replies

9. Shell Programming and Scripting

To find all common lines from 'n' no. of files

Hi, I have one situation. I have some 6-7 no. of files in one directory & I have to extract all the lines which exist in all these files. means I need to extract all common lines from all these files & put them in a separate file. Please help. I know it could be done with the help of... (11 Replies)
Discussion started by: The Observer
11 Replies
Login or Register to Ask a Question