12-22-2008
here is my small test time for taking grep method and my method
shell>wc -l file*
28684 file1
4993 file2
shell> time grep -f file1 file2 |wc -l
4953
real 5m34.005s
user 5m26.267s
sys 0m0.785s
shell> time sort file1 file2 | uniq -d | wc -l
4953
real 0m0.224s
user 0m0.218s
sys 0m0.006s
shell> time comm file1 file2 -12 | wc -l
4953
real 0m0.005s
user 0m0.004s
sys 0m0.001s
all three are givig same output
method 1:-comm is efficient but if input files are not in sorting order this method wont work you need to create temporary files for both file1 and file2. If you create temperary sorted file1.sort and file2.sort it takes same time as sort&uniq method
method 2:- grep method is not efficient its taking 5min 34sec you use this method if file1 contains 2 or more feilds and file2 contains one feild cat file1 | grep -f file2
I thik grep process like it will read one line from file2 and grep the line from file1
while read line
do
grep $line file1
done<file2
Last edited by reddyrajal; 12-23-2008 at 02:44 AM..
10 More Discussions You Might Find Interesting
1. UNIX for Advanced & Expert Users
Does anyone know what's new with Efficient dispatching in the Solaris 2.8 release (vs Solaris 2.6) release?
Specifically, does anyone know of a good website to get detailed information on thread dispatching using efficient dispatching in solaris 2.8?
Thank you. (1 Reply)
Discussion started by: uchachra
1 Replies
2. Shell Programming and Scripting
I'm using korn shell to connect to oracle, retrieve certain values, put them in a list, and iterate through them. While this method works, I can't help but think there is an easier method.
If you know of one, please suggest a shorter, more efficient method.
############### FUNCTIONS ... (6 Replies)
Discussion started by: SelectSplat
6 Replies
3. Shell Programming and Scripting
Hello,
We have a directory with 15 sub-directories where each sub-directory contains 1.5 to 2 lakhs of files in it. Daily, around 300-500 files will be uploaded to each sub-directory.
Now, i need to get the list of files received today in most efficient way. I tried using "find with newer... (16 Replies)
Discussion started by: prvnrk
16 Replies
4. UNIX for Advanced & Expert Users
some of the data i receive has been typed in manually due to which there are often places where i find 8 instead of ( and the incorrect use of case
what according to you is the best way to correct such data.
The data has around 20,000 records.
The value i want to change is in the 4th field.... (2 Replies)
Discussion started by: VGR
2 Replies
5. UNIX for Dummies Questions & Answers
I want to match the red portion:
9784323456787-Unknown Phrase with punctuation "Some other PhrASE."
Is this the best regex to match this?
'978\{10\}-*' (4 Replies)
Discussion started by: glev2005
4 Replies
6. Shell Programming and Scripting
How to reverse search for a matched string in a file. Get line# of the first matched line. I am getting '2' into 'lineNum' variable.
But it feels like I am using too many commands. Is there a better more efficiant way to do this on Unix?
abc.log
aaaaaaaaaaaaa
bbbbbbbbbbbbb... (11 Replies)
Discussion started by: kchinnam
11 Replies
7. Shell Programming and Scripting
Hello guys
My requirement is to read a file with parent-child relationship
we need to iterate through each row to find its latest child.
for eg. parent child
ABC PQR
PQR DEF
DEF XYZ
Expected Output
ABC XYZ
PQR XYZ
DEF XYZ
Script Logic :
read parent from file
seach child... (4 Replies)
Discussion started by: joshiamit
4 Replies
8. Shell Programming and Scripting
Hi,
I have an XML file with around 1 billion rows in it and i am trying to find the number of times a particular tag occurs in it. The solution i am using works but takes a lot of time (~1 hr) .Please help me with an efficient way to do this.
Lets say the input file is
<Root>
... (13 Replies)
Discussion started by: Sheel
13 Replies
9. Shell Programming and Scripting
Hi Bigshots,
I have a pattern file with two columns. I have another data file. If column 1 in the pattern file appears as the 4th column in the data file, I need to replace it (4th column of data file) with column 2 of the pattern file. If the pattern is found in any other column, it should not... (6 Replies)
Discussion started by: ss112233
6 Replies
10. Shell Programming and Scripting
I have one array SPLNO with approx 10k numbers.Now i want to search the subscriber number from MDN.TXT file (containing approx 1.5 lac record)from the array.if subscriber number found in array it will perform below operation.my issue is that it's taking more time because for one number it's search... (6 Replies)
Discussion started by: siramitsharma
6 Replies
XZDIFF(1) XZ Utils XZDIFF(1)
NAME
xzcmp, xzdiff, lzcmp, lzdiff - compare compressed files
SYNOPSIS
xzcmp [cmp_options] file1 [file2]
xzdiff [diff_options] file1 [file2]
lzcmp [cmp_options] file1 [file2]
lzdiff [diff_options] file1 [file2]
DESCRIPTION
xzcmp and xdiff invoke cmp(1) or diff(1) on files compressed with xz(1), lzma(1), gzip(1), or bzip2(1). All options specified are passed
directly to cmp or diff. If only one file is specified, then the files compared are file1 (which must have a suffix of a supported com-
pression format) and file1 from which the compression format suffix has been stripped. If two files are specified, then they are uncom-
pressed if necessary and fed to cmp(1) or diff(1). The exit status from cmp or diff is preserved.
The names lzcmp and lzdiff are provided for backward compatibility with LZMA Utils.
SEE ALSO
cmp(1), diff(1), xz(1), gzip(1), bzip2(1), zdiff(1)
BUGS
Messages from the cmp(1) or diff(1) programs refer to temporary filenames instead of those specified.
Tukaani 2009-07-05 XZDIFF(1)