09-14-2009
one change to it pls....

This also works fine.
I'm sorry, Can I have one small change while writing the output alone.
For the data missing in first file, it has to go into a separate file while the data missing in second file, it should be in another file.
09-14-2009
swame_sp
Should be something like this:

awk -F"|" 'NR==FNR{a[$1];next}
$1 in a{a[$1]++;next}
{print > "NotInFirstfile"}
END{for(i in a){if(!a[i]){print i}}}
' Firstfile Secondfile > NotInSecondfile

09-15-2009

I would really appreciate if you can explain how it works.
Trying to learn....

I have not specified any file name but still it works fine.... how is that possible?
How does it identify the source files for creating the above two new files.?


09-15-2009
Set fieldseparator.


Define an array a with the first field as index if we read the first file.

The next lines are for processing the second file:

$1 in a{a[$1]++;next}

If the first field is defined in array a increase the value of the array with 1 (line is present in both files) and read the next line.

{print > "NotInFirstfile"}

If the first field is NOT defined in array a print the line to the file "NotInFirstfile".

END{for(i in a){if(!a[i]){print i}}}

At last we print the elements of the array a with the value 0 (not increased when we read the second file).

' Firstfile Secondfile > NotInSecondfile

Firstfile and Secondfile are the input files, the prints of the END section are redirected to the file NotInSecondfile.

09-15-2009
Franklin52
But the use of 4 external programs is not the most efficient way, try this:

awk is the least efficient program to use. If you look at awk binary code, it is half as big as ksh. Means you are loading this on top of shell you are using. On top of that it makes code hard to debug and read and prevents programmers gaining in-depth experience with UNIX. It was developed at time when only sh was available and it could not process and format character strings. This need vanished with advent of ksh and bash. Awk right now is a crutch for people that never really learned UNIX commands.

This is character count on binaries:

# cd /usr/bin
# wc -c ksh
  171412 ksh
# wc -c awk
   80184 awk
# wc -c sort
    5816 sort
# wc -c uniq
   10036 uniq
# wc -c cut
    9928 cut

Which of these take more computer resources?
09-15-2009
I am a awk *and* ksh user and I tend to pickup the right tool for the job. And for the question raised in the OP, awk *is* the right tool. Try to achieve the same result with ksh - or any other shell - with just one line of code. Oh, and awk will be _much_ faster also.
09-15-2009
Except syntax is simpler without awk. Mine was also one line of code and faster. You can test that with command "time". As you noticed I did not have to explain syntax to user.
