If I needed to get this done quickly, I would make use of the usual *nix commands. I would add the file name to each line, then manipulate the results so that I had a single file, sort it, collect the lines on which the data items were the same, and then filter for lines which had exactly 2 fields. For example:
producing for your data:
The awk script is for this specific instance. If this was going to be a on-going task, I would write a more general multi-file join, and have a self-join mode when only one file was specified. In fact, all the operations could probably be placed into the perl code, so that the data need be touched a minimum of times.
See my other post on sdiff ....
I don't think sdiff is able to do what I want.
The 'comm' command does what I need and works fine as
far as the logic and results.
The problem I'm having is with the output format, it outputs 3 columns of data,
but because of the way it starts each line... (2 Replies)
Hi all,
I need help in comm command , I am having 2 files . I have to display the common line in the two file only onnce and i have to also display the non common line as well.
tmpcut1 -- First file
cat tmpcut1
smstr_303000_O_432830_... f_c2_queue_sys30.sys30 RUNNING 10 1000... (1 Reply)
Hello All,
I am writing a file comparison utility and using the cmp command to compare 2file. But I need command that will compare 2 files and if the files are identical expect for differences in white spaces, then it should ignore those spaces and consider the two files equal. Is there a way to... (7 Replies)
Hi,
I have a Master file (file.txt) with good and bad records( records with unicode characters). I ahve a file with only bad records (bad.txt)
I want the records in file.txt which are not present in bad.txt ie only the good records.
I tried comm -23 file.txt bad.txt
It is giving... (14 Replies)
I need to compare 2 files. I need to see if 1 file has records that are not in a second file. I did some searching and found the 'comm' command. According to the man pages
comm -23 test1.txt test2.txt
Will tell me what is in file 1 and not in file 2. So I did a simple test
test1.txt has the... (3 Replies)
Hello , I am trying to get contents which are only present in a.csv ,so using comm -23
cat a.csv | sort > a.csv
cat b.csv | sort > b.csv
comm -23 a.csv b.csv > c.csv.
a.csv
SKU COUNTRY CURRENCY PRICE_LIST_TYPE LIST_PRICE_EFFECTIVE_DATE
TG430ZA ZA USD DF ... (4 Replies)
The manual does not cover this very well. What do the following compares will do ?
1) comm -13 file1 file2: will it display what is in file2 not in file1?
2) comm -23 file1 file2: will it display what in 1 but not in 2 ?
Thanks (5 Replies)
Hello all ,
I have two files a.txt and b.txt which have same content . They contain data that is fetched from database through a java program. When I delete a line in a.txt and run the below command
comm -13 a.txt b.txt
I am not getting the expected result i.e. the line i deleted from... (5 Replies)
Discussion started by: RaviTej
5 Replies
LEARN ABOUT V7
join
JOIN(1) General Commands Manual JOIN(1)NAME
join - relational database operator
SYNOPSIS
join [ options ] file1 file2
DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If file1 is `-', the standard
input is used.
File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in
each line.
There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con-
sists of the common field, then the rest of the line from file1, then the rest of the line from file2.
Fields are normally separated by blank, tab or newline. In this case, multiple separators count as one, and leading separators are dis-
carded.
These options are recognized:
-an In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.
-e s Replace empty output fields by string s.
-jn m Join on the mth field of file n. If n is missing, use the mth field in each file.
-o list
Each output line comprises the fields specifed in list, each element of which has the form n.m, where n is a file number and m is a
field number.
-tc Use character c as a separator (tab character). Every appearance of c in a line is significant.
SEE ALSO sort(1), comm(1), awk(1)BUGS
With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort.
The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous.
JOIN(1)