10-08-2008
The files are like this
File1
12345
56789
23456
File2
12345 fsfsdf 76775
23456 ytyy 090890
66444 rytry 878878
The out put should be
12345 fsfsdf 76775
23456 ytyy 090890
The file1 contains arround 1 million lines file2 has 2.5 million lines
pls help..
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I have to compare two text files, very few of the lines in these files will have some difference in some column.
The files size is in GB.
Sample lines are as below:
11111122222222333333aaaaaaaaaabbbbbbbbbccccccccdddddd
11111122222222333333aaaaaaaaaabbbbbbbbbccccccccddeddd
So assuming these... (19 Replies)
Discussion started by: net_shree
19 Replies
2. UNIX for Dummies Questions & Answers
Hello all,
Can anyone help me with this.
There are two files and I have to match the second file records with that of first and if matched, print the output in two fies, one containing the matched records and other containing the rest.
Here is the example.
File1
"111",erter,"00000", ... (4 Replies)
Discussion started by: er_ashu
4 Replies
3. Shell Programming and Scripting
I have two text files which have records of thousand rows. Each row is having around 40 columns. Each column is tab delimited. Each row is delimited by newline character.
My requirement is to find for each row i need to find whether any column is different between the two files. For each row i... (8 Replies)
Discussion started by: uihnybgte
8 Replies
4. Shell Programming and Scripting
Hi i have 2 csv files a.csv and b.csv with the same number of columns and a list of values in both of it. Each and every individual value in both the files need to compared and if it matches then print correct in a new csv file otherwise print Incorrect
eg
a.csv
1,12/27/2007,Reward,$10.00... (5 Replies)
Discussion started by: naveenn08
5 Replies
5. Shell Programming and Scripting
now i have a different file zoo.txt with content
123|zoo
234|natan
456|don
and file rick.txt with contents
123|dog|pie|pep
123|tail|see|newt
456|som|sin|sim
234|pay|rat|cat
i want to look for lines in file zoo.txt column1 that has same corresponding lines in column 1 of... (6 Replies)
Discussion started by: dealerso
6 Replies
6. Shell Programming and Scripting
Hi all,
i have two .csv files. i need to compare those two files and if there is any difference that should be moved into third .csv file.
example,
org.csv and dup.csv
when we compare those two files org.csv and dup.csv. if there is any change in dup.csv. it should be capture in third... (7 Replies)
Discussion started by: baskivs
7 Replies
7. Shell Programming and Scripting
Hello, I am trying to compare 2 files and get only the new lines as output. Note that new lines can be anywhere in the file and not necessarily at the bottom of the file.
I have made the following progress so far.
/home/aa>cat old.txt
0001 732 A
0002 732 C
0005 732 D... (7 Replies)
Discussion started by: cartrider
7 Replies
8. Shell Programming and Scripting
Dear All,
I would really appreciate if you can help me to resolve this file comparison
I have two files:
file1:
chr start end ID gene_name
chr1 2020 3030 1 test1
chr1 900 5000 2 test1
chr2 5000 8000 3 test2
chr3 6000 12000 4 test3
chr3 6000 15000 5 test3
file2:... (2 Replies)
Discussion started by: paolo.kunder
2 Replies
9. Shell Programming and Scripting
Hi,
I have fileA.txt like this.
B01B02 D0011718
B01B03 D0012540
B01B04 D0006145
B01B05 D0004815
B01B06 D0012069
B01B07 D0004064
B01B08 D0011988
B01B09 D0012071
B01B10 D0005596
B01B11 D0011351
B01B12 D0004814
B01C01 D0011804
I want to compare this against another file (fileB.txt)... (3 Replies)
Discussion started by: genehunter
3 Replies
10. Shell Programming and Scripting
HI,
I have two files and contains many Fields with | (pipe) delimitor, wanted to compare both the files and get only unmatched perticular fields. this i wanted to use in shell scriting.
ex:
first.txt
111 |abc| 230| hbc231 |bbb |210 |bbd405 |ghc |555 |cgv
second.txt
111 |abc |230 |hbc231... (1 Reply)
Discussion started by: prawinmca
1 Replies
JOIN(1) General Commands Manual JOIN(1)
NAME
join - relational database operator
SYNOPSIS
join [ options ] file1 file2
DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If one of the file names is the
standard input is used.
File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in
each line.
There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con-
sists of the common field, then the rest of the line from file1, then the rest of the line from file2.
Input fields are normally separated spaces or tabs; output fields by space. In this case, multiple separators count as one, and leading
separators are discarded.
The following options are recognized, with POSIX syntax.
-a n In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.
-v n Like -a, omitting output for paired lines.
-e s Replace empty output fields by string s.
-1 m
-2 m Join on the mth field of file1 or file2.
-jn m Archaic equivalent for -n m.
-ofields
Each output line comprises the designated fields. The comma-separated field designators are either 0, meaning the join field, or
have the form n.m, where n is a file number and m is a field number. Archaic usage allows separate arguments for field designators.
-tc Use character c as the only separator (tab character) on input and output. Every appearance of c in a line is significant.
EXAMPLES
sort /adm/users | join -t: -a 1 -e "" - bdays
Add birthdays to password information, leaving unknown birthdays empty. The layout of is given in users(6); bdays contains sorted
lines like
tr : ' ' </adm/users | sort -k 3 3 >temp
join -1 3 -2 3 -o 1.1,2.1 temp temp | awk '$1 < $2'
Print all pairs of users with identical userids.
SOURCE
/sys/src/cmd/join.c
SEE ALSO
sort(1), comm(1), awk(1)
BUGS
With default field separation, the collating sequence is that of sort -b -ky,y; with -t, the sequence is that of sort -tx -ky,y.
One of the files must be randomly accessible.
JOIN(1)