08-09-2007
Grep Line with Matching Fields
Below is the scenario. Help is appreciated.
File1: ( 500,000 lines ) : Three fields comma delimited : Not sorted
1234FAA,435612,88975
1224FAB,12345,212356
File2: ( 4,000,000 lines ) : Six fields comma delimited (Last 3 field should match the 3 fields of File1) : Not Sorted :
0123456abcd,12345,abcdef,1234FAA,435612,88975
0123456wxyz,11234,lmnopq,1224FAB,12345,212356
I need to grab all the six fields for file2 when there is a match of first 3 fields of file1 and last 3 fields of file2.
I wrote a small script but seems like it might take days to complete :-)
1 #!/bin/ksh
2
3 while read record
4 do
5 cat file2 | grep "$record" >> final.list
6 done < file1
Can someone help me with a faster solution?
Thanks in advance.
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi all,
I am trying to zgrep / grep list of files so that it displays only the matching filename:line number and does not display the whole line, like:
(echo "1.txt";echo "2.txt") | xargs zgrep -no STRING
If I use -o option, it displays the matching STRING and if not used, displays the... (3 Replies)
Discussion started by: vvaidyan
3 Replies
2. Shell Programming and Scripting
Dear experts,
I have a file1 that looks like
60127930928 2091
60129382039 2092
60126382937 2091
60128937928 2061
60127329389 2062
60123748730 2061
60128730293 2061
and file 2 that looks like
60127930928 2091
60129382039 2092
60126382937 2093
60128937928 2061
60127329389... (2 Replies)
Discussion started by: aismann
2 Replies
3. Shell Programming and Scripting
I have a file (key.dat) that contains two columns:
AA|1234|
BB|567|
CC|8910|
I have another file (extract.dat) that contains some data:
SD|458|John|Smith|
AA|3345|Frank|Williams|
AA|1234|Bill|Garner|
BD|0098|Yu|Lin|
BB|567|Gail|Hansen|
CC|8910|Ken|Nielsen|
I want to compare the... (5 Replies)
Discussion started by: ChicagoBlues
5 Replies
4. Shell Programming and Scripting
Hi all,
I want to display line number for matching string in a file. can anyone please help me.
I used
grep -n "ABC" file
so it displays
6 ABC.
But i only want to have line number,i don't want that it should prefix matching context with line number.
Actually my original... (10 Replies)
Discussion started by: sarbjit
10 Replies
5. Linux
Hi
I am having 2 fields and if f1=f2 i wanna print that line
eg
1 2
1 3
1 9
2 2
3 5
9 9
In the abov eg. the highlighted lines shud be printed
2 2
9 9
Thanking u (3 Replies)
Discussion started by: binnybio
3 Replies
6. Shell Programming and Scripting
Hello,
I have two files file 1 and file 2 each having result of a query on certain database tables and need to compare for Col1 in file1 with Col3 in file2, compare Col2 with Col4 and output the value of Col1 from File1 which is a) not present in Col3 of File2 b) value of Col2 is different from... (2 Replies)
Discussion started by: RasB15
2 Replies
7. Shell Programming and Scripting
Hi,
I have a log file that looks like this
"delete" : {
"_type" : "cl",
"_id" : "1000600000000562636",
"_version" : 1,
"status" : 200,
"found" : false
}
}, {
"delete" : {
"_type" : "cl",
"_id" : "1000600000000562643",
... (4 Replies)
Discussion started by: wahi80
4 Replies
8. Shell Programming and Scripting
In the below I am trying to use awk to match all the $13 values in input, which is tab-delimited,
that are in $1 of gene which is just a single column of text.
However only the line with the greatest $9 value in input needs to be printed.
So in the example below all the MECP2 and LTBP1... (0 Replies)
Discussion started by: cmccabe
0 Replies
9. UNIX for Beginners Questions & Answers
I have a file with a list of references towards the end and want to apply a grep for some string.
text ....
@unnumbered References
@sp 1
@paragraphindent 0
2017. @strong{Chalenski, D.A.}; Wang, K.; Tatanova, Maria; Lopez,
Jorge L.; Hatchell, P.; Dutta, P.; @strong{Small airgun... (1 Reply)
Discussion started by: kristinu
1 Replies
10. Shell Programming and Scripting
I have multiple (~80) files (some can be as big as 30GB of >1 billion of lines!) to grep on a pattern, and piped the match to a single file. I have a 96-core machine so that each grep job was sent to the background to speed up the search:
file1.tab
chr1A_part1 123241847 123241848... (6 Replies)
Discussion started by: yifangt
6 Replies
LEARN ABOUT OPENSOLARIS
comm
comm(1) User Commands comm(1)
NAME
comm - select or reject lines common to two files
SYNOPSIS
comm [-123] file1 file2
DESCRIPTION
The comm utility reads file1 and file2, which must be ordered in the current collating sequence, and produces three text columns as output:
lines only in file1; lines only in file2; and lines in both files.
If the input files were ordered according to the collating sequence of the current locale, the lines written will be in the collating
sequence of the original lines. If not, the results are unspecified.
OPTIONS
The following options are supported:
-1 Suppresses the output column of lines unique to file1.
-2 Suppresses the output column of lines unique to file2.
-3 Suppresses the output column of lines duplicated in file1 and file2.
OPERANDS
The following operands are supported:
file1 A path name of the first file to be compared. If file1 is -, the standard input is used.
file2 A path name of the second file to be compared. If file2 is -, the standard input is used.
USAGE
See largefile(5) for the description of the behavior of comm when encountering files greater than or equal to 2 Gbyte ( 2^31 bytes).
EXAMPLES
Example 1 Printing a list of utilities specified by files
If file1, file2, and file3 each contain a sorted list of utilities, the command
example% comm -23 file1 file2 | comm -23 - file3
prints a list of utilities in file1 not specified by either of the other files. The entry:
example% comm -12 file1 file2 | comm -12 - file3
prints a list of utilities specified by all three files. And the entry:
example% comm -12 file2 file3 | comm -23 -file1
prints a list of utilities specified by both file2 and file3, but not specified in file1.
ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of comm: LANG, LC_ALL, LC_COLLATE,
LC_CTYPE, LC_MESSAGES, and NLSPATH.
EXIT STATUS
The following exit values are returned:
0 All input files were successfully output as specified.
>0 An error occurred.
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWesu |
+-----------------------------+-----------------------------+
|CSI |enabled |
+-----------------------------+-----------------------------+
|Interface Stability |Standard |
+-----------------------------+-----------------------------+
SEE ALSO
cmp(1), diff(1), sort(1), uniq(1), attributes(5), environ(5), largefile(5), standards(5)
SunOS 5.11 3 Mar 2004 comm(1)