03-03-2011
awk column comparison big file
Hi all,
I would like to compare a column in one file to a column in another file and when there is a match it prints the first column and the corresponding second column. Example
File1
ABA
ABC
ABE
ABF
File 2
ABA 123
ABB 124
ABD 125
ABC 126
So what I would like printed to a file is
ABA 123
ABC 126
The only thing is that in file 1 there are 8,000 columns while in file 2 there are 140,000 columns to search for. I have tried awk and grep-f but it doesn't work or is very slow?
Any quick solutions?
Thanks
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Morning guys. Another day another question. :rolleyes:
I am knocking up a script to pull some data from a file. The problem is the file is very big (up to 1 gig in size), so this solution:
for results in `grep "^\
... works, but takes ages (we're talking minutes) to run. The data is held... (8 Replies)
Discussion started by: dlam
8 Replies
2. Shell Programming and Scripting
Hello there,
I'm trying to write an awk program in bash shell with the following three input files:
File 1
1001 1 2 3
1002 4 5 6
1003 7 8 9
1004 10 11 12
File 2
1001 11 22 33
1002 44 55 66
1004 100 111 122
... (4 Replies)
Discussion started by: kbirde
4 Replies
3. Shell Programming and Scripting
Hi
I have two files, one is 1.6 GB. I would like to add one extra column of information to the large file at a specific location (after its 2nd column).
For example:
File 1 has two columns more than 1000 rows like this
MM009987 1
File 2 looks like this
MM00098 MM00076 3 4 2 4 2... (1 Reply)
Discussion started by: sogi
1 Replies
4. Shell Programming and Scripting
- I am looking for different kind of awk solution which I don't think is mentioned before in these forums.
Number of rows in the file are fixed
Their are two columns in file1.txt
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
I am looking for 3... (1 Reply)
Discussion started by: softwarekids23
1 Replies
5. Shell Programming and Scripting
Hi,
Can any one help with my below requirement.
i need to compare each line by line and in each line i have to compare some columns values with previous line column values in perl script.
Can any one help me........! its very urgent.
Thanks (3 Replies)
Discussion started by: jam_prasanna
3 Replies
6. Shell Programming and Scripting
INPUT SAMPLE
Symmetrix ID : 000192601507
Masking View Name : TS00P22_13E_1
Last updated at : 05:10:18 AM on Tue Mar 22,2011
Initiator Group Name : 10000000c960b9cd
Host Initiators
{
WWN : 10000000c960b9cd
}
Port Group Name :... (8 Replies)
Discussion started by: greycells
8 Replies
7. Shell Programming and Scripting
my files are as follows
fileA sepearated by tab /t
00 lieferungen
00 attractiop
01 done
02 forness
03 rasp
04 alwaysisng
04 funny
05 done1
fileB
funnymou120112
funnymou234470
mou3raspnhdhv
rddfgmoudone1438748
so all those record which are greater than 3 and which are not... (6 Replies)
Discussion started by: rajniman
6 Replies
8. Shell Programming and Scripting
Hi All,
i have two files file1 ,file 2
file 1
col1|col2|col3|col4|col5|col6|col7|col8
11346925|0|2009-09-20|9999-12-31|100|0
11346925|0|2009-09-20|9999-12-31|120|0
12954311|0|2009-09-11|9999-12-31|100|0
12954311|0|2009-07-23|2999-12-31|120|0
12954312|0|2009-09-11|9999-12-31|100|0... (9 Replies)
Discussion started by: mohanalakshmi
9 Replies
9. Shell Programming and Scripting
Hi Guys,
I am having two requirement in one of my scripts. please help out to find a fast solution using AWK (since there is lot of data to be processed)
1) First snippet -
File1 has two columns and file2 has three columns
If any value of column 1 of file1 matches with column 1... (4 Replies)
Discussion started by: stormfield
4 Replies
10. Shell Programming and Scripting
Hi,
I have a file1 whose 17th column needs to be checked if it exists in between the values of column 2 & column 3 as mentioned in another file2. Output of the matched value to be put in separate file 3 & 4.
File1:
... (10 Replies)
Discussion started by: siramitsharma
10 Replies
LEARN ABOUT CENTOS
funtbl
funtbl(1) SAORD Documentation funtbl(1)
NAME
funtbl - extract a table from Funtools ASCII output
SYNOPSIS
funtable [-c cols] [-h] [-n table] [-p prog] [-s sep] <iname>
DESCRIPTION
[NB: This program has been deprecated in favor of the ASCII text processing support in funtools. You can now perform fundisp on funtools
ASCII output files (specifying the table using bracket notation) to extract tables and columns.]
The funtbl script extracts a specified table (without the header and comments) from a funtools ASCII output file and writes the result to
the standard output. The first non-switch argument is the ASCII input file name (i.e. the saved output from funcnts, fundisp, funhist,
etc.). If no filename is specified, stdin is read. The -n switch specifies which table (starting from 1) to extract. The default is to
extract the first table. The -c switch is a space-delimited list of column numbers to output, e.g. -c "1 3 5" will extract the first
three odd-numbered columns. The default is to extract all columns. The -s switch specifies the separator string to put between columns.
The default is a single space. The -h switch specifies that column names should be added in a header line before the data is output. With-
out the switch, no header is prepended. The -p program switch allows you to specify an awk-like program to run instead of the default
(which is host-specific and is determined at build time). The -T switch will output the data in rdb format (i.e., with a 2-row header of
column names and dashes, and with data columns separated by tabs). The -help switch will print out a message describing program usage.
For example, consider the output from the following funcnts command:
[sh] funcnts -sr snr.ev "ann 512 512 0 9 n=3"
# source
# data file: /proj/rd/data/snr.ev
# arcsec/pixel: 8
# background
# constant value: 0.000000
# column units
# area: arcsec**2
# surf_bri: cnts/arcsec**2
# surf_err: cnts/arcsec**2
# summed background-subtracted results
upto net_counts error background berror area surf_bri surf_err
---- ------------ --------- ------------ --------- --------- --------- ---------
1 147.000 12.124 0.000 0.000 1600.00 0.092 0.008
2 625.000 25.000 0.000 0.000 6976.00 0.090 0.004
3 1442.000 37.974 0.000 0.000 15936.00 0.090 0.002
# background-subtracted results
reg net_counts error background berror area surf_bri surf_err
---- ------------ --------- ------------ --------- --------- --------- ---------
1 147.000 12.124 0.000 0.000 1600.00 0.092 0.008
2 478.000 21.863 0.000 0.000 5376.00 0.089 0.004
3 817.000 28.583 0.000 0.000 8960.00 0.091 0.003
# the following source and background components were used:
source_region(s)
----------------
ann 512 512 0 9 n=3
reg counts pixels sumcnts sumpix
---- ------------ --------- ------------ ---------
1 147.000 25 147.000 25
2 478.000 84 625.000 109
3 817.000 140 1442.000 249
There are four tables in this output. To extract the last one, you can execute:
[sh] funcnts -s snr.ev "ann 512 512 0 9 n=3" | funtbl -n 4
1 147.000 25 147.000 25
2 478.000 84 625.000 109
3 817.000 140 1442.000 249
Note that the output has been re-formatted so that only a single space separates each column, with no extraneous header or comment informa-
tion.
To extract only columns 1,2, and 4 from the last example (but with a header prepended and tabs between columns), you can execute:
[sh] funcnts -s snr.ev "ann 512 512 0 9 n=3" | funtbl -c "1 2 4" -h -n 4 -s " "
#reg counts sumcnts
1 147.000 147.000
2 478.000 625.000
3 817.000 1442.000
Of course, if the output has previously been saved in a file named foo.out, the same result can be obtained by executing:
[sh] funtbl -c "1 2 4" -h -n 4 -s " " foo.out
#reg counts sumcnts
1 147.000 147.000
2 478.000 625.000
3 817.000 1442.000
SEE ALSO
See funtools(7) for a list of Funtools help pages
version 1.4.2 January 2, 2008 funtbl(1)