06-30-2009
filtering out data using grep
Both files have no headings
input of file1.txt (has one 1 column, as shown below):
MXY2344
MXY2455
.
.
.
.
.
.
.
MXY9150 <--- row #364
input of file2.ped (this file has more than 2 million columns with single digit numbers, starting with column 1 as shown below, each column is separated by a space)
MXY2344
MXY2455
.
.
.
.
.
.
.
MXY9150 <--- row #364
.
.
.
.
.
.
.
.
.
.
.
MXY9423 <--- row #1411
desired output file 3 (with only #364 rows with the ids matched between file1 and file2 and 2,498,588 columns)
MXY2344
MXY2455
.
.
.
.
.
.
.
MXY9150 <--- row #364
Thank you for any help!
---------- Post updated at 11:10 PM ---------- Previous update was at 11:03 PM ----------
I used grep -A1 -A1 -f file1.txt file2 > file3 but that did not work.
I only got one reply for this thread yesterday saying to use grep, so that's why I'm posting this again in hopes somebody would help.
Thank you!
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi
I have an file which looks like
country address phone amount
sweden |address |phone | 10 |
Singapo |address |phone | 20 |
Italy-N |address |phone | 30 |
denmar |address |phone | 40 |
Here i need to do the sum(amount), how to do this in shell scripting
Thanks
Babu (11 Replies)
Discussion started by: ksmbabu
11 Replies
2. Shell Programming and Scripting
Dear All,
I am a newbie to shell scripting so this one is really over my head.
I have a text file with five fields as below:
76576.867188 6232.454102 2.008904 55.000000 3
76576.867188 6232.454102 3.607231 55.000000 4
76576.867188 6232.454102 1.555146 65.000000 3
76576.867188 6232.454102... (19 Replies)
Discussion started by: Ghetz
19 Replies
3. Shell Programming and Scripting
Hi,
I have two different files, one has two columns and other has only one column. I would like to compare the first column in the first file with the data in the second file and write a third file with the data that is not present is not common to them.
First file:... (26 Replies)
Discussion started by: swame_sp
26 Replies
4. Shell Programming and Scripting
Hi,
My first file has
592155 9 rs16916098 1
592156 19 rs7249604 1
592157 4 rs885156 1
592158 5 rs350067 12nd file has
9 rs16916098 0 113228129 2 4
19 rs7249604 0 58709070 4 2
2 rs17042833 0 113558750 4 2... (2 Replies)
Discussion started by: genehunter
2 Replies
5. Shell Programming and Scripting
Hi, Experts,
I have a requirement as following:
my source file:
a
a
a
b
b
c
c
c
c
I need add one more colume as following:
1 a
2 a
3 a
1 b
2 b
1 c
2 c (4 Replies)
Discussion started by: ken002
4 Replies
6. UNIX for Advanced & Expert Users
Hello everyone,
I am writing a script to process data from the ATP world tour.
I have a file which contains:
t=540 y=2011 r=1 p=N409
t=540 y=2011 r=2 p=N409
t=540 y=2011 r=3 p=N409
t=540 y=2011 r=4 p=N409
t=520 y=2011 r=1 p=N409
t=520 y=2011 r=2 p=N409
t=520 y=2011 r=3 p=N409
The... (4 Replies)
Discussion started by: imahmoud
4 Replies
7. Shell Programming and Scripting
Hello,
I have some tab delimited data and I need to move the last col. I could hard code it,
awk '{ print $1,$NF,$2,$3,$4,etc }' infile > outfile
but it would be nice to know the syntax to print a range cols.
I know in cut you can do,
cut -f 1,4-8,11-
to print fields 1,... (8 Replies)
Discussion started by: LMHmedchem
8 Replies
8. Shell Programming and Scripting
Hi Friends,
This is the only solution to my task. So, any help is highly appreciated.
I have a file
cat input1.bed
chr1 100 200 abc
chr1 120 300 def
chr1 145 226 ghi
chr2 567 600 unix
Now, I have another file by name
input2.bed (This file is a binary file not readable by the... (7 Replies)
Discussion started by: jacobs.smith
7 Replies
9. Shell Programming and Scripting
Hi,
Please help with this.
I have several excel files (with and .xlsx format) with 10-15 columns each.
They all have the same type of data but the columns are not ordered in the same way.
Here is a 3 column example. What I want to do add the alphabet
from column 2 to column 3, provided... (9 Replies)
Discussion started by: newbie83
9 Replies
10. Shell Programming and Scripting
hello,
i have a undelimited file which contains 229 byte records. i want to change column 23 - 26 with a new value and also change the sign of the data in colulmn 30 - 70. i've tried SED for the first change, but nothing happens:
sed 's/\(^.\{22\}\).\{4\}\(.*\)/\0603\2/' inputfile
heres an... (8 Replies)
Discussion started by: blt123
8 Replies
comm(1) General Commands Manual comm(1)
NAME
comm - select or reject lines common to two sorted files
SYNOPSIS
file1 file2
DESCRIPTION
comm reads file1 and file2, which should be ordered in increasing collating sequence (see sort(1) and Environment Variables below), and
produces a three-column output:
Column 1: Lines that appear only in file1,
Column 2: Lines that appear only in file2,
Column 3: Lines that appear in both files.
If is used for file1 or file2, the standard input is used.
Options 1, 2, or 3 suppress printing of the corresponding column. Thus prints only the lines common to the two files; prints only lines in
the first file but not in the second; does nothing useful.
EXTERNAL INFLUENCES
Environment Variables
determines the collating sequence expects from the input files.
determines the language in which messages are displayed.
If is not specified in the environment or is set to the empty string, the value of determines the language in which messages are displayed.
If is not specified in the environment or is set to the empty string, the value of is used as a default. If is not specified or is set to
the empty string, a default of ``C'' (see lang(5)) is used instead of If any internationalization variable contains an invalid setting,
behaves as if all internationalization variables are set to ``C''. See environ(5).
International Code Set Support
Single- and multi-byte character code sets are supported.
EXAMPLES
The following examples assume that and have been ordered in the collating sequence defined by the or environment variable.
Print all lines common to and (in other words, print column 3):
Print all lines that appear in but not in (in other words, print column 1):
Print all lines that appear in but not in (in other words, print column 2):
SEE ALSO
cmp(1), diff(1), sdiff(1), sort(1), uniq(1).
STANDARDS CONFORMANCE
comm(1)