Sponsored Content
Full Discussion: grep -f file1 file2
Top Forums Shell Programming and Scripting grep -f file1 file2 Post 302491126 by geparada88 on Wednesday 26th of January 2011 05:59:56 PM
Old 01-26-2011
grep -f file1 file2

Hi
I started to learn bash a week ago. I need filter the strings from the last column of a "file2" that match with a column from an other "file1"

file1:

chr10100036394-100038350AK077761
chr10100041065-100046547AK032226
chr10100041065-100046547AK016270
chr10100041065-100046547AK078231
...


file2:

chr10100036394-100038350 BC082587 100.00 30 0 0 1 30 868 897 5e-09 60.0 chr10100036394-100038350BC082587
chr10100036394-100038350 AK160693 100.00 30 0 0 1 30 901 930 5e-09 60.0 chr10100036394-100038350AK160693
chr10100036394-100038350 AK082894 100.00 30 0 0 1 30 871 900 5e-09 60.0 chr10100036394-100038350AK082894
chr10100036394-100038350 AK077761 100.00 30 0 0 1 30 913 942 5e-09 60.0 chr10100036394-100038350AK077761
chr10100039992-100040948 AK078231 100.00 30 0 0 1 30 551 580 5e-09 60.0 chr10100039992-100040948AK078231
chr10100039992-100040948 AK077761 100.00 30 0 0 1 30 545 574 5e-09 60.0 chr10100039992-100040948AK077761
chr10100039992-100040948 AK036647 100.00 30 0 0 1 30 533 562 5e-09 60.0 chr10100039992-100040948AK036647
chr10100039992-100040948 AK032226 100.00 30 0 0 1 30 506 535 5e-09 60.0 chr10100039992-100040948AK032226
chr10100039992-100040948 AK016270 100.00 30 0 0 1 30 382 411 5e-09 60.0 chr10100039992-100040948AK016270
chr10100039992-100040948 AK015251 100.00 30 0 0 1 30 499 528 5e-09 60.0 chr10100039992-100040948AK015251
chr10100041065-100044896 AK043358 100.00 30 0 0 1 30 3118 3147 5e-09 60.0 chr10100041065-100044896AK043358
chr10100041065-100046547 BC082587 100.00 30 0 0 1 30 383 412 5e-09 60.0 chr10100041065-100046547BC082587
chr10100041065-100046547 AK160693 100.00 30 0 0 1 30 416 445 5e-09 60.0 chr10100041065-100046547AK160693
chr10100041065-100046547 AK082894 100.00 30 0 0 1 30 386 415 5e-09 60.0 chr10100041065-100046547AK082894
chr10100041065-100046547 AK078231 100.00 30 0 0 1 30 434 463 5e-09 60.0 chr10100041065-100046547AK078231
chr10100041065-100046547 AK077761 100.00 30 0 0 1 30 428 457 5e-09 60.0 chr10100041065-100046547AK077761
chr10100041065-100046547 AK036647 100.00 30 0 0 1 30 416 445 5e-09 60.0 chr10100041065-100046547AK036647
chr10100041065-100046547 AK032226 100.00 30 0 0 1 30 389 418 5e-09 60.0 chr10100041065-100046547AK032226
chr10100041065-100046547 AK016270 100.00 30 0 0 1 30 265 294 5e-09 60.0 chr10100041065-100046547AK016270
chr10100041065-100046547 AK015251 100.00 30 0 0 1 30 382 411 5e-09 60.0 chr10100041065-100046547AK015251
...

I tried to use
Code:
grep -f file1 file2

but the process is killed Smilie
I think it's because both files are too large (file 1 has more 10e6 lines).

I tried the same code but a whit a smaller file1 (10 lines) and It work fine.

Can anybody tell me an other way to do this or tell me what I'm doing wrong??

Thanks!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

match value from file1 in file2

Hi, i've two files (file1, file2) i want to take value (in column1) and search in file2 if the they match print the value from file2. this is what i have so far. awk 'FILENAME=="file1"{ arr=$1 } FILENAME=="file2" {print $0} ' file1 file2 (2 Replies)
Discussion started by: myguess21
2 Replies

2. Shell Programming and Scripting

grep -f file1 file2

Wat does this command do? fileA is a subset of fileB..now, i need to find the lines in fileB that are not in fileA...i.e fileA - fileB. diff fileA fileB gives the ouput but the format looks no good.... I just need the contents alone not the line num etc. (7 Replies)
Discussion started by: vijay_0209
7 Replies

3. UNIX for Dummies Questions & Answers

cat file1 file2 > file3

file1 has pgap500 500 file2 has bunch of data cat file1 file2 > file3 cp file2 file3.dat then vi pgap500 500 onto 1st line compare file3 and fil3.dat, they are not the same. any idea ? the 1st line, i want to put pg500 xxx ---------- Post updated at 07:35 AM ---------- Previous... (2 Replies)
Discussion started by: tjmannonline
2 Replies

4. Shell Programming and Scripting

file1 newer then file2

Hello, I am new to shell scripting and i need to create a script with the following directions and I can not figure it out. Create a shell script called newest.bash that takes two filenames as input arguments ($1 and $2) and prints out the name of the newest file (i.e. the file with the... (1 Reply)
Discussion started by: mandylynn78
1 Replies

5. UNIX for Dummies Questions & Answers

if matching strings in file1 and file2, add column from file1 to file2

I have very limited coding skills but I'm wondering if someone could help me with this. There are many threads about matching strings in two files, but I have no idea how to add a column from one file to another based on a matching string. I'm looking to match column1 in file1 to the number... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

6. Shell Programming and Scripting

look for line from FILE1 at FILE2

Hi guys! I'm trying to write something to find each line of file1 into file2, if line is found return YES, if not found return NO. The result can be written to a new file. Can you please help me out? FILE1 INPUT: WATER CAR SNAKE (in reality this file has about 600 lines each with a... (2 Replies)
Discussion started by: demmel
2 Replies

7. Shell Programming and Scripting

If file1 and file2 exist then

HI, I would like a little help on writing a if statement. What i have so far is: #!/bin/bash FILE1=path/to/file1 FILE2=path/to/file2 echo ${FILE1} ${FILE2} if ] then echo file1 and file2 not found else echo FILE ok fi (6 Replies)
Discussion started by: techy1
6 Replies

8. UNIX for Dummies Questions & Answers

Compare file1 and file2, print matching lines in same order as file1

I want to print only the lines in file2 that match file1, in the same order as they appear in file 1 file1 file2 desired output: I'm getting the lines to match awk 'FNR==NR {a++}; FNR!=NR && a' file1 file2 but they are in sorted order, which is not what I want: Can anyone... (4 Replies)
Discussion started by: pathunkathunk
4 Replies

9. Shell Programming and Scripting

How-to check if file1 a subset of file2 ?

I need to know if file1 is a subset of file2 i.e all the contents of file1 are present in file2 or not. Here is how i would do it. Read line by line file1 and grep every line in file2 in a for loop. any failing grep would means that it is not a subset. Is there a quicker or easier way... (3 Replies)
Discussion started by: mohtashims
3 Replies

10. Shell Programming and Scripting

awk to search field2 in file2 using range of fields file1 and using match to another field in file1

I am trying to use awk to find all the $2 values in file2 which is ~30MB and tab-delimited, that are between $2 and $3 in file1 which is ~2GB and tab-delimited. I have just found out that I need to use $1 and $2 and $3 from file1 and $1 and $2of file2 must match $1 of file1 and be in the range... (6 Replies)
Discussion started by: cmccabe
6 Replies
DIFF(1) 						      General Commands Manual							   DIFF(1)

NAME
diff - differential file comparator SYNOPSIS
diff [ -efbh ] file1 file2 DESCRIPTION
Diff tells what lines must be changed in two files to bring them into agreement. If file1 (file2) is `-', the standard input is used. If file1 (file2) is a directory, then a file in that directory whose file-name is the same as the file-name of file2 (file1) is used. The normal output contains lines of these forms: n1 a n3,n4 n1,n2 d n3 n1,n2 c n3,n4 These lines resemble ed commands to convert file1 into file2. The numbers after the letters pertain to file2. In fact, by exchanging `a' for `d' and reading backward one may ascertain equally how to convert file2 into file1. As in ed, identical pairs where n1 = n2 or n3 = n4 are abbreviated as a single number. Following each of these lines come all the lines that are affected in the first file flagged by `<', then all the lines that are affected in the second file flagged by `>'. The -b option causes trailing blanks (spaces and tabs) to be ignored and other strings of blanks to compare equal. The -e option produces a script of a, c and d commands for the editor ed, which will recreate file2 from file1. The -f option produces a similar script, not useful with ed, in the opposite order. In connection with -e, the following shell program may help maintain multiple versions of a file. Only an ancestral file ($1) and a chain of version-to-version ed scripts ($2,$3,...) made by diff need be on hand. A `latest version' appears on the standard output. (shift; cat $*; echo '1,$p') | ed - $1 Except in rare circumstances, diff finds a smallest sufficient set of file differences. Option -h does a fast, half-hearted job. It works only when changed stretches are short and well separated, but does work on files of unlimited length. Options -e and -f are unavailable with -h. FILES
/tmp/d????? /usr/lib/diffh for -h SEE ALSO
cmp(1), comm(1), ed(1) DIAGNOSTICS
Exit status is 0 for no differences, 1 for some, 2 for trouble. BUGS
Editing scripts produced under the -e or -f option are naive about creating lines consisting of a single `.'. DIFF(1)
All times are GMT -4. The time now is 12:25 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy