Sponsored Content
Top Forums Shell Programming and Scripting Matching the entries and printing data Post 302702777 by manigrover on Tuesday 18th of September 2012 09:35:58 PM
Old 09-18-2012
Matching the entries and printing data

Hi all,

I have a file with Id which I want to compare it with other file to get the sequence of a particular id.

File 1 with ID
Code:
Q7L8J4
Q676U5
Q8NAA4
Q5TYW2
Q5SQ80
Q5VUR7
Q4UJ75
Q96IX9
Q7Z4T9
Q6NTF7
Q8IZP0
Q9NYB9
Q9P2A4
O14639
Q9ULW3
Q969K4
Q15057
Q5T8D3
Q8N7X0
Q9Y2D8
Q8TED9
Q8N4X5

File 2 with sequence information
Code:
>sp|Q7L8J4|3BP5L_HUMAN SH3 domain-binding protein 5-like OS=Homo sapiens GN=SH3BP5L PE=1 SV=1
MAELRQVPGGRETPQGELRPEVVEDEVPRSPVAEEPGGGGSSSSEAKLSPREEEELDPRI
QEELEHLNQASEEINQVELQLDEARTTYRRILQESARKLNTQGSHLGSCIEKARPYYEAR
RLAKEAQQETQKAALRYERAVSMHNAAREMVFVAEQGVMADKNRLDPTWQEMLNHATCKV
NEAEEERLRGEREHQRVTRLCQQAEARVQALQKTLRRAIGKSRPYFELKAQFSQILEEHK
AKVTELEQQVAQAKTRYSVALRNLEQISEQIHARRRGGLPPHPLGPRRSSPVGAEAGPED
MEDGDSGIEGAEGAGLEEGSSLGPGPAPDTDTLSLLSLRTVASDLQKCDSVEHLRGLSDH
VSLDGQELGTRSGGRRGSDGGARGGRHQRSVSL
>sp|Q676U5|A16L1_HUMAN Autophagy-related protein 16-1 OS=Homo sapiens GN=ATG16L1 PE=1 SV=2
MSSGLRAADFPRWKRHISEQLRRRDRLQRQAFEEIILQYNKLLEKSDLHSVLAQKLQAEK
HDVPNRHEISPGHDGTWNDNQLQEMAQLRIKHQEELTELHKKRGELAQLVIDLNNQMQRK
DREMQMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDALQITFTALEGKL
RKTTEENQELVTRWMAEKAQEANRLNAENEKDSRRRQARLQKELAEAAKEPLPVEQDDDI
EVIVDETSDHTEETSPVRAISRAATKRLSQPAGGLLDSITNIFGRRSVSSFPVPQDNVDT
HPGSGKEVRVPATALCVFDAHDGEVNAVQFSPGSRLLATGGMDRRVKLWEVFGEKCEFKG
SLSGSNAGITSIEFDSAGSYLLAASNDFASRIWTVDDYRLRHTLTGHSGKVLSAKFLLDN
ARIVSGSHDRTLKLWDLRSKVCIKTVFAGSSCNDIVCTEQCVMSGHFDKKIRFWDIRSES
IVREMELLGKITALDLNPERTELLSCSRDDLLKVIDLRTNAIKQTFSAPGFKCGSDWTRV
VFSPDGSYVAAGSAEGSLYIWSVLTGKVEKVLSKQHSSSINAVAWSPSGSHVVSVDKGCK
AVLWAQY
>sp|Q8NAA4|A16L2_HUMAN Autophagy-related protein 16-2 OS=Homo sapiens GN=ATG16L2 PE=2 SV=2
MAGPGVPGAPAARWKRHIVRQLRLRDRTQKALFLELVPAYNHLLEKAELLDKFSKKLQPE
PNSVTPTTHQGPWEESELDSDQVPSLVALRVKWQEEEEGLRLVCGEMAYQVVEKGAALGT
LESELQQRQSRLAALEARVAQLREARAQQAQQVEEWRAQNAVQRAAYEALRAHVGLREAA
LRRLQEEARDLLERLVQRKARAAAERNLRNERRERAKQARVSQELKKAAKRTVSISEGPD
TLGDGMRERRETLALAPEPEPLEKEACEKWKRPFRSASATSLTLSHCVDVVKGLLDFKKR
RGHSIGGAPEQRYQIIPVCVAARLPTRAQDVLDAHLSEVNAVRFGPNSSLLATGGADRLI
HLWNVVGSRLEANQTLEGAGGSITSVDFDPSGYQVLAATYNQAAQLWKVGEAQSKETLSG
HKDKVTAAKFKLTRHQAVTGSRDRTVKEWDLGRAYCSRTINVLSYCNDVVCGDHIIISGH
NDQKIRFWDSRGPHCTQVIPVQGRVTSLSLSHDQLHLLSCSRDNTLKVIDLRVSNIRQVF
RADGFKCGSDWTKAVFSPDRSYALAGS

I want these two files to be compared by comapring ID in the first file with the ID encoded between the pipeline "||" in the second file. If it is same then print the complete sequence.

For example, Q676U5 is found in first file and in the second file. So in the output I should have something like this given below
Expected output
Code:
>sp|Q676U5|A16L1_HUMAN Autophagy-related protein 16-1 OS=Homo sapiens GN=ATG16L1 PE=1 SV=2
MSSGLRAADFPRWKRHISEQLRRRDRLQRQAFEEIILQYNKLLEKSDLHSVLAQKLQAEK
HDVPNRHEISPGHDGTWNDNQLQEMAQLRIKHQEELTELHKKRGELAQLVIDLNNQMQRK
DREMQMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDALQITFTALEGKL
RKTTEENQELVTRWMAEKAQEANRLNAENEKDSRRRQARLQKELAEAAKEPLPVEQDDDI
EVIVDETSDHTEETSPVRAISRAATKRLSQPAGGLLDSITNIFGRRSVSSFPVPQDNVDT
HPGSGKEVRVPATALCVFDAHDGEVNAVQFSPGSRLLATGGMDRRVKLWEVFGEKCEFKG
SLSGSNAGITSIEFDSAGSYLLAASNDFASRIWTVDDYRLRHTLTGHSGKVLSAKFLLDN
ARIVSGSHDRTLKLWDLRSKVCIKTVFAGSSCNDIVCTEQCVMSGHFDKKIRFWDIRSES
IVREMELLGKITALDLNPERTELLSCSRDDLLKVIDLRTNAIKQTFSAPGFKCGSDWTRV
VFSPDGSYVAAGSAEGSLYIWSVLTGKVEKVLSKQHSSSINAVAWSPSGSHVVSVDKGCK
AVLWAQY

Thanks in advance
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Pattern matching and Printing Filename

Hi, My requirement is to search for a paritcular string from a group of .gz files and to print the lines containing that string and the name of the files in which that string is present. Daily 500 odd .gz files will be generated in a directory(directory name will be in the form of... (4 Replies)
Discussion started by: krao
4 Replies

2. Shell Programming and Scripting

Pattern Matching and printing

Dear All, I have a log file like below 13:26:31 |152.22 13:27:31 |154.25 13:28:31 |154.78 13:29:31 |151.23 13:30:31 |145.63 13:31:31 |142.10 13:32:31 |145.45 where values will be there from 00:00 hrs to 23:59 hrs. I'm matching for last occurance of 23:59 and printing 1440 lines (grep... (4 Replies)
Discussion started by: Naga06
4 Replies

3. Shell Programming and Scripting

Printing entire field, if at least one row is matching by AWK

Dear all, I have been trying to print an entire field, if the first line of the field is matching. For example, my input looks something like this. aaa ddd zzz 123 987 126 24 0.650 985 354 9864 0.32 0.333 4324 000 I am looking for a pattern,... (5 Replies)
Discussion started by: Chulamakuri
5 Replies

4. Shell Programming and Scripting

fgrep not printing non matching lines

I'm using this: fgrep -f file1.txt file2.txt To find lines in file1 that match patterns found in file2. When I add -v: egrep -v -f file1.txt file2.txt It won't return non matching lines, I just get a blank. Can anyone help? PS. file1.txt contains 3 million lines...each string... (2 Replies)
Discussion started by: Nonito84
2 Replies

5. Shell Programming and Scripting

Help With AWK Matching and Re-printing Lines

Hi All, I'm looking to use AWK to pattern match lines in XML file - Example patten for below sample would be /^<apple>/ The sample I wrote out is very basic compared to what I am actually working with but it will get me started I would like to keep the matched line(s) unchanged but have them... (4 Replies)
Discussion started by: rhoderidge
4 Replies

6. Shell Programming and Scripting

Matching and printing line with awk

Hi there, I'm trying to use awk to print out the entire line that contains a match to a certain regex and then append some text,plus the match to the end of the line. So far I have: awk -F: '{print "RG:Z:" $2}' file Which prints out the match I want plus the additional text, but I'm stuck... (3 Replies)
Discussion started by: jim_lad
3 Replies

7. Shell Programming and Scripting

UNIX awk pattern matching and printing lines

I have the below plain text file where i have some result, in order to mail that result in html table format I have written the below script and its working well. cat result.txt Page 2015-01-01 2000 Colors 2015-02-01 3000 Landing 2015-03-02 4000 #!/bin/sh LOG=/tmp/maillog.txt... (1 Reply)
Discussion started by: close2jay
1 Replies

8. Shell Programming and Scripting

Comparing same column from two files, printing whole row with matching values

First I'd like to apologize if I opened a thread which is already open somewhere. I did a bit of searching but could quite find what I was looking for, so I will try to explaing what I need. I'm writing a script on our server, got to a point where I have two files with results. Example: File1... (6 Replies)
Discussion started by: mitabrev83
6 Replies

9. Shell Programming and Scripting

Need help on pattern matching and printing the same

Hi, I need to match for the pattern '.py' in my file and print the word which contains. For example: cat testfile a b 3 4.py 5 6 a b.py c.py 4 5 6 7 8 1.py 2.py 3 4 5 6 Expected output: 4.py b.py c.py 1.py 2.py (3 Replies)
Discussion started by: Sumanthsv
3 Replies

10. UNIX for Beginners Questions & Answers

Continued trouble matching fields in different files and selective field printing ([g]awk)

I apologize in advance, but I continue to have trouble searching for matches between two files and then printing portions of each to output in awk and would very much appreciate some help. I have data as follows: File1 PS012,002 PRQ 0 1 1 17 1 0 -1 3 2 1 2 -1 ... (7 Replies)
Discussion started by: jvoot
7 Replies
All times are GMT -4. The time now is 11:45 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy