Need help searching for values in file then adding to line


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Need help searching for values in file then adding to line
# 1  
Old 12-07-2012
Need help searching for values in file then adding to line

Hello!

I'm currently trying to organize data for some bio research, but I'm not sure how to compare a value to values in a file. So what I have are 2 arrays, one array contains NM numbers and can be referenced as NM[#]. The other array has symbols, SYM[#]. I have a file for which it contains an NM number every other line and between each NM number, irrelevant information (but I need it in there still). What I need to do is match every NM[#] in my array to the NM number in the file, but also add :Sym[#] to the end of that line. The problem is, before each NM number in the file, there is a > symbol in front of the line (which needs to stay there). So for example I have an array NM that looks like:

Code:
{NM_23948375 NM_03948274 NM_39482746 NM_20475839}

#except there are about 2 thousand values

and SYM:

Code:
{fj48g9sk 2idjf8a0s ajsie9rt skdjie8t}

#same amount of values as NM

and the file looks like:

Code:
>NM_########
AUGCGCUAGCUGAUGCUGAGCACGAUCGAUCGAAA
>NM_########
AUGUCGUAGCUAGCGUAGCUGUAUCGUGAC

I need to take the first NM number in my NM array and compare it to every other line in the file without the > in front. Then, when that line in the file is found, I need to add :SYM, where SYM is the same order as the NM number from the array. So take the first NM number, find the line, add the first symbol. Then the second NM number, match it, add second symbol, and so on, for a final product that looks like:

Code:
>NM_########:SYM
AUGCAGUCGAUCGAUGCUAGUCUACAGCUAUCGGAAA
>NM_########:SYM
AUGCCGUAGCUAGCUACGUACGUGUAGCUGAC

I feel like the process should be relatively simple, I'm just completely new at this and was looking for any help. I'm not really even sure how to start.

Here's what I have (forgive all syntax errors, everything I want to do is in there, I just need help translating it to code, file to be edited is called file.fa, I can also take it as an argument and refer to it as $1 if that's easier):

Code:
#!/bin/bash

for ((i=0; i<$(wc -l file.fa)/2; i++))
  for ((j=0; j<$(wc -l file.fa)/2; j++))
    if ($NM[i] = $fileline[2*j+1)]) #without the >
      sed '(2*(j+1)s/.*/>$NM[i]:$SYM[i]/
    fi
  done
done

I also have access to perl if that makes things easier. Also, if this is all possible by just using the command line, that'd be simpler for me.

Sorry for the long post and any help is appreciated!

Last edited by jim mcnamara; 12-07-2012 at 11:49 AM..
# 2  
Old 12-07-2012
So, any line beginning with '>NM_[0-9]\(8\)' needs a nm lookup and a sym lookup and a :sym insertion. What if the nm or sym lookup fails? sed could help parse, but you are writing the output file in shell, and usually speed precludes any command fork/exec per line is a shell data processing loop for non-trivial amounts of data (must use shell built-in commands or one command per file). You can use one sed for all files to break your nm number out into a separate field for a "while read prefix nm rest ; do ... ; done" loop, and if prefix is '>NM_' then lookup sym and make a new line else preserve old line 'echo "$prefix$nm$rest" '.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Adding line in a file using info from previous line

I have a shell script that looks something like the following: mysql -uroot db1 < db1.sql mysql -uroot db2 < db2.sql mysql -uroot db3 < db3.sql mysql -uroot db4 < db4.sql .... different db names in more than 160 lines. I want to run this script with nohup and have a status later. So,... (6 Replies)
Discussion started by: MKH
6 Replies

2. Shell Programming and Scripting

Adding the code after searching the string

Hi All, please suggest me.. How to add the text from one file to another file and need to add the code after 3 lines of below searched line . sample code in standard file: <corecom:Description xml:id="id_2607"> <xsl:value-of... (6 Replies)
Discussion started by: vijayko
6 Replies

3. Shell Programming and Scripting

Searching for values in a file

Hi guys. I'm trying to do a search on the fruit & brand inside Fruit.txt, and printing the result out in the following format: , , $, I am able to do this via the following code: awk -F: -vOFS=", " -vt="$Fruit:$Brand" '$0~t{$3="$"$3;print}' Fruit.txt However, I want to be able to... (5 Replies)
Discussion started by: todaealas
5 Replies

4. Shell Programming and Scripting

Adding column values in a file

Hi, I am having a file in the following format. for aaaa 1111 1234 2222 3434 for bbbb 1111 3434.343 2222 2343 for cccc 3333 2343.343 4444 89000 for dddd 1111 5678.343 2222 890.3 aaaa 2343.343 bbbb 34343.343 (5 Replies)
Discussion started by: jpkumar10
5 Replies

5. Shell Programming and Scripting

Adding tab/new line at the end of each line of a file

Hello Everyone, I need a help from experts of this community regarding one of the issue that I am facing with shell scripting. My requirement is to append char's at the end of each line of a file. The char that will be appended is variable and will be passed through command line. The... (20 Replies)
Discussion started by: Sourav Das
20 Replies

6. Shell Programming and Scripting

How to display the line number of file while searching for a pattern

awk 'BEGIN{IGNORECASE=1} /error|warning|exception/ { ++x } END { print x }' filename The above command returning the number of times the pattern present in the file. But I want the the line number as well. please help me out (6 Replies)
Discussion started by: arukuku
6 Replies

7. Shell Programming and Scripting

Searching data files for another file of values

I've used awk for some simple scripting, but having trouble figuring out how to search a couple of data files that have Name/Address/Zip Codes from another file that has list of only Zip Codes, and write out the lines that matched. Zip code field in the data file is 27 I was thinking... (5 Replies)
Discussion started by: matkins99
5 Replies

8. Shell Programming and Scripting

Adding the values of two file

I have two files as Count1 and Count2. The count contains only one values as 10 and count2 contains only one values as 20. Now I want third file Count3 as count1+Count2. That is it should contain sum of two file(10+20=30) (3 Replies)
Discussion started by: Shell_Learner
3 Replies

9. Shell Programming and Scripting

Searching a specific line in a large file

Hey All Can any one please suggest the procedure to search a part of line in a very large file in which log entries are entered with very high speed. i have trued with grep and egrep grep 'text text text' <file-name> egrep 'text text text' <file-name> here 'text text text' is... (4 Replies)
Discussion started by: NIMISH AGARWAL
4 Replies

10. Shell Programming and Scripting

Append a field to the end of each line of a file based on searching another file.

Hi All, I have two comma separated value(CSV) files, say FileA and FileB. The contents looks like that shown below. FileA EmpNo,Name,Age,Sex, 1000,ABC,23,M, 1001,DES,24,F, ... (2 Replies)
Discussion started by: ultimate
2 Replies
Login or Register to Ask a Question