extract perticular lines and make them into speadsheet


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting extract perticular lines and make them into speadsheet
# 1  
Old 10-04-2005
extract perticular lines and make them into speadsheet

Hi Masters,

I knew this isn't a new issue, but couldn't find any similar threads. So, I have to bother you. Here is my input file (genomic data). The file has many sessions, each session seperated by //. Within eash session there is only one ID and GN line.

ID 3HAO_HUMAN STANDARD; PRT; 286 AA.
AC P46952; Q8N6N9;
DT 01-NOV-1995 (Rel. 32, Created)
DT 01-NOV-1995 (Rel. 32, Last sequence update)
DT 10-MAY-2005 (Rel. 47, Last annotation update)
DE 3-hydroxyanthranilate 3,4-dioxygenase (EC 1.13.11.6) (3-HAO) (3-
DE hydroxyanthranilic acid dioxygenase) (3-hydroxyanthranilate
DE oxygenase).
GN Name=HAAO;
OS Homo sapiens (Human).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae;
OC Homo.
OX NCBI_TaxID=9606;
//
ID A4GCT_HUMAN STANDARD; PRT; 340 AA.
AC Q9UNA3;
DT 28-FEB-2003 (Rel. 41, Created)
DT 28-FEB-2003 (Rel. 41, Last sequence update)
DT 13-SEP-2005 (Rel. 48, Last annotation update)
DE Alpha-1,4-N-acetylglucosaminyltransferase (EC 2.4.1.-) (Alpha4GnT).
GN Name=A4GNT;
OS Homo sapiens (Human).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae;
OC Homo.
OX NCBI_TaxID=9606;
//
................

What I need to do is to extract part of line GN, ID and put them into this format. Thanks in advance.

GN ID
HAAO 3HAO_HUMAN
A4GNT A4GCT_HUMAN
.... ....
# 2  
Old 10-04-2005
There is definitely a better way to do this, but right now, all I could think of was this: (test.tmp holds all your records)
Code:
#!/usr/bin/ksh
awk '/^ID/ {print $2}' test.tmp > ID.tmp
awk -F'=' '/^GN/ {print $2}' test.tmp  > GN.tmp
paste -d '\' GN.tmp ID.tmp > final.output

The output will be a ';' seperated file that you could open in any spreadsheet program.

The above code is inefficient and will be very slow if you have a very large number of records - but for a reasonable number of records, it will be just fine.

Last edited by blowtorch; 10-04-2005 at 04:38 PM.. Reason: to modify code to retain the ';'
# 3  
Old 10-04-2005
thanks, but not work

Hi,
It didn't work for some reason. error is
awk: syntax error at source line 1
context is
/^ID/ {print $2} test.tmp > >>> ID <<< .tmp
awk: bailing out at source line 2
paste: ID.tmp: No such file or directory

By the way, I am using Mac OSX
# 4  
Old 10-04-2005
sorry, it worked

my bad! I misspelled word. Thanks.
# 5  
Old 10-04-2005
works but not giving right output

in the output file for each line record must match very well, even though some of session doesn't have a GN line. In such case, ID has to match to empty record. Thanks.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

ksh sed - Extract specific lines with mulitple occurance of interesting lines

Data file example I look for primary and * to isolate the interesting slot number. slot=`sed '/^primary$/,/\*/!d' filename | tail -1 | sed s'/*//' | awk '{print $1" "$2}'` Now I want to get the Touch line for only the associate slot number, in this case, because the asterisk... (2 Replies)
Discussion started by: popeye
2 Replies

2. Shell Programming and Scripting

Make all lines divisible by three

Hi, I need some help with the following: I need all lines in a file divisible by three, so for a file like this: 1 11 111 I want to add characters to make them all divisible by three (e.g. with an X): 1XX 11X 111 I would like to also ignore all lines that begin with the... (2 Replies)
Discussion started by: mikey11415
2 Replies

3. Shell Programming and Scripting

Search for a pattern,extract value(s) from next line, extract lines having those extracted value(s)

I have hundreds of files to process. In each file I need to look for a pattern then extract value(s) from next line and then search for value(s) selected from point (2) in the same file at a specific position. HEADER ELECTRON TRANSPORT 18-MAR-98 1A7V TITLE CYTOCHROME... (7 Replies)
Discussion started by: AshwaniSharma09
7 Replies

4. UNIX for Dummies Questions & Answers

Extract lines with specific words with addition 2 lines before and after

Dear all, Greetings. I would like to ask for your help to extract lines with specific words in addition 2 lines before and after these lines by using awk or sed. For example, the input file is: 1 ak1 abc1.0 1 ak2 abc1.0 1 ak3 abc1.0 1 ak4 abc1.0 1 ak5 abc1.1 1 ak6 abc1.1 1 ak7... (7 Replies)
Discussion started by: Amanda Low
7 Replies

5. Shell Programming and Scripting

How to make duplicate lines

input tophr5:178975153-178982740:+ tophrX:14502176-14502376:+ output >tophr5:178975153-178982740:+ tophr5:178975153-178982740:+ >tophrX:14502176-14502376:+ tophrX:14502176-14502376:+ (2 Replies)
Discussion started by: quincyjones
2 Replies

6. Shell Programming and Scripting

how to make a log file of extract time

Dear All, Please apology to me if this question already posted, because I try to find it but not found. I have make bash script to automatically download data from ftp and this running very well. and after the data downloaded it will automatically extract the data and keep in the specific... (2 Replies)
Discussion started by: chenboly
2 Replies

7. Shell Programming and Scripting

To extract data of a perticular interval (date-time wise)

I want a shell script which extract data from a log file which contains date and time-wise data and i need the data for a perticular interval of time...what can i do??? (3 Replies)
Discussion started by: abhishek27
3 Replies

8. Shell Programming and Scripting

Replace a perticular character of all lines of a file

Hi all, I am new to UNIX, so sorry if my question seem stupid to u. well i want to replace the first character of first 30 lines of a file, only if the first character is h. and in anothe script i want to replace a particular string/character say hello/h of a file.Condition: It should... (1 Reply)
Discussion started by: abovais
1 Replies

9. UNIX for Dummies Questions & Answers

How to make all lines into 1 line?

I have a file listing IP addresses, 1 per line, such as: 1.2.3.4 3.4.5.6 12.13.14.15 7.8.9.6 I want all of the entries to be on the same line, and quoted, such as: "1.2.3.4" "3.4.5.6" "12.13.14.15" "7.8.9.6" I got the quotes on there in vi with ":%s/^/"/g" and "%s/$/"/g" ... is there... (8 Replies)
Discussion started by: earnstaf
8 Replies

10. Shell Programming and Scripting

Make sed ignore lines

Hi I use sed in a script for severall changes in files. I whish one of the substitutions I made to be aplied to every line that has the word "scripts" with the exception for the ones that start with "rsh", wich I wish sed to ignore . Is this possible? If yes, how can I do it? The substitution... (2 Replies)
Discussion started by: Scarlos
2 Replies
Login or Register to Ask a Question