Blocks into table


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Blocks into table
# 1  
Old 12-19-2013
Blocks into table

please help, I have a huge file with blocks of data which I need to convert to a tabular format.

Input



sample
Code:
[Term]
id: GO:0000017
name: alpha-glucoside transport
namespace: biological_process
def: "The directed movement of alpha-glucosides into, out of or within a cell, or between cells, by means of some agent such as a transporter or pore. Alpha-glucosides are glycosides in which the sugar group is a glucose residue, and the anomeric carbon of the bond is in an alpha configuration." [GOC:jl, http://www.biochem.purdue.edu/, ISBN:0198506732]
is_a: GO:0042946 ! glucoside transport

[Term]
id: GO:0000018
name: regulation of DNA recombination
namespace: biological_process
def: "Any process that modulates the frequency, rate or extent of DNA recombination, a DNA metabolic process in which a new genotype is formed by reassortment of genes resulting in gene combinations different from those that were present in the parents." [GOC:go_curators, ISBN:0198506732]
subset: gosubset_prok
is_a: GO:0051052 ! regulation of DNA metabolic process
relationship: regulates GO:0006310 ! DNA recombination

Each block starts with [Term] and has fields id, name, namspace etc.
Note that the def sometimes carries over to the next line in the input, I would like that to be in a column in the output


I want to have a table with the four columns

Code:
id name namespace def

Desired output (tab delimited possibly)
Code:
id name namespace def
GO:0000017 alpha-glucoside transport biological_process "The directed movement of alpha-glucosides into, out of or within a cell, or between cells, by means of some agent such as a transporter or pore. Alpha-glucosides are glycosides in which the sugar group is a glucose residue, and the anomeric carbon of the bond is in an alpha configuration." [GOC:jl, http://www.biochem.purdue.edu/, ISBN:0198506732]
GO:0000018 regulation of DNA recombination biological_process "Any process that modulates the frequency, rate or extent of DNA recombination, a DNA metabolic process in which a new genotype is formed by reassortment of genes resulting in gene combinations different from those that were present in the parents." [GOC:go_curators, ISBN:0198506732]

# 2  
Old 12-19-2013
With this specification, I'm not sure how we can help you. How are we supposed to determine if a def: field carries over to the next line? The def contents seem to contain quoted and unquoted colons, so how can we guess whether or not the line following a def: line is the start of a new field or a continuation of the previous line?

What does:
Quote:
Note that the def sometimes carries over to the next line in the input, I would like that to be in a column in the output
mean? Do you want a fifth column in your output file that indicates that your input file had a multi-line definition for "def"?

You said your desired output should be
Quote:
tab delimited possibly
but your sample output is space delimited (and has spaces within fields); not tab delimited.
# 3  
Old 12-19-2013
Code:
awk '
     NR == 1 {
               print "id", "name", "namespace", "def"
             }

 function p(){
                gsub(/.*: /,x)
                return $0
             }

       /^id:/{ 
                id = p()  
             }

     /^name:/{  
                name = p() 
             }

/^namespace:/{ 
                namespace = p()
             }

      /^def:/{ 
                def = p() 
                print id,name,namespace,def       
             } 
    
   ' OFS="\t" file

This User Gave Thanks to Akshay Hegde For This Post:
# 4  
Old 12-19-2013
Appreciating Don Cragun's comments, esp. on colums spilling into next lines, this will do what you asked for (except for fields spilling), and you can define any field and their sequence by modifying the Head parameter:
Code:
awk     'function prline ()     {for (i=1; i<=n; i++) printf "%s\t", Z[S[i]":"]; printf "\n"}

         NR==1          {gsub (FS, OFS, Head)
                         print Head
                         i=n=split (Head,S)
                         while (i>0) SRCH[S[i]":"]=i--
                         next}

         $1 in SRCH     {Z[$1]=$0; sub (/^[^:]*: /,"", Z[$1])}

         /^\[Term\]/    {prline ()}
         END            {prline ()}
        ' OFS="\t" Head="name id is_a" file
name    id    is_a
alpha-glucoside transport    GO:0000017    GO:0042946 ! glucoside transport    
regulation of DNA recombination    GO:0000018    GO:0051052 ! regulation of DNA metabolic process

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Web Development

Getting Rid of Annoying Bootstrap Table Borders and Wayward Table Lines

Bootstrap is great; but we have had some issues with Bootstrapped <tables> (and legacy <fieldset> elements) showing annoying, wayward lines. I solved that problem today with this simple jQuery in the footer: <script> $(function(){ $('tr, td, fieldset,... (0 Replies)
Discussion started by: Neo
0 Replies

2. Shell Programming and Scripting

awk to convert table-by-row to matrix table

Hello, I need some help to reformat this table-by-row to matrix? infile: site1 A:o,p,q,r,s,t site1 C:y,u site1 T:v,w site1 -:x,z site2 A:p,r,t,v,w,z site2 C:u,y site2 G:q,s site2 -:o,x site3 A:o,q,s,t,u,z site3 C:y site3 T:v,w,x site3 -:p,routfile: SITE o p q r s t v u w x y... (7 Replies)
Discussion started by: yifangt
7 Replies

3. UNIX and Linux Applications

Help in copying table structure to another table with constraints in Oracle

hi, i need to copy one table with data into another table, right now am using create table table1 as select * from table2 i want the constraints of table1 to be copied to table2 also , can anyone give me some solution to copy the constraints also, now am using oracle 10.2.0.3.0... (1 Reply)
Discussion started by: senkerth
1 Replies

4. Shell Programming and Scripting

Row blocks to column blocks

Hello, Searched for a while and found some "line-to-column" script. My case is similar but with multiple fields each row: S02 Length Per S02 7043 3.864 S02 54477 29.89 S02 104841 57.52 S03 Length Per S03 1150 0.835 S03 1321 0.96 S03 ... (9 Replies)
Discussion started by: yifangt
9 Replies

5. Shell Programming and Scripting

Build a table from a list by comparing existing table entries

I am new to this shell scripting.... I have a file which contains list of users. This files get updated when new user comes into the system. I want to create script which will give a table containing unique list of users. When I say unique, it means script should match table while parsing... (3 Replies)
Discussion started by: dchavan1901
3 Replies

6. UNIX for Dummies Questions & Answers

Creating a condensed table from a pre-existing table in putty

Hello, I'm working with putty on Windows 7 professional and I'd like to know if there's a way to gather specific lines from a pre-existing table and make a new table with that information. More specifically, I'd like the program to look at a specific column, say column N, and see if any of the... (5 Replies)
Discussion started by: Deedee393
5 Replies

7. Shell Programming and Scripting

how to split this file into blocks and then send these blocks as input to the tool called Yices?

Hello, I have a file like this: FILE.TXT: (define argc :: int) (assert ( > argc 1)) (assert ( = argc 1)) <check> # (define c :: float) (assert ( > c 0)) (assert ( = c 0)) <check> # now, i want to separate each block('#' is the delimeter), make them separate files, and then send them as... (5 Replies)
Discussion started by: paramad
5 Replies

8. UNIX for Dummies Questions & Answers

Convert 512-blocks to 4k blocks

I'm Unix. I'm looking at "df" on Unix now and below is an example. It's lists the filesystems out in 512-blocks, I need this in 4k blocks. Is there a way to do this in Unix or do I manually convert and how? So for container 1 there is 7,340,032 in size in 512-blocks. What would the 4k block be... (2 Replies)
Discussion started by: rockycj
2 Replies

9. UNIX and Linux Applications

create table via stored procedure (passing the table name to it)

hi there, I am trying to create a stored procedure that i can pass the table name to and it will create a table with that name. but for some reason it creates with what i have defined as the variable name . In the case of the example below it creates a table called 'tname' for example ... (6 Replies)
Discussion started by: rethink
6 Replies

10. Shell Programming and Scripting

select values from db1 table and insert into table of DB2

Hi I am having three oracle databases running in three different machine. their ip address is different. from one of the DB am able to access both the databases.(means am able to select values and insert values in to tables individually.) I need to fetch some data from DB1 table(say DB1 ip is... (2 Replies)
Discussion started by: aemunathan
2 Replies
Login or Register to Ask a Question