Help with allocated text content based on specific rules...


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with allocated text content based on specific rules...
# 1  
Old 05-19-2011
Help with allocated text content based on specific rules...

Input file format:
Code:
/tag="ABL"
/note="abl homolog
2
/tag="ABLIM1"
/note="actin binding LIM 1
/tag="ABP1"
/note="amiloride binding protein 1 (amine oxidase (copper-
containing))
/tag="ABR"
/note="active BCR-related
/tag="AC003042.1"
/note="SDR family member 11
precursor
.
.
.

Desired output file:
Code:
/tag="ABL"
/note="abl homolog 2
/tag="ABLIM1"
/note="actin binding LIM 1
/tag="ABP1"
/note="amiloride binding protein 1 (amine oxidase (copper-containing))
/tag="ABR"
/note="active BCR-related
/tag="AC003042.1"
/note="SDR family member 11 precursor
.
.
.

If the first line of the content are not start as "/tag" or "/note". I would like those content allocated at the end of the content at "/note" based on the following rules:
1. If the last content at the "/note" is end with "-", the content (first line are not start as "/tag" or "/note") should straight append to it.
eg.
Code:
Input:
/note="amiloride binding protein 1 (amine oxidase (copper-
containing))

Desired output:
/note="amiloride binding protein 1 (amine oxidase (copper-containing))

2. If the last content at the "/note" is excluded with "-", the content (first line are not start as "/tag" or "/note") should add a space " " before append to it.
eg.
Code:
Input:
/note="SDR family member 11
precursor

Output:
/note="SDR family member 11 precursor

Any programming language (awk, sed ,perl ,etc) are appreciate.
Thanks first for advice Smilie
# 2  
Old 05-19-2011
Lazy way ...
Code:
tr '\n' '#' <infile | sed 's/#\([^/]\)/\1/g' | tr '#' '\n'

---------- Post updated at 09:33 AM ---------- Previous update was at 09:29 AM ----------

dealing with space stuff or not when end with '-' :

Code:
tr '\n' '#' <tst | sed 's/-#\([^/]\)/-\1/g;s/#\([^/]\)/ \1/g' | tr '#' '\n'

---------- Post updated at 09:34 AM ---------- Previous update was at 09:33 AM ----------

Code:
$ cat tst
/tag="ABL"
/note="abl homolog
2
/tag="ABLIM1"
/note="actin binding LIM 1
/tag="ABP1"
/note="amiloride binding protein 1 (amine oxidase (copper-
containing))
/tag="ABR"
/note="active BCR-related
/tag="AC003042.1"
/note="SDR family member 11
precursor

Code:
$ tr '\n' '#' <tst | sed 's/-#\([^/]\)/-\1/g;s/#\([^/]\)/ \1/g' | tr '#' '\n'
/tag="ABL"
/note="abl homolog 2
/tag="ABLIM1"
/note="actin binding LIM 1
/tag="ABP1"
/note="amiloride binding protein 1 (amine oxidase (copper-containing))
/tag="ABR"
/note="active BCR-related
/tag="AC003042.1"
/note="SDR family member 11 precursor

$

---------- Post updated at 09:37 AM ---------- Previous update was at 09:34 AM ----------

May be shorten a bit like:

Code:
tr '\n' '#' <inputfile | sed 's/-#/-/g;s/#\([^/]\)/ \1/g' | tr '#' '\n'

This User Gave Thanks to ctsgnb For This Post:
# 3  
Old 05-19-2011
Hi ctsgnb,

Thanks for your reply.
Your "lazy way" is worked but it don't follow rules 2 Smilie
It gives the following output:
Code:
cat infile:
/note="SDR family member 11
precursor

tr '\n' '#' < infile | sed 's/#\([^/]\)/\1/g' | tr '#' '\n'
/note="SDR family member 11precursor

My desired output is:
Code:
/note="SDR family member 11 precursor

Thanks again Smilie
# 4  
Old 05-19-2011
I have meanwhile updated my previous post, did you try the last suggestion ?

---------- Post updated at 10:49 AM ---------- Previous update was at 10:40 AM ----------

also try
Code:
sed -e ':a' -e 'N;/^\/.*\n\/.*/{P;D;};s/\(.*\)-\n/\1-/;/^\/.*\n[^/].*/s/\n\([^/]\)/ \1/;p;d' -e 'ta' infile

This User Gave Thanks to ctsgnb For This Post:
# 5  
Old 05-19-2011
Hi ctsgnb,

Really thanks.
It worked Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl to update field based on a specific set of rules

In the perl below, which does execute, I am having trouble with the else in Rule 3. The digit in f{8} is extracted and used to update f accordinly along with the value in f. There can be either - * or + before the number that is extracted but the same logic applies, that is if the value is greater... (5 Replies)
Discussion started by: cmccabe
5 Replies

2. Shell Programming and Scripting

Delete lines based on Rules

Hi My requirement is very simple . I juts need to delte some lines from a file. here comes theactual scenario I have some data in file like say srinivasa prabhu kumar antony srinivas king prabhu antony srinivas prabhu king yar venkata venkata kingson srinivas... (6 Replies)
Discussion started by: ptappeta
6 Replies

3. Shell Programming and Scripting

Help with underline text based on specific region

Input file 2 5 ASFGEWTEWRQWEQ 10 20 QEWIORUEIOUEWORUQWEQWRQRQWGQWGFQ 1 6 WRQTQWTQTQWTQT Desired output file 2 5 ASFGEWTEWRQWEQ 10 20 QEWIORUEIOUEWORUQWEQWRQRQWGQWGFQ 1 6 WRQTQWTQTQWTQT Column 1 is the start region of underline the text in column 3; Column 2 is the end region of... (13 Replies)
Discussion started by: cpp_beginner
13 Replies

4. UNIX for Dummies Questions & Answers

How to cut from a text file based on value of a specific column?

Hi, I have a tab delimited text file from which I want to cut out specific columns. If the second column equals one, I want to cut out columns 1 and 5 and 6. If the second column equals two, I want to cut out columns 1 and 5 and 7. How do I go about doing that? Thanks! (4 Replies)
Discussion started by: evelibertine
4 Replies

5. Shell Programming and Scripting

Script to create a text file whose content is the text of another files

Hello everyone, I work under Ubuntu 11.10 (c-shell) I need a script to create a new text file whose content is the text of another text files that are in the directory $DIRMAIL at this moment. I will show you an example: - On the one hand, there is a directory $DIRMAIL where there are... (1 Reply)
Discussion started by: tenteyu
1 Replies

6. Shell Programming and Scripting

Sort content of text file based on date?

I now have a 230,000+ lines long text file formatted in segments like this: Is there a way to sort this file to have everything in chronological order, based on the date and time in the text? In this example, I would like the result to be: (19 Replies)
Discussion started by: KidCactus
19 Replies

7. Shell Programming and Scripting

Assigning a specific format to a specific column in a text file using awk and printf

Hi, I have the following text file: 8 T1mapping_flip02 ok 128 108 30 1 665000-000008-000001.dcm 9 T1mapping_flip05 ok 128 108 30 1 665000-000009-000001.dcm 10 T1mapping_flip10 ok 128 108 30 1 665000-000010-000001.dcm 11 T1mapping_flip15 ok 128 108 30... (2 Replies)
Discussion started by: goodbenito
2 Replies

8. Shell Programming and Scripting

Extract lines of text based on a specific keyword

I regularly extract lines of text from files based on the presence of a particular keyword; I place the extracted lines into another text file. This takes about 2 hours to complete using the "sort" command then Kate's find & highlight facility. I've been reading the forum & googling and can find... (4 Replies)
Discussion started by: DionDeVille
4 Replies

9. Shell Programming and Scripting

Insert a text from a specific row into a specific column using SED or AWK

Hi, I am having trouble converting a text file. I have been working for this whole day now, still i couldn't make it. Here is how the text file looks: _______________________________________________________ DEVICE STATUS INFORMATION FOR LOCATION 1: OPER STATES: Disabled E:Enabled ... (5 Replies)
Discussion started by: Issemael
5 Replies

10. Shell Programming and Scripting

compare two files with specific rules when comparing

Hello, I'm new here but I found many of your posts very helpful. I hope you can help me out here. I'm trying to compare two files, but the catch here is to determine if the values are less or greater than the other column. here is an example. file1: 1 91625106 91626002 1 ... (19 Replies)
Discussion started by: labrazil
19 Replies
Login or Register to Ask a Question