Add markup tag and sequential number after specific line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Add markup tag and sequential number after specific line
# 1  
Old 12-29-2012
Add markup tag and sequential number after specific line

Hello,

This one has me a bit stumped. I have data the looks like,
Code:
M  END
>  <PREDICTION_ACCURACY>
PROBABLE

>  <NO_OF_PARENTS>
3

>  <CLOGP>
-13.373

>  <SMILES>
OCC(O)C(OC1OC(CO)C(OC2OC(CO)C

>  <MIMW>
1006.322419888

>  <FORMULA>
C36H62O32

$$$$

...
...
other lines
...
...

M  END
>  <PREDICTION_ACCURACY>
PROBABLE

>  <NO_OF_PARENTS>
3

>  <CLOGP>
-12.969

>  <SMILES>
OCC(O)C(O)C(OC1OC(CO)C(OC2OC(CO)C(OC3OC(CO)C

>  <MIMW>
992.34315533

>  <FORMULA>
C36H64O31

$$$$

etc

What I need to do is to add a new markup tag after each "M END". The tag I need to add is,

> <id>

followed by a sequential number (starting with 1), followed by a blank line. The resulting text should look like,
Code:
M  END
>  <id>
1

>  <PREDICTION_ACCURACY>
PROBABLE

>  <NO_OF_PARENTS>
3

>  <CLOGP>
-13.373

>  <SMILES>
OCC(O)C(OC1OC(CO)C(OC2OC(CO)C

>  <MIMW>
1006.322419888

>  <FORMULA>
C36H62O32

$$$$

...
...
other lines
...
...

M  END
>  <id>
2

>  <PREDICTION_ACCURACY>
PROBABLE

>  <NO_OF_PARENTS>
3

>  <CLOGP>
-12.969

>  <SMILES>
OCC(O)C(O)C(OC1OC(CO)C(OC2OC(CO)C(OC3OC(CO)C

>  <MIMW>
992.34315533

>  <FORMULA>
C36H64O31

$$$$

etc...

This seems straightforward enough, but there is a bit too much multi-line work for what I know, and I'm not sure how to create the sequential number, so suggestions would be greatly appreciated. This is a large file (~1GB), so I don't know if that makes a difference.

LMHmedchem
# 2  
Old 12-29-2012
It's still a single-line problem, all that's multi-line is the output. And awk lets you print as many lines as you want.

Code:
awk '/M  END/ { print $0"\n> <id>\n"++N"\n"; next } 1' inputfile > outputfile

# 3  
Old 12-29-2012
Thanks, that worked like a charm. I was thinking of doing this in sed, but I couldn't find any examples.

If I read this right,
awk '/M END/ { print $0"\n> <id>\n"++N"\n"; next } 1' inputfile > outputfile

it looks for M END and then prints the M END line ($0), followed by newline, > <id>, newline, an incremented integer, and newline. The next tells is to keep looking for M END and the 1 is the start value for N. Is that right?

Is N an implicit integer for awk?

LMHmedchem
# 4  
Old 12-29-2012
N is an ordinary variable name. X, Q, or SLARTIBARTFAST would also work.

++N is a pre-increment operator -- add one to the variable, then get its value. An unused variable in awk is blank, but blank plus one does the right thing, so it works. If I'd used N++ instead, the first match would be blank instead of 1, since it gets the value and adds after.

In detail:

Code:
# Run the following code block for all things matching the regex
# Print the whole line, the line "> <id>", the variable N plus one, and a newline.
# An extra newline will be added by the print command itself.
# Skip to the next line, since we don't need the current line printed again.
# Print all other lines.
awk '/M END/ { print $0"\n> <id>\n"++N"\n"; next } 1' inputfile > outputfile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to extract tag and add to each line

In the awk below which executes as is, I am trying to add a condition that will extract the text or value after the FR= for the lines in each line of file1 compared to file2. As is the lines between the two files are either a match, Missing in file 1, or Missing in file2, but I can not add the... (1 Reply)
Discussion started by: cmccabe
1 Replies

2. Shell Programming and Scripting

sed command to replace a line at a specific line number with some other line

my requirement is, consider a file output cat output blah sdjfhjkd jsdfhjksdh sdfs 23423 sdfsdf sdf"sdfsdf"sdfsdf"""""dsf hellow there this doesnt look good et cetc etc etcetera i want to replace a line of line number 4 ("this doesnt look good") with some other line ... (3 Replies)
Discussion started by: vivek d r
3 Replies

3. Shell Programming and Scripting

Cut from specific line number to a line number

Hi All, I've a file like this.. Sheet1 a,1 a,2 a,3 a,4 a,5 Sheet2 a,6 a,7 a,8 a,9 a,10 Sheet3 a,11 a,12 a,13 (7 Replies)
Discussion started by: manab86
7 Replies

4. Shell Programming and Scripting

Add the html tag first and last line the file

Hi, i have 30 html files and i want to add the html tag first (<html>) and end of the line </html> tag..How to do it in script. Thanks, (7 Replies)
Discussion started by: bmk
7 Replies

5. Shell Programming and Scripting

sequential to line sequential

Hi I have a file sequential way i.e. written in contineous mode and the Record Seperator is AM from which the record is seperated .Now to process I have to make line sequential,and more over record length is not same it varies as per the input address, AM1234563 John Murray 24 Old streeet old... (5 Replies)
Discussion started by: vakharia Mahesh
5 Replies

6. Shell Programming and Scripting

How would i delete a line at specific line number

Hi guys , I m writing a script to delete a line at particular location. But i m unable to use variable for specifying line number. for example. sed -n '7!p' filename works fine and deletes 7th line from my file but sed -n '$variable!p' filename gives following error. sed: -e... (12 Replies)
Discussion started by: pinga123
12 Replies

7. Shell Programming and Scripting

Append specific lines to a previous line based on sequential search criteria

I'll try explain this as best I can. Let me know if it is not clear. I have large text files that contain data as such: 143593502 09-08-20 09:02:13 xxxxxxxxxxx xxxxxxxxxxx 09-08-20 09:02:11 N line 1 test line 2 test line 3 test 143593503 09-08-20 09:02:13... (3 Replies)
Discussion started by: jesse
3 Replies

8. Shell Programming and Scripting

using sed to replace a specific string on a specific line number using variables

using sed to replace a specific string on a specific line number using variables this is where i am at grep -v WARNING output | grep -v spawn | grep -v Passphrase | grep -v Authentication | grep -v '/sbin/tfadmin netguard -C'| grep -v 'NETWORK>' >> output.clean grep -n Destination... (2 Replies)
Discussion started by: todd.cutting
2 Replies

9. Shell Programming and Scripting

Adding a columnfrom a specifit line number to a specific line number

Hi, I have a huge file & I want to add a specific text in column. But I want to add this text from a specific line number to a specific line number & another text in to another range of line numbers. To be more specific: lets say my file has 1000 lines & 4 Columns. I want to add text "Hello"... (2 Replies)
Discussion started by: Ezy
2 Replies

10. Programming

Reading special characters while converting sequential file to line sequential

We have to convert a sequential file to a 80 char line sequential file (HP UX platform).The sequential file contains special characters. which after conversion of the file to line sequential are getting coverted into "new line" or "tab" and file is getting distorted. Is there any way to read these... (2 Replies)
Discussion started by: Rajeshsu
2 Replies
Login or Register to Ask a Question