Help with data re-arrangement problem facing


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with data re-arrangement problem facing
# 1  
Old 12-13-2011
Help with data re-arrangement problem facing

Input file:
Code:
<symbol>Q9Y8G1</symbol>
<name>Q9Y8G1_EMENI</name>

<symbol>Q6V953</symbol>
<symbol>Q5B8K1</symbol>
<name>Q6V953_EMENI</name>

<symbol>G1A416</symbol>
<name>G1A416_9FUNG</name>

<symbol>D4N894</symbol>
<name>D4N894_PLEER</name>

<symbol>B0FFU4</symbol>
<symbol>B0LF02</symbol>
<symbol>B0LF04</symbol>
<symbol>B0LF05</symbol>
<symbol>B0LF07</symbol>
<symbol>B0LF08</symbol>
<name>B0FFU4_9HYPO</name>
.
.

Desired output file:
Code:
Q9Y8G1    Q9Y8G1_EMENI
Q6V953/Q5B8K1    Q6V953_EMENI
G1A416    G1A416_9FUNG
D4N894    D4N894_PLEER
B0FFU4/B0LF02/B0LF04/B0LF05/B0LF07/B0LF08    B0FFU4_9HYPO
.
.

Condition to generate desired output file:
1. Content in between "<symbol>" and "</symbol>" should represent data in column 1 of desired output file;
2. If more than one symbol share one "<name>", add "/" to represent the sharing relationship;
3. Content in between "<name>" and "</name>" should represent data in column 2 of desired output file;

"\n" can be treat as field separator of each group of data.

Thanks with any advice.
# 2  
Old 12-13-2011
Try this...
Code:
awk -F"<symbol>|</symbol>|<name>|</name>" '/symbol/{x=x?x"/"$2:$2}/name/{print x"\t"$2;x=""}' input_file

--ahamed
This User Gave Thanks to ahamed101 For This Post:
# 3  
Old 12-13-2011
here is your code dude...

Code:
cat share.txt
<symbol>Q9Y8G1</symbol>
<name>Q9Y8G1_EMENI</name>

<symbol>Q6V953</symbol>
<symbol>Q5B8K1</symbol>
<name>Q6V953_EMENI</name>

<symbol>G1A416</symbol>
<name>G1A416_9FUNG</name>

<symbol>D4N894</symbol>
<name>D4N894_PLEER</name>

<symbol>B0FFU4</symbol>
<symbol>B0LF02</symbol>
<symbol>B0LF04</symbol>
<symbol>B0LF05</symbol>
<symbol>B0LF07</symbol>
<symbol>B0LF08</symbol>
<name>B0FFU4_9HYPO</name>

Code:
#!/bin/sh
 

while read line
do
        if [[ `expr match "$line" ".*symbol.*"` != "0" ]]
        then
                echo -n "$line" | awk -F '>' '{printf $2}' |  awk -F '<' '{printf $1}'
                echo -n "/"
        fi

        if [[ `expr match "$line" ".*name.*"` != "0" ]]
        then
                echo -n "         "
                echo "$line" | awk -F '>' '{print $2}' |  awk -F '<' '{print $1}'
        fi
done < share.txt

# 4  
Old 12-13-2011
Code:
perl -ne 'if(/<symbol>/../<\/name>/){chomp;if(/symbol/){print "/" if($cnt!=0);s/[<\/symbol>]//g;print;$cnt++}if(/name/){s/[<\/name>]//g;print "\t$_\n";$cnt=0}}' inputfile

This User Gave Thanks to balajesuri For This Post:
# 5  
Old 12-13-2011
@ahamed..: you are good at this.. :-)... yours is better again...
# 6  
Old 12-13-2011
Alternate awk solution..
Code:
awk -F'[><]' '/symbol/{x=x"/"$3;++n;next}n{print substr(x,2),$3;x=n=""}' inputfile

This User Gave Thanks to michaelrozar17 For This Post:
# 7  
Old 12-13-2011
Thanks ahamed.
Do you mind to explain a little bit more about your code?
I'm quite confusing about following code:
Code:
'/symbol/{x=x?x"/"$2:$2}/name/{print x"\t"$2;x=""}'

Apart from that, if my input file is look like:
Code:
<name>Q9Y8G1_EMENI</name>
<symbol>Q9Y8G1</symbol>

<name>Q6V953_EMENI</name>
<symbol>Q6V953</symbol>
<symbol>Q5B8K1</symbol>

<name>G1A416_9FUNG</name>
<symbol>G1A416</symbol>

<name>D4N894_PLEER</name>
<symbol>D4N894</symbol>

<name>B0FFU4_9HYPO</name>
<symbol>B0FFU4</symbol>
<symbol>B0LF02</symbol>
<symbol>B0LF04</symbol>
<symbol>B0LF05</symbol>
<symbol>B0LF07</symbol>
<symbol>B0LF08</symbol>

How should I edit your code to generate the following output?
Code:
Q9Y8G1_EMENI   Q9Y8G1
Q6V953_EMENI   Q6V953/Q5B8K1
G1A416_9FUNG   G1A416
D4N894_PLEER   D4N894
B0FFU4_9HYPO   B0FFU4/B0LF02/B0LF04/B0LF05/B0LF07/B0LF08

Thanks for advice.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk facing delimiter inside data

Inpu file is as below: CMEOPT1_dump.1:1002 ZN:VTJ3J3C131 CMEOPT1_dump.1:1002 ZN:VTM4M4P123%5 CMEOPT1_dump.1:1002 ZN:VTM3M3P132%5 CMEOPT1_dump.2:1002 OZNG4 CMEOPT2_dump.3:1002 ZB:VTH4H4C132 CMEOPT2_dump.4:1002 ZN:VTK4K4P123 CMEOPT2_dump.5:1002 ZN:BOZ2Z2Z2P131%5 CMEOPT2_dump.5:1002 OZNG4 ... (10 Replies)
Discussion started by: zaq1xsw2
10 Replies

2. Shell Programming and Scripting

Retreive data with arrangement

Hi all I have following part of a big file TTDS00002 Synonyms M1 receptor TTDS00002 Disease Alzheimer's disease TTDS00002 Disease Bronchospasm (histamine induced) TTDS00002 Disease Cognitive deficits TTDS00002 Disease Schizophrenia TTDS00002 Function The muscarinic acetylcholine... (2 Replies)
Discussion started by: kareena
2 Replies

3. UNIX for Advanced & Expert Users

Data re-arrangement

Hi I have a huge problem to solve ASAP. Can someone please help!!! My format is arranged in this format: It has three columns. LOGIN ALIAS REC_ID A BB1 0 A ... (1 Reply)
Discussion started by: Mapilo
1 Replies

4. AIX

facing problem using su

Hi, I am able to login using su - or su directly , # prompt is coming, it doesnt ask for password. any normal user on aix system is login using su - or su . Please suggest where to change the configuration direct root login is disabled in /etc/ssh/sshd_config file. (0 Replies)
Discussion started by: manoj.solaris
0 Replies

5. Shell Programming and Scripting

Help to data re-arrangement problem

Input file <data id>="1">\ </data>\ <data id>="2">\ </data>\ <code>="1" target="2">\ </code>\ <data id>="1">\ </data>\ <data id>="2">\ </data>\ <code>="1" target="2">\ </code>\ <data id>="1">\ </data>\ <data id>="2">\ </data>\ <code>="1" target="2">\ </code>\ (2 Replies)
Discussion started by: cpp_beginner
2 Replies

6. Shell Programming and Scripting

Manipulate data in detail problem facing

Input Participant number: HAC Position type Location Distance_start Distance_end Range Mark 1 1 + Front 808 1083 276 2 1 + Front 1373 1636 264 3 1 - Back 1837 2047 211 Participant number: BCD Position type... (6 Replies)
Discussion started by: patrick87
6 Replies

7. Shell Programming and Scripting

Re-arrangement of data

Dear Frineds, I have a flat file as follows ABCD ABDCWQE POIERAS ADSGASGFG GHJKFHD XBDFGHFGDH POIU IJPFG AFGJFPGOU A;DGUPGU SFSDFDSDFHDSF SDFGHSFDH I want this column to be converted into row like follows ABCD, ABDCWQE, POIERAS, ADSGASGFG, GHJKFHD, XBDFGHFGDH (6 Replies)
Discussion started by: anushree.a
6 Replies

8. Shell Programming and Scripting

sorting/arrangement problem

Hi, I have the following 'sorting' problem. Given the input file: a:b:c:12:x:k s:m:d:8:z:m a:b:c:1:x:k p:q:r:23:y:m a:b:c:3:x:k p:q:r:1:y:m the output I expect is: a:b:c:1:x:k p:q:r:1:y:m s:m:d:8:z:m What happened here is I grouped together lines having the same values for... (7 Replies)
Discussion started by: Abhishek Ghose
7 Replies

9. UNIX for Dummies Questions & Answers

Data arrangement

10 2 1 2 3 4 5 6 7 8 20 3 2 1 3 2 9 8 2 1 Need the data to be arranged: 10 2 1 5 2 6 3 7 4 8 20 3 2 1 1 2 3 8 2 9 please help! (6 Replies)
Discussion started by: bobo
6 Replies

10. UNIX for Dummies Questions & Answers

Data arrangement

I have these following data: Home Tom Member List 100 Yes 200 No Home Tom Member List 1 No 2 Yes Home Tome Member List 3 No 400 Yes I want my data to be consistants like this: (4 Replies)
Discussion started by: bobo
4 Replies
Login or Register to Ask a Question