Read tags in text file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Read tags in text file
# 1  
Old 07-20-2010
Read tags in text file

Hello Team,

I am writing a script that reads a text (say 1.txt - 2 s2 a+bb means Number State Label) file having data as:
Code:
2 s2 a+bb
3 s3 a+bb
4 s4 a+bb

And there is another text file (say 2.txt) that has sample data as;
Code:
~x "a+bb"
<BEGIN>
<TOTAL> 3
<STATE> 1
~y "S_2"
<STATE> 2
~y "S_6"
<STATE> 3
~y "S_4"
~z "Z_t"
<END>
~y "S_2"
<FIRST> 4
 5.66 5.66 6.66 7.33
<SECOND> 4
 1.23 4.55 4.55 4.55
~y "S_6"
<FIRST> 4
 5.66 5.66 6.66 7.33
<SECOND> 4
 1.23 4.55 4.55 4.55
~y "S_4"
<FIRST> 4
 5.66 5.66 6.66 7.33
<SECOND> 4
 1.23 4.55 4.55 4.55

My script should be that on reading file 1.txt, it searches 2.txt for label "a+bb" (unique and not patterns like a+bb+c), reads <STATE>2 and then read <FIRST> and <SECOND> tags 2 times to give output as <FIRST><SECOND><FIRST><SECOND> i.e; as 5.66,5.66,6.66,7.33,1.23,4.55,4.55,4.55,5.66,5.66,6.66,7.33,1.23,4.55,4.55,4.55. These will all be comma separated like an array which I will use in my program later.
After this,it again reads second line in 1.txt (3 s3 a+bb), searches again label "a+bb" and read <STATE>3 (as given s3 in 1.txt) and append <FIRST><SECOND><FIRST><SECOND><FIRST><SECOND> 3 times (as given 3 in column 1 in 1.txt) with previous array. It repeats till 1.txt has all line traversed,

I am very much stuck in this part of my program. If any one help me out, I shall be very thankful.

Thanks.

Last edited by radoulov; 07-20-2010 at 06:47 AM.. Reason: more info; code tags, please!
# 2  
Old 07-20-2010
Could you post an example of the desired output?
# 3  
Old 07-20-2010
For 1.txt my output should be :
Code:
5.66,5.66,6.66,7.33,1.23,4.55,4.55,4.55,5.66,5.66,6.66,7.33,1.23,4.55,4.55,4.55,5.66,5.66,6.66,7.33,1.23,4.55,4.55,4.55,5.66,5.66,6.66,7.33,1.23,4.55,4.55,4.55,5.66,5.66,6.66,7.33,1.23,4.55,4.55,4.55,5.66,5.66,6.66,7.33,1.23,4.55,4.55,4.55,5.66,5.66,6.66,7.33,1.23,4.55,4.55,4.55,5.66,5.66,6.66,7.33,1.23,4.55,4.55,4.55,5.66,5.66,6.66,7.33,1.23,4.55,4.55,4.55

That can be interpreted as 2 times State s2 of a+bb in 2.txt, appended by 3 times State s3 of a+bb in 2.txt and 4 times State s4 of a+bb in 2.txt. I hope I have clearly expressed.
# 4  
Old 07-20-2010
You sample data includes a single label. If the label is important, you should post a bigger sample of your data, in order to make us understand how the different labels should be treated.
# 5  
Old 07-20-2010
I am pasting bigger sample of my data file (2.txt), original file runs into 5MB data.
Code:
~b "ST_ah_2_2"
<FIRST> 
 6.42 2.53
<SECOND> 2
 1.8 6.29
~b "ST_ah_3_6"
<FIRST> 2
 6.61 1.02
<SECOND> 2
 1.51 6.33
~b "ST_ah_4_9"
<FIRST> 2
 6.33 1.02
<SECOND> 2
 2.61 2.42
~b "ST_ih_2_2"
<FIRST> 2
 6.66 1.01
<SECOND> 2
 2.01 1.08
~b "ST_ih_3_3"
<FIRST> 2
 6.63 1.20
<SECOND> 2
 2.29 1.02
 ~b "ST_ih_4_4"
<FIRST> 2
 6.87 9.01
<SECOND> 2
 3.45 4.06
~b "ST_er_2_5"
<FIRST> 2
 6.89 1.20
<SECOND> 2
 2.16 4.22
~b "ST_er_3_5"
<FIRST> 2
 6.01 9.20
<SECOND> 2
 6.16 4.22
 ~b "ST_er_4_5"
<FIRST> 2
 6.89 1.20
<SECOND> 2
 6.36 2.42
~a "aa-ah"
<BEGIN>
<STATES> 3
<STATE> 1
~b "ST_ah_2_2"
<STATE> 2
~b "ST_ah_3_6"
<STATE> 3
~b "ST_ah_4_9"
~z "Z_ah"
<END>
~a "iy-ih"
<BEGIN>
<STATES> 3
<STATE> 1
~b "ST_ih_2_2"
<STATE> 2
~b "ST_ih_3_3"
<STATE> 3
~b "ST_ih_4_4"
~z "Z_ih"
<END>
~a "ey+er"
<BEGIN>
<STATES> 3
<STATE> 1
~b "ST_er_2_5"
<STATE> 2
~b "ST_er_3_5"
<STATE> 3
~b "ST_er_4_5"
~z "Z_er"
<END>

If my 1.txt happened to be like this:
Code:
2 s1 ey+er
1 s2 ey+er
1 s3 ey+er
1 s1 iy-ih
1 s2 iy-ih
2 s3 iy-ih

I am expecting output as;
Code:

Quote:
6.89 1.20 2.16 4.22 6.89 1.20 2.16 4.22 6.01 9.20 6.16 4.22 6.89 1.20 6.36 2.42 6.66 1.01 2.01 1.08 6.63 1.20 2.29 1.02 6.87 9.01 3.45 4.06 6.87 9.01 3.45 4.06

Output could be understood as;
1. Read 1.txt. Line 1 is 2 s1 ey+er i.e; 2 times <STATE> 1 of label ey+er in 2.txt.
2. Search "ey+er" (unique and not pattern as there might be labels like "ey+er+t" etc.) in 2.txt. Go to tag <STATE> 1. Then Combine elements of <FIRST> and <SECOND> tag i.e 6.89 1.20 2.16 4.22. After this, as column 1 in 1.txt was 2, so write this 2 times in a file i.e;
Code:
6.89 1.20 2.16 4.22 6.89 1.20 2.16 4.22

3. Again read 2nd line of 1.txt i.e. 1 s2 ey+er.
4. Search <STATE> 2 (i.e s2) of label ey+er in 2.txt. Combine elements of <FIRST> and <SECOND> tag i.e.
Code:
6.01 9.20 6.16 4.22

. As column 1 of 1.txt has element 1, so only one time I append this in new text file.
Now the combined output is :
Code:
6.89 1.20 2.16 4.22 6.89 1.20 2.16 4.22 6.01 9.20 6.16 4.22

I need to repeat this process till 1.txt finishes.

Last edited by AKD; 07-20-2010 at 10:10 AM.. Reason: output
# 6  
Old 07-20-2010
Something like this:

Code:
awk 'END { print RS }
NR == FNR {
  /~b/ && state_name = $2
  if (/^ *[0-9.]/) { 
    sub(/^  */, null)
    state_values[state_name] = state_values[state_name] ? \
      state_values[state_name] FS $0 : $0
      }
  /~a/ && label = $2
  /~b/ && state_names[label] = state_names[label] ? \
            state_names[label] SUBSEP $2 : $2
  next
  }
{
  sn = split(state_names[qq $3 qq], t, SUBSEP)   
  for (i = 0; ++i <= $1;)
    printf "%s ", state_values[t[substr($2, 2)]]  
  }' qq='"' 2.txt 1.txt

This User Gave Thanks to radoulov For This Post:
# 7  
Old 07-20-2010
After see radoulov's code, mine is useless. Smilie

removed

Last edited by rdcwayx; 07-20-2010 at 11:01 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read csv file, convert the data and make one text file in UNIX shell scripting

I have input data looks like this which is a part of a csv file 7,1265,76548,"0102:04" 8,1266,76545,"0112:04" I need to make the output data should look like this and the output data will be part of text file: 7|1265000 |7654899 |A| 8|12660000 |76545999 |B| The logic behind the... (6 Replies)
Discussion started by: RJG
6 Replies

2. Shell Programming and Scripting

Read in search strings from text file, search for string in second text file and output to CSV

Hi guys, I have a text file named file1.txt that is formatted like this: 001 , ID , 20000 002 , Name , Brandon 003 , Phone_Number , 616-234-1999 004 , SSNumber , 234-23-234 005 , Model , Toyota 007 , Engine ,V8 008 , GPS , OFF and I have file2.txt formatted like this: ... (2 Replies)
Discussion started by: An0mander
2 Replies

3. Shell Programming and Scripting

Read n lines from a text files getting n from within the text file

I dont even have a sample script cause I dont know where to start from. My data lookes like this > sat#16 #data: 15 site:UNZA baseline: 205.9151 0.008 -165.2465 35.8109 40.6685 21.9148 121.1446 26.4629 -18.4976 33.8722 0.017 -165.2243 48.2201 40.6908 ... (8 Replies)
Discussion started by: malandisa
8 Replies

4. Shell Programming and Scripting

Read text file and use it as input

I need to take a text file that holds a bunch of data and run each the stuff in it as an input for the program. the file would hold stuff like this: thing1.awesomesite.com 80 123.456 thing2.awesomesite.com 80 789.098 thing3.awesomesite.com 80 765.432 ... Now I already know the... (1 Reply)
Discussion started by: shade917
1 Replies

5. Shell Programming and Scripting

how read specific line in a file and write it in a new text file?

I have list of files in a directory 'dir'. Each file is of type HTML. I need to read each file and get the string which starts with 'http' and write them in a new text file. How can i do this shell scripting? file1.html <head> <url>http://www.google.com</url> </head> file2.html <head>... (6 Replies)
Discussion started by: vel4ever
6 Replies

6. UNIX for Dummies Questions & Answers

how to read the second word of a text file

Folks, how to read the second word of the first line from a text file. Text file does not have any delimiters in the line and has words at random locations. Basically the text file is a log and i want to capture a number that is in second position. Appreciate your help Venu (1 Reply)
Discussion started by: venu
1 Replies

7. Shell Programming and Scripting

Read any lines of text from file

Witam wszystkich , Jest to moj pierwszy post i już prośba ale gdybym potrafił zaradzić problemowi to nie zawracałbym nikomu głowy . mianowicie : Mam jakis 'plik' w ktorym są osadzone pojedyncze i zmienne słowa po jednym w lini czyli : test1 tekttw resst .... itd. Moje... (6 Replies)
Discussion started by: versace
6 Replies

8. Shell Programming and Scripting

Read text file in Cshell

I've been searching the forums for info on reading a text file in a Cshell script but nothing I'm trying is working. My latest attempt was: set LASInputFile = `ls *. | head -1` echo $LASInputFile while read line do echo $line done < $LASInputFile My error message is: while:... (7 Replies)
Discussion started by: phudgens
7 Replies

9. UNIX for Advanced & Expert Users

How can i read a non text file in unix - ELF-64 executable object file - IA64

The binary file is ELF-64 executable object file - IA64. How i know that the source is Is there any comamnd in unix i can read these kind of files or use a thirty party software? Thanks for your help (8 Replies)
Discussion started by: alexcol
8 Replies

10. UNIX for Dummies Questions & Answers

need to read 3° character from a text file

Hi, I need a script to read the n° character from a text file. eg: if the text file contains the line "123456" ,I nedd a command to display the number 4, as an example. I tried with awk and printf but it seems only works with words separated with spaces, but in this case I have only one word... (15 Replies)
Discussion started by: piltrafa
15 Replies
Login or Register to Ask a Question