File parsing in Unix


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting File parsing in Unix
# 1  
Old 01-11-2011
File parsing in Unix

I have got a large file with content as below.

LOCATION A B C 1

Line 3
Line 4



*Line 8
**Line 9


TAG END


LOCATION A B" C 3

Line 33
Line 34



Line 38

*Line 40
**Line 41


TAG END



LOCATION A' B C 4


Line 51
Line 52

*Line 54
**Line 55


TAG END



LOCATION A B C 5


Line 51
Line 52

*Line 54
**Line 55


TAG END

Now I need to format this file to the below format

LOC A B C 1

Line 3
Line 4



*Line 8
**Line 9


TAG END


LOC A B`` C 3

Line 33
Line 34



Line 38

*Line 40
**Line 41


TAG END



LOC A` B C 4


Line 51
Line 52

*Line 54
**Line 55


TAG END



LOC A B C 5


Line 51
Line 52

*Line 54
**Line 55


TAG END

Putting into words, my large file has got several tags which start with a Location name (eg, LOC A B ) and ends with TAG END ( "TAG END" word is fixed). Now the location might contain some single quote (') or double quote (") in it's value. All I have to do is to search for all the loc names for single and double quote and replace them with back quotes ( one back quote for single quote and 2 back quotes for a double quote).
Note 1: One tag ( from location name till TAG END) might contain any number of lines.
Note 2: There could be one, two, three or more number of blank lines between TAG END and next location name.
Note 3: I should not alter any single or double quote present inside the tag, but only those present in the location name.

One idea is to look for the string "TAG END" and check for the next non blank line. This must be a location name. Only execption is the first non blank line which itself is the store name. To make it uniform we can add the line "TAG END" as first line and finally remove this in output file.

Any other suggestion is welcome. Please help me in achieving this.

Thanks in advance..
Rinku
# 2  
Old 01-11-2011
Code:
sed "/LOC/{s:':\`:g;s:\":\`\`:g;}" infile

should work, but if you get trouble with the single quote, then, escape it :
Code:
sed "/LOC/{s:\':\`:g;s:\":\`\`:g;}" infile

or
Code:
sed "/LOC/s:\':\`:g;/LOC/s:\":\`\`:g" infile


Last edited by ctsgnb; 01-11-2011 at 12:27 PM..
# 3  
Old 01-12-2011
Hi

Thanks for resposonse but it does not solve the problem. As I said, Location name is not fixed and LOC A B is just an example of location name. It does not mean that it starts with LOC. Sorry for the confusion.

Thanks,
Rinku
# 4  
Old 01-12-2011
copy/paste a representative sample of the content of your file then.
We cannot help to parse a file if we ignore its structure.
# 5  
Old 01-12-2011
Code:
 awk -vRS="TAG END" -vFS="\n" '/LOCATION/{gsub(/\047/,"`",$1);gsub(/\042/,"``",$1);print}' ORS="TAG END" file

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Parsing UNIX emails

Hi all, I am new to unix and I have a requirement to develop a shell script which does the following:- Check for email from a particular user If found, copy the attachment in the mail to a particular directory. Delete the email. Any help, pointers are really appreciated. Thanks (3 Replies)
Discussion started by: esgovi1234
3 Replies

2. Shell Programming and Scripting

Specific string parsing in Linux/UNIX

Hi, I have a string which can be completely unstructred. I am looking to parse out values within that String. Here is an example <Random Strings> String1=<some number a> String2=<some number b> String3=<some number c> Satish=<some number d> String4=<some number e> I only want to parse out... (1 Reply)
Discussion started by: satishrao
1 Replies

3. Windows & DOS: Issues & Discussions

Parsing a UNIX txt to separate files

I have a requirement to parse a dataflex .txt file and break it into separate files within the Windows server env. Is there any special characters I should pay particular attention on the unix side? Any ideas? Thanks in advance (2 Replies)
Discussion started by: kicklinr
2 Replies

4. UNIX for Advanced & Expert Users

XML parsing by UNIX

Hi, I am new in shell scripting. i want to extract tag values in xml files by using shell script. my files like this: <cw: properties> <cw:std_properties> <tns: properties> <tns:name>AdminOutQueue</tns:name> <tns:type>String</tns:type> <tns:subtype>QueueName</tns:subtype> <tns:value... (25 Replies)
Discussion started by: arindam guha
25 Replies

5. Shell Programming and Scripting

New to UNIX ... Date parsing

Hi ... extremely new to Unix scripting ... I have to get a date field from mm/dd/yyyy to yyyy/mm/dd format ... I have a variable that I want parsed. I don't seem to be doing it right?? Thanks. (19 Replies)
Discussion started by: MJKeeble
19 Replies

6. Shell Programming and Scripting

string parsing using UNIX

I got multple sql files.such as >>vi abc.sql select A.SITENAME, NULL NULL A.CREATE_DTM NULL A.MODIFY_DTM NULL FROM ${STG_RET_ITEM} A INNER JOIN ${STG_INC_COMP} B ON (A.CUSTID=B.CUSTID) LEFT OUTER JOIN ( select C.SITEID,SITESTATUS,MIN_EFF_DT,CURR_ST_DT,MAX_IN_DT,MAX_ACT_DT from... (4 Replies)
Discussion started by: ali123
4 Replies

7. Programming

Parsing unix STAT structure

Hi I am creating a utility which needs to create a log file under the path represented by an environment variable. The condition is that this path must be a valid DIRECTORY PATH. So i need to determine that the path is indeed a VALID DIRECTORY PATH. I have written a function which will return... (2 Replies)
Discussion started by: skyineyes
2 Replies

8. Shell Programming and Scripting

Parsing a file using unix

I have the following input file $AGENCY_SCRIPTS/mfa_pools_avlbal.sh get $AGENCY_SCRIPTS/daily_db_tables.sh $AGENCY_SCRIPTS/missing_prefix_new.sas $AGENCY_SCRIPTS/fnma-arm-matrix.pl $AGENCY_SCRIPTS/fnma_mega_dus.sh curr $AGENCY_SCRIPTS/fhlm_supp_daily.c $AGENCY_SCRIPTS/run_fhlm_prepay.sh... (12 Replies)
Discussion started by: ramky79
12 Replies

9. UNIX for Advanced & Expert Users

html parsing using unix

hi all, I had raised the same question a few weeks back but forgot to mention a lot of points ... so i am raising a new thread furnishing my requirement ... sorry for that .... here is my problem. i have a html that look like below <tr class="modifications-oddrow"> <td... (2 Replies)
Discussion started by: sais
2 Replies

10. Shell Programming and Scripting

Parsing of file for Report Generation (String parsing and splitting)

Hey guys, I have this file generated by me... i want to create some HTML output from it. The problem is that i am really confused about how do I go about reading the file. The file is in the following format: TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Discussion started by: umar.shaikh
8 Replies
Login or Register to Ask a Question