![]() |
|
|
|
|
|||||||
| Forums | Portal | Register | Forum Rules | FAQ | Contribute | Members List | Arcade | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here. |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Help me with parsing this file | eamani_sun | Shell Programming and Scripting | 2 | 05-16-2008 12:39 PM |
| Parsing xml file using Sed | kapilkinha | UNIX for Advanced & Expert Users | 3 | 04-08-2008 06:43 AM |
| Parsing a csv file | chiru_h | Shell Programming and Scripting | 6 | 02-12-2008 05:33 AM |
| File Parsing | jsusheel | Shell Programming and Scripting | 5 | 09-25-2007 07:25 AM |
| parsing file through awk | bbeugie | Shell Programming and Scripting | 13 | 08-22-2006 10:21 AM |
|
|
Submit Tools | LinkBack | Thread Tools | Display Modes |
|
|||
|
awk and file parsing
Hi, I have a input file like this
TH2TH2867Y NOW33332106Yo You Baby TH2TH3867Y NOW33332106No Way Out TH2TH9867Y NOW33332106Can't find it TJ2TJ2872N WOW33332017sure thing alas TJ2TJ3872N WOW33332017the sky rocks TJ2TJ4872N WOW33332017nothing else matters TJ2TJ5872N WOW33332017you know about it TJ2TJ6872N WOW33331999nothing else matters TJ2TJ7872N WOW33332017nothing else matters TJ2TJ8872N WOW33332017No Way Out TJ2TAW872N WOW33331999No Way Out TJAPXC050Y NOW33331999No Way Out. TJAT1N999Y NOW33331999still loving you. TJBJOG575Y NOW33331999Jacka nd jill. TJBJXG575Y NOW33331999Julie and friend I am trying to get the output something like this- Yo You Baby|TH2 No Way Out|TJ2 still loving you.|TJA You got it|TJB Here..TH2,TJ2,TJA and TJB are the distinct first 3 characters from the input. In the input , lets say fr=substr($0,1,3) and nx=substr($0,4,3). Basically, i want to check the line if the first 3 character(fr) = the next 3 characters(nx), then print substr($0,23,20) and the substr($0,1,3) If they dont match, then print the first occurance of the fr with the substr($0,23,20). Help! Regards, Big Gun |
| Forum Sponsor | ||
|
|
|
|||
|
I did try this-
awk 'BEGIN{OFS="|"}{fr=substr($0,1,3);nx=substr($0,4,3); if (fr == nx) print substr($0,23,20),fr}' inputfile| nawk 'BEGIN{FS="|";OFS="|"}{ sub(/[ \t]*$/, "",$1);print $1,$2}' But this will missed out to print lines when they dont match in my above example - TJAPXC050Y NOW33331999still loving you. TJAT1N999Y NOW33331999still loving you. I should be getting NOW33331999still loving you.|TJA |
|
|||
|
How can I also include to print the below output
NOW33331999still loving you.|TJA from the input TJAPXC050Y NOW33331999still loving you. TJAT1N999Y NOW33331999still loving you. in which case fr is not equal to nx. so i would like to print the first occurance of the line. |
|
|||
|
Posting with rephrased problem statement.
Hi, I have a input file like this TH2TH2867Y NOW33332106Yo You Baby TH2TH3867Y NOW33332106No Way Out TH2TH9867Y NOW33332106Can't find it TJ2TJ2872N WOW33332017sure thing alas TJ2TJ3872N WOW33332017the sky rocks TJ2TJ4872N WOW33332017nothing else matters TJ2TJ5872N WOW33332017you know about it TJ2TJ6872N WOW33331999nothing else matters TJ2TJ7872N WOW33332017nothing else matters TJ2TJ8872N WOW33332017No Way Out TJ2TAW872N WOW33331999No Way Out TJAPXC050Y NOW33331999No Way Out. TJAT1N999Y NOW33331999still loving you. TJBJOG575Y NOW33331999Jacka nd jill. TJBJXG575Y NOW33331999Julie and friend I am trying to get the output something like this- Yo You Baby|TH2 sure thing alas|TJ2 No Way Out.|TJA Jacka nd jill|TJB Here..TH2,TJ2,TJA and TJB are the distinct first 3 characters from the input. In the input , lets say fr=substr($0,1,3) and nx=substr($0,4,3). Basically, i want to check the line if the first 3 character(fr) = the next 3 characters(nx), then print substr($0,23,20) and the substr($0,1,3) If they dont match, then print the first occurance of the fr with its associated substr($0,23,20). I started doing domething like this.. awk 'BEGIN{OFS="|"}{fr=substr($0,1,3);nx=substr($0,4,3); if (fr == nx) print substr($0,23,20),fr}' inputfile | nawk 'BEGIN{FS="|";OFS="|"}{ sub(/[ \t]*$/, "",$1);print $1,$2}' But this will missed out to print lines when fr and nx dont match in my above example - fr doesn't match with fr.. TJAPXC050Y NOW33331999No Way Out. TJAT1N999Y NOW33331999still loving you. TJBJOG575Y NOW33331999Jacka nd jill. TJBJXG575Y NOW33331999Julie and friend But I would like to get the result as below too...( the first occurance of the fr and its substr ) No Way Out.|TJA Jacka nd jill|TJB Help! Regards, Big Gun |
|||
| Google The UNIX and Linux Forums |