The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
urgent-extracting block data from flat file using shell script shirish_cd Shell Programming and Scripting 4 02-06-2008 05:05 AM
lining up columns of data pau Shell Programming and Scripting 2 05-16-2006 06:40 AM
How to compare two flat files and get changed data jtshashidhar Shell Programming and Scripting 3 01-29-2006 06:26 PM
How to compare data in two flat files and update them? rajus19 Shell Programming and Scripting 3 11-08-2005 07:13 AM
processing data in a flat file wolkott Shell Programming and Scripting 4 01-27-2003 09:57 AM

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 06-14-2006
Registered User
 

Join Date: Feb 2004
Posts: 2
Help with Data Positioning from Columns in a flat file.

Hi All,
I have used this forum many times to solve my many scripting problems. This time, I would like to seek some answers to a problem that I've been head scratching quite a bit on.

My Example:
I am converting a 2000-byte file into a 300-byte file
this file has no delimiters and hardly any spaces inbetween the cols...
I have successfully converted most of the 2000-byte cols into the 300-byte file format by using awk's substr function, but I am having an issue in getting 1 col of the 2000-byte file to space properly in the 300-byte file.
SUBSTR would work perfectly if the data was positioned all the same way.

My dilema,
The 2000-byte column starts at position 150 and ends at 180 (30 bytes)
this col contains city,state,zip code info
and the positioning of these items varies per record...
example record#1 anytown us 11111
example record#2 any town us 11111
example record#3 anytown us11111
etc,

The requirements,
I have to specifically place the city info into position 159-181 of the 300-byte file
I have to specifically place the state info into pos 182-183
& I have to spcifically place the zip info into pos 184-188

Not sure what would work best, since I have to keep intact the integrity of each record as well.....and then manipulate this col .... without trashing anything else....I would appreciate any ideas...
thanks...
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 06-14-2006
vgersh99's Avatar
Moderator
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 2,999
1. do all of your 'columns' starts at the same position?
2. do all the columns are of the form
Code:
<town name> US 5numberZIP
3. Is the string US always the same OR is it the 2-letter state abbreviation?
4. a couple of sample records/lines would help to see the 'pattern'
Reply With Quote
  #3 (permalink)  
Old 06-14-2006
Registered User
 

Join Date: Feb 2004
Posts: 2
Hi...vergsh99

1) yes it all starts at the first position, BUT the data is not specific after that...tho they do follow this pattern.. city then state then zip

2) yes its that format...but they all have various positioning

3) US was just an example...it can be any state abbr....TX, CA, etc..

4) examples:

BIG COVE TANNRY PA 17212
WEST CHESTER PA 19382
PEARL RIVER NY10965
JAMAICA NY 11432


I need it look like this
Pos 159-181 Pos 182-188

BIG COVE TANNRY PA17212
WEST CHESTER PA19382
PEARL RIVER NY10965
JAMAICA NY11432
Reply With Quote
  #4 (permalink)  
Old 06-14-2006
vgersh99's Avatar
Moderator
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 2,999
here's an idea to start with.......

assuming your 'column' starts at position '10' - you can change the code to reflect the 'reality'.

nawk -f oo.awk oo.txt

oo.txt:
Code:
a;skldfj BIG COVE TANNRY PA17212 sakdlfjaslkdf
zxm,cvn  WEST CHESTER PA19382 a;sldkfja;skldj asdlk
a;skldfj PEARL RIVER NY10965 127 asdkfj asdlfkj
wut sdf  JAMAICA NY11432  sd sdlkfj 0938
oo.awk:
Code:
BEGIN {
  FLRstart=10
  PAT=".*[A-Z][A-Z].*[0-9][0-9][0-9][0-9][0-9]"
  PATstate="[A-Z][A-Z][ ]*[0-9][0-9][0-9][0-9][0-9]$"
}

{
   _sub=substr($0, FLRstart)
   #printf("_sub->[%s]\n", _sub)

   if( match(_sub, PAT) )
     #printf("subSTR->[%s]\n", substr(_sub, RSTART, RLENGTH))

   _column=substr(_sub, 1, RSTART+RLENGTH-1)
   printf("_column->[%s]\n", _column)

   end=substr(_sub, RSTART)

   zip=substr(_column, RLENGTH-4)

   if (match(_column, PATstate) )
     state=substr(_column, RSTART, 2)

   town=substr(_column, 1, RSTART-2)

   printf("town->[%s]\n", town)
   printf("state->[%s]\n", state)
   printf("zip->[%s]\n", zip)

}
Reply With Quote
Google UNIX.COM
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 02:48 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0