The UNIX and Linux Forums  


Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Multi-line output to single line LinuxRacr Shell Programming and Scripting 7 02-26-2008 10:05 AM
Multi File processing orahi001 Shell Programming and Scripting 2 01-10-2008 04:49 PM
AWK Multi-Line Records Numbering Problem RacerX Shell Programming and Scripting 3 11-01-2007 10:44 AM
Making multi line output appear on one line djsal Shell Programming and Scripting 1 10-07-2004 07:21 PM
Multi line deletes rehoboth UNIX for Dummies Questions & Answers 5 10-18-2002 04:39 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 10-11-2007
RacerX's Avatar
RacerX RacerX is offline
Registered User
  
 

Join Date: Oct 2007
Posts: 38
AWK Multi-Line Records Processing

I am an Awk newbie and cannot wrap my brain around my problem:

Given multi-line records of varying lengths separated by a blank line I need to skip the first two lines
of every record and extract every-other line in each record unless the first line of the record has the word "(CONT)" in the line
then skip the second line and append those every-other lines to the previous records every-other lines.

I hope that makes some sort of sense. I tried the following awk to get every
other line but it doesn't come out right. So i haven't even begun to try to figure out the rest of my problem....
awk '(((NR % 2) == 0) && ( NR > 2 )) {print}' ~/Desktop/datafile

Any programming help would be appreciated! I have provided the following example input data of four records:

CHARGER R M 1972 9 3 3 1 $7,060 1570 INDY 13 $27,717 MICKEY E.& OLGA B.SMITH,VIENNA,NC.
LA72TAUR FORD CHEVY GMC 1.57.00Q DAVID R.MILLER,ALDEN,NY.
MD TEST 0321 1371 OFF OFF OFF SONNY SMITH, KASHMAN, DAVE
FEB 20-98 VLY 1041 2094 8 8 8 8 8 8 NB SMITH GW : - : - :LAPS : 8 : 40 : - : - : 1
GD TEST 0311 1354 3H 4H 2H 0304 VIC, YO HERSHEY, CHARGER
JAN 7-98 VLY 1030 2064 6 3 3 4 4 3 2071 NB SMITH GW : - : - :LAPS : 6 : 40 : - : - : 1
$2,000 MD NW2L5CD 0303 1343 1Q 2 2T 0314 WILD MICK, CHARGER, TEKLA
MAR 9-98 VLY $500 1024 2060 5 2 3 4 2 2 2063 1900 SMITH GW : - : - :LAPS : 8 : 36 : - : - : 1
$2,700 GD OPENRUN 0292 1312 2 H 1 0303 CHARGER, HAL THE BARBER, WOLFMAN
MAR 13-98 VLY $1,350 1004 2022 2 3 2 2 1 1 2022 1130 SARAMA GW : - : - :LAPS : 7 : 31 : - : - : 2
$2,700 FT NW2L5CD 0294 1320 Q Q H 0300 WHEELS WIN, CHARGER, ROCK
MAR 27-98 VLY $675 1013 2020 4 1 1 1 1 2 2020 *1750 SMITH GW : - : - :LAPS : 8 : 60 : - : - : 2
$3,000 FT OPEN 0291 1301 1 1 3Q 0293 CHARGER, OVERRUN, ROCK
MAY 1-98 VLY $1,500 0594 1594 2 1 1 1 1 1 1594 *9500 SMITH GW : - : - :LAPS : 7 : 70 : - : - : 9
$4,000 FT OPEN 0263 1280 2Q 1Q T 0283 CHARGER, GUARDIAN, TORRE
MAY 9-98 HILL $2,000 0581 1570 1 6 5 4 2 1 1570 *2200 SMITH GW : - : - :LAPS : 7 : 60 : - : - : 8
$4,400 FT WO4000LT 0292 1320 OFF OFF OFF TORRE, ROCK, TY ZOLAK
MAY 15-98 HILL 1003 2011 7 8 8 8 8 8 *265 SMITH GW : - : - :LAPS : 8 : 75 : - : - : 9
$8,000 FT TM1500CND 0290 1294 OFF OFF BOSTON BEEMER, THE CANNON, ZURICH TOYOTA
MAY 21-98 HILL 0593 2010 7 8 8 8 8 8 9550 SMITH GW : - : - :LAPS : 9 : 46 : - : - : 2

SPARKPLUG BLK M 1964 2 5 5 4 $10,534 2001 HILL 5 $14,926 JOHN DOE,TARPORT,DE.
N764CHVY FORD CHEVY GMC 2.00.10F ELMER SMITH,NY,NY.
$2,700 FT NW4L5CD 0294 1320 Q Q H 0300 WHEELS WIN, CHARGER, ROCK
FEB 22-98 VLY $675 1013 2020 4 1 1 1 1 2 2020 *1750 SMITH GW : - : - :LAPS : 8 : 60 : - : - : 3
$2,700 FT NW4L5CD 0291 1311 1H T LT 0294 HAL THE BARBER, CHARGER, MAC
APR 3-98 VLY $675 1001 2011 3 2 2 3 3 2 2011 *1550 SMITH GW : - : - :LAPS : 6 : 45 : - : - : 5

SPARKPLUG (CONT)
N764CHVY
$2,000 MD NW4L5CD 0303 1343 1Q 2 2T 0314 WILD MICK, CHARGER, TEKLA
MAR 8-99 VLY $500 1024 2060 5 2 3 4 2 2 2063 1900 SMITH GW : - : - :LAPS : 8 : 36 : - : - : 10
$2,700 GD OPENRUN 0292 1312 2 H 1 0303 CHARGER, HAL THE BARBER, WOLFMAN
MAR 13-99 VLY $1,350 1004 2022 2 3 2 2 1 1 2022 1130 SMITH GW : - : - :LAPS : 7 : 31 : - : - : 7
$2,700 FT NW4L5CD 0294 1320 Q Q H 0300 WHEELS WIN, CHARGER, ROCK

DUTCHESS W F 82 21 3 2 4 $10,834 2003 VLY 3 $10,858 TARP INC,VALLEY CITY,CA.
PN82TRCK FORD CHEVY GMC 2.00.30M RICK SMITH,RED CEDAR,ND.
$2,800 MD F-NW2CND 0284 1311 7 8Q CARD SHARK, PHP GIRL, BREEZY BREE
AUG 25-98 RIDC 1011 2011 9 4 3< 3< 7 7 2024 820 MILLER TF : - : - :MILE : 9 : 69 : - : - : 6
  #2 (permalink)  
Old 10-11-2007
awk awk is offline
Registered User
  
 

Join Date: Feb 2007
Posts: 134
Not sure which lines you are wanting to print, but there is a trick with awk. You can reset the value of NR anytime you want.

So a program like this

awk 'NR > 2 && (NR % 2 == 0 ){ print}
/^$/{NR=0}' <textfile> # /^$/ represent the blank line

got me output that looked like this:

FEB 20-98 VLY 1041 2094 8 8 8 8 8 8 NB SMITH GW : - : - :LAPS : 8 : 40 : - : - : 1
JAN 7-98 VLY 1030 2064 6 3 3 4 4 3 2071 NB SMITH GW : - : - :LAPS : 6 : 40 : - : - : 1
MAR 9-98 VLY $500 1024 2060 5 2 3 4 2 2 2063 1900 SMITH GW : - : - :LAPS : 8 : 36 : - : - : 1
MAR 13-98 VLY $1,350 1004 2022 2 3 2 2 1 1 2022 1130 SARAMA GW : - : - :LAPS : 7 : 31 : - : - : 2
MAR 27-98 VLY $675 1013 2020 4 1 1 1 1 2 2020 *1750 SMITH GW : - : - :LAPS : 8 : 60 : - : - : 2
MAY 1-98 VLY $1,500 0594 1594 2 1 1 1 1 1 1594 *9500 SMITH GW : - : - :LAPS : 7 : 70 : - : - : 9
MAY 9-98 HILL $2,000 0581 1570 1 6 5 4 2 1 1570 *2200 SMITH GW : - : - :LAPS : 7 : 60 : - : - : 8
MAY 15-98 HILL 1003 2011 7 8 8 8 8 8 *265 SMITH GW : - : - :LAPS : 8 : 75 : - : - : 9
MAY 21-98 HILL 0593 2010 7 8 8 8 8 8 9550 SMITH GW : - : - :LAPS : 9 : 46 : - : - : 2
FEB 22-98 VLY $675 1013 2020 4 1 1 1 1 2 2020 *1750 SMITH GW : - : - :LAPS : 8 : 60 : - : - : 3
APR 3-98 VLY $675 1001 2011 3 2 2 3 3 2 2011 *1550 SMITH GW : - : - :LAPS : 6 : 45 : - : - : 5
MAR 8-99 VLY $500 1024 2060 5 2 3 4 2 2 2063 1900 SMITH GW : - : - :LAPS : 8 : 36 : - : - : 10
MAR 13-99 VLY $1,350 1004 2022 2 3 2 2 1 1 2022 1130 SMITH GW : - : - :LAPS : 7 : 31 : - : - : 7

AUG 25-98 RIDC 1011 2011 9 4 3< 3< 7 7 2024 820 MILLER TF : - : - :MILE : 9 : 69 : - : - : 6
  #3 (permalink)  
Old 10-11-2007
RacerX's Avatar
RacerX RacerX is offline
Registered User
  
 

Join Date: Oct 2007
Posts: 38
Thanks for the reply! Your solution got exactly the lines i wanted to pick out!

I believe i should be able to solve the rest of my problem on my own using some type of regex for "(CONT)" on the first line and an if-else statement.

If i can't figure it out, i'll be back with another question

Thanks again for your help, that trick with resetting the NR is a good one to know as i was clueless and kept fiddling with the settings for the FS and RS which was getting me no-where fast. Your solution is simply elegant.
  #4 (permalink)  
Old 10-18-2007
RacerX's Avatar
RacerX RacerX is offline
Registered User
  
 

Join Date: Oct 2007
Posts: 38
Is there any way to get the info lined up in columns using printf? I've tried a few things but it never seems to come out right; maybe the data is just too funky to get it to line-up?

So, given INPUT like:
FEB 20-98 VLY 1041 2094 8 8 8 8 8 8 NB SMITH GW : - : - :LAPS : 8 : 40 : - : - : 1
JAN 7-98 VLY 1030 2064 6 3 3 4 4 3 2071 NB SMITH GW : - : - :LAPS : 6 : 40 : - : - : 1
MAR 9-98 VLY $500 1024 2060 5 2 3 4 2 2 2063 1900 SMITH GW : - : - :LAPS : 8 : 36 : - : - : 1
MAR 13-98 VLY $1,350 1004 2022 2 3 2 2 1 1 2022 1130 SARAMA GW : - : - :LAPS : 7 : 31 : - : - : 2

Can i get OUTPUT like:
Code:
FEB 20-98	VLY			1041	2094	8  8  8  8  8  8		NB    SMITH GW	: - : - :LAPS : 8 : 40 : - : - : 1
JAN 7-98	VLY			1030	2064	6  3  3  4  4  3	2071	NB    SMITH GW	: - : - :LAPS : 6 : 40 : - : - : 1
MAR 9-98	VLY		$500	1024	2060	5  2  3  4  2  2	2063	1900  SMITH GW	: - : - :LAPS : 8 : 36 : - : - : 1
MAR 13-98	VLY		$1,350	1004	2022	2  3  2  2  1  1	2022	1130  SARAMA GW	: - : - :LAPS : 7 : 31 : - : - : 2
MAR 27-98	VLY		$675	1013	2020	4  1  1  1  1  2	2020	*1750 SMITH GW	: - : - :LAPS : 8 : 60 : - : - : 2
  #5 (permalink)  
Old 10-18-2007
awk awk is offline
Registered User
  
 

Join Date: Feb 2007
Posts: 134
Instead of a print, use a printf command. It allows you to specifiy a mask, then the data to print. for example


printf("%-30s", "MY NAME");

will right justify the value in the column. If you are a C programmer, it follows that printing convention. I suggest looking up the online (free and in pdf) version of "Effective awk programming" by Arnold Robbins for more information.
  #6 (permalink)  
Old 10-18-2007
RacerX's Avatar
RacerX RacerX is offline
Registered User
  
 

Join Date: Oct 2007
Posts: 38
I've been reading about, and testing printf options for a while now, and am stuck on how to handle the above situation where one of the fields in a column is blank-whitespace. I tried using printf in the above code, specifically the following line using the first six fields only (i want to format the rest of the fields too, but for testing purposes only tried the first six to show my problem):
Code:
NR > 2 && (NR % 2 == 0 ) {printf "%-5s%-8s: %-10s: %-15s: %-10s: %-5s:\n",$1,$2,$3,$4,$5,$6} /^$/{NR=0}

I GET RETURNED:
FEB  20-98   : VLY       : 1041           : 2094      : 8    :
JAN  7-98    : VLY       : 1030           : 2064      : 6    :
MAR  9-98    : VLY       : $500           : 1024      : 2060 :
MAR  13-98   : VLY       : $1,350         : 1004      : 2022 :
MAR  27-98   : VLY       : $675           : 1013      : 2020 :
MAY  1-98    : VLY       : $1,500         : 0594      : 1594 :
MAY  9-98    : HILL      : $2,000         : 0581      : 1570 :
MAY  15-98   : HILL      : 1003           : 2011      : 7    :
MAY  21-98   : HILL      : 0593           : 2010      : 7    :
FEB  22-98   : VLY       : $675           : 1013      : 2020 :
APR  3-98    : VLY       : $675           : 1001      : 2011 :
MAR  8-99    : VLY       : $500           : 1024      : 2060 :
MAR  13-99   : VLY       : $1,350         : 1004      : 2022 :
             :           :                :           :      :
AUG  25-98   : RIDC      : 1011           : 2011      : 9    :
Which messes up which columns go where. So, how can i handle formatting a field that is whitespace?

It should be:
Code:
FEB  20-98   : VLY       :                : 1041      : 2094      : 8    :
JAN  7-98    : VLY       :                : 1030      : 2064      : 6    :
MAR  9-98    : VLY       : $500           : 1024      : 2060 :
MAR  13-98   : VLY       : $1,350         : 1004      : 2022 :
MAR  27-98   : VLY       : $675           : 1013      : 2020 :
MAY  1-98    : VLY       : $1,500         : 0594      : 1594 :
MAY  9-98    : HILL      : $2,000         : 0581      : 1570 :
MAY  15-98   : HILL      :                : 1003      : 2011      : 7    :
MAY  21-98   : HILL      :                : 0593      : 2010      : 7    :
FEB  22-98   : VLY       : $675           : 1013      : 2020 :
APR  3-98    : VLY       : $675           : 1001      : 2011 :
MAR  8-99    : VLY       : $500           : 1024      : 2060 :
MAR  13-99   : VLY       : $1,350         : 1004      : 2022 :

AUG  25-98   : RIDC      :                : 1011      : 2011      : 9    :
  #7 (permalink)  
Old 10-18-2007
awk awk is offline
Registered User
  
 

Join Date: Feb 2007
Posts: 134
Yeah, but it is going to get complicated (believe it or not, this has been pretty straightforward).

the problem comes up from awk not being able to recognize a whitespace column. If you used tab separators, you could have a -F parameter for the tabs, but if it is simply spaces, you have to make an programmatic decision.

for instance, it looks like if column 3 is a $ amount - if that is always true, you can check to see if it has a $, and print in the right column, or, if not, then you know everything has slid down one.

So you could have to printf statements
if ($3 ~ /\$/ )
{ print style 1 }
else
{ print style 2 }

As much as I hate to admit it, I would have to try some trial and error to make sure the search for the $ works, since that is and End_of_line indicator and I was thinking that escaping it was the right idea.
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 04:10 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0