How to load an array with desired lines with awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to load an array with desired lines with awk
# 1  
Old 03-14-2011
How to load an array with desired lines with awk

Hi everyone,
Please some help over here.

I've written the below script using ranges (/Initial_pattern/,/Final_Pattern/)to extract only those lines of interest for me:
Code:
awk  '
/^Category/,/^$/{print $1};
/Titles/,/^$/{print $1};
/Authors/,/^$/{print $1}' inputfile

To process this input:
Code:
_________________________________________________

Category
--------------------------------------------------
Adventure

______________________________________________________________________________

                                                                 
XXXX        XXXXX XXXXXXXXX
Y-W-K        YYYYYYYYYY  YYYY
Z-R-T        ZZZZZZZZ ZZZZ
--------------------------------------------------

______________________________________________________________________________

                                                                 
Titles
--------------------------------------------------
Robinson-Crusoe    XXXXXXXXXXXX XXXX
Saturday     XXXXXXXX XX XXXXXX

______________________________________________________________________________

                                                                 
XXXX    X
M-U-J    M

______________________________________________________________________________

                                                                 
Authors
--------------------------------------------------
Daniel-Defoe    XXXXXXXXXXXXXXXX
Ian-McEwan    XXXXXXXXXXXX

______________________________________________________________________________

and I obtain this:
Code:
Category
--------------------------------------------------
Adventure

Titles
--------------------------------------------------
Robinson-Crusoe
Saturday

Authors
--------------------------------------------------
Daniel-Defoe
Ian-McEwan

I'm interested only in first column, for that reason I'm using /Initial_pattern/,/Final_Pattern/{print $1} for every search and I'm trying also to exclude the blank lines and those lines containing "----" but instead of print wanted lines, send them to an array, but simply I don't know the correct way to store (instead of print) the lines found or load the array after every range of lines I get.
The idea is to have an array like this:
Code:
Category
Adventure
Titles
Robinson-Crusoe
Saturday
Authors
Daniel-Defoe
Ian-McEwan

I'm having problems to handel arrays so far,
May somebody show me the way to do this?

Many thanks in advance
# 2  
Old 03-14-2011
You don't really need arrays for this.

Just use an A (for Active) flag and only print if A and line does not start with --
Blank line or line starting in __ turns A (Active) flag off:

Code:
awk '/^$/||/^__/{A=0} A&&!/^--/{print $1} /^(Category|Titles|Authors)/{print;A++}' infile


For reference this is how it would be done using arrays:

Code:
awk '/^$/||/^_/{for(i=0;i<A;i++) print L[i]; A=0} A&&!/^--/{L[A++]=$1} /^(Category|Titles|Authors)/{L[A++]=$0}' infile


Last edited by Chubler_XL; 03-14-2011 at 09:02 PM..
# 3  
Old 03-15-2011
Hi Chubler_XL,

Thanks for your reply. I was thinking to have an array to manipulate it later and be able to print
its elements in a single line separated by commas as follow.
Code:
Category,Adventure,Titles,Robinson-Crusoe,Saturday,Authors,Daniel-Defoe,Ian-McEwan

I've tried to modify a little bit your first code option adding printf() but doesn't join all lines
in a single one as I show above, only merges two consecutive lines as follow.
Code:
awk '/^$/||/^__/{A=0} A&&!/^--/{print $1} /^(Category|Titles|Authors)/{printf("%s,",$0);A++}' inputfile
Category,Adventure
Titles,Robinson-Crusoe
Saturday
Authors,Daniel-Defoe
Ian-McEwan

In addition, how can apply the "flag logic" when I use ranges as I do in my original code.
Doesn't work when I do this:
Code:
awk  '/^$/||/^__/{A=0} 
A&&!/^--/{print $1}
/^Category/,/^$/{print $1;A++};
/Titles/,/^$/{print $1;A++};
/Authors/,/^$/{print $1;A++}' inputfile

Thanks in advance.

Last edited by cgkmal; 03-15-2011 at 01:29 AM..
# 4  
Old 03-15-2011
For commas all on one line try this:
Code:
awk '/^$/||/^__/{A=0} A&&!/^--/{R=R","$1}
/^(Category|Titles|Authors)/{R=R?R",":"")$0;A++}
END{print R}' inputfile

If doing it with ranges I would try and avoid having 3 seperate ranges, otherwise you end up with 3 identical blocks to process the lines. This is fine if you fell that the processing for each block is likley to change at a later date, but otherwise stick with the shorter solution:

Code:
awk '/^(Category|Titles|Authors)/,/^$/ { 
if ($0&&$0!~"^---") R=(R?R",":"")$1}
END {print R}' inputfile

# 5  
Old 03-15-2011
Hi again Chubler_XL,

And thanks again for your help.

I've tried both codes and specially the 2nd one I can adapt it to reach my goal. In this context the output with your code gives me:
Code:
Category,Adventure,Titles,Robinson-Crusoe,Saturday,Authors,Daniel-Defoe,Ian-McEwan,Category,Fantasy,Category,Literature,Titles,Ulysses



Therefore, I've added some lines(in green) of code to modify the output before print it in order to separate individual lines by "\n" and the fields by "|" as follow:

Code:
awk 'BEGIN{print "Category|Titles|Authors"}
/^(Category|Titles|Authors)/,/^$/ { 
if ($0&&$0!~"^---") R=(R?R",":"")$1}
{R==gsub(/^Category,/,"",R);gsub(/,Category,/,"\n",R);
gsub(/,Titles,/,"|",R);gsub(/,Authors,/,"|",R)}
END {print R}' inputfile

With an inputfile with more data("books") within it, the output is:
Code:
Category|Titles|Authors
Adventure|Robinson-Crusoe,Saturday|Daniel-Defoe,Ian-McEwan
Fantasy
Literature|Ulysses

With this the output I'm looking for is 95% done, the last help I need is:
How to modify the variable "R", as I've done so far, with gsub() adding "|" in all lines that have 1 or 2 fields with empty value? and get this final output:
Code:
Category|Titles|Authors
Adventure|Robinson-Crusoe,Saturday|Daniel-Defoe,Ian-McEwan
Fantasy||
Literature|Ulysses|

I want to add 1 or 2 more gsub(/regexp/,"|",R) to reach this task, but I can't get the right Regexp for lines without "|" (to add 2 "|") and lines with only 1 "|" (to add 1 "|"). Category field will always have a value.

* For reference I've uploaded the inputfile with more "books".


Any advice to get this would be very appreciated.


Thanks in advance,

Regards.
# 6  
Old 03-15-2011
Back to using arrays again! As you can see the output format tends to dictate the solution used.

Code:
awk 'BEGIN { print "Category|Titles|Authors" }
/^(Category|Titles|Authors)/,/^$/ {
if ($1=="Category") {L++;T=1;next}
if ($1 ~"(Titles|Authors)") T++;
else if ($0&&$0!~"^---") F[L,T]=(F[L,T]?F[L,T]",":"")$1}
END {for(i=1;i<=L;i++) print F[i,1]"|"F[i,2]"|"F[i,3]}' inputfile


Last edited by Chubler_XL; 03-15-2011 at 06:21 PM..
# 7  
Old 03-15-2011
Quote:
Originally Posted by Chubler_XL
Back to using arrays again! As you can see the output format tends to dictate the solution used.

Code:
awk 'BEGIN { print "Category|Titles|Authors" }
/^(Category|Titles|Authors)/,/^$/ {
if ($1=="Category") {L++;T=1;next}
if ($1 ~"(Titles|Authors)") T++;
else if ($0&&$0!~"^---") F[L,T]=(F[L,T]?F[L,T]",":"")$1}
END {for(i=1;i<=L;i++) print F[i,1]"|"F[i,2]"|"F[i,3]}' inputfile

Simply great Chubler_XL!!!

I understand the way you access the array and then print its element as your choice, but please may you explain how it work the
parts in
GREEN when you load the arrays and details within those parts highlighted in RED?

Code:
{
 if ($1=="Category") {L++;T=1;next}
   if ($1 ~"(Titles|Authors)") T++;
   else if ($0&&$0!~"^---") F[L,T]=(F[L,T]?F[L,T]",":"")$1}

Besides this, L and T are array and F it contain both?

Many thank for your gret help.

Best regards
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need help in solving to obtain desired print output using awk or perl or any commands, Please help!!

I have an file which have data in lines as follows ad, findline=24,an=54,ab=34,av=64,ab=7989,ab65=34,aj=323,ay=34,au=545,ad=5545 ab,abc,an10=23,an2=24,an31=32,findline=00,an33=23,an32=26,an40=45,ac23=5,ac=87,al=76,ad=26... (3 Replies)
Discussion started by: deepKrish
3 Replies

2. Shell Programming and Scripting

awk split command to get the desired result

Dear all, I am using the awk 'split' command to get the particular value. FILE=InputFile_009_0.txt Temp=$(echo $FILE | awk '{split($FILE, a, "e_"); print a}') I would like to have the Temp take the value as : _009_0 ... (4 Replies)
Discussion started by: emily
4 Replies

3. Shell Programming and Scripting

awk pattern match not printing desired columns

Hi all, I'm trying to match the following two files with the code below: awk -F, 'NR==FNR {a=$0; next} ($12,$4) in a {print $12,$1,a}' OFS="," file4.csv file3.csv but the code does not print the entire row from file4 in addition to column 12 and 1 of file3. file4: o,c,q,co,ov,b... (1 Reply)
Discussion started by: bkane3
1 Replies

4. Shell Programming and Scripting

AWK command to cut the desired header columns

Hi Friends, I have a file1 i want to retrieve only the fields which have DEP,CITY,TRANS as headers in other file. Output: I want to give the input as DEP,CITY,TRANS column names to get the output. i used cut command .. but if i have 300 fileds it is more difficult to... (4 Replies)
Discussion started by: i150371485
4 Replies

5. Shell Programming and Scripting

awk syntax mistake doubles desired output

I am trying to add a line to a BASH shell script to print out a large variable length table on a web page. I am very new to this obviously, but I tried this with awk and it prints out every line twice. What I am doing wrong? echo "1^2^3%4^5^6%7^8^9%" | awk 'BEGIN { RS="%"; FS="^"; } {for (i =... (6 Replies)
Discussion started by: awknewb123
6 Replies

6. Shell Programming and Scripting

Put lines of a file in an array with awk

Hello, Is there any way in awk to put every line of a file in an array and so we can like this print the line we want. For example, if we have this file aaa eee bbb fff ccc ggg ddd hhh So we can print to the output the 3rd line only ccc ggg If it is possible, please put the... (7 Replies)
Discussion started by: rany1
7 Replies

7. Shell Programming and Scripting

Need to parse file "x" lines at a time ... awk array?

I have files that store multiple data points for the same device "vertically" and include multiple devices. It repeats a consistant pattern of lines where for each line: Column 1 is a common number for the entire file and all devices in that file Column 2 is a unique device number Column 3 is... (7 Replies)
Discussion started by: STN
7 Replies

8. Solaris

How to grep (say)last-3 and next-3 lines of Desired Pattern

Hi All, OS-Type=Sun-OS 5.8 Sparc9 Processor Can I grep the previous 4 lines and next 4 lines of a matched pattern(context grep)? For example here we need to monitor logs of live traffic.The data obtained from "tail -f LiveTrafficData.log" looks something like this:-... (3 Replies)
Discussion started by: Sujan Banerjee
3 Replies

9. Shell Programming and Scripting

Need help in wrting Load Script for a Load-Resume type of load.

hi all need your help. I am wrting a script that will load data into the table. then on another load will append the data into the existing table. Regards Ankit (1 Reply)
Discussion started by: ankitgupta
1 Replies

10. UNIX for Advanced & Expert Users

Percent complete error while scanning RAID array during 5.0.6 load

Percent complete SCO 5.0.6 / No longer an issue (0 Replies)
Discussion started by: Henrys
0 Replies
Login or Register to Ask a Question