Awk,nawk Help please


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Awk,nawk Help please
# 1  
Old 10-29-2011
Awk,nawk Help please

Hi Guys,



I am in need of some help; I have an xml message file which contains personal details as shown below:

Code:
[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><NAME=”John Smith”><Age=”23”><D.O.B=”11-10-1988”> <Gender=”Male”>”


[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><NAME=”Emy Williams”><Age=”23”><D.O.B=”01-05-1988”> <Gender=”Female”>”


[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><NAME=”Jack Adam”><Age=”66”><D.O.B=”24-07-1945”> <Gender=”Male”>”


[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><NAME=”Charlie Daniel”><Age=”38”><D.O.B=”15-08-1973”> <Gender=”Male”>”


[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><NAME=”Ruby James”><Age=”38”><D.O.B=”11-03-1973”> <Gender=”Female”>”


[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><NAME=”Sophie Thomas”><Age=”20”><D.O.B=”12-09-1991”><Gender=”Female”>”

I want to use nawk to parse these xml messages but I am new to awk and nawk.



What I want is to get output of these messages to look like this.



Output


Code:
FullName Age D.O.B Gender
John Smith 23 11-10-1988 Male
Emy Williams 23 01-05-1988 Female
Jack Adam 66 24-07-1945 Male
Charlie Daniel 38 15-08-1973 Male
Ruby James 38 11-03-1973 Female
Sophie Thomas 20 12-09-1991 Female

Also to get two counts;


Code:
Age count

20 1
23 2
38 2
66 1

Gender count

Male 3
Female 3


Moderator's Comments:
Mod Comment Please use code tags <- click the link! Do not use overly font formattings, thanks.

Thanks for any help.

Last edited by zaxxon; 11-10-2011 at 04:31 AM.. Reason: code tags, see PM
# 2  
Old 10-29-2011
This works under awk, and should work with nawk though I don't have a sun box to test it on.

Code:
awk '
    {
        gsub( ">", "" );        # strip uneeded junk and make "foo bar" easy to capture
        gsub( " ", "~" );
        gsub( "<", " " );

        for( i = 1; i <= NF; i++ )          # snarf up each name=value pair
        {
            if( split( $(i), a, "=" ) == 2 ) 
            {
                gsub(  "\"", "", a[2] );
                gsub(  "~", " ", a[2] );
                values[a[1]] = a[2];
            }
        }

        gcount[values["Gender"]]++;         # collect counts
        acount[values["Age"]]++;

        printf( "%s %s %s %s\n", values["NAME"], values["Age"], values["D.O.B"], values["Gender"] );
    }

    END {
        printf( "\nAge Count" );
        for( x in acount )
            printf( "%s %d\n", x, acount[x] );

        printf( "\nGender Count:\n" );
        for( x in gcount )
            printf( "%s %d\n", x, gcount[x] );
    }
' input_file

Assumptions:
all lines are of the format you gave; no blank lines.
The tlida (~) character doesn't appear in the data

Last edited by agama; 10-29-2011 at 11:07 PM.. Reason: small clarification
This User Gave Thanks to agama For This Post:
# 3  
Old 10-29-2011
Try this...
Code:
awk -F"<|>" '
/^$/{next}
{
        for(i=1;i<=NF;i++)
        {
                if($i~"NAME") { match($i,"(NAME=.)(.*).",name) }
                if($i~"D.O.B") { match($i,"(D.O.B=.)(.*).",dob) }
                if($i~"Gender") { match($i,"(Gender=.)(.*).",gender) }
                if($i~"Age") { match($i,"(Age=.)(.*).",age) }
        }
        printf("%s %s %s %s\n",name[2],age[2],dob[2],gender[2]); 
        agec[age[2]]++;genderc[gender[2]]++;
}
END{
        print "\nAge Count"
        for(i in agec)
                printf("%s %s\n", i, agec[i])

        print "\nGender Count"
        for(i in genderc)
                printf("%s %s\n", i, genderc[i])
}' input_file

--ahamed
This User Gave Thanks to ahamed101 For This Post:
# 4  
Old 10-30-2011
Hi James,

Another option would be:

Code:
awk -F[”] 'BEGIN{print "FullName Age D.O.B Gender"}
    /NAME/{A[$6]++; G[$10]++; print $4, $6, $8, $10}
    
END{ print "\nAge count\n" 
     for (i in A) print i,A[i]

     print "\nGender count\n"
     for (i in G) print i,G[i]
    }' file

Hope this helps.

Regards
This User Gave Thanks to cgkmal For This Post:
# 5  
Old 11-09-2011
Awk,nawk Help please

Hay guys,

Is there a way to only get the
Gender , age and age count

Because I don't need two counts, all I need is to get the details then gender, age and count of the age.*

For example
Code:
Male,23,1
Female,23,1
Female,30,

I am trying to use the code below and have made a lot of changes but not getting the above output.
Can someone please help.
Code:
awk '
    {
        gsub( ">", "" );        # strip uneeded junk and make "foo bar" easy to capture
        gsub( " ", "~" );
        gsub( "<", " " );

        for( i = 1; i <= NF; i++ )          # snarf up each name=value pair
        {
            if( split( $(i), a, "=" ) == 2 ) 
            {
                gsub(  "\"", "", a[2] );
                gsub(  "~", " ", a[2] );
                values[a[1]] = a[2];
            }
        }

        gcount[values["Gender"]]++;         # collect counts
        acount[values["Age"]]++;

        printf( "%s %s %s %s\n", values["NAME"], values["Age"], values["D.O.B"], values["Gender"] );
    }

    END {
        printf( "\nAge Count" );
        for( x in acount )
            printf( "%s %d\n", x, acount[x] );

        printf( "\nGender Count:\n" );
        for( x in gcount )
            printf( "%s %d\n", x, gcount[x] );
    }
' input_file

---------- Post updated at 09:11 PM ---------- Previous update was at 01:56 PM ----------

Hay guys, can I please get help with this.

Thank you all for helping.Smilie

Last edited by Scott; 11-09-2011 at 05:57 PM.. Reason: Added code tags
# 6  
Old 11-09-2011
Please edit your post to add code tags.

And, you agreed not to "bump" posts when you registered.

edit: No worries. I got it. Please read the PM you just got.
This User Gave Thanks to Scott For This Post:
# 7  
Old 11-09-2011
This produces the output you've indicated is desired.

Code:
awk '
    {
        gsub( ">", "" );        # strip uneeded junk and make "foo bar" easy to capture
        gsub( " ", "~" );
        gsub( "<", " " );

        for( i = 1; i <= NF; i++ )          # snarf up each name=value pair
        {
            if( split( $(i), a, "=" ) == 2 )
            {
                gsub(  "\"", "", a[2] );
                gsub(  "~", " ", a[2] );
                values[a[1]] = a[2];
            }
        }

        #gcount[values["Gender"]]++;         # collect counts
        #acount[values["Age"]]++;
        agcount[values["Gender"]","values["Age"]]++;

        printf( "%s %s %s %s\n", values["NAME"], values["Age"], values["D.O.B"], values["Gender"] );
    }

    END {
        printf( "\nSummary\n" );
        for( x in agcount )
            printf( "%s,%d\n", x, agcount[x] ) | "sort";
    }
' input-file

I ditched the extra stuff from END to make it obvious; put back what you need.

Last edited by agama; 11-09-2011 at 08:07 PM.. Reason: Sort the summary
This User Gave Thanks to agama For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Awk/Nawk Questions

Hi Guys, This is the Input: <xn:MeContext id="XXX012"> <xn:ManagedElement id="1"> <xn:attributes> <xn:userLabel>XXX012</xn:userLabel> <xn:swVersion>R58E68</xn:swVersion> </xn:attributes> </xn:ManagedElement> </xn:MeContext>... (4 Replies)
Discussion started by: smarones
4 Replies

2. Shell Programming and Scripting

awk or nawk in ksh

I am trying to use either awk or nawk in ksh88 to grep the word "Reason" in multiple files and than print the lines that say "Reason" in a particular format that is different from how they would normally print. The original input is as follows: ... (10 Replies)
Discussion started by: ther2000
10 Replies

3. UNIX for Dummies Questions & Answers

Help me to know about awk and nawk

Hi everyone, i am new to unix , so i want to know what is the use of awk and nawk. because in most of the place this cmds were used. so, if anyone provied the basic idea of this cmds, it will be much helpfull for me . . .. Thnks in Advance :) (9 Replies)
Discussion started by: natraj005
9 Replies

4. Shell Programming and Scripting

awk and nawk on Solaris

Why do they do two different things? Like on one version of UNIX you can use awk, but tehn if you move to Solaris then awk becomes something crap and you need to use nawk instead! whY!?!?!?! (4 Replies)
Discussion started by: linuxkid
4 Replies

5. Shell Programming and Scripting

comparing awk and nawk

Hi Guys, i tried these two commands. First in awk and nawk. The nawk command is running fine but the awk command is throwing error. What is wrong with the awk command. There are lot of awk commands running fine in my system d003:/usr/local/dsadm/dsprod>nawk 'NR = 1 {print " "$0}' a.txt ... (6 Replies)
Discussion started by: mac4rfree
6 Replies

6. Shell Programming and Scripting

how to access values of awk/nawk variables outside the awk/nawk block?

i'm new to shell scripting and have a problem please help me in the script i have a nawk block which has a variable count nawk{ . . . count=count+1 print count } now i want to access the value of the count variable outside the awk block,like.. s=`expr count / m` (m is... (5 Replies)
Discussion started by: saniya
5 Replies

7. UNIX for Dummies Questions & Answers

How to use awk instead of nawk?

Hi all, I can run the following script using nawk..However, I find that teh server dun support nawk.. May I know how to change teh script to use awk such that it will work? Very urgent.. thx! nawk 'BEGIN {FS=OFS=","} NR==FNR{arr=$2;next} $0 !~ "Documentation"{print $0;next} ... (2 Replies)
Discussion started by: kinmak
2 Replies

8. Shell Programming and Scripting

nawk -v to awk

hi, i have the command nawk -v i want to use it equivalent in awk? any help please :) (2 Replies)
Discussion started by: kamel.seg
2 Replies

9. UNIX for Dummies Questions & Answers

help with Awk or nawk

Can anyone explain to me why the first line doesn't work and the second seems to work fine. I am trying to find all occurances of text within a certain column (col 13) that start with the character V, I suppose it sounds simple but I have tried using the following but don't really understand what... (2 Replies)
Discussion started by: Gerry405
2 Replies
Login or Register to Ask a Question