The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
Google UNIX.COM


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
How to listout the files based on group by the date...? psiva_arul UNIX for Dummies Questions & Answers 3 04-21-2008 06:03 AM
Newly created files default group and write permissions goldfish UNIX for Dummies Questions & Answers 2 02-20-2008 02:39 PM
Recursive search for group or other writeable 'dot' files maficdan Security 5 02-14-2008 05:43 PM
Monkcast #12: IBM HW group OEMs Solaris to chagrin of SW group & a ... - ZDNet.com bl iBot UNIX and Linux RSS News 0 08-17-2007 01:30 PM
listing sequential files as one group... kentm UNIX for Advanced & Expert Users 1 01-24-2007 02:11 PM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #8 (permalink)  
Old 06-22-2005
vgersh99's Avatar
Moderator
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 3,002
A bit convoluted, but unfortunately nawk doesn't deal too well with arrays of arrays - had to jump through hoops.
Assumption: the key is the FIRST field.
Code:
BEGIN { FS=OFS="^" ; FSlist=","}

function add2list(   i,idx,n,cell,j,found,list)
{
   for(i=2; i<= NF; i++) {
      idx=$1 SUBSEP i
      n=split(arr[idx], cell, SUBSEP)
      for(j=1; j<=n; j++)
         if ( $i == cell[j] )  { found=1; break }
      if (!found) cell[++n] = $i
      for(j=1; j<=n; j++)
         list=(j>1) ? list SUBSEP cell[j] : cell[j]
      arr[idx]=list
      found=0
   }
}

#-----------------------------------------------------------------------
{
    add2list()
    nf=NF
}

#-----------------------------------------------------------------------
END {
  for (i in arr) {
    split(i, idxA, SUBSEP)
    printf("%s", idxA[1])
    for(j=2; j<=nf; j++) {
       idx=idxA[1] SUBSEP j
       printf("%s", OFS)
       if ( idx in arr ) {
          n=split(arr[idx], vals, SUBSEP)
          for(k=1; k <= n; k++)
             printf("%s%s", vals[k], (k<n) ? FSlist : "")
       delete arr[idx]
       }
     }
     printf("\n")
  }
}
Reply With Quote
Forum Sponsor
  #9 (permalink)  
Old 06-22-2005
Registered User
 

Join Date: Oct 2003
Location: United Kingdom
Posts: 37
"arrays of arrays" can be "faked" by using multi-dimensional subscripts.

I'm feeling lazy so that is the extent of my response - sorry!
Reply With Quote
  #10 (permalink)  
Old 06-22-2005
vgersh99's Avatar
Moderator
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 3,002
Quote:
Originally Posted by Simerian
"arrays of arrays" can be "faked" by using multi-dimensional subscripts.

I'm feeling lazy so that is the extent of my response - sorry!
not really.
An array INDEX can be multi-dimentional, e.g array[index1, index2,...,indexN].
But an array CELL can only hold a scalar [not at array], e.g. one cannot do array[index1,index2] = anotherArray - at least in older gawk and the Solaris' stock nawk.
Reply With Quote
  #11 (permalink)  
Old 06-22-2005
Registered User
 

Join Date: Oct 2003
Location: United Kingdom
Posts: 37
You are quite correct, hence my earlier statement that "arrays within arrays" can be "faked" - I knew that being lazy would get me into trouble!

The point is that the extra dimensions act as array sets themselves and so although not strictly a distinct instance of an (sub)array it can emulate the construct.

So...

List #1 elements has child items in List #2 the child items of which are in List #3

Rather than a pointer in each that identifies the child array, a single array is used but with multi-dimensions, the number of which reflect the total number of tiers (i.e. 3 in this case)

List[List#1-Index,List#2-Index,List#3-Index]

There are plenty of opportunities for this to go screwy but it can be used to good effect if handled carefully enough.

As I recall (i.e. I may be making this up as I go along... ), awk uses associative arrays and so large subscript ranges in multi-dimensions are not resource hungry. That is, it is only as big as the total number of used unique combinations.

Pros: It doesn't allocate resource at declaration time
Cons: It doesn't allocate resource at declaration time

When people call me a "complete nawker" I've always taken it as a compliment

Last edited by Simerian; 06-22-2005 at 09:04 AM.
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 03:50 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0