group by in files :-)


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers group by in files :-)
# 8  
Old 06-22-2005
A bit convoluted, but unfortunately nawk doesn't deal too well with arrays of arrays - had to jump through hoops.
Assumption: the key is the FIRST field.
Code:
BEGIN { FS=OFS="^" ; FSlist=","}

function add2list(   i,idx,n,cell,j,found,list)
{
   for(i=2; i<= NF; i++) {
      idx=$1 SUBSEP i
      n=split(arr[idx], cell, SUBSEP)
      for(j=1; j<=n; j++)
         if ( $i == cell[j] )  { found=1; break }
      if (!found) cell[++n] = $i
      for(j=1; j<=n; j++)
         list=(j>1) ? list SUBSEP cell[j] : cell[j]
      arr[idx]=list
      found=0
   }
}

#-----------------------------------------------------------------------
{
    add2list()
    nf=NF
}

#-----------------------------------------------------------------------
END {
  for (i in arr) {
    split(i, idxA, SUBSEP)
    printf("%s", idxA[1])
    for(j=2; j<=nf; j++) {
       idx=idxA[1] SUBSEP j
       printf("%s", OFS)
       if ( idx in arr ) {
          n=split(arr[idx], vals, SUBSEP)
          for(k=1; k <= n; k++)
             printf("%s%s", vals[k], (k<n) ? FSlist : "")
       delete arr[idx]
       }
     }
     printf("\n")
  }
}

# 9  
Old 06-22-2005
"arrays of arrays" can be "faked" by using multi-dimensional subscripts.

I'm feeling lazy so that is the extent of my response - sorry!
# 10  
Old 06-22-2005
Quote:
Originally Posted by Simerian
"arrays of arrays" can be "faked" by using multi-dimensional subscripts.

I'm feeling lazy so that is the extent of my response - sorry!
not really.
An array INDEX can be multi-dimentional, e.g array[index1, index2,...,indexN].
But an array CELL can only hold a scalar [not at array], e.g. one cannot do array[index1,index2] = anotherArray - at least in older gawk and the Solaris' stock nawk.
# 11  
Old 06-22-2005
You are quite correct, hence my earlier statement that "arrays within arrays" can be "faked" - I knew that being lazy would get me into trouble!

The point is that the extra dimensions act as array sets themselves and so although not strictly a distinct instance of an (sub)array it can emulate the construct.

So...

List #1 elements has child items in List #2 the child items of which are in List #3

Rather than a pointer in each that identifies the child array, a single array is used but with multi-dimensions, the number of which reflect the total number of tiers (i.e. 3 in this case)

List[List#1-Index,List#2-Index,List#3-Index]

There are plenty of opportunities for this to go screwy but it can be used to good effect if handled carefully enough.

As I recall (i.e. I may be making this up as I go along... Smilie ), awk uses associative arrays and so large subscript ranges in multi-dimensions are not resource hungry. That is, it is only as big as the total number of used unique combinations.

Pros: It doesn't allocate resource at declaration time
Cons: It doesn't allocate resource at declaration time

Smilie When people call me a "complete nawker" I've always taken it as a compliment

Last edited by Simerian; 06-22-2005 at 01:04 PM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Replace Stub Values In One Group Of Files With Actual Values From Another Group Of Files

I have two directories of files (new-config-files and old-config-files): new-config-files/this-db/config.inc.php new-config-files/that-db/config.inc.php new-config-files/old-db/config.inc.php new-config-files/new-db/config.inc.php new-config-files/random-database/config.inc.php etc. ... (4 Replies)
Discussion started by: spacegoose
4 Replies

2. AIX

Files without owner and group

Dears it is normal that the below binaries stay without any owner and group I have checked it in many servers and the like the below /usr/lpp/bos.net/inst_root/etc/ipsec# ls -lrt total 248 -r-xr-xr-x 1 987 987 13589 Jun 29 2005 default_group -r-xr-xr-x ... (5 Replies)
Discussion started by: thecobra151
5 Replies

3. UNIX for Dummies Questions & Answers

How to Tar group of Files on Sun OS

How to Tar group of Files on Sun OS? (2 Replies)
Discussion started by: Siva Sankar
2 Replies

4. UNIX for Dummies Questions & Answers

Space utilization for group of files

Hi Is there an easy was to list a group of file (*.txt) and report how much disk space they are using in total? Cheers (2 Replies)
Discussion started by: Grueben
2 Replies

5. Shell Programming and Scripting

Group files and zip

Hi I have 4 files in a folder and I am supposed to group and zip them via a mapping file as such: Group. Filename 1. A.txt 1. B.txt 2. C.txt 2. D.txt Result should be 2 zip files - 1.zip and 2.zip created with the contents being the text file. How can... (7 Replies)
Discussion started by: nightrider
7 Replies

6. Shell Programming and Scripting

How to group matched patterns in different files

Hi, I have a master file that i need to split into multiple files based on matched patterns. sample of my data as follows:- scaff_1 a e 123 130 c_scaff_100 scaff_1 a e 132 138 c_scaff_101 scaff_1 a e 140 150 ... (2 Replies)
Discussion started by: redse171
2 Replies

7. Shell Programming and Scripting

Find all files with group read OR group write OR user write permission

I need to find all the files that have group Read or Write permission or files that have user write permission. This is what I have so far: find . -exec ls -l {} \; | awk '/-...rw..w./ {print $1 " " $3 " " $4 " " $9}' It shows me all files where group read = true, group write = true... (5 Replies)
Discussion started by: shunter63
5 Replies

8. Shell Programming and Scripting

need help in remove group of files

i have some 350 files in a dir: i want to remove them in one shot, ls -ltr | grep 'Sep 15' | head -350 the above command gives me those 350 files i need to remove them,how to implement remove logic here in this command? i can get those 350 files using the above command only and therefore... (6 Replies)
Discussion started by: ali560045
6 Replies

9. Shell Programming and Scripting

uncompress a group of files

i have some 100's of files in the format .tar.gz. how to uncompress them in a single shot i have sorted the files according to current date and now they reside in a dir called naveed1. cd naveed1 ls -ltr file1.tar.gz file2.tar.gz : : : file100.tar.gz how to uncompresse them in... (8 Replies)
Discussion started by: ali560045
8 Replies

10. UNIX for Advanced & Expert Users

listing sequential files as one group...

Hi, I posted this over at Macnn and was redirected here... I'm not a unix programmer at all, but I have some backup if needed. Thanks in advance for any input. Is there a command for the osX terminal that will list sequentially numbered groups of file as one line instead of individually,... (1 Reply)
Discussion started by: kentm
1 Replies
Login or Register to Ask a Question