Sponsored Content
Top Forums Shell Programming and Scripting The builtin split function in AWK is too slow Post 302423798 by alister on Saturday 22nd of May 2010 12:56:43 PM
Old 05-22-2010
It seems to me that both files contain the same information, though in different formats. A simpler solution would be to use a different algorithm, which builds an internal list of book-pairs in one pass using one data file:
Code:
#!/bin/sh

awk -F'[:,]' '
    { for(i=2;i<=NF;i++) for(j=2;j<=NF;j++) if (i!=j) a[$i" "$j]++}
    END { for (k in a) print k" "a[k] }' "$1" \
| sort -k1,1 -k3,3nr -k2,2 \
| awk '{b=$1; if (b!=ob) {if (NR>1) print s; s=$1":"$2; ob=b; next}; s=s","$2} END {print s}'

Test run:
Code:
$ cat data
list1:A,B,C
list2:A,B,C,F,H
list3:A,B,D
list4:A,B,F
list5:H,F
list6:C
list7:G
$ ./books.sh data
A:B,C,F,D,H
B:A,C,F,D,H
C:A,B,F,H
D:A,B
F:A,B,H,C
H:F,A,B,C



A perl solution which is probably faster:
Code:
for ($i=1; $i<=$#F; $i++) {
    for ($j=1; $j<=$#F; $j++) {
        if ($i!=$j) {
            $books{$F[$i]}{$F[$j]}++
        }
    }
}

END {
    for $k ( sort keys %books ) {
        @v = sort { $books{$k}{$b} != $books{$k}{$a}
                    ? $books{$k}{$b} <=> $books{$k}{$a}
                    : $a cmp $b
                  } keys %{ $books{$k} };
        print "$k:" . join (",", @v);
    }
}

Test run, using the same data file as with the sh/awk/sort solution:
Code:
$ perl -lan -F'[:,]' books.pl data
A:B,C,F,D,H
B:A,C,F,D,H
C:A,B,F,H
D:A,B
F:A,B,H,C
H:F,A,B,C

Note: Its been about 10 years since I've written anything more than a one-liner in perl, so perhaps a perl guru can slash that to a couple of lines. Smilie

Regards,
Alister
This User Gave Thanks to alister For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

split function

Hi all! I am relatively new to UNIX staff, and I have come across a problem: I have a big directory, which contains 100 smaller ones. Each of the 100 contains a file ending in .txt , so there are 100 files ending in .txt I want to split each of the 100 files in smaller ones, which will contain... (4 Replies)
Discussion started by: ktsirig
4 Replies

2. Shell Programming and Scripting

perl split function

$mystring = "name:blk:house::"; print "$mystring\n"; @s_format = split(/:/, $mystring); for ($i=0; $i <= $#s_format; $i++) { print "index is $i,field is $s_format"; print "\n"; } $size = $#s_format + 1; print "total size of array is $size\n"; i am expecting my size to be 5, why is it... (5 Replies)
Discussion started by: new2ss
5 Replies

3. UNIX for Dummies Questions & Answers

Split a file with no pattern -- Split, Csplit, Awk

I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this. For example: split -l 3000000 filename.txt This is very slow and it splits the file with 3 million records in each... (10 Replies)
Discussion started by: madhunk
10 Replies

4. Shell Programming and Scripting

awk - split function

Hi, I have some output in the form of: #output: abc123 def567 hij890 ghi324 the above is in one column, stored in the variable x ( and if you wana know about x... x=sprintf(tolower(substr(someArray,1,1)substr(userArray,3,1)substr(userArray,2,1))) when i simply print x (print x) I get... (7 Replies)
Discussion started by: fusionX
7 Replies

5. Shell Programming and Scripting

Use split function in perl

Hello, if i have file like this: 010000890306932455804 05306977653873 0520080417010520ISMS SMT ZZZZZZZZZZZZZOC30693599000 30971360000 ZZZZZZZZZZZZZZZZZZZZ202011302942311 010000890306946317387 05306977313623 0520080417010520ISMS SMT ZZZZZZZZZZZZZOC306942190000 30971360000... (5 Replies)
Discussion started by: chriss_58
5 Replies

6. Homework & Coursework Questions

PERL split function

Hi... I have a question regarding the split function in PERL. I have a very huge csv file (more than 80 million records). I need to extract a particular position(eg : 50th position) of each line from the csv file. I tried using split function. But I realized split takes a very long time. Also... (1 Reply)
Discussion started by: castle
1 Replies

7. Homework & Coursework Questions

PERL split function

Hi... I have a question regarding the split function in PERL. I have a very huge csv file (more than 80 million records). I need to extract a particular position(eg : 50th position) of each line from the csv file. I tried using split function. But I realized split takes a very long time. Also... (0 Replies)
Discussion started by: castle
0 Replies

8. Shell Programming and Scripting

PERL split function

Hi... I have a question regarding the split function in PERL. I have a very huge csv file (more than 80 million records). I need to extract a particular position(eg : 50th position) of each line from the csv file. I tried using split function. But I realized split takes a very long time. Also... (1 Reply)
Discussion started by: castle
1 Replies

9. Shell Programming and Scripting

Perl split function

my @d =split('\|', $_); west|ACH|3|Y|LuV|N||N|| Qt|UWST|57|Y|LSV|Y|Bng|N|KT| It Returns d as 8 for First Line, and 9 as for Second Line . I want to Process Both the Files, How to Handle It. (3 Replies)
Discussion started by: vishwakar
3 Replies

10. Shell Programming and Scripting

awk to split one field and print the last two fields within the split part.

Hello; I have a file consists of 4 columns separated by tab. The problem is the third fields. Some of the them are very long but can be split by the vertical bar "|". Also some of them do not contain the string "UniProt", but I could ignore it at this moment, and sort the file afterwards. Here is... (5 Replies)
Discussion started by: yifangt
5 Replies
XmScrollVisible(3X)													       XmScrollVisible(3X)

NAME
XmScrollVisible - A ScrolledWindow function that makes an invisible descendant of a ScrolledWindow work area visible SYNOPSIS
#include <Xm/ScrolledW.h> void XmScrollVisible (scrollw_widget, widget, left_right_margin, top_bottom_margin) Widget scrollw_widget; Widget widget; Dimension left_right_margin; Dimension top_bottom_margin; DESCRIPTION
XmScrollVisible makes an obscured or partially obscured widget or gadget descendant of a ScrolledWindow work area visible. The function repositions the work area and sets the specified margins between the widget and the nearest viewport boundary. The widget's location rela- tive to the viewport determines whether one or both of the margins must be adjusted. This function requires that the XmNscrollingPolicy of the ScrolledWindow widget be set to XmAUTOMATIC. Specifies the ID of the ScrolledWindow widget whose work area window contains an obscured descendant. Specifies the ID of the widget to be made visible. Specifies the margin to establish between the left or right edge of the widget and the associated edge of the viewport. This margin is established only if the widget must be moved horizontally to make it visi- ble. Specifies the margin to establish between the top or bottom edge of the widget and the associated edge of the viewport. This margin is established only if the widget must be moved vertically to make it visible. For a complete definition of ScrolledWindow and its associated resources, see XmScrolledWindow(3X). SEE ALSO
XmScrolledWindow(3X) XmScrollVisible(3X)
All times are GMT -4. The time now is 07:29 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy