The UNIX and Linux Forums  


Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Making Connection nodes for Graph anjas Shell Programming and Scripting 4 06-18-2009 05:42 AM
FTP large files - Getting "Connection Refused" bullz26 HP-UX 4 10-25-2008 07:52 AM
problem while making ftp of a large file rprajendran UNIX for Dummies Questions & Answers 1 05-28-2008 02:19 AM
nodes kamisi UNIX for Dummies Questions & Answers 3 05-30-2002 04:47 PM
i-nodes djatwork UNIX for Dummies Questions & Answers 4 09-25-2001 01:29 PM

 
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
Prev Previous Post   Next Post Next
  #1 (permalink)  
Old 07-04-2009
anjas anjas is offline
Registered User
  
 

Join Date: Mar 2009
Location: Bali, Indonesia
Posts: 17
Making Large Connection nodes for Graph

Hi power user,

Basically, this thread is a continuation of the previous one :

Making Connection nodes for Graph

However, I'm going to explain it again.

I have this following data:

file1
aa A
aa B
aa C
bb X
bb Y
bb Z
cc O
cc P
cc Q
. .
. .
. .
. .

and I want to turn them into a connection nodes like this:
file2

A aa A
A aa B
A aa C
B aa C
B aa B
C aa C
X bb X
X bb Y
X bb Z
Y bb Z
Y bb Y
Z bb Z
. . .
. . .
. . .
. . .

I made this relation, to create a graph. The file have more than 6.000.000 lines.
For smaller files (100.000 lines), I have used this following script in the previous thread:

join -o 1.2 0 2.2 -1 1 -2 1 file1 file1 | nawk '!a[$3$2$1];{a[$1$2$3]++}'
join -o 1.2 0 2.2 -1 1 -2 1 file1 file1 | nawk '$1<$3{print;next}{print$3,$2,$1}' | sort -u
nawk '
NR==FNR { c = a[$1]; a[$1] = c?c" "$2:$2; next }
{ c = a[$1]
if (c) {
split(c,b)
for (k in b) {
p = $2<b[k]?$2" "$1" "b[k]:b[k]" "$1" "$2
if (!d[p]++) print p
}
}
}
' file1 file1
For small file, those three kind of scripts could create the network only in less than 10 minutes. However, for files with more than 6.000.000 lines, even after one days, there was no results at all . Is there any faster way to do it?


Any suggestion, how to create file2 by using perl or awk? Tx
 

Bookmarks

Tags
graph, nodes

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 01:54 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0