Match child with parents and form matrix


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Match child with parents and form matrix
# 8  
Old 02-03-2015
The data file with 800 million records is too big to stock in memory.
How many parents in data file ?
An idea of the maximum size of a matrix ?

Regards.
This User Gave Thanks to disedorgue For This Post:
# 9  
Old 02-03-2015
Number of parents is 234 now, but it will grow,, the data file has much redundant data where parents and children have many redundant/extra variables which will not appear in output.. the max size of a matrix will be 8,000 x 100 .. I was monitoring data for 40 parents,,, it was taking 400 Mb memory on top which is not much..I have access to university cluster having 300 gb memory,,so I can run the data there if needed..
# 10  
Old 02-03-2015
Here another awk program that load all in memory:
Code:
$ cat newmatrx.awk
BEGIN{FS=","}
FNR == NR {
	if ( $2 !~ /\// ) {
		P[$3]=$2
		Q[$2]=$3
	}
	else {
		C[$3]=$2
	}
	next
}
{
	if ( P[$1] ) {
		V[P[$1]$1$2]=$3
	}
	else {
		split(C[$1],A,"/")
		if ( V[P[Q[A[1]]]Q[A[1]]$2] && V[P[Q[A[2]]]Q[A[2]]$2] && V[P[Q[A[1]]]Q[A[1]]$2] != V[P[Q[A[2]]]Q[A[2]]$2] ) {
			W[C[$1]]++
			V[P[Q[A[1]]]Q[A[1]]$2] == $3 ? X="a" : V[P[Q[A[2]]]Q[A[2]]$2] == $3 ? X="b" : X="m"
			E[W[C[$1]]C[$1]]=$2":"$1":"X
		}
	}
}
END{
	for (i in W){
		print i":"
		J=1
		while (J <= W[i]){
			split(E[J""i],li,":")
			L[li[1]li[2]]=li[3]
			if(V1[li[1]]!=1){
				V1[li[1]]=1
				V2[G++]=li[1]
			}
			if(M[li[2]]!=1){
				M[li[2]]=1
				B[D++]=li[2]
			}
			J++
		}
		for(Y=0;Y<D;Y++){
			K=K"\t"B[Y]
		}
		print K
		for(Z=0;Z<G;Z++){
			K=V2[Z]
			for(Y=0;Y<D;Y++){
				if(L[V2[Z]B[Y]]) {
					K=K"\t"L[V2[Z]B[Y]]
				}
				else{
					K=K"\t-"
				}
			}
			print K
		}
		K=""
		split("",L)
		split("",B)
		split("",V2)
		split("",V1)
		split("",M)
		G=D=0
	}
}

And you must execute this as old version:
Code:
awk -F,  'FNR == NR && !/\// {T[$3]=1;next};T[$1] {print}' file1 file2 | awk -f newmatrx.awk file1 - file2

Regards.
This User Gave Thanks to disedorgue For This Post:
# 11  
Old 02-04-2015
Thank you ! this is running now..I will get back to you if I find some issues.. thank you again....
Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Form balanced matrix by filtering data

I need to form a matrix out of unbalanced set of records. First eliminate the sample that do not have at least 3 variables (col2). So, in the example, samples 4 and 5 get eliminated. Then form a matrix of values (col3) from the samples using only variables that are present accross all samples.... (3 Replies)
Discussion started by: senhia83
3 Replies

2. Shell Programming and Scripting

Comparing two strings receiving form two different loops and execute if condition when single match

I want to read a file contain sub-string and same string need to match in file name I got from for loop. I am using below code: #!/bin/bash C_UPLOADEDSUFFIX='.uploaded' files=$(find . -iname '*'$C_UPLOADEDSUFFIX -type f) # find files having .uploaded prefix for file in $files do ... (1 Reply)
Discussion started by: ketanraut
1 Replies

3. Shell Programming and Scripting

Reformatting data in matrix form

Hi, Some assistance with respect to the following problem will be very helpful. I want to reformat my dataset in the following manner for subsequent analysis. I have first column values (which repeat for each value of 2nd column) which are names, the second column specifies position ad the... (1 Reply)
Discussion started by: newbie83
1 Replies

4. Shell Programming and Scripting

fetch last line no form file which is match with specific pattern by grep command

Hi i have a file which have a pattern like this Nov 10 session closed Nov 10 Nov 9 08:14:27 EST5EDT 2010 on tty . Nov 10 Oct 19 02:14:21 EST5EDT 2010 on pts/tk . Nov 10 afrtetryytr Nov 10 session closed Nov 10 Nov 10 03:21:04 EST5EDT 2010 Dec 8 Nov 10 05:03:02 EST5EDT 2010 ... (13 Replies)
Discussion started by: Himanshu_soni
13 Replies

5. Shell Programming and Scripting

Cut and paste data in matrix form

I have large formatted data file with five columns. This has to be rearranged in lower order matrix form as shown below for sample data. 1 2 3 4 5 1.0 3.0 2.0 5.0 3.0 2.0 4.0 3.0 1.0 6.0 2.0 3.0 4.0 5.0 1.0 1.0 4.0 2.0 3.0 5.0 3.0 5.0 4.0 2.0 8.0 1.0 3.0 2.0 4.0 5.0 2.0... (7 Replies)
Discussion started by: dhilipumich
7 Replies

6. UNIX for Dummies Questions & Answers

changing data into matrix form

Hi, I have a file whose structure is like this 7 7 1 2 3 4 5 1 3 4 8 6 1 4 5 6 0 2 6 8 3 8 2 5 7 8 0 5 7 9 4 1 3 8 0 2 2 3 5 6 8 basically first two row tell the number of rows and column but the data following them are not arranged in that format. now i want to create another... (1 Reply)
Discussion started by: g0600014
1 Replies

7. UNIX for Advanced & Expert Users

Changing Unix form to Microsoft Word form to be able to email it to someone.

Please someone I need information on how to change a Unix form/document into a microsoft word document in order to be emailed to another company. Please help ASAP. Thankyou :confused: (8 Replies)
Discussion started by: Cheraunm
8 Replies
Login or Register to Ask a Question