Sponsored Content
Top Forums Shell Programming and Scripting How many studies have unequal values for each pair? Post 302978103 by senhia83 on Tuesday 26th of July 2016 09:53:10 AM
Old 07-26-2016
How many studies have unequal values for each pair?

I have several Studies (s) which has points (p) having Values (v).
My goal is to determine for each pair of points, how many studies have different values ( if available ).

Code:
Study	Point	Value
1	p1	value1
1	p2	value2
1	p3	value1
1	p4	value3
1	p5	value3
2	p2	value1
2	p4	value1
3	p1	value1
3	p5	value5
3	p3	value1
4	p2	value1
4	p4	value5


For example, the pair (p1,p5) are involved in 2 studies , STUDY 1 (value1,value3 ) and STUDY 3 (value1, value5 ) where both values are different. So the count for this pair is 2. Pair (p1,p3) is present in both studies 1 and 3 with same values. So the count is 0.


So my desired output is


Code:
Point1	Point2	#StudiesWhereValuesAreDifferentForThisPair
p1	p2	1
p1	p4	1
p1	p5	2
p2	p3	1
p2	p4	2
p2	p5	1
p3	p4	1
p3	p5	2
p4	p5	1

I do have a working solution for this which works for a small data-set for runs forever for the actual dataset which has several thousand factors in each column

Here is my solution

Code:
awk '{sp[$1 FS $2]=$3;s[$1];p[$2];next }END { for(ss in s) { for (p1s in p) { for (p2s in p)   { if (p1s !=p2s) { print ss,p1s,p2s,sp[ss FS p1s],sp[ss FS p2s] } }}}}'  | awk 'NF==5' | awk '$4!=$5' dataset

Please help me achieve the same in a smarter way.

Last edited by senhia83; 07-26-2016 at 11:34 AM..
 

7 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Splitting a file into unequal parts

How do I split a file into many parts but with different amounts of lines per part? I looked at the split command but that only splits evenly. I'd like a range specified to determine how many lines each output file should have. For example, if the input file has 1000 lines and the range is... (1 Reply)
Discussion started by: revax
1 Replies

2. UNIX for Dummies Questions & Answers

Merge two files with common IDs but unequal number of rows

Hi, I have two files that I would like to merge and think that there should be a solution using awk. The files look something like this: file 1 IDX1 IDY1 IDX2 IDY2 IDX3 IDY3 file 2 IDY1 dataA data1 IDY2 dataB data2 IDY3 dataC data3 Desired output IDX1 IDY1 dataA data1 IDX2 ... (5 Replies)
Discussion started by: katie8856
5 Replies

3. Shell Programming and Scripting

Newline between unequal record fields

Assume the following 5 records (field separator is a space): 0903 0903 0910 0910 0910 0910 0910 0910 0917 0917 0917 0917 0924 1001 1001 1001 1001 1008 1008 1008 1008 1015 1015 1015 1015 1022 1029 1029 1029 1029 1105 1105 1105 1105 1112 1112 1112 1112 1119 1126 1126 1126 1126 1203 1203 1203 1203... (8 Replies)
Discussion started by: tree
8 Replies

4. Shell Programming and Scripting

Pair wise comparisons

Hi, I have 25 groups and I need to perform all possible pairwise compariosns between them using the formula n(n-1)/2. SO in my case it will be 25(25-1)/2 which is equal to 300 comparisons. my 25 groups are FG1 FG2 FG3 FG4 FG5 NT5E CD44 CD44 CD44 AXL ADAM19 CCDC80 L1CAM L1CAM CD44... (1 Reply)
Discussion started by: Diya123
1 Replies

5. Shell Programming and Scripting

Compare two unsorted unequal files extracted from xml

I have two files for comparison which are extracts from set of xml files. file1 has: Comparing File: BRCSH1to320140224CC3.xml :: TZZZ:BR :: TAZZ:OUT UIZZ:0 :: ERAZ:1.000000 UIZZ:0 :: CTZZ:B UIZZ:0 :: CCAZ:MYR Comparing File: BRMY20140224CC18REG013SPFNSY13.xml :: TZZZ:BR :: TAZZ:INB... (1 Reply)
Discussion started by: vamsi gunda
1 Replies

6. Shell Programming and Scripting

Finding difference between two columns of unequal length

Hi, I have two files which look like this cat waitstate.txt 18.2 82.1 cat gostate.txt 5.6 5.8 6.1 6.3 6.6 6.9 7.2 7.5 (4 Replies)
Discussion started by: jamie_123
4 Replies

7. Shell Programming and Scripting

awk name pair values

Team, I have a file like below FILE: NAM1,KEY1,VAL1 NAM1,KEY2,VAL2 NAM1,KEY3,VAL3 NAM2,KEY1,VALA NAM2,KEY2,VALB NAM2,KEY3,VALCOutput: I have to build commands like below <Script> VAL1 VAL2 VAL3 NAME1 <Script> VALA VALB VALC NAME2Can you please help with awk command i can use... (4 Replies)
Discussion started by: mallak
4 Replies
Dumpvalue(3pm)						 Perl Programmers Reference Guide					    Dumpvalue(3pm)

NAME
Dumpvalue - provides screen dump of Perl data. SYNOPSIS
use Dumpvalue; my $dumper = Dumpvalue->new; $dumper->set(globPrint => 1); $dumper->dumpValue(*::); $dumper->dumpvars('main'); my $dump = $dumper->stringify($some_value); DESCRIPTION
Creation A new dumper is created by a call $d = Dumpvalue->new(option1 => value1, option2 => value2) Recognized options: "arrayDepth", "hashDepth" Print only first N elements of arrays and hashes. If false, prints all the elements. "compactDump", "veryCompact" Change style of array and hash dump. If true, short array may be printed on one line. "globPrint" Whether to print contents of globs. "dumpDBFiles" Dump arrays holding contents of debugged files. "dumpPackages" Dump symbol tables of packages. "dumpReused" Dump contents of "reused" addresses. "tick", "quoteHighBit", "printUndef" Change style of string dump. Default value of "tick" is "auto", one can enable either double-quotish dump, or single-quotish by setting it to """ or "'". By default, characters with high bit set are printed as is. If "quoteHighBit" is set, they will be quoted. "usageOnly" rudimentary per-package memory usage dump. If set, "dumpvars" calculates total size of strings in variables in the package. unctrl Changes the style of printout of strings. Possible values are "unctrl" and "quote". subdump Whether to try to find the subroutine name given the reference. bareStringify Whether to write the non-overloaded form of the stringify-overloaded objects. quoteHighBit Whether to print chars with high bit set in binary or "as is". stopDbSignal Whether to abort printing if debugger signal flag is raised. Later in the life of the object the methods may be queries with get() method and set() method (which accept multiple arguments). Methods dumpValue $dumper->dumpValue($value); $dumper->dumpValue([$value1, $value2]); Prints a dump to the currently selected filehandle. dumpValues $dumper->dumpValues($value1, $value2); Same as "$dumper->dumpValue([$value1, $value2]);". stringify my $dump = $dumper->stringify($value [,$noticks] ); Returns the dump of a single scalar without printing. If the second argument is true, the return value does not contain enclosing ticks. Does not handle data structures. dumpvars $dumper->dumpvars('my_package'); $dumper->dumpvars('my_package', 'foo', '~bar$', '!......'); The optional arguments are considered as literal strings unless they start with "~" or "!", in which case they are interpreted as regular expressions (possibly negated). The second example prints entries with names "foo", and also entries with names which ends on "bar", or are shorter than 5 chars. set_quote $d->set_quote('"'); Sets "tick" and "unctrl" options to suitable values for printout with the given quote char. Possible values are "auto", "'" and """. set_unctrl $d->set_unctrl('unctrl'); Sets "unctrl" option with checking for an invalid argument. Possible values are "unctrl" and "quote". compactDump $d->compactDump(1); Sets "compactDump" option. If the value is 1, sets to a reasonable big number. veryCompact $d->veryCompact(1); Sets "compactDump" and "veryCompact" options simultaneously. set $d->set(option1 => value1, option2 => value2); get @values = $d->get('option1', 'option2'); perl v5.18.2 2013-11-04 Dumpvalue(3pm)
All times are GMT -4. The time now is 06:41 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy