Getting the minimum of each column in a file


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Getting the minimum of each column in a file
# 1  
Old 07-05-2011
Getting the minimum of each column in a file

Hi,

I have a file like:

Code:
0.000000 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 33.097845 33.363764
0.000000 266.483441 262.519130 266.380993 274.989622 289.594799 309.523518 336.124848 372.386124 413.522043 429.984825 421.621810
0.000000 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 33.401882
0.000000 97.477355 94.933661 97.928952 105.448316 117.024660 132.644757 153.381627 180.224859 215.463003 253.248762 271.624651
0.000000 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 33.483917
0.000000 42.681075 39.326458 38.411189 39.743588 43.541023 49.794621 59.562272 73.726915 93.300969 119.534458 148.535368
0.000000 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 33.557573
0.000000 21.870860 18.959399 17.609238 17.067321 17.430512 18.823996 21.660205 26.600318 34.705616 47.154923 64.389543
0.000000 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 33.613809
0.000000 11.201860 9.529166 8.455750 7.880009 7.603165 7.643028 8.174741 9.391286 11.784547 16.131785 23.447829
0.000000 4.810876 4.115395 3.646008 3.345025 3.152457 3.040187 3.064264 3.305893 3.859102 5.033395 7.330489
0.000000 0.969459 1.345371 1.216664 1.142066 1.098538 1.056879 1.027665 1.060867 1.176333 1.434613 2.004218
0.000000 0.428585 0.430715 0.432730 0.433734 0.437866 0.436239 0.431684 0.433489 0.446501 0.483750 0.577336
0.000000 1.436791 1.321300 1.231155 0.742121 0.715842 0.691830 0.674073 0.661660 0.643040 0.617369 0.583815
0.000000 124.085533 3.050521 2.820307 2.597373 1.682511 1.566116 1.486880 1.420974 1.364140 1.305646 1.224344
0.000000 124.085533 5.293862 4.880496 4.508001 4.146966 2.936976 2.724846 2.580173 2.451738 2.339468 2.215425
0.000000 124.085533 7.673906 7.218768 6.715450 6.234255 5.768312 4.333641 4.033744 3.821608 3.626226 3.441136
0.000000 124.085533 9.954696 9.562780 9.063915 8.522168 7.919997 7.334804 5.793549 5.426340 5.129007 4.853716
0.000000 124.085533 11.957430 11.747540 11.365440 10.854155 10.188824 9.489209 8.816863 7.223774 6.776017 6.411012

And I want to find the minimum value of each column, is there a way to do it in one step? (Column 1 should be skipped)
(The way I can think of is:
Code:
sort -k 2 filename > output1
head -1 output1 > minimum_column_2

sort -k 3 filename > output2
head -1 output2 > minimum_column_3

sort -k 4 filename > output3
head -1 output3 > minimum_column_4

....etc
But I am wondering if there is a way to get the minimum of each column all in one file (note that for this file all the minima are in the same line but that is not a general case)
# 2  
Old 07-05-2011
Code:
#!/usr/bin/perl
open I, "$ARGV[0]";
while (<I>){
  $n++;
  chomp;
  @F=split / /;
  for ($i=0;$i<=$#F;$i++){
    push @{$c[$i]},$F[$i];
  }
}
for ($i=1;$i<=$#F;$i++){
  $min[$i]=$c[$i][0];
  for ($j=0;$j<=$n-1;$j++){
    $min[$i]=$c[$i][$j] if $c[$i][$j]<$min[$i];
  }
}
for ($i=1;$i<=$#F;$i++){
  print "Minimum for column ", $i+1,": $min[$i]\n";
}

Run it as: ./script.pl filename
This User Gave Thanks to bartus11 For This Post:
# 3  
Old 07-05-2011
Code:
awk '{for(i=0;++i<=NF;) A[i]=(!(i in A))?$i:($i<A[i])?$i:A[i]}END{i=1;do {printf A[i] FS} while (++i in A);print z}' input >output


Last edited by ctsgnb; 07-05-2011 at 12:29 PM..
# 4  
Old 07-05-2011
Quote:
Originally Posted by bartus11
Code:
#!/usr/bin/perl
open I, "$ARGV[0]";
while (<I>){
  $n++;
  chomp;
  @F=split / /;
  for ($i=0;$i<=$#F;$i++){
    push @{$c[$i]},$F[$i];
  }
}
for ($i=1;$i<=$#F;$i++){
  $min[$i]=$c[$i][0];
  for ($j=0;$j<=$n-1;$j++){
    $min[$i]=$c[$i][$j] if $c[$i][$j]<$min[$i];
  }
}
for ($i=1;$i<=$#F;$i++){
  print "Minimum for column ", $i+1,": $min[$i]\n";
}

Run it as: ./script.pl filename

Is there a way to modify this script so that it tells you in which line was the minimum?
# 5  
Old 07-05-2011
Hi.

Using one of the programs in "|stat":
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate minimum among other characteristics of columns of data.
# "|stat" statistics package:
# http://oldwww.acm.org/perlman/stat/

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C specimen

FILE=${1-data1}

pl " Input data file $FILE:"
cut -c1-78 $FILE |
specimen 

pl " Results:"
validata < $FILE

exit 0

producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
GNU bash 3.2.39
specimen (local) 1.17

-----
 Input data file data1:
Edges: 5:0:5 of 19 lines in file "-"
0.000000 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124
0.000000 266.483441 262.519130 266.380993 274.989622 289.594799 309.523518 336
0.000000 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124
0.000000 97.477355 94.933661 97.928952 105.448316 117.024660 132.644757 153.38
0.000000 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124
   ---
0.000000 124.085533 3.050521 2.820307 2.597373 1.682511 1.566116 1.486880 1.42
0.000000 124.085533 5.293862 4.880496 4.508001 4.146966 2.936976 2.724846 2.58
0.000000 124.085533 7.673906 7.218768 6.715450 6.234255 5.768312 4.333641 4.03
0.000000 124.085533 9.954696 9.562780 9.063915 8.522168 7.919997 7.334804 5.79
0.000000 124.085533 11.957430 11.747540 11.365440 10.854155 10.188824 9.489209

-----
 Results:
validata: 19 lines read
Col   N  NA alnum alpha   int float other  type   min   max 
  1  19   0     0     0    19    19     0   int     0     0 
  2  19   0     0     0     0    19     0 float 0.428585 266.483 
  3  19   0     0     0     0    19     0 float 0.430715 262.519 
  4  19   0     0     0     0    19     0 float 0.43273 266.381 
  5  19   0     0     0     0    19     0 float 0.433734 274.99 
  6  19   0     0     0     0    19     0 float 0.437866 289.595 
  7  19   0     0     0     0    19     0 float 0.436239 309.524 
  8  19   0     0     0     0    19     0 float 0.431684 336.125 
  9  19   0     0     0     0    19     0 float 0.433489 372.386 
 10  19   0     0     0     0    19     0 float 0.446501 413.522 
 11  19   0     0     0     0    19     0 float 0.48375 429.985 
 12  19   0     0     0     0    19     0 float 0.577336 421.622

See URL as noted in script comments for details on "|stat" ... cheers, drl
# 6  
Old 07-05-2011
Quote:
Originally Posted by cosmologist
Is there a way to modify this script so that it tells you in which line was the minimum?
Sure:
Code:
#!/usr/bin/perl
open I, "$ARGV[0]";
while (<I>){
  $n++;
  chomp;
  @F=split / /;
  for ($i=0;$i<=$#F;$i++){
    push @{$c[$i]},$F[$i];
  }
}
for ($i=0;$i<=$#F;$i++){
  $min[$i]=$c[$i][0];
  $line[$i]=1;
  for ($j=0;$j<=$n-1;$j++){
    $line[$i]=$j+1 if $c[$i][$j]<$min[$i];
    $min[$i]=$c[$i][$j] if $c[$i][$j]<$min[$i];
  }
}
for ($i=1;$i<=$#F;$i++){
  print "Minimum for column ", $i+1,": $min[$i], line: $line[$i]\n";
}

# 7  
Old 07-05-2011
Code:
[ctsgnb@shell ~/sand]$ cat tst
0.000000 124.085533 124.085533 124.085533 124.085533 0.085533 124.085533 124.085533 124.085533 124.085533 33.097845 33.363764
0.000000 266.483441 262.519130 266.380993 274.989622 289.594799 309.523518 336.124848 372.386124 413.522043 429.984825 421.621810
0.000000 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 0.005533 124.085533 124.085533 124.085533 33.401882
0.000000 97.477355 94.933661 97.928952 105.448316 117.024660 132.644757 153.381627 180.224859 215.463003 253.248762 271.624651
0.000000 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 33.483917
0.000000 42.681075 39.326458 38.411189 39.743588 43.541023 49.794621 59.562272 73.726915 93.300969 119.534458 148.535368
0.000000 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 33.557573
0.000000 21.870860 18.959399 17.609238 17.067321 17.430512 18.823996 21.660205 26.600318 34.705616 47.154923 64.389543
0.000000 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 33.613809
0.000000 11.201860 9.529166 8.455750 7.880009 7.603165 7.643028 8.174741 9.391286 11.784547 16.131785 23.447829
0.000000 4.810876 4.115395 3.646008 3.345025 3.152457 3.040187 3.064264 3.305893 3.859102 5.033395 7.330489
0.000000 0.969459 1.345371 1.216664 1.142066 1.098538 1.056879 1.027665 1.060867 1.176333 1.434613 0.004218
0.000000 0.428585 0.430715 0.432730 0.433734 0.437866 0.436239 0.431684 0.433489 0.446501 0.483750 0.577336
0.000000 1.436791 1.321300 1.231155 0.742121 0.715842 0.691830 0.674073 0.661660 0.643040 0.617369 0.583815
0.000000 124.085533 3.050521 2.820307 2.597373 1.682511 1.566116 1.486880 1.420974 1.364140 1.305646 1.224344
0.000000 124.085533 5.293862 4.880496 4.508001 4.146966 2.936976 2.724846 2.580173 2.451738 2.339468 2.215425
0.000000 124.085533 7.673906 7.218768 6.715450 6.234255 5.768312 0.333641 4.033744 3.821608 3.626226 3.441136
0.000000 124.085533 9.954696 9.562780 0.063915 8.522168 7.919997 7.334804 5.793549 5.426340 5.129007 4.853716
0.000000 124.085533 11.957430 11.747540 11.365440 10.854155 10.188824 9.489209 8.816863 7.223774 6.776017 6.411012
[ctsgnb@shell ~/sand]$ awk '{for(i=0;++i<=NF;) A[i]=(!(i in A))?$i":"NR:($i<A[i])?$i":"NR:A[i]}END{i=1;do {printf A[i] FS} while (++i in A);print z}' tst
0.000000:19 0.428585:13 0.430715:13 0.432730:13 0.063915:18 0.085533:1 0.436239:13 0.005533:3 0.433489:13 0.446501:13 0.483750:13 0.004218:12
[ctsgnb@shell ~/sand]$

the result show <minimum value>:<linenumber at which it appeared> ...

---------- Post updated at 04:36 PM ---------- Previous update was at 04:35 PM ----------

Code:
awk '{for(i=0;++i<=NF;) A[i]=(!(i in A))?$i":"NR:($i<A[i])?$i":"NR:A[i]}END{i=1;do {printf A[i] FS} while (++i in A);print z}' tst


Last edited by ctsgnb; 07-05-2011 at 12:30 PM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Programming

Find the minimum value of the column with respect to other column

Hi All, I would like get the minimum value in the certain column with respect to other column. For example, I have a text file like this. ATOM 1 QSS SPH S 0 -2.790 -1.180 -2.282 2.28 2.28 ATOM 1 QSS SPH S 1 -2.915 -1.024 -2.032 2.31 2.31 ATOM 1 ... (4 Replies)
Discussion started by: bala06
4 Replies

2. Shell Programming and Scripting

awk to find maximum and minimum from column and store in other column

Need your support for below. Please help to get required output If column 5 is INV then only consider column1 and take out duplicates/identical rows/values from column1 and then put minimum value of column6 in column7 and put maximum value in column 8 and then need to do subtract values of... (7 Replies)
Discussion started by: as7951
7 Replies

3. Shell Programming and Scripting

Find minimum and maximum values based on column with associative array

Hello, I need to find out the minimum and maximum values based on specific column, and then print out the entire row with the max value. Infile.txt: scf6 290173 290416 . + X_047241 T_00113118-1 scf6 290491 290957 . + X_047241 T_00113118-2 scf6 290898 290957 . + X_047241 T_00113119-3 scf6... (2 Replies)
Discussion started by: yifangt
2 Replies

4. Shell Programming and Scripting

Filter all the lines with minimum specified length of words of a text file

Hi Can someone tell me which script will work best (in terms of speed and simplicity to write and run) for a large text file to filter all the lines with a minimum specified length of words ? A sample script with be definitely of great help !!! Thanks in advance. :) (4 Replies)
Discussion started by: my_Perl
4 Replies

5. UNIX for Dummies Questions & Answers

[Solved] Using awk to obtain minimum of each column (ignoring zeros)

Hi, I have a wide and long dataset which looks as follows: 0 3 4 2 3 0 2 2 ... 3 2 4 0 2 2 2 3 ... 0 3 4 2 0 4 4 4 ... 3 0 4 2 2 4 2 4 ... .... I would like to obtain the minimum of each column (ignoring zero values) so the output would look like: 3 2 4 2 2 2 2 2 I have the... (3 Replies)
Discussion started by: kasan0
3 Replies

6. Shell Programming and Scripting

C shell--take the minimum of a column

I have a data file with two columns, for the second column I want to find the minimum, and subtract this minimum from each value in the second column, how to realize this using C shell For example, I have 1 -2.4 2 -4.8 3 7.9 I wanna output 1 2.4 2 0 3 12.7 Thanks! (4 Replies)
Discussion started by: rockytodd
4 Replies

7. Shell Programming and Scripting

Match column 3 in file1 to column 1 in file 2 and replace with column 2 from file2

Match column 3 in file1 to column 1 in file 2 and replace with column 2 from file2 file 1 sample SNDK 80004C101 AT XLNX 983919101 BB NETL 64118B100 BS AMD 007903107 CC KLAC 482480100 DC TER 880770102 KATS ATHR 04743P108 KATS... (7 Replies)
Discussion started by: rydz00
7 Replies

8. Shell Programming and Scripting

Count minimum columns in file

Hi All, Consider the file with following lines: 1,2,3,4 1,2,3, 5,6,7,7,8,9 1 I need to get the count of minimum columns per line. i.e. in above case, it should come out to be 1 since the last line only has 1 column. I tried following code: minCount = 0 wordCountPerLine = 0... (12 Replies)
Discussion started by: sh_kk
12 Replies

9. Shell Programming and Scripting

Changing one column of delimited file column to fixed width column

Hi, Iam new to unix. I have one input file . Input file : ID1~Name1~Place1 ID2~Name2~Place2 ID3~Name3~Place3 I need output such that only first column should change to fixed width column of 15 characters of length. Output File: ID1<<12 spaces>>Name1~Place1 ID2<<12... (5 Replies)
Discussion started by: manneni prakash
5 Replies

10. Shell Programming and Scripting

Minimum whitespace separated CSV file generation

Hi, I have a flat text file consisting of rows (each field separated by '|') from a table. e.g; NSW|Gulliver Travels|236||5000|BW This has to be converted to the following format NSW "Gulliver Travels" 236 5000 BW No data field has to be left as a blank character so that we have... (6 Replies)
Discussion started by: vharsha
6 Replies
Login or Register to Ask a Question