Sponsored Content
Homework and Emergencies Emergency UNIX and Linux Support Average columns based on header name Post 302969067 by Don Cragun on Thursday 17th of March 2016 03:08:07 PM
Old 03-17-2016
I do not understand what output you are trying to produce.

For each input gene line do you want:
  1. one output line with the gene number and 10 averages from all 13 samples,
  2. thirteen output lines with the gene number and the 10 averages from one sample on each line, or
  3. one output line with the gene number and 130 averages where each set of 10 averages comes from one sample?
Can you show us the exact output you're hoping to produce from the data provided in your sample input for genes 1 and 2?

Was the data for gene 3 in your sample truncated, or will some inputs have missing fields that should be treated as zero values when computing the averages?
This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk based script to find the average of all the columns in a data file

Hi All, I need the modification for the below mentioned code (found in one more post https://www.unix.com/shell-programming-scripting/27161-script-generate-average-values.html) to find the average values for all the columns(but for a specific rows) and print the averages side by side. I have... (4 Replies)
Discussion started by: ks_reddy
4 Replies

2. Shell Programming and Scripting

Average of columns with values of other column with same name

I have a lot of input files that have the following form: Sample Cq Sample Cq Sample Cq Sample Cq Sample Cq 1WBIN 23.45 1WBIN 23.45 1CVSIN 23.96 1CVSIN 23.14 S1 31.37 1WBIN 23.53 1WBIN 23.53 1CVSIN 23.81 1CVSIN 23.24 S1 31.49 1WBIN 24.55 1WBIN 24.55 1CVSIN 23.86 1CVSIN 23.24 S1 31.74 ... (3 Replies)
Discussion started by: isildur1234
3 Replies

3. Shell Programming and Scripting

Average, min and max in file with header, using awk

Hi, I have a file which looks like this: FID IID MISS_PHENO N_MISS N_GENO F_MISS 12AB43131 12AB43131 N 17774 906341 0.01961 65HJ87451 65HJ87451 N 10149 906341 0.0112 43JJ21345 43JJ21345 N 2826 906341 0.003118I would... (11 Replies)
Discussion started by: kayakj
11 Replies

4. Shell Programming and Scripting

Extract columns based on header

Hi to all, I have two files. File1 has no header, two columns: sample1 A sample2 B sample3 B sample4 C sample5 A sample6 D sample7 D File2 has a header, except for the first 3 columns (chr,start,end). "sample1" is the header for the 4th ,5th ,6th columns, "sample2" is the header... (4 Replies)
Discussion started by: aec
4 Replies

5. Shell Programming and Scripting

Make copy of text file with columns removed (based on header)

Hello, I have some tab delimited text files with a three header rows. The headers look like, (sorry the tabs look so messy). index group Name input input input input input input input input input input input... (9 Replies)
Discussion started by: LMHmedchem
9 Replies

6. Shell Programming and Scripting

Average across multiple columns group by

Hi experts, I want to group by average, for multiple columns starting column $7 until NF, group by ($1-$5), please help For just 7th column, I can do awk ' NR>1{ arr += $7 count += 1 } END{ for (a in arr) { print a, arr/count ... (10 Replies)
Discussion started by: ritakadm
10 Replies

7. UNIX for Beginners Questions & Answers

Keep only columns in first two rows based on partial header pattern.

I have this code below that only prints out certain columns from the first two rows (doesn't affect rows 3 and beyond). How can I do the same on a partial header pattern “G_TP” instead of having to know specific column numbers (e.g. 374-479)? I've tried many other commands within this pipe with no... (4 Replies)
Discussion started by: aachave1
4 Replies

8. Shell Programming and Scripting

Find columns in a file based on header and print to new file

Hello, I have to fish out some specific columns from a file based on the header value. I have the list of columns I need in a different file. I thought I could read in the list of headers I need, # file with header names of required columns in required order headers_file=$2 # read contents... (11 Replies)
Discussion started by: LMHmedchem
11 Replies

9. Shell Programming and Scripting

Average of a columns from three files

hello, I have three files in the following order ==> File1 <== 1 20977000 20977000 A C 1.00 0,15 15 45 1 115829313 115829313 G A 0.500 6,7 13 99 ==> File2 <== 1 20977000 20977000 A C 1.00 0,13 13 39 1 115829313 ... (5 Replies)
Discussion started by: nans
5 Replies

10. UNIX for Beginners Questions & Answers

Average of columns

I have files that have the following columns chr pos ref alt sample 1 sample 2 sample 3 chr2 179644035 G A 1,107 0,1 58,67 chr7 151945167 G T 142,101 100,200 500,700 chr13 31789169 CTT CT,C 6,37,8 0,0,0 15,46,89 chr22 ... (3 Replies)
Discussion started by: nans
3 Replies
SWISS::GNs(3pm) 					User Contributed Perl Documentation					   SWISS::GNs(3pm)

Name
       SWISS::GNs.pm

Description
       SWISS::GNs represents the GN lines within an SWISS-PROT + TrEMBL entry as specified in the user manual
	http://www.expasy.org/sprot/userman.html . The GNs object is a container object which holds a list of SWISS::GeneGroup objects.

Inherits from
       SWISS::ListBase.pm

Attributes
       "list"
	     Each list element is a SWISS::GeneGroup object.

       "and" (deprecated, for old format only)
	     Delimiter used between genes. Defaults to " AND ".

       "or" (deprecated, for old format only)
	     Delimiter used between gene names. Defaults to " OR ".

Methods
   Standard methods
       new
       fromText
       toText

   Reading/Writing methods
       text [($newText)]
	   Sets the text of the GN line to the parameter if it is present, and returns the (unwrapped) text of the line.  Also sets 'and' and 'or'
	   delimiters to the first occurrences of the words "OR" and "AND" in the line, conserving the case.

       lowercase (deprecated, for old format only)
	   Sets the GNs::and and GNs::or delimiters to their lower case values.

       uppercase (deprecated, for old format only)
	   Sets the GNs::and and GNs::or delimiters to their upper case values.

       getFirst()
	   Returns first gene name in gene line

       getTags($target)
	   Returns evidence tags associated with $target

	   $target is a string

       isPresent($target)
	   Returns 1 if $target is present in the GN line

	   $target is a string

       needsReCasing($target)
	   If $target is present in the GN line, but wrongly cased, method returns the matching name in its current case

	   $target is a string

       replace($newName, $target, $evidenceTag)
	   Replaces the first GN object in the GN line whose text attribute is $target with a new GN object whose text attribute is set to
	   $newName and whose evidenceTags attribute is is set using values set by splitting $evidenceTag on /, / (as name is not being changed,
	   programs should keep old tag and add new tag).  Does nothing if $target is not found.

       delete($target)
	   Removes synonym/single-member gene group matching $target. Note that if a "Name" is deleted, the first "Synonym" will be promoted to
	   "Name"

       addAsNewSynonym($newName, $target, $evidenceTag, $location)
	   Adds a new GN object (with text attribute set to new $newName, and evidenceTags attribute set to ($evidenceTag)), as a synonym to the
	   first gene group in which $target is a gene name.  Does nothing if $target is not found.  Will not add a duplicate gene name.
	   $location determines where in gene group new object is added: if $location == 1, 2, 3, ..., new object added in the 1st, 2nd, 3rd, ...
	   position; if $location == 0, new object added before $target; if $location == -1, new object added after $target (default); if
	   $location == -2, new object added at end of gene group.  Note that if the new synonym is inserted in the first postion, it will become
	   the "Name" and the previous "Name" will be downgraded to first "Synonym"

       addAsNewGeneGroup($newName, $target, $evidenceTag, $location)
	   Adds a new GeneGroup object, comprising 1 GN object (with text attribute set to new $newName, and evidenceTags attribute set to
	   ($evidenceTag)). Will not add a duplicate gene name.  $location and $target determine where in GNs line new group is added: if
	   $location == 1, 2, 3, ..., new object added in the 1st, 2nd, 3rd, ... position; if $location == 0, new object added before $target; if
	   $location == -1, new object added after $target (default); if $location == -2, new object added at end of GNs line.	Does nothing if
	   $target is not found, and $location == 0 or -1; otherwise $target does not need to be set.

       replaceGeneGroup($newGeneGroup, $target)
	   Replaces the first gene group containing $target with $newGeneGroup.  Creating the $newGeneGroup correctly is the user's responsibility

       getGeneGroup($target)
	   Returns the first gene group that contains $target

       setToOr()
	   Retruns a new GNs object, but with all GNs objects in a single gene group.  Needed when adding 'C' to 'A and B', when the relationship
	   of 'C' to 'A' and 'B' is unknown: the universal use of ' or ' is the default delimeter for TrEMBL entries

TRANSITION
The format of the GN line will change in 2004 from: GN (CYSA1 OR CYSA OR RV3117 OR MT3199 OR MTCY164.27) AND (CYSA2 OR GN RV0815C OR MT0837 OR MTV043.07C). to: GN Name=CysA1; Synonyms=CysA; OrderedLocusNames=Rv3117, MT3199; GN ORFNames=MtCY164.27; GN and GN Name=CysA2; OrderedLocusNames=Rv0815c, MT0837; ORFNames=MTV043.07c; This module supports both formats. To convert an entry from the old to the new format, do: $entry->GNs->is_old_format(0); perl v5.10.1 2006-01-26 SWISS::GNs(3pm)
All times are GMT -4. The time now is 08:42 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy