Sponsored Content
Top Forums Shell Programming and Scripting Filling positions based on consensus character Post 302434192 by bartus11 on Thursday 1st of July 2010 06:32:54 PM
Old 07-01-2010
For now AWK.. If it happens to be too slow, then I can come up with some Perl, but not until tomorow (note that this is one huge command, you might consider breaking it into awk scripts):
Code:
awk '/>/{fr=$3;getline;n=split ($0,a,""); for (i=1;i<=n;i++) b[i"-"a[i]]+=fr}\
END{for (i in b) {split (i,c,"-"); if (d[c[1]]<=b[i]){e[c[1]]=c[2];d[c[1]]=b[i]}}\
for (i in e) print i" "e[i]}' file | awk 'NR==FNR{a[$1]=$2;next}\
{n=split($0,b,"");for (i=1;i<=n;i++) if (b[i]=="-" b[i]=a[i]; for (i=1;i<=n;i++) printf b[i];\
printf "\n"}' - file > outfile

 

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk script replace positions if certain positions equal prescribed value

I am attempting to replace positions 44-46 with YYY if positions 48-50 = XXX. awk -F "" '{if (substr($0,48,3)=="XXX") $44="YYY"}1' OFS="" $filename > $tempfile But this is not working, 44-46 is still spaces in my tempfile instead of YYY. Any suggestions would be greatly appreciated. (9 Replies)
Discussion started by: halplessProblem
9 Replies

2. Shell Programming and Scripting

filling a character

in my file data is like this 1,2,3 3,4,5,6,7,8 10,11,23,24 i want to make as 1,2,3,?,?,? 3,4,5,6,7,8 10,11,23,24,?,? here max no of words(separated by comma) in a line is 6.so every line contains 6 words.Line which have less than 6 words replaced with '?' as a word i have... (3 Replies)
Discussion started by: new2ubuntulinux
3 Replies

3. Shell Programming and Scripting

seds to extract fields based on positions

Hi My file has a series of rows up to 160 characters in length. There are 7 columns for each row. In each row, column 1 starts at position 4 column 2 starts at position 12 column 3 starts at position 43 column 4 starts at position 82 column 5 starts at... (7 Replies)
Discussion started by: malts18
7 Replies

4. Shell Programming and Scripting

Extract text between two character positions

Greetings. I need to extract text between two character positions, e.g: all text between character 4921 and 6534. The text blocks are FASTA-format sequence of whole chromosomes, so basically a million A, T, G, C, combinations. E.g: >Chr_1 ACCTGTTCAACTCTCAGGACTCTCAGGTCAACTCTCAG... (3 Replies)
Discussion started by: Twinklefingers
3 Replies

5. Shell Programming and Scripting

Sort based on positions in flat file

Hello, For example: 12........6789101112..............20212223242526..................50 ( Positions) LName FName DOB (Lastname starts from 1 to 6 , FName from 8 to 15 and date of birth from 21 to29) CURTIS KENNETH ... (5 Replies)
Discussion started by: duplicate
5 Replies

6. Shell Programming and Scripting

Join based on positions

I have two text files as shown below cat file1.txt Id leng sal mon 25671 34343 56565 5565 44888 56565 45554 6868 23343 23423 26226 6224 77765 88688 87464 6848 66776 23343 63463 4534 cat file2.txt Id number 25671 34343 76767 34234 23343 23423 66776 23343 (4 Replies)
Discussion started by: halfafringe
4 Replies

7. UNIX for Dummies Questions & Answers

Filling positions based on frequency

I have files with hundreds of sequences with frequency values reported as "Freq X" and missing characters represented by a dash ("-"), something like this >39sample Freq 4 TAGATGTGCCCGTGGGTTTCCCGTCAACACCGGATAGTAGCAGCACTA >22sample Freq 15 T-GATGTCGTGGGTTTCCCGTCAACACCGGCAAATAGTAGCAGCACTA... (12 Replies)
Discussion started by: Xterra
12 Replies

8. Shell Programming and Scripting

Filter lines based on values at specific positions

hi. I have a Fixed Length text file as input where the character positions 4-5(two character positions starting from 4th position) indicates the LOB indicator. The file structure is something like below: 10126Apple DrinkOmaha 10231Milkshake New Jersey 103 Billabong Illinois ... (6 Replies)
Discussion started by: kumarjt
6 Replies
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +--------------------+-----------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +--------------------+-----------------+ |Availability | SUNWgawk | +--------------------+-----------------+ |Interface Stability | Volatile | +--------------------+-----------------+ NOTES
Source for gawk is available on http://opensolaris.org. Free Software Foundation Nov 3 1999 IGAWK(1)
All times are GMT -4. The time now is 11:11 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy