I'm better with awk than perl. This seems to do what you want:
Code:
awk '
FNR == NR {
if($0 ~ /^>/)
t += f = $NF
else for(i = length($0); i > 0; i--)
if(substr($0, i, 1) == "-")
cc[i] += f
next
}
/^>/ { # Print header lines unchanged.
print
next
}
FNR == 2 {
# Filter out column counts with frequency <= 50%...
for(i in cc)
if((cc[i] / t) <= .5)
delete cc[i]
# Create arrays for low end and counts for substrings to be printed...
for(i = 1; i <= length($0); i++) {
if(low == 0) {
# Find low end of range to keep.
if(!(i in cc)) {
low = i
count = 1
}
} else {# Look for end of range to keep.
if(!(i in cc)) {
# Keep this column.
count++
} else {# Save range and setup to look for next range.
sf[++subc] = low
sl[subc] = count
low = count = 0
}
}
}
if(low) {
# Set up entry to print last substring.
sf[++subc] = low
sl[subc] = count
}
}
{ # Print selected substrings for non-header lines.
out=""
for(i = 1; i <= subc; i++)
out = out substr($0, sf[i], sl[i])
print out
}' infile infile
This User Gave Thanks to Don Cragun For This Post:
I've got a 2.2 Gig syslog file from our Cisco firewall appliance. The problem is that we've been seeing gaps in the syslog for anywhere from 10 minutes to 2 hours. Currently I've just been using 'less' and paging through the file to see if I can find any noticeable gaps. Obviously this isn't the... (3 Replies)
I am very new to shell scripting. We use C-Shell here and I know the issues that surround it. I hope a solution can be created using awk, sed, etc... instead of having to write a program.
I have an input file that is sorted by date and time in ascending order
... (2 Replies)
My file looks like this
The first 2 sequences are identical (different ID and frequencies though). The same thing for the last 2. What I need is to compare all sequences within the file and if they are identical, they need to be 'compressed' to one entry and the frequency should be recalculated.... (8 Replies)
Dear All,
I have the following input data:
w1 20 g1
w1 10 g1
w2 12 g1
w2 23 g1
w3 10 g1
w3 17 g1
w3 12.5 g1
w3 21 g1
w4 11 g1
w4 13.2 g1
w4 23 g1
w4 18 g1
First I seek to find the word frequencies in col1 and sort col2 in ascending order for each change in a col1 word. Second,... (5 Replies)
hello,
I have a file which has the following structure:
word <TAB> frequency
The same word can have multiple frequencies:
John <TAB> 60
John <TAB> 20
John <TAB> 30
Mary <TAB> 1000
Mary <TAB> 800
Mary <TAB> 20
What I need is a script which could merge all these frequencies into one single... (10 Replies)
Hello,
I have lots of sequentially numbered files which make up an image sequence.
I'm trying to do two things with it:
#1: Find gaps in the sequence and move each range of sequencial files into their own subfolder.
#2: Designate a starting point (file) and move every 24th file into... (4 Replies)
Our university has upgraded its version of a computational chemistry program that our group uses quite regularly. In the past we have been able to extract frequency spectra from log files that are generated. Since the upgrade, the viewing program errors out. I've been able to trace down the changes... (16 Replies)
I have the following string, and want to introduce additional spaces between the two %s. This will be done by specifying the gap between the %s. Example having gap=8 will put 8 spaces between the two %s.
frmt_k1d1_test="%s %s\n"
I am doing the script in bash.
---------- Post updated at... (4 Replies)
Discussion started by: kristinu
4 Replies
LEARN ABOUT OPENSOLARIS
igawk
IGAWK(1) Utility Commands IGAWK(1)NAME
igawk - gawk with include files
SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ...
igawk [ all gawk options ] [ -- ] program-text file ...
DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1).
AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like
@include getopt.awk
in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path.
OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports.
EXAMPLES
cat << EOF > test.awk
@include getopt.awk
BEGIN {
while (getopt(ARGC, ARGV, "am:q") != -1)
...
}
EOF
igawk -f test.awk
SEE ALSO gawk(1)
Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995.
AUTHOR
Arnold Robbins (arnold@skeeve.com).
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+--------------------+-----------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+--------------------+-----------------+
|Availability | SUNWgawk |
+--------------------+-----------------+
|Interface Stability | Volatile |
+--------------------+-----------------+
NOTES
Source for gawk is available on http://opensolaris.org.
Free Software Foundation Nov 3 1999 IGAWK(1)