Sponsored Content
Full Discussion: Gaps and frequencies
Top Forums UNIX for Dummies Questions & Answers Gaps and frequencies Post 302949767 by Don Cragun on Wednesday 15th of July 2015 11:57:34 PM
Old 07-16-2015
I'm better with awk than perl. This seems to do what you want:
Code:
awk '
FNR == NR {
	if($0 ~ /^>/)
		t += f = $NF
	else	for(i = length($0); i > 0; i--)
			if(substr($0, i, 1) == "-")
				cc[i] += f
	next
}
/^>/ {	# Print header lines unchanged.
	print
	next
}
FNR == 2 {
	# Filter out column counts with frequency <= 50%...
	for(i in cc)
		if((cc[i] / t) <= .5)
			delete cc[i]
	# Create arrays for low end and counts for substrings to be printed...
	for(i = 1; i <= length($0); i++) {
		if(low == 0) {
			# Find low end of range to keep.
			if(!(i in cc)) {
				low = i
				count = 1
			}
		} else {# Look for end of range to keep.
			if(!(i in cc)) {
				# Keep this column.
				count++
			} else {# Save range and setup to look for next range.
				sf[++subc] = low
				sl[subc] = count
				low = count = 0
			}
		}
	}
	if(low) {
		# Set up entry to print last substring.
		sf[++subc] = low
		sl[subc] = count
	}
}
{	# Print selected substrings for non-header lines.
	out=""
	for(i = 1; i <= subc; i++)
		out = out substr($0, sf[i], sl[i])
	print out
}' infile infile

This User Gave Thanks to Don Cragun For This Post:
 

8 More Discussions You Might Find Interesting

1. Linux

Searching for gaps in huge (2.2G) log file?

I've got a 2.2 Gig syslog file from our Cisco firewall appliance. The problem is that we've been seeing gaps in the syslog for anywhere from 10 minutes to 2 hours. Currently I've just been using 'less' and paging through the file to see if I can find any noticeable gaps. Obviously this isn't the... (3 Replies)
Discussion started by: deckard
3 Replies

2. Shell Programming and Scripting

Searching for Gaps in Time

I am very new to shell scripting. We use C-Shell here and I know the issues that surround it. I hope a solution can be created using awk, sed, etc... instead of having to write a program. I have an input file that is sorted by date and time in ascending order ... (2 Replies)
Discussion started by: jclanc8
2 Replies

3. Shell Programming and Scripting

Recalculating frequencies

My file looks like this The first 2 sequences are identical (different ID and frequencies though). The same thing for the last 2. What I need is to compare all sequences within the file and if they are identical, they need to be 'compressed' to one entry and the frequency should be recalculated.... (8 Replies)
Discussion started by: Xterra
8 Replies

4. Shell Programming and Scripting

Appending lines with word frequencies, ordering and indexing a column

Dear All, I have the following input data: w1 20 g1 w1 10 g1 w2 12 g1 w2 23 g1 w3 10 g1 w3 17 g1 w3 12.5 g1 w3 21 g1 w4 11 g1 w4 13.2 g1 w4 23 g1 w4 18 g1 First I seek to find the word frequencies in col1 and sort col2 in ascending order for each change in a col1 word. Second,... (5 Replies)
Discussion started by: Ghetz
5 Replies

5. Shell Programming and Scripting

Merging Frequencies in a File

hello, I have a file which has the following structure: word <TAB> frequency The same word can have multiple frequencies: John <TAB> 60 John <TAB> 20 John <TAB> 30 Mary <TAB> 1000 Mary <TAB> 800 Mary <TAB> 20 What I need is a script which could merge all these frequencies into one single... (10 Replies)
Discussion started by: gimley
10 Replies

6. Shell Programming and Scripting

Sorting and moving file sequence with gaps

Hello, I have lots of sequentially numbered files which make up an image sequence. I'm trying to do two things with it: #1: Find gaps in the sequence and move each range of sequencial files into their own subfolder. #2: Designate a starting point (file) and move every 24th file into... (4 Replies)
Discussion started by: ex_H
4 Replies

7. Shell Programming and Scripting

Removal of extra spaces in *.log files to allow extraction of frequencies

Our university has upgraded its version of a computational chemistry program that our group uses quite regularly. In the past we have been able to extract frequency spectra from log files that are generated. Since the upgrade, the viewing program errors out. I've been able to trace down the changes... (16 Replies)
Discussion started by: wsuchem
16 Replies

8. Shell Programming and Scripting

Adding gaps to a string in bash

I have the following string, and want to introduce additional spaces between the two %s. This will be done by specifying the gap between the %s. Example having gap=8 will put 8 spaces between the two %s. frmt_k1d1_test="%s %s\n" I am doing the script in bash. ---------- Post updated at... (4 Replies)
Discussion started by: kristinu
4 Replies
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +--------------------+-----------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +--------------------+-----------------+ |Availability | SUNWgawk | +--------------------+-----------------+ |Interface Stability | Volatile | +--------------------+-----------------+ NOTES
Source for gawk is available on http://opensolaris.org. Free Software Foundation Nov 3 1999 IGAWK(1)
All times are GMT -4. The time now is 01:20 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy