Sponsored Content
Top Forums Shell Programming and Scripting Filter file by length, looking only at lines that don't begin with ">" Post 302835661 by pathunkathunk on Monday 22nd of July 2013 11:20:54 PM
Old 07-23-2013
Filter file by length, looking only at lines that don't begin with ">"

I have a file that stores data in pairs of lines, following this format:
line 1: header (preceded by ">")
line 2: sequence

Example.txt:
Code:
>seq1 name
GATTGATGTTTGAGTTTTGGTTTTT
>seq2 name
TTTTCTTC

I want to filter out the sequences and corresponding headers for all sequences that are less than 11 characters. Desired output:
Code:
>seq2 name
TTTTCTTC

I can search each line for lines less than 11 characters, and print that line along with the header. The problem I'm having is ignoring the headers (i.e. lines beginning with ">") when I do the length search.

For example
Code:
awk '{lines[NR] = $0} length($0) < 11 {print lines [NR-1]; print lines [NR]} ' example.txt

Gives me
Code:
>seq1 name
GATTGATGTTTGAGTTTTGGTTTTT
>seq2 name
>seq2 name
TTTTCTTC

How do I tell awk not to ignore lines beginning with ">"?

Last edited by Scrutinizer; 07-23-2013 at 02:55 AM.. Reason: code tags also for data samples. Do not use quote tags
 

10 More Discussions You Might Find Interesting

1. SCO

Plz. don't ignore this mail "Installing Tomcat 4.1.24.zip on Sco Openserver 5.0.2"

Hi Guys, I want ur replies very very Urgently.Plz. don't ignore this mail. I am using Sco openserver 5.0.2 and i have downloaded jdk1.2.2 for that i have installed it.The jdk is working fine. Then i download jakarta-tomcat-4.1.24.zip and i have installed it. In order... (1 Reply)
Discussion started by: ananthu_m
1 Replies

2. Shell Programming and Scripting

How to skip lines which don't begin with a number

Hi, I have a file: file.txt 1 word 2 word word word 3 word 4 word and I would like to create a set: set number = `cut -d" " -f1 ${1}` #${1} is the text file but it should only contain the lines which begin with numbers, and another set which contains the lines which begin with... (10 Replies)
Discussion started by: shira
10 Replies

3. Shell Programming and Scripting

perl file, one line code include "length, rindex, substr", slow

Hi Everyone, # cat a.txt a;b;c;64O a;b;c;d;ee;f # cat a.pl #!/usr/bin/perl use strict; use warnings; my $tmp3 = ",,a,,b,,c,,d,,e,,f,,"; open(my $FA, "a.txt") or die "$!"; while(<$FA>) { chomp; my @tmp=split(/\;/, $_); if ( ($tmp =~ m/^(64O)/i) || ($tmp... (3 Replies)
Discussion started by: jimmy_y
3 Replies

4. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

5. UNIX for Dummies Questions & Answers

awk - difference between -F"," and BEGIN{FS=","}

in awk, what is the difference between: -F"," and BEGIN{FS=","} (2 Replies)
Discussion started by: locoroco
2 Replies

6. Shell Programming and Scripting

Find lines with "A" then change "E" to "X" same line

I have a bunch of random character lines like ABCEDFG. I want to find all lines with "A" and then change any "E" to "X" in the same line. ALL lines with "A" will have an "X" somewhere in it. I have tried sed awk and vi editor. I get close, not quite there. I know someone has already solved this... (10 Replies)
Discussion started by: nightwatchrenba
10 Replies

7. UNIX for Dummies Questions & Answers

Using "mailx" command to read "to" and "cc" email addreses from input file

How to use "mailx" command to do e-mail reading the input file containing email address, where column 1 has name and column 2 containing “To” e-mail address and column 3 contains “cc” e-mail address to include with same email. Sample input file, email.txt Below is an sample code where... (2 Replies)
Discussion started by: asjaiswal
2 Replies

8. UNIX for Dummies Questions & Answers

Grep : Filter/Move All The Lines Containing Not More Than One "X" Character Into A Text File

Hi All It's me again with another huge txt files. :confused: What I have: - I have 33 huge txt files in a folder. - I have thousands of line in this txt file which contain many the letter "x" in them. - Some of them have more than one "x" character in the line. What I want to achieve:... (8 Replies)
Discussion started by: Nexeu
8 Replies

9. Shell Programming and Scripting

Filter all the lines with minimum specified length of words of a text file

Hi Can someone tell me which script will work best (in terms of speed and simplicity to write and run) for a large text file to filter all the lines with a minimum specified length of words ? A sample script with be definitely of great help !!! Thanks in advance. :) (4 Replies)
Discussion started by: my_Perl
4 Replies

10. Shell Programming and Scripting

Bash script - Print an ascii file using specific font "Latin Modern Mono 12" "regular" "9"

Hello. System : opensuse leap 42.3 I have a bash script that build a text file. I would like the last command doing : print_cmd -o page-left=43 -o page-right=22 -o page-top=28 -o page-bottom=43 -o font=LatinModernMono12:regular:9 some_file.txt where : print_cmd ::= some printing... (1 Reply)
Discussion started by: jcdole
1 Replies
GREP(1) 						      General Commands Manual							   GREP(1)

NAME
grep - search a file for lines containing a given pattern SYNOPSIS
grep [-elnsv] pattern [file] ... OPTIONS
-e -e pattern is the same as pattern -c Print a count of lines matched -i Ignore case -l Print file names, no lines -n Print line numbers -s Status only, no printed output -v Select lines that do not match EXAMPLES
grep mouse file # Find lines in file containing mouse grep [0-9] file # Print lines containing a digit DESCRIPTION
Grep searches one or more files (by default, stdin) and selects out all the lines that match the pattern. All the regular expressions accepted by ed and mined are allowed. In addition, + can be used instead of * to mean 1 or more occurrences, ? can be used to mean 0 or 1 occurrences, and | can be used between two regular expressions to mean either one of them. Parentheses can be used for grouping. If a match is found, exit status 0 is returned. If no match is found, exit status 1 is returned. If an error is detected, exit status 2 is returned. SEE ALSO
cgrep(1), fgrep(1), sed(1), awk(9). GREP(1)
All times are GMT -4. The time now is 06:34 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy