03-14-2016
Select lines based on character length
Hi,
I've got a file like this:
HTML Code:
22 22:35645163:T:<CN0>:0 0 35645163 T <CN0>
22 rs140738445:20902439:TTTTTTTG:T 0 20902439 T TTTTTTTG
22 rs149602065:40537763:TTTTTTG:T 0 40537763 T TTTTTTG
22 rs71670155:50538408:TTTTTTG:T 0 50538408 T TTTTTTG
22 rs147956527:27899116:TTTTTG:T 0 27899116 T TTTTTG
22 rs112169882:26309326:T:TTTTTC 0 26309326 T TTTTTC
22 rs112170669:29942398:T:TTTTTC 0 29942398 T TTTTTC
22 rs148467612:32268721:TTTTTA:T 0 32268721 T TTTTTA
22 rs71806779:32681699:TTTTTA:T 0 32681699 T TTTTTA
22 rs7291429 0 17294251 G T
22 rs2192431:17303596:T:G 0 17303596 G T
22 rs175140 0 17306104 G T
22 rs175147:17309362:G:T 0 17309362 G T
22 rs12628206:17316990:T:G 0 17316990 G T
22 rs7510758:17432482:T:G 0 17432482 G T
22 rs4819923:17433210:T:G 0 17433210 G T
And I need to print out lines that have only one character in columns 5 and 6. So, the output should look like this:
HTML Code:
22 rs7291429 0 17294251 G T
22 rs2192431:17303596:T:G 0 17303596 G T
22 rs175140 0 17306104 G T
22 rs175147:17309362:G:T 0 17309362 G T
22 rs12628206:17316990:T:G 0 17316990 G T
22 rs7510758:17432482:T:G 0 17432482 G T
22 rs4819923:17433210:T:G 0 17433210 G T
So far, I've tried to use awk:
HTML Code:
awk '{print $5,$6}' in.file | awk 'length <3' > out.file
Unfortunately, (1) it is not that great because I do not get the whole line in the end and (2) it does not select the lines that I need.
Any help would be greatly appreciated!
Many thanks!
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
Is there any way to merge two lines based on specific occurance of a character in a file.
I am having a flat file which contains multiple records.
Each row in the file should contain specified number of delimiter.
For a particular row , if the delimiter count is not matched with... (2 Replies)
Discussion started by: mohan_tuty
2 Replies
2. Shell Programming and Scripting
All,
I can't seem to find exactly what I'm looking for, and haven't had any luck patching things together.
I need to look through a file, and if the record length is not 874, then add 'E' in position 778.
Your help is greatly appreciated. (4 Replies)
Discussion started by: CutNPaste
4 Replies
3. UNIX for Dummies Questions & Answers
Hi,
I would like to know how can I select lines of one file based on a common ID column from another file (keeping the order of the second file).
Example of file1:
ID A B C D
1-30 1 2 3
5-60 4 5 6
1-20 7 8 9
Example of file2:
ID chr pos
1-20 1 20
1-30 1 30
5-60 5 60
Desired... (2 Replies)
Discussion started by: fadista
2 Replies
4. Shell Programming and Scripting
Hello,
I use UBUNTU 12.04.
I want to write a short program using awk to select some lines in a file based on a second file.
My first file has this format with about 400,000 lines and 47 fields:
SNP1 1 12.1
SNP2 1 13.2
SNP3 1 45.2
SNP4 1 23.4
My second file has this format:
SNP2
SNP3... (1 Reply)
Discussion started by: Homa
1 Replies
5. Shell Programming and Scripting
Hi
I need to select lines from a txt file, I have got a line starting with ZMIO:MSISDN= and after a few line I have another line starting with 'MOBILE STATION ISDN NUMBER' and another one starting with 'VLR-ADDRESS' I need to copy these three lines as three different columns in a separate... (3 Replies)
Discussion started by: Tlcm sam
3 Replies
6. Shell Programming and Scripting
Hi,
I am in need of help for the two things which is to be done.
First, I have a file that has around four columns. The first column is filled with letter "A".
There are around 400 lines in the files as shown below.
A 1 5.2 3.2
A 2 0.2 4.5
A 1 2.2 2.2
A 5 2.1 ... (2 Replies)
Discussion started by: begin_shell
2 Replies
7. Shell Programming and Scripting
Hi,
I have a pipe delimeted text file where lines have been split over 2 lines and I need to join them back together. For example the file I have is similar to the following:
aaa|bbb
|ccc
ddd|eee
fff|ggg
|hhh
I ideally need to have it looking like the following
aaa|bbb|ccc
ddd|eee... (5 Replies)
Discussion started by: fuji_s
5 Replies
8. Shell Programming and Scripting
Hi,
I need help with one problem, I came across recently.
I have one input file which I need to delimit based on character length.
$ cat Input.txt
12345sda231453
asd760kjol62569
sdasw4g76gdf57
And, There is one comma separated file which mentions "start of the field" and "length... (6 Replies)
Discussion started by: Prathmesh
6 Replies
9. Shell Programming and Scripting
Good day, I am a newbie here and thanks for accepting me
I have a task to modify input data where my input data looks like
123|34567|CHINE
1|23|INDIA
34512|21|USA
104|901|INDIASee that my input has two columns with different character length but max length is 5 and minimum length is 0 which... (1 Reply)
Discussion started by: fastlearner
1 Replies
10. Shell Programming and Scripting
Hello,
I want to get the maximum value of each record separated by empty line based on the 3rd column of each row within each record?
Input:
A1 chr5D 634 7 82 707
A2 chr5D 637 6 82 713
A3 chr5D 637 5 82 713
A4 chr5D 626 1 82 704... (4 Replies)
Discussion started by: yifangt
4 Replies
LEARN ABOUT DEBIAN
html::filter
HTML::Filter(3pm) User Contributed Perl Documentation HTML::Filter(3pm)
NAME
HTML::Filter - Filter HTML text through the parser
NOTE
This module is deprecated. The "HTML::Parser" now provides the functionally of "HTML::Filter" much more efficiently with the the "default"
handler.
SYNOPSIS
require HTML::Filter;
$p = HTML::Filter->new->parse_file("index.html");
DESCRIPTION
"HTML::Filter" is an HTML parser that by default prints the original text of each HTML element (a slow version of cat(1) basically). The
callback methods may be overridden to modify the filtering for some HTML elements and you can override output() method which is called to
print the HTML text.
"HTML::Filter" is a subclass of "HTML::Parser". This means that the document should be given to the parser by calling the $p->parse() or
$p->parse_file() methods.
EXAMPLES
The first example is a filter that will remove all comments from an HTML file. This is achieved by simply overriding the comment method to
do nothing.
package CommentStripper;
require HTML::Filter;
@ISA=qw(HTML::Filter);
sub comment { } # ignore comments
The second example shows a filter that will remove any <TABLE>s found in the HTML file. We specialize the start() and end() methods to
count table tags and then make output not happen when inside a table.
package TableStripper;
require HTML::Filter;
@ISA=qw(HTML::Filter);
sub start
{
my $self = shift;
$self->{table_seen}++ if $_[0] eq "table";
$self->SUPER::start(@_);
}
sub end
{
my $self = shift;
$self->SUPER::end(@_);
$self->{table_seen}-- if $_[0] eq "table";
}
sub output
{
my $self = shift;
unless ($self->{table_seen}) {
$self->SUPER::output(@_);
}
}
If you want to collect the parsed text internally you might want to do something like this:
package FilterIntoString;
require HTML::Filter;
@ISA=qw(HTML::Filter);
sub output { push(@{$_[0]->{fhtml}}, $_[1]) }
sub filtered_html { join("", @{$_[0]->{fhtml}}) }
SEE ALSO
HTML::Parser
COPYRIGHT
Copyright 1997-1999 Gisle Aas.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.14.2 2008-04-04 HTML::Filter(3pm)