03-14-2016
Select lines based on character length
Hi,
I've got a file like this:
HTML Code:
22 22:35645163:T:<CN0>:0 0 35645163 T <CN0>
22 rs140738445:20902439:TTTTTTTG:T 0 20902439 T TTTTTTTG
22 rs149602065:40537763:TTTTTTG:T 0 40537763 T TTTTTTG
22 rs71670155:50538408:TTTTTTG:T 0 50538408 T TTTTTTG
22 rs147956527:27899116:TTTTTG:T 0 27899116 T TTTTTG
22 rs112169882:26309326:T:TTTTTC 0 26309326 T TTTTTC
22 rs112170669:29942398:T:TTTTTC 0 29942398 T TTTTTC
22 rs148467612:32268721:TTTTTA:T 0 32268721 T TTTTTA
22 rs71806779:32681699:TTTTTA:T 0 32681699 T TTTTTA
22 rs7291429 0 17294251 G T
22 rs2192431:17303596:T:G 0 17303596 G T
22 rs175140 0 17306104 G T
22 rs175147:17309362:G:T 0 17309362 G T
22 rs12628206:17316990:T:G 0 17316990 G T
22 rs7510758:17432482:T:G 0 17432482 G T
22 rs4819923:17433210:T:G 0 17433210 G T
And I need to print out lines that have only one character in columns 5 and 6. So, the output should look like this:
HTML Code:
22 rs7291429 0 17294251 G T
22 rs2192431:17303596:T:G 0 17303596 G T
22 rs175140 0 17306104 G T
22 rs175147:17309362:G:T 0 17309362 G T
22 rs12628206:17316990:T:G 0 17316990 G T
22 rs7510758:17432482:T:G 0 17432482 G T
22 rs4819923:17433210:T:G 0 17433210 G T
So far, I've tried to use awk:
HTML Code:
awk '{print $5,$6}' in.file | awk 'length <3' > out.file
Unfortunately, (1) it is not that great because I do not get the whole line in the end and (2) it does not select the lines that I need.
Any help would be greatly appreciated!
Many thanks!
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
Is there any way to merge two lines based on specific occurance of a character in a file.
I am having a flat file which contains multiple records.
Each row in the file should contain specified number of delimiter.
For a particular row , if the delimiter count is not matched with... (2 Replies)
Discussion started by: mohan_tuty
2 Replies
2. Shell Programming and Scripting
All,
I can't seem to find exactly what I'm looking for, and haven't had any luck patching things together.
I need to look through a file, and if the record length is not 874, then add 'E' in position 778.
Your help is greatly appreciated. (4 Replies)
Discussion started by: CutNPaste
4 Replies
3. UNIX for Dummies Questions & Answers
Hi,
I would like to know how can I select lines of one file based on a common ID column from another file (keeping the order of the second file).
Example of file1:
ID A B C D
1-30 1 2 3
5-60 4 5 6
1-20 7 8 9
Example of file2:
ID chr pos
1-20 1 20
1-30 1 30
5-60 5 60
Desired... (2 Replies)
Discussion started by: fadista
2 Replies
4. Shell Programming and Scripting
Hello,
I use UBUNTU 12.04.
I want to write a short program using awk to select some lines in a file based on a second file.
My first file has this format with about 400,000 lines and 47 fields:
SNP1 1 12.1
SNP2 1 13.2
SNP3 1 45.2
SNP4 1 23.4
My second file has this format:
SNP2
SNP3... (1 Reply)
Discussion started by: Homa
1 Replies
5. Shell Programming and Scripting
Hi
I need to select lines from a txt file, I have got a line starting with ZMIO:MSISDN= and after a few line I have another line starting with 'MOBILE STATION ISDN NUMBER' and another one starting with 'VLR-ADDRESS' I need to copy these three lines as three different columns in a separate... (3 Replies)
Discussion started by: Tlcm sam
3 Replies
6. Shell Programming and Scripting
Hi,
I am in need of help for the two things which is to be done.
First, I have a file that has around four columns. The first column is filled with letter "A".
There are around 400 lines in the files as shown below.
A 1 5.2 3.2
A 2 0.2 4.5
A 1 2.2 2.2
A 5 2.1 ... (2 Replies)
Discussion started by: begin_shell
2 Replies
7. Shell Programming and Scripting
Hi,
I have a pipe delimeted text file where lines have been split over 2 lines and I need to join them back together. For example the file I have is similar to the following:
aaa|bbb
|ccc
ddd|eee
fff|ggg
|hhh
I ideally need to have it looking like the following
aaa|bbb|ccc
ddd|eee... (5 Replies)
Discussion started by: fuji_s
5 Replies
8. Shell Programming and Scripting
Hi,
I need help with one problem, I came across recently.
I have one input file which I need to delimit based on character length.
$ cat Input.txt
12345sda231453
asd760kjol62569
sdasw4g76gdf57
And, There is one comma separated file which mentions "start of the field" and "length... (6 Replies)
Discussion started by: Prathmesh
6 Replies
9. Shell Programming and Scripting
Good day, I am a newbie here and thanks for accepting me
I have a task to modify input data where my input data looks like
123|34567|CHINE
1|23|INDIA
34512|21|USA
104|901|INDIASee that my input has two columns with different character length but max length is 5 and minimum length is 0 which... (1 Reply)
Discussion started by: fastlearner
1 Replies
10. Shell Programming and Scripting
Hello,
I want to get the maximum value of each record separated by empty line based on the 3rd column of each row within each record?
Input:
A1 chr5D 634 7 82 707
A2 chr5D 637 6 82 713
A3 chr5D 637 5 82 713
A4 chr5D 626 1 82 704... (4 Replies)
Discussion started by: yifangt
4 Replies
LEARN ABOUT DEBIAN
html::treebuilder::libxml
HTML::TreeBuilder::LibXML(3pm) User Contributed Perl Documentation HTML::TreeBuilder::LibXML(3pm)
NAME
HTML::TreeBuilder::LibXML - HTML::TreeBuilder and XPath compatible interface with libxml
SYNOPSIS
use HTML::TreeBuilder::LibXML;
my $tree = HTML::TreeBuilder::LibXML->new;
$tree->parse($html);
$tree->eof;
# $tree and $node compatible to HTML::Element
my @nodes = $tree->findvalue($xpath);
for my $node (@nodes) {
print $node->tag;
my %attr = $node->all_external_attr;
}
HTML::TreeBuilder::LibXML->replace_original(); # replace HTML::TreeBuilder::XPath->new
DESCRIPTION
HTML::TreeBuilder::XPath is libxml based compatible interface to HTML::TreeBuilder, which could be slow for a large document.
HTML::TreeBuilder::LibXML is drop-in-replacement for HTML::TreeBuilder::XPath.
This module doesn't implement all of HTML::TreeBuilder and HTML::Element APIs, but enough methods are defined so modules like Web::Scraper
work.
BENCHMARK
This is a benchmark result by tools/benchmark.pl
Web::Scraper: 0.26
HTML::TreeBuilder::XPath: 0.09
HTML::TreeBuilder::LibXML: 0.01_01
Rate no_libxml use_libxml
no_libxml 5.45/s -- -94%
use_libxml 94.3/s 1632% --
AUTHOR
Tokuhiro Matsuno <tokuhirom slkjfd gmail.com>
Tatsuhiko Miyagawa <miyagawa@cpan.org>
Masahiro Chiba
THANKS TO
woremacx++ http://d.hatena.ne.jp/woremacx/20080202/1201927162
id:dailyflower
SEE ALSO
HTML::TreeBuilder, HTML::TreeBuilder::XPath
LICENSE
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.14.2 2012-04-02 HTML::TreeBuilder::LibXML(3pm)