Filter records in a huge text file from a filter text file
Hi Folks,
I have a text file with lots of rows with duplicates in the first column, i want to filter out records based on filter columns in a different filter text file.
bash scripting is what i need.
Data.txt
Filter file contains:
Filter.txt
ouput file should be:
Out.txt
Appreciate your help guys.
Hi All
We have got a text file, which has data dumped from 60 tables.
From these 60 tables of data we need data from 4 tables only.
I tried assigning line numbers to filter out data, but it is not working as intended.
below is the sample file
----Table1-----
3,dfs,43,df
4,sd,5,edd... (18 Replies)
I have a file which is having fileds separtaed by delimiter.
Ex:
C;4498;qwa;cghy;;;;40;;222122
C;4498;sample;city;;;;34 2;;222123
C;4498;qwe;xcbv;;;;34-2;;222124
C;4498;jj;sffz;;;;41;;222120
C;4498;eert;qwq;;;;34 A;;222125
C;4498;jj;szxzzd;;;;34;;222127
out of these records I... (3 Replies)
hi all,
I have this file with some user data.
example:
$cat myfile.txt
FName|LName|Gender|Company|Branch|Bday|Salary|Age
aaaa|bbbb|male|cccc|dddd|19900814|15000|20|
eeee|asdg|male|gggg|ksgu|19911216|||
aara|bdbm|male|kkkk|acke|19931018||23|
asad|kfjg|male|kkkc|gkgg|19921213|14000|24|... (4 Replies)
I want to filter records in one of my file using AWK command (or anyother command). I am using the below code
awk -F@ '$1=="0003"&&"$2==20100402" print {$0}' $INPUT > $OUTPUT
I want to pass the 0003 and 20100402 values through a variable. How can I do this?
Any help is much... (1 Reply)
I have a .kml file. So I want filter the .kml to get only the tags that have this numeric codes that they are in a text file
11951
11952
74014
11964
11965
11969
11970
11971
11972
60149
74018
74023
86378
11976
11980
11983
11984
11987 (5 Replies)
Hello,
I am very now to this, hope you can help,
I am looking into editing a file in Solaris, with dinamic collums (lenght varies) and I need 2 things to be made, the fist is to filter the first column and third column from the file bellow file.txt, and create a new file with the 2 filtered... (8 Replies)
Hello,
I want to filter records of a file if they fall in range associated with a second file. First the chr number (2nd col of 1st file and 1st col of 2nd file) needs to be matched. Then if the 3rd col of the first file falls within any of the ranges specified by the 2nd and 3rd cols , then... (4 Replies)
Hi
Can someone tell me which script will work best (in terms of speed and simplicity to write and run) for a large text file to filter all the lines with a minimum specified length of words ? A sample script with be definitely of great help !!! Thanks in advance. :) (4 Replies)
Not sure if this is the correct forum for this question. I have two files. file1.zip, file2
Input:
file1.zip
col1, col2 , col3
a , b , 0:0:0:0:0:c436:9346:d40b
x, y, 0:0:0:0:0:880:39f9:c9a7
m, n , 0:0:0:0:0:80c7:9161:fe00
file2.txt
col1
c4:36:93:46:d4:0b... (1 Reply)
Dear Experts,
I have a log file that contains a timestamp, I would like to filter record from that file based on timestamp. For example refer below file -
cat sample.txt
Jan 19 20:51:48 mukul-Vostro-14-3468 systemd: pam_unix(systemd-user:session): session opened for user root by (uid=0)... (6 Replies)
Discussion started by: mukulverma2408
6 Replies
LEARN ABOUT DEBIAN
xml::sax::byrecord
XML::SAX::ByRecord(3pm) User Contributed Perl Documentation XML::SAX::ByRecord(3pm)NAME
XML::SAX::ByRecord - Record oriented processing of (data) documents
SYNOPSIS
use XML::SAX::Machines qw( ByRecord ) ;
my $m = ByRecord(
"My::RecordFilter1",
"My::RecordFilter2",
...
{
Handler => $h, ## optional
}
);
$m->parse_uri( "foo.xml" );
DESCRIPTION
XML::SAX::ByRecord is a SAX machine that treats a document as a series of records. Everything before and after the records is emitted as-
is while the records are excerpted in to little mini-documents and run one at a time through the filter pipeline contained in ByRecord.
The output is a document that has the same exact things before, after, and between the records that the input document did, but which has
run each record through a filter. So if a document has 10 records in it, the per-record filter pipeline will see 10 sets of (
start_document, body of record, end_document ) events. An example is below.
This has several use cases:
o Big, record oriented documents
Big documents can be treated a record at a time with various DOM oriented processors like XML::Filter::XSLT.
o Streaming XML
Small sections of an XML stream can be run through a document processor without holding up the stream.
o Record oriented style sheets / processors
Sometimes it's just plain easier to write a style sheet or SAX filter that applies to a single record at at time, rather than having to
run through a series of records.
Topology
Here's how the innards look:
+-----------------------------------------------------------+
| An XML:SAX::ByRecord |
| Intake |
| +----------+ +---------+ +--------+ Exhaust |
--+-->| Splitter |--->| Stage_1 |-->...-->| Merger |----------+----->
| +----------+ +---------+ +--------+ |
| ^ |
| | |
| +---------->---------------+ |
| Events not in any records |
| |
+-----------------------------------------------------------+
The "Splitter" is an XML::Filter::DocSplitter by default, and the "Merger" is an XML::Filter::Merger by default. The line that bypasses
the "Stage_1 ..." filter pipeline is used for all events that do not occur in a record. All events that occur in a record pass through the
filter pipeline.
Example
Here's a quick little filter to uppercase text content:
package My::Filter::Uc;
use vars qw( @ISA );
@ISA = qw( XML::SAX::Base );
use XML::SAX::Base;
sub characters {
my $self = shift;
my ( $data ) = @_;
$data->{Data} = uc $data->{Data};
$self->SUPER::characters( @_ );
}
And here's a little machine that uses it:
$m = Pipeline(
ByRecord( "My::Filter::Uc" ),
$out,
);
When fed a document like:
<root> a
<rec>b</rec> c
<rec>d</rec> e
<rec>f</rec> g
</root>
the output looks like:
<root> a
<rec>B</rec> c
<rec>C</rec> e
<rec>D</rec> g
</root>
and the My::Filter::Uc got three sets of events like:
start_document
start_element: <rec>
characters: 'b'
end_element: </rec>
end_document
start_document
start_element: <rec>
characters: 'd'
end_element: </rec>
end_document
start_document
start_element: <rec>
characters: 'f'
end_element: </rec>
end_document
METHODS
new
my $d = XML::SAX::ByRecord->new( @channels, \%options );
Longhand for calling the ByRecord function exported by XML::SAX::Machines.
CREDIT
Proposed by Matt Sergeant, with advise by Kip Hampton and Robin Berjon.
Writing an aggregator.
To be written. Pretty much just that "start_manifold_processing" and "end_manifold_processing" need to be provided. See
XML::Filter::Merger and it's source code for a starter.
perl v5.10.0 2009-06-11 XML::SAX::ByRecord(3pm)