Hi
I am writing a script which should read a file and search for certain strings 'approved' or 'removed' and retain only those lines that contain the above strings.
Ex: file name 'test'
test:
approved package
waiting for approval package
disapproved package
removed package
approved... (14 Replies)
Hey all, a relative bash/script newbie trying solve a problem.
I've got a text file with lots of lines that I've been able to clean up and format with awk/sed/cut, but now I'd like to remove the lines with duplicate usernames based on time stamp. Here's what the data looks like
2007-11-03... (3 Replies)
Hello guys,
should be a very easy questn for you:
I need to delete strings in file1 based on the list of strings in file2.
like file2:
word1_word2_
word3_word5_
word3_word4_
word6_word7_
file1:
word1_word2_otherwords..,word3_word5_others... (7 Replies)
Platform : RHEL 5.8
I have text file called myapplication.log . In this file, I have around 800 lines which start with the followng three strings
PWRBRKER-3493
PWRBRKER-7834
SCHEDULER-ERROR
How can I delete these lines in one go ? (13 Replies)
if I have the following lines in a file app.log
some lines here
<AAAA>
abc
<id>123456789</id>
ddd
</AAAA>some lines here too
<BBBB>
abc
<id>123456789</id>
ddd
</BBBB>some lines here too
<AAAA>
xyz
<id>987654321</id>
ssss
</AAAA>some lines here again...
How do I get the... (5 Replies)
Hi,
i need help to remove duplicates in my file. The problem is i need to delete one duplicate for each line only. the input file as follows and it is not tab delimited:-
The output need to remove 2nd word (in red) that duplicate with 1st word (in blue). Other duplicates should remained... (12 Replies)
I want to replace strings in test2 according to test1 table. In doing so, I`m losing records that I dont need to replace, please suggest modifications.
what i have
$ cat > test1
a b
c d
$ cat > test2
a
a
a
d
d
what i tried
$ awk ' BEGIN {FS=OFS=" "} FNR==NR{a=$2;next}... (2 Replies)
Within my text file i have several thousand lines of text with some lines containing duplicate strings/words. I would like to entirely remove those lines which contain the duplicate strings.
Eg;
One and a Two
Unix.com is the Best
This as a Line Line
Example duplicate sentence with the word... (22 Replies)
Hello Everyone ,
Iam a newbie to shell programming and iam reaching out if anyone can help in this :-
I have two files
1) Insert.txt
2) partition_list.txt
insert.txt looks like this :-
insert into emp1 partition (partition_name)
(a1,
b2,
c4,
s6,
d8)
select
a1,
b2,
c4, (2 Replies)
I cannot seem to get what should be a simple awk one-liner to work correctly and cannot figure out why. I would like to use patterns from a specific field in one file as regex to search for matching strings in the entire line ($0) of another file.
I would like to output the lines of File2 which... (1 Reply)
Discussion started by: jvoot
1 Replies
LEARN ABOUT DEBIAN
mkdoc::xml::tokenizer
MKDoc::XML::Tokenizer(3pm) User Contributed Perl Documentation MKDoc::XML::Tokenizer(3pm)NAME
MKDoc::XML::Tokenizer - Tokenize XML the REX way
SYNOPSIS
my $tokens = MKDoc::XML::Tokenizer->process_data ($some_xml);
foreach my $token (@{$tokens})
{
print "'" . $token->as_string() . "' is text
" if (defined $token->text());
print "'" . $token->as_string() . "' is a self closing tag
" if (defined $token->tag_self_close());
print "'" . $token->as_string() . "' is an opening tag
" if (defined $token->tag_open());
print "'" . $token->as_string() . "' is a closing tag
" if (defined $token->tag_close());
print "'" . $token->as_string() . "' is a processing instruction
" if (defined $token->pi());
print "'" . $token->as_string() . "' is a declaration
" if (defined $token->declaration());
print "'" . $token->as_string() . "' is a comment
" if (defined $token->comment());
print "'" . $token->as_string() . "' is a tag
" if (defined $token->tag());
print "'" . $token->as_string() . "' is a pseudo-tag (NOT text and NOT tag)
" if (defined $token->pseudotag());
print "'" . $token->as_string() . "' is a leaf token (NOT opening tag)
" if (defined $token->leaf());
}
SUMMARY
MKDoc::XML::Tokenizer is a module which uses Robert D. Cameron REX technique to parse XML (ignore the carriage returns):
[^<]+|<(?:!(?:--(?:[^-]*-(?:[^-][^-]*-)*->?)?|[CDATA[(?:[^]]*](?:[^]]+])
*]+(?:[^]>][^]]*](?:[^]]+])*]+)*>)?|DOCTYPE(?:[
]+(?:[A-Za-z_:]|[^
x00-x7F])(?:[A-Za-z0-9_:.-]|[^x00-x7F])*(?:[
]+(?:(?:[A-Za-z_:]|[^
x00-x7F])(?:[A-Za-z0-9_:.-]|[^x00-x7F])*|"[^"]*"|'[^']*'))*(?:[
]+)
?(?:[(?:<(?:!(?:--[^-]*-(?:[^-][^-]*-)*->|[^-](?:[^]"'><]+|"[^"]*"|'[^']*'
)*>)|?(?:[A-Za-z_:]|[^x00-x7F])(?:[A-Za-z0-9_:.-]|[^x00-x7F])*(?:?>|[
n
][^?]*?+(?:[^>?][^?]*?+)*>))|%(?:[A-Za-z_:]|[^x00-x7F])(?:[A-Za-z0
-9_:.-]|[^x00-x7F])*;|[
]+)*](?:[
]+)?)?>?)?)?|?(?:(?:[A-Za-z
_:]|[^x00-x7F])(?:[A-Za-z0-9_:.-]|[^x00-x7F])*(?:?>|[
][^?]*?+(?
:[^>?][^?]*?+)*>)?)?|/(?:(?:[A-Za-z_:]|[^x00-x7F])(?:[A-Za-z0-9_:.-]|[^x
00-x7F])*(?:[
]+)?>?)?|(?:(?:[A-Za-z_:]|[^x00-x7F])(?:[A-Za-z0-9_:.
-]|[^x00-x7F])*(?:[
]+(?:[A-Za-z_:]|[^x00-x7F])(?:[A-Za-z0-9_:.-]|
[^x00-x7F])*(?:[
]+)?=(?:[
]+)?(?:"[^<"]*"|'[^<']*'))*(?:[
t
]+)?/?>?)?)
That's right. One big regex, and it works rather well.
DISCLAIMER
This module does low level XML manipulation. It will somehow parse even broken XML and try to do something with it. Do not use it unless
you know what you're doing.
API
my $tokens = MKDoc::XML::Tokenizer->process_data ($some_xml);
Splits $some_xml into a list of MKDoc::XML::Token objects and returns an array reference to the list of tokens.
my $tokens = MKDoc::XML::Tokenizer->process_file ('/some/file.xml');
Same as MKDoc::XML::Tokenizer->process_data ($some_xml), except that it reads $some_xml from '/some/file.xml'.
NOTES
MKDoc::XML::Tokenizer works with MKDoc::XML::Token, which can be used when building a full tree is not necessary. If you need to build a
tree, look at MKDoc::XML::TreeBuilder.
AUTHOR
Copyright 2003 - MKDoc Holdings Ltd.
Author: Jean-Michel Hiver
This module is free software and is distributed under the same license as Perl itself. Use it at your own risk.
SEE ALSO
MKDoc::XML::Token MKDoc::XML::TreeBuilder
perl v5.10.1 2004-10-06 MKDoc::XML::Tokenizer(3pm)