Remove lines that are subsets of other lines in File
Hello everyone,
Although it seems easy, I've been stuck with this problem for a moment now and I can't figure out a way to get it done.
My problem is the following:
I have a file where each line is a sequence of IP addresses, example :
Code:
10.0.0.1 10.0.0.2
10.0.0.5 10.0.0.1 10.0.0.2
...
What I'd like to do, is to remove lines that are completely matched in other lines. In the previous example, "Line 1" would be deleted as it is contained in "Line 2".
So far, I've worked with python and set() objects to get the job done but I've got more than 100K lines and sets lookups are becoming time consuming as the program goes :/
Thanks for you help
Moderator's Comments:
Use code tags, thanks.
Last edited by zaxxon; 04-22-2015 at 06:31 AM..
Reason: code tags and missing a dot
All,
I have a text file with several entries like below:
personname
personname.domain.com
I know there is a way to use vi to remove only the personname.domain.com line. Can someone help? I believe that it involves /s/g/ something...I just can't remember the exact syntax.
Thanks (2 Replies)
Hi gurus,
i'm trying to remove a number of lines from a large file using the following command:
sed '1,5000d' oldfile > newfile
Somehow the lines in the old file are not deleted...
Am I doing this wrongly? Any suggestions? :confused:
Thanks! :)
wee (10 Replies)
A small question
I have a test.txt file
I have contents as:
a:google
b:yahoo
:
c:facebook
:
d:hotmail
How do I remove the line with :
my output should be
a:google
b:yahoo
c:facebook
d:hotmail (5 Replies)
Hi,
I'm not a expert in shell programming, so i've come here to take help from u gurus.
I'm trying to tailor a csv file that i got to make it work for the LOAD FROM command.
I've a datatable csv of the below format -
--in file format
xx,xx,xx ,xx , , , , ,,xx,
xxxx,, ,, xxx,... (11 Replies)
Hey Gang-
I have a list of servers. I want to exclude servers that begin with and end with certain characters. Is there an easy command to do this?
Example
wvm1234dev
wvm1234pro
uvm1122dev
uvm1122bku
uvm1344dev
I want to exclude any lines that start with "wvm" OR "uvm" AND end... (7 Replies)
Hi,
I have a huge file which has Lacs of lines. File system got full.
I want your guys help to suggest me a solution so that I can remove all lines from that file but not last 50,000 lines. I want solution which can remove lines from existing file so that I can have some space left with. (28 Replies)
I have a file that contains the following:
Party_Id1;Party_id2;Party_id3;
1;2;3;
0
0
4;5;6;
0
7;8;9;
How can I adjust the file so it looks like this:
Party_Id1;Party_id2;Party_id3;
1;2;3;
4;5;6;
7;8;9;
I Think the '0' is something like a carriage return, I don't know. But how... (2 Replies)
I have two files, a keepout.txt and a database.csv. They're unsorted, but could be sorted.
keepout:
user1
buser3
anuser19
notheruser27
database:
user1,2343,"information about",field,blah,34
user2,4231,"mo info",etc,stuff,43
notheruser27,4344,"hiya",thing,more thing,423... (4 Replies)
I have been searching and trying to come up with an awk that will perform the following on a
converted text file (original is a pdf).
1. Since the first two lines are (begin with) text they are removed
2. if $1 is a number then all text is merged (combined) into one line until the next... (3 Replies)
Discussion started by: cmccabe
3 Replies
LEARN ABOUT DEBIAN
arch::diffparser
Arch::DiffParser(3pm) User Contributed Perl Documentation Arch::DiffParser(3pm)NAME
Arch::DiffParser - parse file's diff and perform some manipulations
SYNOPSIS
use Arch::DiffParser;
my $dp = Arch::DiffParser->new;
# usable for "annotate" functionality
my $changes = $dp->parse_file("f.diff")->changes;
$dp->parse($diff_content);
$dp->parse("--- f1.c 2005-02-26
+++ f2.c 2005-02-28
...");
# prints "f1.c, f2.c"
printf "%s, %s
", $dp->filename1, $dp->filename2;
# enclose lines in <span class="patch_{mod,orig,line,add,del}">
my $html = $dp->markup_content;
DESCRIPTION
This class provides a limited functionality to parse a single file diff in unified format. Multiple diffs may be parsed sequentially. The
parsed data is stored for the last diff, and is replaced on the following parse.
METHODS
The following class methods are available:
new, parse, parse_file, content, lines, filename1, filename2, mtime1, mtime2, hunks, changes.
new Construct the "Arch::DiffParser" instanse.
parse diff_content
Parse the diff_content and store its parsed data.
parse_file diff_filename
Like parse, but read the diff_content from diff_filename.
diff_data
Return hashref containing certain parsed data. Die if called before any parse methods. The keys are: "lines", "filename1", "filename2",
"mtime1", "mtime2", "hunks", "changes".
The value of "hunks" and "changes" is arrayref of arrayrefs with 5 elements: [ line-number-1, num-lines-1, line-number-2, num-lines-2,
"lines"-index ].
A "hunk" describes a set of lines containing some combination of unmodified, deleted and added lines, a "change" describes an inter-
hunk atom that only contains zero or more deleted lines and zero or more added lines.
lines
filename1
filename2
mtime1
mtime2
hunks
changes
These methods are just shortcuts for diff_data->{method}.
content [%args]
Return content of the last diff.
%args keys are "fileroot1" and "fileroot2"; if given, these will replace the subdirs "orig" and "mod" that arch usually uses in the
filepaths.
markup_content [%args]
Like content, but every non-context line is enclosed into markup <span class="patch_name">line</span>, where name is one of "orig"
(filename1), "mod" (filename2), "line" (hunk linenums), "add" (added), del (deleted).
Not implemented yet.
BUGS
No support for newlines in source file names yet.
AUTHORS
Mikhael Goikhman (migo@homemail.com--Perl-GPL/arch-perl--devel).
SEE ALSO
For more information, see Text::Diff::Unified, Algorithm::Diff.
perl v5.10.1 2005-03-09 Arch::DiffParser(3pm)