Remove lines that are subsets of other lines in File
Hello everyone,
Although it seems easy, I've been stuck with this problem for a moment now and I can't figure out a way to get it done.
My problem is the following:
I have a file where each line is a sequence of IP addresses, example :
Code:
10.0.0.1 10.0.0.2
10.0.0.5 10.0.0.1 10.0.0.2
...
What I'd like to do, is to remove lines that are completely matched in other lines. In the previous example, "Line 1" would be deleted as it is contained in "Line 2".
So far, I've worked with python and set() objects to get the job done but I've got more than 100K lines and sets lookups are becoming time consuming as the program goes :/
Thanks for you help
Moderator's Comments:
Use code tags, thanks.
Last edited by zaxxon; 04-22-2015 at 06:31 AM..
Reason: code tags and missing a dot
All,
I have a text file with several entries like below:
personname
personname.domain.com
I know there is a way to use vi to remove only the personname.domain.com line. Can someone help? I believe that it involves /s/g/ something...I just can't remember the exact syntax.
Thanks (2 Replies)
Hi gurus,
i'm trying to remove a number of lines from a large file using the following command:
sed '1,5000d' oldfile > newfile
Somehow the lines in the old file are not deleted...
Am I doing this wrongly? Any suggestions? :confused:
Thanks! :)
wee (10 Replies)
A small question
I have a test.txt file
I have contents as:
a:google
b:yahoo
:
c:facebook
:
d:hotmail
How do I remove the line with :
my output should be
a:google
b:yahoo
c:facebook
d:hotmail (5 Replies)
Hi,
I'm not a expert in shell programming, so i've come here to take help from u gurus.
I'm trying to tailor a csv file that i got to make it work for the LOAD FROM command.
I've a datatable csv of the below format -
--in file format
xx,xx,xx ,xx , , , , ,,xx,
xxxx,, ,, xxx,... (11 Replies)
Hey Gang-
I have a list of servers. I want to exclude servers that begin with and end with certain characters. Is there an easy command to do this?
Example
wvm1234dev
wvm1234pro
uvm1122dev
uvm1122bku
uvm1344dev
I want to exclude any lines that start with "wvm" OR "uvm" AND end... (7 Replies)
Hi,
I have a huge file which has Lacs of lines. File system got full.
I want your guys help to suggest me a solution so that I can remove all lines from that file but not last 50,000 lines. I want solution which can remove lines from existing file so that I can have some space left with. (28 Replies)
I have a file that contains the following:
Party_Id1;Party_id2;Party_id3;
1;2;3;
0
0
4;5;6;
0
7;8;9;
How can I adjust the file so it looks like this:
Party_Id1;Party_id2;Party_id3;
1;2;3;
4;5;6;
7;8;9;
I Think the '0' is something like a carriage return, I don't know. But how... (2 Replies)
I have two files, a keepout.txt and a database.csv. They're unsorted, but could be sorted.
keepout:
user1
buser3
anuser19
notheruser27
database:
user1,2343,"information about",field,blah,34
user2,4231,"mo info",etc,stuff,43
notheruser27,4344,"hiya",thing,more thing,423... (4 Replies)
I have been searching and trying to come up with an awk that will perform the following on a
converted text file (original is a pdf).
1. Since the first two lines are (begin with) text they are removed
2. if $1 is a number then all text is merged (combined) into one line until the next... (3 Replies)
Discussion started by: cmccabe
3 Replies
LEARN ABOUT PLAN9
grep
GREP(1) General Commands Manual GREP(1)NAME
grep - search a file for a pattern
SYNOPSIS
grep [ option ... ] pattern [ file ... ]
DESCRIPTION
Grep searches the input files (standard input default) for lines (with newlines excluded) that match the pattern, a regular expression as
defined in regexp(6). Normally, each line matching the pattern is `selected', and each selected line is copied to the standard output.
The options are
-c Print only a count of matching lines.
-h Do not print file name tags (headers) with output lines.
-i Ignore alphabetic case distinctions. The implementation folds into lower case all letters in the pattern and input before interpre-
tation. Matched lines are printed in their original form.
-l (ell) Print the names of files with selected lines; don't print the lines.
-L Print the names of files with no selected lines; the converse of -l.
-n Mark each printed line with its line number counted in its file.
-s Produce no output, but return status.
-v Reverse: print lines that do not match the pattern.
Output lines are tagged by file name when there is more than one input file. (To force this tagging, include /dev/null as a file name
argument.)
Care should be taken when using the shell metacharacters $*[^|()= and newline in pattern; it is safest to enclose the entire expression in
single quotes '...'.
SOURCE
/sys/src/cmd/grep.c
SEE ALSO ed(1), awk(1), sed(1), sam(1), regexp(6)DIAGNOSTICS
Exit status is null if any lines are selected, or non-null when no lines are selected or an error occurs.
GREP(1)