Sponsored Content
Top Forums Shell Programming and Scripting Remove lines that are subsets of other lines in File Post 302941908 by Chubler_XL on Wednesday 22nd of April 2015 04:47:24 PM
Old 04-22-2015
This solution keeps the lines with most IPs first although not optimal it should give fairly more concise answers than just processing file top to bottom.

I tried it on a data files with 170K lines and it took approx 3 seconds to process.

Code:
awk '
function have(ln,ip,num) {
  ret=1
  num=split(ln,ips)
  for(ip=1;ip<=num;ip++)
     if(!(ips[ip] in havelist)) {
         havelist[ips[ip]]
         unique++
         ret=0
      }
  return ret
}
{ L[NR]=$0;D[NF]=D[NF] " " NR; max=NF>max?NF:max }
END {
   for(c=max;c;c--)
      if(D[c]) {
         lns=split(D[c],v)
         for(i=1;i<=lns;i++)
           if(!have(L[v[i]])) print L[v[i]]
      }
   printf "Data file contains %'\''d unique IPs\n", unique > "/dev/stderr"
}' infile


For anyone who wants to test possible solutions I used the script to make my test file:

Code:
for ((i=0;i<170000;i++))
do
    printf "10.0.%d.%d" $((RANDOM%256)) $((RANDOM%256))
    ips=$((RANDOM%8))
    while ((--ips > 0))
    do
       printf " 10.0.%d.%d" $((RANDOM%256)) $((RANDOM%256))
    done
    printf "\n"
done


Last edited by Chubler_XL; 04-22-2015 at 05:53 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

remove lines from file

file: 1 xxxxxxx 2 xxx xxx 5 xxx xxx ... 180 xxxxxx 200 xxx how to remove any lines with the first number range 1-180 (9 Replies)
Discussion started by: bluemoon1
9 Replies

2. UNIX for Dummies Questions & Answers

vi to remove lines in file

All, I have a text file with several entries like below: personname personname.domain.com I know there is a way to use vi to remove only the personname.domain.com line. Can someone help? I believe that it involves /s/g/ something...I just can't remember the exact syntax. Thanks (2 Replies)
Discussion started by: kjbaumann
2 Replies

3. Shell Programming and Scripting

remove lines from file

Hi gurus, i'm trying to remove a number of lines from a large file using the following command: sed '1,5000d' oldfile > newfile Somehow the lines in the old file are not deleted... Am I doing this wrongly? Any suggestions? :confused: Thanks! :) wee (10 Replies)
Discussion started by: lweegp
10 Replies

4. Shell Programming and Scripting

remove : lines from file

A small question I have a test.txt file I have contents as: a:google b:yahoo : c:facebook : d:hotmail How do I remove the line with : my output should be a:google b:yahoo c:facebook d:hotmail (5 Replies)
Discussion started by: aronmelon
5 Replies

5. Shell Programming and Scripting

remove blank lines and merge lines in shell

Hi, I'm not a expert in shell programming, so i've come here to take help from u gurus. I'm trying to tailor a csv file that i got to make it work for the LOAD FROM command. I've a datatable csv of the below format - --in file format xx,xx,xx ,xx , , , , ,,xx, xxxx,, ,, xxx,... (11 Replies)
Discussion started by: dvah
11 Replies

6. Shell Programming and Scripting

Remove lines from file

Hey Gang- I have a list of servers. I want to exclude servers that begin with and end with certain characters. Is there an easy command to do this? Example wvm1234dev wvm1234pro uvm1122dev uvm1122bku uvm1344dev I want to exclude any lines that start with "wvm" OR "uvm" AND end... (7 Replies)
Discussion started by: idiotboy
7 Replies

7. UNIX for Dummies Questions & Answers

Want to remove all lines but not latest 50 lines from a file

Hi, I have a huge file which has Lacs of lines. File system got full. I want your guys help to suggest me a solution so that I can remove all lines from that file but not last 50,000 lines. I want solution which can remove lines from existing file so that I can have some space left with. (28 Replies)
Discussion started by: prashant2507198
28 Replies

8. Shell Programming and Scripting

Remove lines in file

I have a file that contains the following: Party_Id1;Party_id2;Party_id3; 1;2;3; 0 0 4;5;6; 0 7;8;9; How can I adjust the file so it looks like this: Party_Id1;Party_id2;Party_id3; 1;2;3; 4;5;6; 7;8;9; I Think the '0' is something like a carriage return, I don't know. But how... (2 Replies)
Discussion started by: katled
2 Replies

9. Shell Programming and Scripting

Two files, remove lines from second based on lines in first

I have two files, a keepout.txt and a database.csv. They're unsorted, but could be sorted. keepout: user1 buser3 anuser19 notheruser27 database: user1,2343,"information about",field,blah,34 user2,4231,"mo info",etc,stuff,43 notheruser27,4344,"hiya",thing,more thing,423... (4 Replies)
Discussion started by: esoffron
4 Replies

10. Shell Programming and Scripting

awk to remove lines that do not start with digit and combine line or lines

I have been searching and trying to come up with an awk that will perform the following on a converted text file (original is a pdf). 1. Since the first two lines are (begin with) text they are removed 2. if $1 is a number then all text is merged (combined) into one line until the next... (3 Replies)
Discussion started by: cmccabe
3 Replies
All times are GMT -4. The time now is 10:17 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy