Sponsored Content
Top Forums Shell Programming and Scripting Delete block of text in one file based on list in another file Post 302350452 by Festus Hagen on Thursday 3rd of September 2009 11:07:19 PM
Old 09-04-2009
Hi all,

Thanks for the responses ...

I have accomplished this with the following methods, however I have gone a step further with a third method...

Hopefully they help the next one in need!
The 3rd one is pretty specific to my needs.

Method 1 based on Removing Lines if value exist in first file post #4 by Azhrei, Thanks Azhrei
Code:
perl -i~ -e '  # -i~ for in-place editing with tilde backup file
  use strict;
  use warnings;
  my @a;
  my %excludehash;
  my $file = shift;
  open(excludelist, "< $file") or die;
  chomp( @a=<excludelist> );
  close(excludelist);
  @excludehash{@a}=@a;
  {
    local($/) = "";
    while (<>) {
      m/YOURKEY:\s+(.*)$/m;
      print unless exists $excludehash{ $1 }
    }
  }' "excludefile" "datafile"

Method 2 my own brew.
Code:
perl -i~ -e '  # -i~ for in-place editing with tilde backup file
  use strict;
  use warnings;
  my %excludehash;
  my $file = shift;
  open(my $excludelist, "<", $file) or die;
  while(<$excludelist>) {
    chomp;
    next if /^$/;
    $excludehash{ $_ } = $_;
  }
  close($excludelist);
  {
    local($/) = "";
    while (<>) {
      next if ( m/^YOURKEY:\s+(.*)$/m && $excludehash{ $1 } );
      print
    }
  }' "excludefile" "datafile"

Just for giggles I created a dummy datafile that was ~26M with 80,743 records ... each record consisted of at least 7 and up to 30 lines of text. After generating an excludefile of 20,271 records to be removed... I ran them both across it.

The speed is freaking incredible!
I didn't accurately time them, however it is done in less than 15 seconds! I was/am blown away by that!
Especially on my FBSD7.1R PIII-866!

Now from the results of that education I did the following!

What is in production for my needs...
Take a look at the sample data above and you will notice that Archive Maintenance removes old archives and logs them as "Removed archive: ... ..." ... There is the EXCLUDELIST!!

The following code reads the log file in the first while loop adding all the "Removed archive:" elements to a hash (removehash).
Then moves the file pointer back to the beginning of the log, and in the second while loop scrolls down though the records matching the removehash elements to the "Using archive: of each record... If there is a match skip it!

It's even wise to multiple 'Removed archive:' elements per record...
And is incredibly fast.
Code:
perl -i~ -e '  # -i~ for in-place editing with tilde backup file
    my %removehash;
    {
      local($/) = "";
      {
        while(<>) {
          while (m/^ Removed archive:\s+(.*)$/mg) {
            $removehash{ $1 } = $1;
          }
          last if (eof)
        }
      }
      seek(ARGV, 0, 0);
      {
        while (<>) {
          m/^ Using archive:\s+(.*)$/m;
          print unless exists $removehash{ $1 }
        }
      }
    }' "logfile"

-Enjoy
fh : )_~

Last edited by Festus Hagen; 09-04-2009 at 09:14 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

i want to delete a file based on existing file in a directory

hi i am having four files in a directory.like 1)sampleRej 2)exampleRej 3)samplemain 4)examplemain my requirement is i have to search for the rejected files (sampleRej,exampleRej) in a directory.if these files in that directory then i have to delete the main files... (3 Replies)
Discussion started by: srivsn
3 Replies

2. Shell Programming and Scripting

Delete line in file based on data in another file

Hi there I would like to create a shell script to do the following: - delete a line in file1 if it contains the data string in file2 eg: file1 1 100109942004051510601703694 0.00 0.00 2 100109942004051510601702326 0.00 0.00 3 ... (1 Reply)
Discussion started by: earth_goddess
1 Replies

3. Shell Programming and Scripting

Bash script to delete folder based on text file information

I have been working on a script to list all the name's of a subfolder in a text file then edit that text file and then delete the subfolder base on the edited text file so far I have been able to do every thing I just talked about but can't figure out how to delete the subfolers base on a text file... (8 Replies)
Discussion started by: bone11409
8 Replies

4. Shell Programming and Scripting

Parallel delete based flag from text file

Hi, I need a unix shell script for this requirement and is URGENT My input text file contains A-1 B-1 C-1 D-2 E-2 F-3 G-3 H-3 I-3 J-4 K-4 L-5 My expected result should be: if flag is 1, it has to delete A, B, C if flag is 2, it has to delete D,E if flag is 3, it has to delete... (1 Reply)
Discussion started by: moses_a
1 Replies

5. UNIX for Dummies Questions & Answers

Script for replacing text in a file based on list

Hi All, I am fairly new to the world of Unix, and I am looking for a way to replace a line of text in a file with a delimited array of values. I have an aliases file that is currently in use on our mail server that we are migrating off of. Until the migration is complete, the server must stay... (8 Replies)
Discussion started by: phoenixjc
8 Replies

6. UNIX for Dummies Questions & Answers

print multiple lines from text file based on pattern list

I have a text file with a list of items/patterns: ConsensusfromCGX_alldays_trimmedcollapsedfilteredreadscontiglist(229095contigs)contig12238 ConsensusfromCGX_alldays_trimmedcollapsedfilteredreadscontiglist(229095contigs)contig34624... (1 Reply)
Discussion started by: Oyster
1 Replies

7. UNIX for Dummies Questions & Answers

Extracting lines from a text file based on another text file with line numbers

Hi, I am trying to extract lines from a text file given a text file containing line numbers to be extracted from the first file. How do I go about doing this? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

8. Shell Programming and Scripting

How to delete lines of a text file based on another text file?

I have 2 TXT files with with 8 columns in them(tab separated). First file has 2000 entries whereas 2nd file has 300 entries. The first file has ALL the lines of second file. Now I need to remove those 300 lines (which are in both files) from first file so that first file's line count become... (2 Replies)
Discussion started by: prvnrk
2 Replies

9. UNIX for Dummies Questions & Answers

Delete records based on a text file from a text file

Hi Folks, I am a novice and need to build a script in bash. I have 2 text files data.txt file is big file, column 2 is the we need to search and delete in the output. The filter file contains the rows to be deleted. Data.txt state city zone Alabama Huntsville 4 California SanDiego 3... (3 Replies)
Discussion started by: tech_frk
3 Replies

10. UNIX for Dummies Questions & Answers

Deletion of list of user based on a text file In LDAP UNIX server

Dear All, It would be really nice, if you could help me to write a script for deletion of list of user( more than 15000 users) stored in a file and sorted by email address( i need deletion of only a particular type of mail address). Is the any script to write and take the file as input and... (3 Replies)
Discussion started by: Chand
3 Replies
All times are GMT -4. The time now is 06:22 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy