Sponsored Content
Top Forums Shell Programming and Scripting Please Help. Strings in file 1 need to be searched and removed from file 2 Post 302085647 by tmarikle on Friday 18th of August 2006 02:12:04 PM
Old 08-18-2006
awk is amazingly well suited for this kind of operation. I've seen a similar method to yours take several days on flat files containing ~ 20 million records (I can't recall exactly) and something similar to the following to less than 3 minutes.
Code:
nawk '
    # While processing records from file a (9000 lines)
    FILENAME=="file_a.txt" {
        # Record key value that should be excluded from file b
        Keys[$1]++
    }
    
    # While processing records from file b (50000)
    FILENAME=="file_b.txt" {
        # Look up key value in keys collected from file a
        if (Keys[$1] == 0) {
            # If the key is not found in the key array, save in the delta file
            #print > "deltas.txt"
            print $0
        }
    }
' file_a.txt file_b.txt

I created a 9000 record test file (file_a.txt) and a 50000 record test file (file_b.txt) that consisted of one key field in each and the process took 1/5 second.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Using grep - check the permissions of the file searched

What I need to do is: I need to use the grep command to search for pattern in directory and sub-directories. And also I need to show the permission of file been seached by the grep command. Could any one please suggest me? ----------------- $> cat file1.txt A -----------------... (8 Replies)
Discussion started by: Johny001
8 Replies

2. Shell Programming and Scripting

Inserting file content into a searched pattern

Hi, i have to insert the content of source.txt into the searched pattern of the file second.txt. $cat source.txt One Two Three . . $cat second.txt This is second file pattern match start here pattern match end here end of the file so the result will be like this (4 Replies)
Discussion started by: posix
4 Replies

3. Linux

file removed

Hi Team, I have deleted a file accidentally by using rm command. I am not the root(admin) user. Can you please let me know how to get that .tex file? (2 Replies)
Discussion started by: darling
2 Replies

4. Shell Programming and Scripting

Extract two strings from a file and create a new file with these strings

I have the following lines in a log file. It would be great if some one can help me to create a new file with the just entries in the below format. 66.150.161.195 HPSAC=Z05 66.150.161.196 HPSAC=A05 That is just extract the IP address and the string DPSAC=its value 66.150.161.195 -... (1 Reply)
Discussion started by: Tuxidow
1 Replies

5. Shell Programming and Scripting

Match list of strings in File A and compare with File B, C and write to a output file in CSV format

Hi Friends, I'm a great fan of this forum... it has helped me tone my skills in shell scripting. I have a challenge here, which I'm sure you guys would help me in achieving... File A has a list of job ids and I need to compare this with the File B (*.log) and File C (extend *.log) and copy... (6 Replies)
Discussion started by: asnandhakumar
6 Replies

6. UNIX for Beginners Questions & Answers

Use strings from nth field from one file to match strings in entire line in another file, awk

I cannot seem to get what should be a simple awk one-liner to work correctly and cannot figure out why. I would like to use patterns from a specific field in one file as regex to search for matching strings in the entire line ($0) of another file. I would like to output the lines of File2 which... (1 Reply)
Discussion started by: jvoot
1 Replies

7. Shell Programming and Scripting

Cut line from searched file if grep find neighbor columns

Hello All, While searching for the question, I found some answers but my implementation is not giving expected output. I have two files; one is sourcefile, other is named template. What I want to do is to search each line in template, when found all columns, cut the matching line from source... (4 Replies)
Discussion started by: baris35
4 Replies

8. UNIX for Beginners Questions & Answers

(g)awk: Matching strings from one file in another file between two strings

Hello all, I can get close to what I am looking for but cannot seem to hit it exactly and was wondering if I could get your help. I have the following sample from textfile with many thousands of lines: File 1 PS001,001 HLK PS002,004 L<G PS004,002 XNN PS004,006 BVX PS004,006 ZBX=... (7 Replies)
Discussion started by: jvoot
7 Replies

9. UNIX for Beginners Questions & Answers

Insert text after the first occurance of searched string entry in a file

My server xml file has huge data part of which i'm sharing below. I wish to add the below text held by variable "addthisline" after the closing braces i.e --> once the first </Connector> tag is found. addthisline="I need to be inserted after the comments" Thus my searchstring is... (3 Replies)
Discussion started by: mohtashims
3 Replies

10. UNIX for Beginners Questions & Answers

File Management: Removing of files from Server2 IF the same file is removed from Server1.

Hi Folks, I have a requirement of file management on different servers. Source Server is SERVER-A. Two servers will fetch files from SERVER-A: SERVER1 and SERVER2. 4th SERVER is SERVER-B, It will fetch files from SERVER1. If SERVER1 goes DOWN, SERVER-B will fetch pending files from... (2 Replies)
Discussion started by: Raza Ali
2 Replies
GREP(1) 						      General Commands Manual							   GREP(1)

NAME
grep - search a file for lines containing a given pattern SYNOPSIS
grep [-elnsv] pattern [file] ... OPTIONS
-e -e pattern is the same as pattern -c Print a count of lines matched -i Ignore case -l Print file names, no lines -n Print line numbers -s Status only, no printed output -v Select lines that do not match EXAMPLES
grep mouse file # Find lines in file containing mouse grep [0-9] file # Print lines containing a digit DESCRIPTION
Grep searches one or more files (by default, stdin) and selects out all the lines that match the pattern. All the regular expressions accepted by ed and mined are allowed. In addition, + can be used instead of * to mean 1 or more occurrences, ? can be used to mean 0 or 1 occurrences, and | can be used between two regular expressions to mean either one of them. Parentheses can be used for grouping. If a match is found, exit status 0 is returned. If no match is found, exit status 1 is returned. If an error is detected, exit status 2 is returned. SEE ALSO
cgrep(1), fgrep(1), sed(1), awk(9). GREP(1)
All times are GMT -4. The time now is 10:37 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy