02-10-2012
Thanks @ahamed101.
I did try grep -f, and there are two problems. I found that a pattern file with duplicate entries found unique matches, thus destroying the order of File1.csv, which I am trying to preserve. The other issue is that grep is notoriously inefficient for this task. For 173k patterns I would need to split File1.csv into chunks in a loop, and use each chunk to search against File2.csv. Even in this case, using grep to search >10k patterns begins to take several seconds. Other posts have profiled similar performance. While this second consideration is not a total deal-breaker, (a) I am going to have to perform a large number of these kinds of matches, (b) with bigger pattern files, awk is *fast*, so it would be great if I could find a more efficient solution.
-i
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
From the pattern mentioned below remove lines based on pattern range.
Conditions
1 Look For all lines starting with ALTER TABLE and Ending with ; and contains the word MOVE.I wanto to remove these lines from the file sample below.
Note : The above pattern list could be found in... (1 Reply)
Discussion started by: rajan_san
1 Replies
2. Shell Programming and Scripting
I am new to shell scripting and need some help. I googled, but couldn't find a similar scenario.
Basically, I need to rename a datafile. This is the scenario -
I have a file, readonly.txt that has 2 columns - file# and name.
I have another file,missing_files.txt that has id and name. Both the... (3 Replies)
Discussion started by: mathews
3 Replies
3. Shell Programming and Scripting
Hi all,
I have been searching online to find the answer for getting a list of files that do not match certain criteria but have been unsuccessful.
I have a directory that has many jpg files. What I need to do is get a list of the files that do not match both of the following patterns (I have... (21 Replies)
Discussion started by: nikos-koutax
21 Replies
4. Shell Programming and Scripting
Hello all
I have a file my DNS server where there are duplicate paragrapsh like below. How can I remove the duplicate paragraph so that only one paragraph remains.
BEGIN;
replace into domains (name,type) values ('225.168.192.in-addr.arpa','MASTER');
replace into records (domain_id,... (2 Replies)
Discussion started by: sb245
2 Replies
5. Shell Programming and Scripting
Hi,
I have a file of csv data, which looks like this:
file1:
1AA,LGV_PONCEY_LES_ATHEE,1,\N,1,00020460E1,0,\N,\N,\N,\N,2,00.22335321,0.00466628
2BB,LES_POUGES_ASF,\N,200,200,00006298G1,0,\N,\N,\N,\N,1,00.30887539,0.00050312... (10 Replies)
Discussion started by: djoseph
10 Replies
6. Shell Programming and Scripting
Hello.
grep v2.21
Debian 8
I wish to search for and output these patterns in order;
"From " "To: " "Subject: " "Message-Id: " "Date: " "To: "
grep works, but not in strict order...
$ grep -a -E "^From |^Subject:|^From: |^Message-Id: |^Date: |^To: " InboxResult;
From - Wed Feb 18... (10 Replies)
Discussion started by: DSommers
10 Replies
7. UNIX for Beginners Questions & Answers
I have two text files. File 1 has 150 ids but all the ids exists in duplicates so it has 300 ids in total. File 2 has 1500 ids but all exists in duplicates so file 2 has 300 ids in total. i want to match the first occurance of every id in file 1 with first occurance of thet id in file 2 and 2nd... (2 Replies)
Discussion started by: limd
2 Replies
8. Shell Programming and Scripting
Hi
I am using Solaris 5.10 & ksh
Wanted to loop through a pattern file by reading it and passing it to the awk to match that value present in column 1 of rawdata.txt , if so print column 1 & 2 in to Avlblpatterns.txt. Using the following code but it seems some mistakes and it is running for... (2 Replies)
Discussion started by: ananan
2 Replies
9. Shell Programming and Scripting
In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies
10. UNIX for Beginners Questions & Answers
Hi,
I need help to match patterns from between two different files and extract region of strings.
inputfile1.fa
>l-WR24-1:1
GCCGGCGTCGCGGTTGCTCGCGCTCTGGGCGCTGGCGGCTGTGGCTCTACCCGGCTCCGG
GGCGGAGGGCGACGGCGGGTGGTGAGCGGCCCGGGAGGGGCCGGGCGGTGGGGTCACGTG... (4 Replies)
Discussion started by: bunny_merah19
4 Replies
LEARN ABOUT NETBSD
zfgrep
ZGREP(1) BSD General Commands Manual ZGREP(1)
NAME
zgrep, zegrep, zfgrep -- print lines matching a pattern in gzip-compressed files
SYNOPSIS
zgrep [grep-flags] [--] pattern [files ...]
zegrep [grep-flags] [--] pattern [file ...]
zfgrep [grep-flags] [--] pattern [file ...]
DESCRIPTION
zgrep runs grep(1) on files or stdin, if no files argument is given, after decompressing them with zcat(1).
The grep-flags and pattern arguments are passed on to grep(1). If an -e flag is found in the grep-flags, zgrep will not look for a pattern
argument.
zegrep calls egrep(1), while zfgrep calls fgrep(1).
EXIT STATUS
In case of missing arguments or missing pattern, 1 will be returned, otherwise 0.
SEE ALSO
egrep(1), fgrep(1), grep(1), gzip(1), zcat(1)
AUTHORS
Thomas Klausner <wiz@NetBSD.org>
BSD
December 28, 2003 BSD