02-10-2012
Thanks @ahamed101.
I did try grep -f, and there are two problems. I found that a pattern file with duplicate entries found unique matches, thus destroying the order of File1.csv, which I am trying to preserve. The other issue is that grep is notoriously inefficient for this task. For 173k patterns I would need to split File1.csv into chunks in a loop, and use each chunk to search against File2.csv. Even in this case, using grep to search >10k patterns begins to take several seconds. Other posts have profiled similar performance. While this second consideration is not a total deal-breaker, (a) I am going to have to perform a large number of these kinds of matches, (b) with bigger pattern files, awk is *fast*, so it would be great if I could find a more efficient solution.
-i
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
From the pattern mentioned below remove lines based on pattern range.
Conditions
1 Look For all lines starting with ALTER TABLE and Ending with ; and contains the word MOVE.I wanto to remove these lines from the file sample below.
Note : The above pattern list could be found in... (1 Reply)
Discussion started by: rajan_san
1 Replies
2. Shell Programming and Scripting
I am new to shell scripting and need some help. I googled, but couldn't find a similar scenario.
Basically, I need to rename a datafile. This is the scenario -
I have a file, readonly.txt that has 2 columns - file# and name.
I have another file,missing_files.txt that has id and name. Both the... (3 Replies)
Discussion started by: mathews
3 Replies
3. Shell Programming and Scripting
Hi all,
I have been searching online to find the answer for getting a list of files that do not match certain criteria but have been unsuccessful.
I have a directory that has many jpg files. What I need to do is get a list of the files that do not match both of the following patterns (I have... (21 Replies)
Discussion started by: nikos-koutax
21 Replies
4. Shell Programming and Scripting
Hello all
I have a file my DNS server where there are duplicate paragrapsh like below. How can I remove the duplicate paragraph so that only one paragraph remains.
BEGIN;
replace into domains (name,type) values ('225.168.192.in-addr.arpa','MASTER');
replace into records (domain_id,... (2 Replies)
Discussion started by: sb245
2 Replies
5. Shell Programming and Scripting
Hi,
I have a file of csv data, which looks like this:
file1:
1AA,LGV_PONCEY_LES_ATHEE,1,\N,1,00020460E1,0,\N,\N,\N,\N,2,00.22335321,0.00466628
2BB,LES_POUGES_ASF,\N,200,200,00006298G1,0,\N,\N,\N,\N,1,00.30887539,0.00050312... (10 Replies)
Discussion started by: djoseph
10 Replies
6. Shell Programming and Scripting
Hello.
grep v2.21
Debian 8
I wish to search for and output these patterns in order;
"From " "To: " "Subject: " "Message-Id: " "Date: " "To: "
grep works, but not in strict order...
$ grep -a -E "^From |^Subject:|^From: |^Message-Id: |^Date: |^To: " InboxResult;
From - Wed Feb 18... (10 Replies)
Discussion started by: DSommers
10 Replies
7. UNIX for Beginners Questions & Answers
I have two text files. File 1 has 150 ids but all the ids exists in duplicates so it has 300 ids in total. File 2 has 1500 ids but all exists in duplicates so file 2 has 300 ids in total. i want to match the first occurance of every id in file 1 with first occurance of thet id in file 2 and 2nd... (2 Replies)
Discussion started by: limd
2 Replies
8. Shell Programming and Scripting
Hi
I am using Solaris 5.10 & ksh
Wanted to loop through a pattern file by reading it and passing it to the awk to match that value present in column 1 of rawdata.txt , if so print column 1 & 2 in to Avlblpatterns.txt. Using the following code but it seems some mistakes and it is running for... (2 Replies)
Discussion started by: ananan
2 Replies
9. Shell Programming and Scripting
In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies
10. UNIX for Beginners Questions & Answers
Hi,
I need help to match patterns from between two different files and extract region of strings.
inputfile1.fa
>l-WR24-1:1
GCCGGCGTCGCGGTTGCTCGCGCTCTGGGCGCTGGCGGCTGTGGCTCTACCCGGCTCCGG
GGCGGAGGGCGACGGCGGGTGGTGAGCGGCCCGGGAGGGGCCGGGCGGTGGGGTCACGTG... (4 Replies)
Discussion started by: bunny_merah19
4 Replies
LEARN ABOUT DEBIAN
dpkg-awk
DPKG-AWK(1) General Commands Manual DPKG-AWK(1)
NAME
dpkg-awk - Utility to read a dpkg style db file
SYNOPSIS
dpkg-awk [(-f|--file) filename] [(-d|--debug) ##] [(-s|--sort) list] [(-rs|--rec_sep) ??] '<fieldname>:<regex>' ... -- <out_fieldname> ..
DESCRIPTION
dpkg-awk Parses a dpkg status file (or other similarly formatted file) and outputs the resulting records. It can use regex on the field
values to limit the returned records, it can also be told which fields to output, and it can sort the matched fields.
OPTIONS
-f filename
--file filename
The file to parse. The default is /var/lib/dpkg/status.
-d [#]
--debug [#]
Each time this is specified, it increased the debug level.
-s field(s)
--sort field(s)
A space or comma separated list of fields to sort on.
-n field(s)
--numeric field(s)
A space or comma separated list of fields that should be interpreted as numeric in value.
-rs ??
--rec_sep ??
Output this string at the end of each output paragraph.
-h
--help Display some help.
fieldname
The fields from the file, that are matched with the regex given. The fieldnames are case insensitive.
out_fieldname
The fields from the file, that are output for each record. If the first field listed begins with ^, then the list of fields that
follows will NOT be output.
BUGS
Be warned that the author has only a shallow understanding of the dpkg packaging system, so there are probably tons of bugs in this pro-
gram.
This program comes with no warranties. If running this program causes fire and brimstone to rain down upon the earth, you will be on your
own.
This program accesses the dpkg database directly in places, querying for data that cannot be gotten via dpkg.
AUTHOR
Adam Heath <doogie@debian.org>
DEBIAN
Debian Utilities DPKG-AWK(1)