Sponsored Content
Top Forums Shell Programming and Scripting Performance issue in Grepping large files Post 302820483 by MadeInGermany on Wednesday 12th of June 2013 06:14:01 PM
Old 06-12-2013
I suggest to run the grep with -f $keywordfile inputfiles...
That starts grep less often, and opens each inputfile once.
The post-processing is a bit more awkward:
Code:
PATH=/usr/xpg4/bin:${PATH}
export PATH
find $tmpdir -type f \( -name "*.rdf" -o -name "*.fmb" -o -name "*.pll" -o -name "*.ctl" -o -name "*.sh" \
 -o -name "*.sql" -o -name "*.prog" \) -exec grep -F -i -x -f $keywordfile /dev/null {} + |
 # the /dev/null guarantees >=2 arguments so grep always returns filename:matchword
 # fold matched keywords to lowercase and remove duplicates and add matchcount
 awk -F":" '{k2=tolower(substr($0,length($1)+1))} {c[$1 k2]++} END {for (i in c) print c[i] FS i}' |
 while IFS=":" read matchCount filename keyword
 do
 
  out3=`echo "$filename"|awk -F\. '{print $NF}'`
  
  bfilename=`basename "$filename"`
  
  case $out3 in
   'rdf')   catagoery="REPORT";;
      
   'fmb')   catagoery="FORM";;
   'sql')   catagoery="SQL FILE";;
   'pll')   catagoery="Library File";;
   'ctl')   catagoery="Control File";;
   'sh')   catagoery="Shell script";;
    *)    catagoery="OTHER";;
  esac
  
  echo "bfilename,keyword,matchCount,out3,catagoery are:- $bfilename,$keyword,$matchCount,$out3,$catagoery"
  # SQL stuff follows

 done


Last edited by MadeInGermany; 06-12-2013 at 07:44 PM.. Reason: added matchCount
 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Unix File System performance with large directories

Hi, how does the Unix File System perform with large directories (containing ~30.000 files)? What kind of structure is used for the organization of a directory's content, linear lists, (binary) trees? I hope the description 'Unix File System' is exact enough, I don't know more about the file... (3 Replies)
Discussion started by: dive
3 Replies

2. Shell Programming and Scripting

Grepping issue..

I found another problem with my disk-adding script today. When looking for disks, I use grep. When I grep for the following disk sizes: 5242880 I also pick up these as well: 524288000 How do I specifically pick out one or the other, using grep, without resorting to the -v option? ... (9 Replies)
Discussion started by: LinuxRacr
9 Replies

3. Shell Programming and Scripting

Performance issue in UNIX while generating .dat file from large text file

Hello Gurus, We are facing some performance issue in UNIX. If someone had faced such kind of issue in past please provide your suggestions on this . Problem Definition: /Few of load processes of our Finance Application are facing issue in UNIX when they uses a shell script having below... (19 Replies)
Discussion started by: KRAMA
19 Replies

4. Shell Programming and Scripting

replace issue with large files

I have the following problem: I have two files: S containing sentences (one in each row) and W containing files (one in each row). It might look like this: S: a b c apple d. e f orange g. h banana i j. W: orange banana apple My task is to replace in S all words that appear in W... (2 Replies)
Discussion started by: tootles564
2 Replies

5. Shell Programming and Scripting

Severe performance issue while 'grep'ing on large volume of data

Background ------------- The Unix flavor can be any amongst Solaris, AIX, HP-UX and Linux. I have below 2 flat files. File-1 ------ Contains 50,000 rows with 2 fields in each row, separated by pipe. Row structure is like Object_Id|Object_Name, as following: 111|XXX 222|YYY 333|ZZZ ... (6 Replies)
Discussion started by: Souvik
6 Replies

6. Red Hat

Empty directory, large size and performance

Hi, I've some directory that I used as working directory for a program. At the end of the procedure, the content is deleted. This directory, when I do a ls -l, appears to still take up some space. After a little research, I've seen on a another board of this forum that it's not really taking... (5 Replies)
Discussion started by: bdx
5 Replies

7. Shell Programming and Scripting

Grepping large list of files

Hi All, I need help to know the exact command when I grep large list of files. Either using ls or find command. However I do not want to find in the subdirectories as the number of subdirectories are not fixed. How do I achieve that. I want something like this: find ./ -name "MYFILE*.txt"... (2 Replies)
Discussion started by: angshuman
2 Replies

8. Shell Programming and Scripting

Grepping verbal forms from a large corpus

I want to extract verbal forms from a large corpus of English. I have identified a certain number of patterns. Each pattern has the following structure SPACE word_CATEGORY where word refers to the verbal form and CATEGORY refers to the class of the verb The categories are identified as per the... (4 Replies)
Discussion started by: gimley
4 Replies

9. Shell Programming and Scripting

Bash script search, improve performance with large files

Hello, For several of our scripts we are using awk to search patterns in files with data from other files. This works almost perfectly except that it takes ages to run on larger files. I am wondering if there is a way to speed up this process or have something else that is quicker with the... (15 Replies)
Discussion started by: SDohmen
15 Replies
UPDATE-METAINIT(1)					User Contributed Perl Documentation					UPDATE-METAINIT(1)

NAME
update-metainit - Generates init scripts SYNOPSIS
update-metainit [--remove initname [--purge] ] DESCRIPTION
Metainit solves the problem of writing good init scripts. Instead of manually creating these important files, they are derived from a declaritive description in the metainit files in /etc/metainit. These files can be shipped with packages or created by the local adminis- trator. If update-metainit called without argument, it will regenerate init scripts for all the files in /etc/metainit. The generated files contain a large warning in form of a comment that they will be overridden. Modifications are preferably done in the files in /etc/metainit and made effective by running update-metainit. If needed, the administrator can prevent modified init files by removing the warning comment. OPTIONS
--remove initname This command will remove any generated and non-modified scripts that were created by the metainit file with the name initname. --purge Only usable with --remove. Will remove the generated files even if modified. SEE ALSO
dh_metainit(1) AUTHOR
Joachim Breitner <nomeata@debian.org> perl v5.8.8 2007-07-30 UPDATE-METAINIT(1)
All times are GMT -4. The time now is 05:01 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy