Sponsored Content
Top Forums Shell Programming and Scripting Search for patterns in thousands of files Post 302785631 by danish0909 on Tuesday 26th of March 2013 04:52:31 AM
Old 03-26-2013
Search for patterns in thousands of files

Hi All,


I want to search for a certain string in thousands of files and these files are distributed over different directories created daily. For that I created a small script in bash but while running it I am getting the below error:

/ms.sh: xrealloc: subst.c:5173: cannot allocate 268435456 bytes (536977408 bytes allocated)

Pasting the code that I wrote:

Code:
#!/usr/local/bin/bash

for i in `cat msisdn_u.txt`
do

cd /comptel4/elink/backup1/output/vas/NG0/20130301
find ./*GPX.Z|xargs zcat|grep $i; cd ..
cd /comptel4/elink/backup1/output/vas/NG0/20130302
find ./*GPX.Z|xargs zcat|grep $i; cd ..
cd /comptel4/elink/backup1/output/vas/NG0/20130303
find ./*GPX.Z|xargs zcat|grep $i; cd ..
cd /comptel4/elink/backup1/output/vas/NG0/20130304
find ./*GPX.Z|xargs zcat|grep $i; cd ..
cd /comptel4/elink/backup1/output/vas/NG0/20130305
find ./*GPX.Z|xargs zcat|grep $i; cd ..
cd /comptel4/elink/backup1/output/vas/NG0/20130306
find ./*GPX.Z|xargs zcat|grep $i; cd ..
cd /comptel4/elink/backup1/output/vas/NG0/20130307
find ./*GPX.Z|xargs zcat|grep $i; cd ..
..
..
..
done

This is in the patterns file:

Code:
more msisdn_u.txt
0564891888
0500555401
0563433343
0561132174
0562714661
0543210172
0503588147
0541400224
0564445889
0544998887
0564543055
0544095240
0563211334

Please advise as I need to find out and report it to the management.

Thanks

Danish

Last edited by radoulov; 03-26-2013 at 08:21 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding a specific pattern from thousands of files ????

Hi All, I want to find a specific pattern from approximately 400000 files on solaris platform. Its very heavy for me to grep that pattern to each file individually. Can anybody suggest me some way to search for specific pattern (alpha numeric) from these forty thousand files. Please note that... (6 Replies)
Discussion started by: aarora_98
6 Replies

2. UNIX Desktop Questions & Answers

how to search files efficiently using patterns

hi friens, :) if i need to find files with extension .c++,.C++,.cpp,.Cpp,.CPp,.cPP,.CpP,.cpP,.c,.C wat is the pattern for finding them :confused: (2 Replies)
Discussion started by: arunsubbhian
2 Replies

3. UNIX for Advanced & Expert Users

Copying Thousands of Tiny or Empty Files?

There is a procedure I do here at work where I have to synchronize file systems. The source file system always has three or four directories of hundreds of thousands of tiny (1k or smaller) or empty files. Whenever my rsync command reaches these directories, I'm waiting for hours for those files... (3 Replies)
Discussion started by: deckard
3 Replies

4. UNIX for Advanced & Expert Users

Best way to search for patterns in huge text files

I have the following situation: a text file with 50000 string patterns: abc2344536 gvk6575556 klo6575556 .... and 3 text files each with more than 1 million lines: ... 000000 abc2344536 46575 0000 000000 abc2344536 46575 4444 000000 abc2344555 46575 1234 ... I... (8 Replies)
Discussion started by: andy2000
8 Replies

5. Shell Programming and Scripting

help to parallelize work on thousands of files

I need to find a smarter way to process about 60,000 files in a single directory. Every night a script runs on each file generating a output on another directory; this used to take 5 hours, but as the data grows it is taking 7 hours. The files are of different sizes, but there are 16 cores... (10 Replies)
Discussion started by: vhope07
10 Replies

6. UNIX for Dummies Questions & Answers

script to search patterns inside list of files

>testfile while read x do if then echo $x >> testfile else fi if then echo $x >> testfile else fi done < list_of_files is there any efficient way to search abc.dml and xyz.dml ? (2 Replies)
Discussion started by: dr46014
2 Replies

7. Shell Programming and Scripting

to read two files, search for patterns and store the output in third file

hello i have two files temp.txt and temp_unique.text the second file consists the unique fields from the temp.txt file the strings stored are in the following form 4,4 17,12 15,65 4,4 14,41 15,65 65,89 1254,1298i'm able to run the following script to get the total count of a... (3 Replies)
Discussion started by: vaibhavkorde
3 Replies

8. SuSE

Search all files based on first and in all listed files search the second patterns

Hello Linux Masters, I am not a linux expert therefore i need help from linux gurus. Well i have a requirement where i need to search all files based on first patterns and after seraching all files then serach second pattern in all files which i have extracted based on first pattern.... (1 Reply)
Discussion started by: Black-Linux
1 Replies

9. Shell Programming and Scripting

Bash-awk to process thousands of files

Hi to all, I have thousand of files in a folder with names with format "FILE-YYYY-MM-DD-HHMM" for what I want to send the following AWK command awk '/Code.*/' FILE-2014* I'd like to separate all files that have the same date to a folder named with the corresponding date. For example, if I... (7 Replies)
Discussion started by: Ophiuchus
7 Replies

10. Shell Programming and Scripting

Bash - Find files excluding file patterns and subfolder patterns

Hello. For a given folder, I want to select any files find $PATH1 -f \( -name "*" but omit any files like pattern name ! -iname "*.jpg" ! -iname "*.xsession*" ..... \) and also omit any subfolder like pattern name -type d \( -name "/etc/gconf/gconf.*" -o -name "*cache*" -o -name "*Cache*" -o... (2 Replies)
Discussion started by: jcdole
2 Replies
bashdb(1)							     GNU Tools								 bashdb(1)

NAME
bashdb - bash debugger script SYNOPSIS
bashdb [options] [--] script-name [script options] bashdb [options] -c execution-string bash --debugger [bash-options...] script-name [script options] DESCRIPTION
"bashdb" is a bash script to which arranges for another bash script to be debugged. The debugger has a similar command interface as gdb(1). The way this script arranges debugging to occur is by including (or actually "source"-ing) some debug-support code and then sourcing the given script or command string. One problem with sourcing a debugged script is that the program name stored in $0 will be "bashdb" rather than the name of the script to be debugged. The debugged script will appear in a call stack not as the top item but as the item below "bashdb". If this is of concern, use the last form given above, "bash --debugger" script-name [script-options]. If you used bashdb script and need to pass options to the script to be debugged, add "--" before the script name. That will tell bashdb not to try to process any further options. See the reference manual <http://bashdb.sourceforge.net/bashdb.html> for how to to call the debugger from inside your program or arrange for the debugger to get called when your program is sent a signal. OPTIONS
-h | --help Print a usage message on standard error and exit with a return code of 100. -A | --annotation level Sets to output additional stack and status information which allows front-ends such as emacs to track what's going on without polling. This is needed in for regression testing. Using this option is equivalent to issuing: set annotation LEVEL inside the debugger. -B | --basename In places where a filename appears in debugger output give just the basename only. This is needed in for regression testing. Using this option is equivalent to issuing: set basename on inside the debugger. -n | nx Normally the debugger will read debugger commands in "~/.bashdbinit" if that file exists before accepting user interaction. ".bashdbinit" is analogus to Perl's ".perldb" or GNU gdb's ".gdbinit": a user might want to create such a debugger profile to add various user-specific customizations. Using the "-n" option this initialization file will not be read. This is useful in regression testing or in tracking down a problem with one's ".bashdbinit" profile. -c command-string Instead of specifying the name of a script file, one can give an execution string that is to be debugged. Use this option to do that. If you invoke the debugger via "bash --debugger", the filename that will appear in source listing or in a call stack trace will be the artifical name *BOGUS*. -q | --quiet Do not print introductory version and copyright information. This is again useful in regression testing where we don't want to include a changeable copyright date in the regression-test matching. -x debugger-cmdfile Run the debugger commands debugger-cmdfile before accepting user input. These commands are read however after any ".bashdbinit" commands. Again this is useful running regression-testing debug scripts. -L | --library debugger-library The debugger needs to source or include a number of functions and these reside in a library. If this option is not given the default location of library is relative to the installed bashdb script: "../lib/bashdb". -T | --tempdir temporary-file-directory The debugger needs to make use of some temporary filesystem storage to save persistent information across a subshell return or in order to evaluate an expression. The default directory is "/tmp" but you can use this option to set the directory where debugger temporary files will be created. -t | --tty tty-name Debugger output usually goes to a terminal rather than stdout or stdin which the debugged program may use. Determination of the tty or pseudo-tty is normally done automatically. However if you want to control where the debugger output goes, use this option. -V | --version Show version number and no-warranty and exit with return code 1. -X | --trace Similar to ""set -x"" line tracing except that by default the location of each line, the bash level, and subshell level are printed. You might be able to get something roughly similar if you set "PS4" as follows export PS4='(${BASH_SOURCE}:${LINENO}): ${FUNCNAME[0]} ' In contrast however to ""set -x"" tracing, indentation of the original program is also preserved in the source output. And if you interrupt the program with a break (a "SIGINT" signal), you will go into the debugger (assuming your program doesn't trap "SIGINT"). BUGS
The "bashdb" script and "--debugger" option assume a version of bash with debugging support. That is you can't debug bash scripts using the standard-issue version 2.05b bash or earlier versions. In versions after 3.0, debugging should have been enabled when bash was built. (I think this is usually the case though.) If you try to run the bashdb script on such as shell, may get the message: Sorry, you need to use a debugger-enabled version of bash. Debugging startup time can be slow especially on large bash scripts. Scripts created by GNU autoconf are at thousands of lines line and it is not uncommon for them to be tens of thousands of lines. There is a provision to address this problem by including a fast file-to-array read routine (readarray), but the bashdb package has to be compiled in a special way which needs access to the bash source code and objects. Another reason of the debugger slowness is that the debugger has to intercept every line and check to see if some action is to be taken for this and this is all in bash code. A better and faster architecture would be for the debugger to register a list of conditions or stopping places inside the bash code itself and have it arrange to call the debugger only when a condition requiring the debugger arises. Checks would be faster as this would be done in C code and access to internal structures would make this more efficient. SEE ALSO
o <http://bashdb.sourceforge.net/bashdb.html> - an extensive reference manual. o <http://bashdb.sourceforge.net> - the homepage for the project o <http://www.gnu.org/software/bash/manual/bashref.html> - bash reference manual AUTHOR
The current version is maintained (or not) by Rocky Bernstein. COPYRIGHT
Copyright (C) 2003, 2006, 2007 Rocky Bernstein This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA $Id: bashdb-man.pod,v 1.10 2009/06/22 22:41:10 rockyb Exp $ 4.2-0.8dev 2009-06-26 bashdb(1)
All times are GMT -4. The time now is 06:23 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy