Sponsored Content
Top Forums Shell Programming and Scripting Perl - work with open files or write to @lists first? Post 302539431 by OldGaf on Sunday 17th of July 2011 12:40:27 PM
Old 07-17-2011
Perl - work with open files or write to @lists first?

I am dealing will many thousand fairy small files.
I need to search them for various matches and depending on what I find, may need to search some files again for additional matches.

Generally speaking, is it better to write a txt file to an @array/@list and then work with it (multiple searches within it, creating $vars etc.) or open the file and go through it few times using while?

ie. Say I am searching through a file for A, B and C.
Depending what I find, I may then need to look in the same file for 1, 2, 3. Also, I will not always find A, B and C in that order.... it could be B, A, C etc.

I could just open the file and use while to look line by line for matches and depending on what I find I could use another (or more) while to go through the file again and close it when done.

OR, I could write the file to @list and search it for what I want.

Depending on what I find, some files will need a lot of searching, some very little.

Is there a rule of thumb for which approach is faster / easier on resources when dealing with thousands of files?

Thanks,
-OG-
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Performing an open/read, open/write

I'm trying to do two different things (converting an OpenVms .com to a ksh shell script): 1) open/read/err= 2) open/write/err= Any help? I've found some things, but can't seem to find the correct way. (1 Reply)
Discussion started by: prosserj
1 Replies

2. Shell Programming and Scripting

write a perl script or kornshell reading a two files and outputting to comma format

Hello Can someone help me to write a perl script or kornshell reading a two files and outputting to comma format. Here is the two files listofdisks.txt id, diskname, diskgroup, diskisze(GB), FC 1, CN34, GRP1, 30, FC_CN34 2, CN67, GRP5, 19, 4, VD1, GRP4, 23, FC_VD1 6, CF_D1, ... (0 Replies)
Discussion started by: deiow
0 Replies

3. Shell Programming and Scripting

Perl, open multiple files with wildcards

I have a question regarding Perl scripting. If I want to say open files that all look like this and assign them to a filehandle and then assign the filehandle to a variable, how do I do this? The file names are strand1.fa.gz.tmp strand2.fa.gz.tmp strand3.fa.gz.tmp strand4.fa.gz.tmp ...... (6 Replies)
Discussion started by: japaneseguitars
6 Replies

4. Solaris

Find all "regular" files open for write on solaris?

How do I find all "regular" files on solaris(8) that are open for write ( +read as well). I tried using pfiles, and lsof commands, but not sure how to get exactly what I wanted. ps -e | awk '{ print $1 }' | xargs -i pfiles {} 2>/dev/null (10 Replies)
Discussion started by: kchinnam
10 Replies

5. Homework & Coursework Questions

Shell script calling Perl function, sort and find data, write to new files

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: I must write a shell script that calls two external Perl functions--one of which sorts the data in a file, and... (6 Replies)
Discussion started by: kowit010
6 Replies

6. Shell Programming and Scripting

Compare two lists with perl

Hi everybody! I'm trying to delete some elements from a list with two elements on each row agreeing with the elements in another list. Pratically I want a perl script able to take each element of the second list (that is a single column list), compare it with both elements of each row from the... (3 Replies)
Discussion started by: gabrysfe
3 Replies

7. Shell Programming and Scripting

Perl: script to work with files with the same name in different directories

Hi All, I would like to use a Perl (not Bash) script to work with multiple files of the same name in different directories (all in the same parent directory). I tried to create a loop to do so, but it isn't working. My code so far: while (defined(my $file = glob("./*/filename.txt")) or... (1 Reply)
Discussion started by: elgo4
1 Replies

8. Shell Programming and Scripting

Script to open files and write into new one

Hello! I am a real beginner in scripting, so I am struggling with a really easy task! I want to write a script to concatenate several text files onto each other and generate a new file. I wanted the first argument to be the name of the new file, so: ./my_script.sh new_file file1.txt... (5 Replies)
Discussion started by: malajedala
5 Replies

9. Shell Programming and Scripting

• Write a shell script that upon invocation shows the time and date and lists all the logged-in user

help me (1 Reply)
Discussion started by: sonu pandey
1 Replies

10. UNIX for Beginners Questions & Answers

How to write in multiple output files in perl?

hi, Hope you are doing good. During my coding yesterday i got this challenge, actually not a challenge it like to optimize the code. I am printing some statement to monitor the file progress in the log file an also to display it in the screen. so i ended up in the below statements. ... (6 Replies)
Discussion started by: mad man
6 Replies
GIT-GREP(1)							    Git Manual							       GIT-GREP(1)

NAME
git-grep - Print lines matching a pattern SYNOPSIS
git grep [-a | --text] [-I] [-i | --ignore-case] [-w | --word-regexp] [-v | --invert-match] [-h|-H] [--full-name] [-E | --extended-regexp] [-G | --basic-regexp] [-P | --perl-regexp] [-F | --fixed-strings] [-n | --line-number] [-l | --files-with-matches] [-L | --files-without-match] [(-O | --open-files-in-pager) [<pager>]] [-z | --null] [-c | --count] [--all-match] [-q | --quiet] [--max-depth <depth>] [--color[=<when>] | --no-color] [--break] [--heading] [-p | --show-function] [-A <post-context>] [-B <pre-context>] [-C <context>] [-W | --function-context] [-f <file>] [-e] <pattern> [--and|--or|--not|(|)|-e <pattern>...] [ [--exclude-standard] [--cached | --no-index | --untracked] | <tree>...] [--] [<pathspec>...] DESCRIPTION
Look for specified patterns in the tracked files in the work tree, blobs registered in the index file, or blobs in given tree objects. Patterns are lists of one or more search expressions separated by newline characters. An empty string as search expression matches all lines. CONFIGURATION
grep.lineNumber If set to true, enable -n option by default. grep.extendedRegexp If set to true, enable --extended-regexp option by default. OPTIONS
--cached Instead of searching tracked files in the working tree, search blobs registered in the index file. --no-index Search files in the current directory that is not managed by git. --untracked In addition to searching in the tracked files in the working tree, search also in untracked files. --no-exclude-standard Also search in ignored files by not honoring the .gitignore mechanism. Only useful with --untracked. --exclude-standard Do not pay attention to ignored files specified via the .gitignore mechanism. Only useful when searching files in the current directory with --no-index. -a, --text Process binary files as if they were text. -i, --ignore-case Ignore case differences between the patterns and the files. -I Don't match the pattern in binary files. --max-depth <depth> For each <pathspec> given on command line, descend at most <depth> levels of directories. A negative value means no limit. This option is ignored if <pathspec> contains active wildcards. In other words if "a*" matches a directory named "a*", "*" is matched literally so --max-depth is still effective. -w, --word-regexp Match the pattern only at word boundary (either begin at the beginning of a line, or preceded by a non-word character; end at the end of a line or followed by a non-word character). -v, --invert-match Select non-matching lines. -h, -H By default, the command shows the filename for each match. -h option is used to suppress this output. -H is there for completeness and does not do anything except it overrides -h given earlier on the command line. --full-name When run from a subdirectory, the command usually outputs paths relative to the current directory. This option forces paths to be output relative to the project top directory. -E, --extended-regexp, -G, --basic-regexp Use POSIX extended/basic regexp for patterns. Default is to use basic regexp. -P, --perl-regexp Use Perl-compatible regexp for patterns. Requires libpcre to be compiled in. -F, --fixed-strings Use fixed strings for patterns (don't interpret pattern as a regex). -n, --line-number Prefix the line number to matching lines. -l, --files-with-matches, --name-only, -L, --files-without-match Instead of showing every matched line, show only the names of files that contain (or do not contain) matches. For better compatibility with git diff, --name-only is a synonym for --files-with-matches. -O [<pager>], --open-files-in-pager [<pager>] Open the matching files in the pager (not the output of grep). If the pager happens to be "less" or "vi", and the user specified only one pattern, the first file is positioned at the first match automatically. -z, --null Output instead of the character that normally follows a file name. -c, --count Instead of showing every matched line, show the number of lines that match. --color[=<when>] Show colored matches. The value must be always (the default), never, or auto. --no-color Turn off match highlighting, even when the configuration file gives the default to color output. Same as --color=never. --break Print an empty line between matches from different files. --heading Show the filename above the matches in that file instead of at the start of each shown line. -p, --show-function Show the preceding line that contains the function name of the match, unless the matching line is a function name itself. The name is determined in the same way as git diff works out patch hunk headers (see Defining a custom hunk-header in gitattributes(5)). -<num>, -C <num>, --context <num> Show <num> leading and trailing lines, and place a line containing -- between contiguous groups of matches. -A <num>, --after-context <num> Show <num> trailing lines, and place a line containing -- between contiguous groups of matches. -B <num>, --before-context <num> Show <num> leading lines, and place a line containing -- between contiguous groups of matches. -W, --function-context Show the surrounding text from the previous line containing a function name up to the one before the next function name, effectively showing the whole function in which the match was found. -f <file> Read patterns from <file>, one per line. -e The next parameter is the pattern. This option has to be used for patterns starting with - and should be used in scripts passing user input to grep. Multiple patterns are combined by or. --and, --or, --not, ( ... ) Specify how multiple patterns are combined using Boolean expressions. --or is the default operator. --and has higher precedence than --or. -e has to be used for all patterns. --all-match When giving multiple pattern expressions combined with --or, this flag is specified to limit the match to files that have lines to match all of them. -q, --quiet Do not output matched lines; instead, exit with status 0 when there is a match and with non-zero status when there isn't. <tree>... Instead of searching tracked files in the working tree, search blobs in the given trees. -- Signals the end of options; the rest of the parameters are <pathspec> limiters. <pathspec>... If given, limit the search to paths matching at least one pattern. Both leading paths match and glob(7) patterns are supported. EXAMPLES
git grep 'time_t' -- '*.[ch]' Looks for time_t in all tracked .c and .h files in the working directory and its subdirectories. git grep -e '#define' --and ( -e MAX_PATH -e PATH_MAX ) Looks for a line that has #define and either MAX_PATH or PATH_MAX. git grep --all-match -e NODE -e Unexpected Looks for a line that has NODE or Unexpected in files that have lines that match both. GIT
Part of the git(1) suite Git 1.7.10.4 11/24/2012 GIT-GREP(1)
All times are GMT -4. The time now is 09:40 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy