Using awk on multiple files in a directory Post: 302838009

Sponsored Content

Top Forums Shell Programming and Scripting Using awk on multiple files in a directory Post 302838009 by SkySmart on Sunday 28th of July 2013 02:28:19 PM

07-28-2013

Registered User

Quote:

Originally Posted by RudiC

Why don't you gunzip all files upfront and then apply the awk script to the entire directory?

actually that's the least of my problems now. i believe i'll be able to figure that out at the end. but the only other question i have is, lets say the first time i run this command, i get and output similar to this:

Code:

first run:
/data/projects/file01,300lines,130lines matching 'Customer.*Processed'

(note, this is just one file out of many that would be in the output.)

now, the above output is saved to a file called /tmp/results.txt
the second time i run this command, say 5 minutes later, there'd be a line in the output similar to:

Code:

second run:
/data/projects/file01,410lines,139lines matching 'Customer.*Processed'

now, i dont want to search through each file again. i want to begin from the point where the last scan left off.

in the first run, there were 300 lines in the file named '/data/projects/file01. I want it so that, the next time i run the script, awk can begin from line 301 to the end of the file. and i want to have this happen for all the files it finds in the directory. this way, only the first run will be slow. all runs after that will be fast.

here's my attempt to modify your code:

Code:

lastlinenumber=$(awk -F"," '{print $2}' /tmp/results.txt | sed 's/lines//g')

awk    -v LLNUM=${lastlinenumber}  'FNR == 1               {if (NR > 1) {print fn, "text1", fnr, "text2", nl}
                                 fn=FILENAME; fnr = 1; nl = 0}
                                {fnr = FNR}
         /customer.*processed/  && NR>LLNUM {nl++}
         END                    {print fn, "text1", fnr, "text2", nl}
        ' file?

if while comparing the most recent list of files in the latest scan, it finds a file that didn't exist in the previous scan, it'll scan that file in its entirety because it would be considered new.

SkySmart

View Public Profile for SkySmart

Find all posts by SkySmart

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

ftp multiple files from same directory

Hi there Gurus, I have the following ftp script: $ more ftp_dump_arch4.sh #! /usr/bin/ksh # Constant variables HOST='xx.xx.xx.xx' USER='user' PASSWD='password' dir='/export/file' ftp_log='/tmp' ftp -n $HOST > $ftp_log/ftp.log << END user $USER $PASSWD verbose lcd $dir bin

2. Shell Programming and Scripting

Multiple search string in multiple files using awk

Hi, filenames: contains name of list of files to search in. placelist contains the names of places to be searched in all files in "filenames" for i in $(<filenames) do egrep -f placelist $i if ] then echo $i fi done >> outputfile Output i am getting:

3. Shell Programming and Scripting

extract multiple cloumns from multiple files; skip rows and include filenames; awk

Hello, I am trying to write a bash shell script that does the following: 1.Finds all *.txt files within my directory of interest 2. reads each of the files (25 files) one by one (tab-delimited format and have the same data format) 3. skips the first 10 rows of the file 4. extracts and...

4. Shell Programming and Scripting

How to Pull out multiple files from DB table and redirect all those files to a differetn directory?

Hi everyone!! I have a database table, which has file_name as one of its fields. Example: File_ID File_Name Directory Size 0001 UNO_1232 /apps/opt 234 0002 UNO_1234 /apps/opt 788 0003 UNO_1235 /apps/opt 897 0004 UNO_1236 /apps/opt 568 I have to...

5. UNIX for Dummies Questions & Answers

best method of replacing multiple strings in multiple files - sed or awk? most simple preferred :)

Hi guys, say I have a few files in a directory (58 text files or somthing) each one contains mulitple strings that I wish to replace with other strings so in these 58 files I'm looking for say the following strings: JAM (replace with BUTTER) BREAD (replace with CRACKER) SCOOP (replace...

6. UNIX for Dummies Questions & Answers

Using AWK: Extract data from multiple files and output to multiple new files

Hi, I'd like to process multiple files. For example: file1.txt file2.txt file3.txt Each file contains several lines of data. I want to extract a piece of data and output it to a new file. file1.txt ----> newfile1.txt file2.txt ----> newfile2.txt file3.txt ----> newfile3.txt Here is...

7. Shell Programming and Scripting

apply record separator to multiple files within a directory using awk

Hi, I have a bunch of records within a directory where each one has this form: (example file1) 1 2 50 90 80 90 43512 98 0909 79869 -9 7878 33222 8787 9090 89898 7878 8989 7878 6767 89 89 78676 9898 000 7878 5656 5454 5454 and i want for all of these files to be...

8. Shell Programming and Scripting

perform 3 awk commands to multiple files in multiple directories

Hi, I have a directory /home/datasets/ which contains a bunch (720) of subdirectories called hour_1/ hour_2/ etc..etc.. in each of these there is a single text file called (hour_1.txt in hour_1/ , hour_2.txt for hour_2/ etc..etc..) and i would like to do some text processing in them. Each of...

9. UNIX for Dummies Questions & Answers

Multiple files with the same name in the same directory

Hi, Is it possible to have multiple files with the same name in a same unix directory? Eg., in the path \tmp, can we have 2 files with the same name as SALES_data_20120124.TXT? I heard it is possible if the user id that is created the files are different and with some ids, a new gets...

10. Shell Programming and Scripting

awk, multiple files input and multiple files output

Hi! I'm new in awk and I need some help. I have a folder with a lot of files and I need that awk do something in each file and print a new file with the output. The input file name should be modified when I print the outpu files. Thanks in advance for help! :-) ciao

LEARN ABOUT DEBIAN

bup-margin

bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME

       bup-margin - figure out your deduplication safety margin

SYNOPSIS

       bup margin [options...]

DESCRIPTION

       bup margin  iterates  through  all  objects  in	your  bup repository, calculating the largest number of prefix bits shared between any two
       entries.  This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.

       For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45.  That  means  a  46-bit
       hash  would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
       its first 46 bits.

       The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects.  Since SHA-1 hashes have 160 bits,
       that  leaves 115 bits of margin.  Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
       with far fewer objects.

       If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see	if
       you're getting dangerously close to 160 bits.

OPTIONS

       --predict
	      Guess  the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
	      from the guess.  This is potentially useful for tuning an interpolation search algorithm.

       --ignore-midx
	      don't use .midx files, use only .idx files.  This is only really useful when used with --predict.

EXAMPLE

	      $ bup margin
	      Reading indexes: 100.00% (1612581/1612581), done.
	      40
	      40 matching prefix bits
	      1.94 bits per doubling
	      120 bits (61.86 doublings) remaining
	      4.19338e+18 times larger is possible

	      Everyone on earth could have 625878182 data sets
	      like yours, all in one repository, and we would
	      expect 1 object collision.

	      $ bup margin --predict
	      PackIdxList: using 1 index.
	      Reading indexes: 100.00% (1612581/1612581), done.
	      915 of 1612581 (0.057%)

SEE ALSO

       bup-midx(1), bup-save(1)

BUP

       Part of the bup(1) suite.

AUTHORS

       Avery Pennarun <apenwarr@gmail.com>.

Bup unknown-															     bup-margin(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

ftp multiple files from same directory

Discussion started by: lweegp

2. Shell Programming and Scripting

Multiple search string in multiple files using awk

Discussion started by: pinnacle

3. Shell Programming and Scripting

extract multiple cloumns from multiple files; skip rows and include filenames; awk

Discussion started by: manishabh

4. Shell Programming and Scripting

How to Pull out multiple files from DB table and redirect all those files to a differetn directory?

Discussion started by: ss3944

5. UNIX for Dummies Questions & Answers

best method of replacing multiple strings in multiple files - sed or awk? most simple preferred :)

Discussion started by: rich@ardz

6. UNIX for Dummies Questions & Answers

Using AWK: Extract data from multiple files and output to multiple new files

Discussion started by: Liverpaul09

7. Shell Programming and Scripting

apply record separator to multiple files within a directory using awk

Discussion started by: amarn

8. Shell Programming and Scripting

perform 3 awk commands to multiple files in multiple directories

Discussion started by: amarn

9. UNIX for Dummies Questions & Answers

Multiple files with the same name in the same directory

Discussion started by: Vijay81

10. Shell Programming and Scripting

awk, multiple files input and multiple files output

Discussion started by: gabrysfe

LEARN ABOUT DEBIAN

bup-margin