Finiding Files with Perl or awk?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Finiding Files with Perl or awk?
# 8  
Old 05-19-2015
That its 75 times slower via NFS than disk is a pretty big hint.

Yes, NFS versus disk matters quite a lot. Its asynchronous nature means that an operation you'd consider atomic on a local disk mean a lot of "who has this file? is this file busy? Has anyone updated it since I saw it last? I'm taking this file, LAST WARNING, FILE TAKEN. Everyone who might be reading this file, I have updated it, flush your buffers! I have now CLOSED THE FILE, free for all takers" etc etc messaging back and forth. Imagine the overhead this creates for programs using huge clutters of tiny files! All that asking and waiting. If any of it can be done in parallel that can help, having 3-4 requests flying at once instead of 1, but a shared filesystem is by definition a lot more back-and-forth than a dedicated one.

The traditional answer to "why is my file tree walker slow" is "Is it NFS?" and the traditional reply to that is something along the lines of "Why -- does that matter?" followed by "I configured it to skip the NFS, much better now".
# 9  
Old 05-19-2015
Corona is spot on. Nothing you can do code-wise on the UNIX box will make NFS faster.
You need to short circuit the find and read NFS operations somehow.

If you absolutely have to speed things up no matter what: reverse roles. Locate the files on your UNIX box and then Samba/CIFS/NFS share it to the Windows box.
# 10  
Old 05-19-2015
I found something interesting. *.aps folders are rarely inside other *.aps folders (although *.aps files are . . . just not folders).
folderList=$( find "$dataDir" -type d -name '*.aps' -printf '%T@|%p\n' -prune | sort -nr | cut -d"|" -f 2 ) is not a perfect solution (I miss a low single digit percentage of matches) but it gives me an idea how fast things can be (about 40x faster). Yes, and this proves that NFS is the bottleneck . . . Smilie

I found out that if a *.aps folder contains a file called Autorange.lay, you definitely do not need to descend any further to find more *.aps folders.

Mike
This User Gave Thanks to Michael Stora For This Post:
# 11  
Old 05-20-2015
If you require 100% accuracy, and can't find a simplification which works 100% of the time, a full search may be unavoidable.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Awk or Perl - to selectively merge two files.

I have two files, these have to be selectively merged into two other files. In addition there will require to be a edit to the last field, where the date format is changed. The first file is a csv file with around 300k lines the data is now nearly 20 years old and I have been asked to move this... (7 Replies)
Discussion started by: gull04
7 Replies

2. Shell Programming and Scripting

awk command to compare a file with set of files in a directory using 'awk'

Hi, I have a situation to compare one file, say file1.txt with a set of files in directory.The directory contains more than 100 files. To be more precise, the requirement is to compare the first field of file1.txt with the first field in all the files in the directory.The files in the... (10 Replies)
Discussion started by: anandek
10 Replies

3. Shell Programming and Scripting

Compare intervals (columns) from two files (awk, grep, Perl?)

Hi dear users, I need to compare numeric columns in two files. These files have the following structure. K.txt (4 columns) A001 chr21 9805831 9846011 A002 chr21 9806202 9846263 A003 chr21 9887188 9988593 A003 chr21 9887188 ... (2 Replies)
Discussion started by: jcvivar
2 Replies

4. Shell Programming and Scripting

Apply 'awk' to all files in a directory or individual files from a command line

Hi All, I am using the awk command to replace ',' by '\t' (tabs) in a csv file. I would like to apply this to all .csv files in a directory and create .txt files with the tabs. How would I do this in a script? I have the following script called "csvtabs": awk 'BEGIN { FS... (4 Replies)
Discussion started by: ScKaSx
4 Replies

5. Shell Programming and Scripting

Compare two files and set a third one using awk or perl

Folks I need your help cuz I've a file with 100,000 records that need to be compared against a passwd file (300) and then create a third one with the data in the first one and the passwd from the second one set in it. The format of the first file is: host xxxxxx "" 0,0 Closed control00/... (4 Replies)
Discussion started by: ranrodrig
4 Replies

6. Shell Programming and Scripting

Comparison and editing of files using awk.(And also a possible bug in awk for loop?)

I have two files which I would like to compare and then manipulate in a way. File1: pictures.txt 1.1 1.3 dance.txt 1.2 1.4 treehouse.txt 1.3 1.5 File2: pictures.txt 1.5 ref2313 1.4 ref2345 1.3 ref5432 1.2 ref4244 dance.txt 1.6 ref2342 1.5 ref2352 1.4 ref0695 1.3 ref5738 1.2... (1 Reply)
Discussion started by: linuxkid
1 Replies

7. Shell Programming and Scripting

Finiding filenames with specific index string

Hi All, I have a file (Names.txt) and the contents of the file is give below. $ cat Names.txt FF313207008.txt FF223207007.txt FF143207006.txt FF372150600.txt FF063407005.txt FF063307005.txt $ From these given file names I want to find the files which has the 6th index value as 2. So... (5 Replies)
Discussion started by: krish_indus
5 Replies

8. Shell Programming and Scripting

perl script for listing files and mailing the all files

Hi, I am new to perl: I need to write perl script to list all the files present in directory and mail should be come to my inbox with all the files present in that directory. advanced thanks for valuable inputs. Thanks Prakash GR (1 Reply)
Discussion started by: prakash.gr
1 Replies

9. Shell Programming and Scripting

Merge files of differrent size with one field common in both files using awk

hi, i am facing a problem in merging two files using awk, the problem is as stated below, file1: A|B|C|D|E|F|G|H|I|1 M|N|O|P|Q|R|S|T|U|2 AA|BB|CC|DD|EE|FF|GG|HH|II|1 .... .... .... file2 : 1|Mn|op|qr (2 Replies)
Discussion started by: shashi1982
2 Replies

10. Shell Programming and Scripting

Perl or awk/egrep from big files??

Hi experts. In one thread i have asked you how to grep the string from the below sample file- Unfortunately the script did not gave proper output (it missed many strings). It happened may be i did gave you the proper contents of the file That was the script- "$ perl -00nle'print join... (13 Replies)
Discussion started by: thepurple
13 Replies
Login or Register to Ask a Question