Efficient shell script code


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Efficient shell script code
# 1  
Old 07-09-2013
Efficient shell script code

Hi all,
I am working on an extremely large collection of text data (about 2 million XML files) in a directory. I have changed the extension from .xml to .dat. Right now I am using this code to remove the XML tags, but the code is way too slow. It seems that it is taking fore-ever:
Code:
#ls -1 *.dat | while read page
find . -name "*.dat" -print | while read page
do
links -dump $page>$page.txt
done

Just to let the readers know that the commented line with ls does not even work as it gives Argument list too long message.

Then I modified the code, and came up with this:
Code:
#ls -1 *.dat | while read page
#find . -name "*.dat" -print | while read page
num=1
for page in *.dat;
do
links -dump $page>$page.txt
let num=num+1
done

Just wish to know will this speed up my task? What I want to do is that instead of doing ls or find, I should generate the filename using my code, and the program should then process that file which has been automatically generated. The trick that I have used is that I have re-named all the 2 million files with "contiguous" numbers 1.dat, 2.dat, 3.dat, 4.dat and so on without leaving any number in between and then using a counter, I generate those numbers and read those files.
Or, Is there any other better way to fasten up my task? I am using Linux with BASH.
# 2  
Old 07-09-2013
No matter which way you do it, you're still running lynx 2 million times. Which do you think is the holdup -- the tiny shell loop, or the part which does all the actual work?

Last edited by Corona688; 07-09-2013 at 12:46 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search and replace multiple patterns in a particular column only - efficient script

Hi Bigshots, I have a pattern file with two columns. I have another data file. If column 1 in the pattern file appears as the 4th column in the data file, I need to replace it (4th column of data file) with column 2 of the pattern file. If the pattern is found in any other column, it should not... (6 Replies)
Discussion started by: ss112233
6 Replies

2. Shell Programming and Scripting

Colour code in shell script

Hello, I am trying to colour code a single word in whole line. Can you please help. I am able to colour code the whole line but not able to do only for single word Query) I want to echo below line and colur code red to word FAILED only. This server is FAILED in check. (2 Replies)
Discussion started by: saurabh84g
2 Replies

3. Shell Programming and Scripting

How to capture the exit code of a shell script in a perl script.?

hi, i want to pop up an alert box using perl script. my requirement is. i am using a html page which calls a perl script. this perl script calls a shell script.. after the shell script ends its execution, i am using exit 0 to terminate the shell script successfully and exit 1 to terminate the... (3 Replies)
Discussion started by: Little
3 Replies

4. Shell Programming and Scripting

Help with Shell script code

Hello all, I am in a middle of an assignment and i would appreciate any help. How can i write a bash shell script code that checks if all elements in an array are the same numbers. I mean -->array = ( 0,0,0,0,0 ) ( e.g., if then return "OK' fi ) Thank you in advance, (9 Replies)
Discussion started by: Geekie
9 Replies

5. Shell Programming and Scripting

Efficient rewrite of code?

egrep -v "#" ${SERVERS} | while read shosts do grep -Pi "|" ${LOGFILE} | egrep "${snhosts}" | egrep "NOTIFICATION:" | awk -F";" '{print $3}' | sort -n | uniq | while read CEXIST do ... (6 Replies)
Discussion started by: SkySmart
6 Replies

6. UNIX for Dummies Questions & Answers

Script Shell in java code

Hello, I try to run a script shell from a java program: but it runs only if i do :chmod 777 myShellScript in the terminal Please how can i insert chmod 777 in my java code without going through the terminal? Thank you (1 Reply)
Discussion started by: chercheur857
1 Replies

7. Programming

Help with make this Fortran code more efficient (in HPC manner)

Hi there, I had run into some fortran code to modify. Obviously, it was written without thinking of high performance computing and not parallelized... Now I would like to make the code "on track" and parallel. After a whole afternoon thinking, I still cannot find where to start. Can any one... (3 Replies)
Discussion started by: P_E_M_Lee
3 Replies

8. Emergency UNIX and Linux Support

Help to make awk script more efficient for large files

Hello, Error awk: Internal software error in the tostring function on TS1101?05044400?.0085498227?0?.0011041461?.0034752266?.00397045?0?0?0?0?0?0?11/02/10?09/23/10???10?no??0??no?sct_det3_10_20110516_143936.txt What it is It is a unix shell script that contains an awk program as well as... (4 Replies)
Discussion started by: script_op2a
4 Replies

9. Programming

Making FORTRAN code more efficient

Hi, I have a very large, very old FORTRAN code that I work with. The code is quite messy and I was wondering if I can speed up execution time by finding subroutines that code execution spends the most time in. Is there any kind of software I can use to see where the code spends most of the... (1 Reply)
Discussion started by: rks171
1 Replies

10. Shell Programming and Scripting

code formatter for shell script

hello, do anybody know a program to format a shell script code ? i tried "editrocket.com" but this product doesn't format a shell script code. i searched for programs but can't find a shell script code formatter. i have to change a shell script and the style of code is ..... regards (5 Replies)
Discussion started by: bora99
5 Replies
Login or Register to Ask a Question