Shell script reading file slow


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Shell script reading file slow
# 1  
Old 06-17-2014
RedHat Shell script reading file slow

I have shell program as below

Code:
#!/bin/sh
echo ======= LogManageri start ==========

#This directory is getting the raw data from remote server
Raw_data=/opt/ftplogs

# This directory is ready for process the data
Processing_dir=/opt/processing_dir

# This directory is prcoessed files and taking backup
Processed_dir=/opt/processed_dir

# This directory spliting the files like access and error logs
split_dir=/opt/split_dir

# Copying Raw data to Processing directory
echo starting copying files from $Raw_Data to $Processing_dir
cp -p $Raw_data/*.gz $Processing_dir/
echo done copying raw files.

# Decompress .gz files from $Process_dir
echo starting unziping files in $Process_dir
gunzip $Processing_dir/*.gz
echo done unziping files in $Process_dir

# This for loops gives year,month and day from file name

for file in $Processing_dir/*.log
do
        year=${file:29:4}
        month=${file:33:2}
        day=${file:35:2}

# This while loop reading the file each and every line and spliting the Respective $cname and $code
echo start reading files
while read -r line
do
        # cname is reading every line from all files
        cname=$(echo ${line} | awk '{split($11,c,"/"); print c[3]}')
        # scode is reading every line from all files
        scode=$(echo ${line} | awk -F"[ ]" '{print $9}')
        # if scode matches between 200 to 399 these lines printing to access.log
        [[ ( ${scode} -ge 200 ) && ( ${scode} -le 399 ) ]] && {
 # if directory not exists creating the new directory
        [[ ! -d "$split_dir/$cname/$year/$month/$day" ]] && mkdir -p "$spilt_dir/$cname/$year/$month/$day"
        # scode,cname conditions matches pusing these lines printing to access.log
        echo ${line} >> $split_dir/$cname/$year/$month/$day/access.log
        }
        # if scode matches between 400 to 599 there lines printing to error.log
        [[ ( ${scode} -ge 400 ) && ( ${scode} -le 599 ) ]] && {
        # if directory not exists creating the new directory
        [[ ! -d "$split_dir/$cname/$year/$month/$day" ]] && mkdir -p "$split_dir/$cname/$year/$month/$day"
        # scode,cname conditions matches pusing these lines printing to error.log
        echo ${line} >> $split_dir/$cname/$year/$month/$day/error.log
        }
done < $file
        done
echo files reading done
# after successfull splitting the the logs gzip .log file
echo starting zip logs
gzip $Processing_dir/*.log
echo compression done
# gzip file moves to $Processed directory
mv $Processing_dir/*.gz  $Processed_dir
echo moved file to $Processed_dir

The above files reading while loop taking 2 minutes with 20 files have 350KB

Please suggest to me where can tune my script

Last edited by Don Cragun; 06-17-2014 at 03:08 AM.. Reason: Add CODE tags.
# 2  
Old 06-17-2014
Are you sure that the file processing is taking up that time and not the gunzip decompression/compression ?

Please use code tags.
# 3  
Old 06-17-2014
Thanks for your update.
I am sure that while compression and decompression while reading the files.
I have daily getting the 200 or more files generating my application per hour
Please suggests to me any alternative or tune the script.
# 4  
Old 06-17-2014
Instead of reading the line variable and then splitting it with awk, you could try to read the individual variables saving several process creations per loop.
# 5  
Old 06-17-2014
It might be worth putting the time stamp in your messages so you can be sure where the time is spent:-
Code:
echo "`date` starting unziping files in $Process_dir"
gunzip $Processing_dir/*.gz
echo "`date` done unziping files in $Process_dir"

For every record you read in, you have the following :-
Code:
cname=$(echo ${line} | awk '{split($11,c,"/"); print c[3]}')

This means that for every single record, you will start several processes (sub-shell, awk and possibly an echo) and this takes time. These processes are not re-used next time round your loop so it's very expensive. You then do pretty much the same with this for every record too:-
Code:
scode=$(echo ${line} | awk -F"[ ]" '{print $9}')

Assuming that what you really want from these two lines is a section of the record, can you share a sample of the data and I'm sure we can give you a much more efficient process. Highlight what you need to use from each record.

Can you explain your logic so we can be sure of what you are trying to achieve please.

I can give you ksh commands if that's what is more useful, but someone may be able to code a single awk command to remove your loop entirely which would be better still.



Robin
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Slow Running Script (Reading 8000 lines)

Slow runnin script. The problem seems to be the sed calls. In summary the script reads list of users in file1. For each username search two files (file 1 & file2) for the username and get the value in the next line after "=". Compare these values with each other. If the same then output... (9 Replies)
Discussion started by: u20sr
9 Replies

2. UNIX for Dummies Questions & Answers

C-Shell script help reading from txt file

I need to write a C-Shell script with these properties: It should accept two arguments on the command line. The first argument is the name of a file which contains a list of names, and the second argument is the name of a directory. For each file in the directory, the script should print the... (1 Reply)
Discussion started by: cerce
1 Replies

3. Shell Programming and Scripting

Reading a csv file using shell script

Hello All, I have a csv file that looks like below ProdId_A,3.3.3,some text,some/text,sometext_1.2.3 ProdId_B,3.3.3,some text,some/text,sometext_1.2.3 ProdId_C,3.3.3,some text,some/text,sometext_1.2.3 ProdId_A,6.6.6,some text,some/text,sometext_9.9.9 I will get ProdId from... (5 Replies)
Discussion started by: anand.shah
5 Replies

4. Shell Programming and Scripting

Error while reading from a file in shell script

Hi All, I'm writing a script to read a file line by line and then perform awk function on it. I am getting an error . My file has one name in it "James". I'm expecting my o/p to be youareJamesbond James ./users.sh: line 7: =: command not found #script to read file line by line #adding... (5 Replies)
Discussion started by: Irishboy24
5 Replies

5. Shell Programming and Scripting

Reading a property file through shell script???

Hi! i need a script that can read a property file. i.e., A script to read a "property" from property file. Read the property value and based on value of property, decide whether to start the some dataload activity or not. Its urngent. Can anyone help me out???:( (7 Replies)
Discussion started by: sukhdip
7 Replies

6. Shell Programming and Scripting

Reading the Properties File From Shell script

Hi, I am new to the shell script please I need help for following question. I have properties file name called "com.test.properties" I have No of key values in this properties. com.person.name = xyz com.person.age = 55 com.person.address = hello I want read this properties but i... (1 Reply)
Discussion started by: venukjs
1 Replies

7. Shell Programming and Scripting

File reading problem via shell script

Hi, Data file named parameter contains : DB=y Alter_def.sql Create_abc.sql SQL=y database.sql my_data.sql To read this file I use var_sql=$(awk -F= '$1 == "SQL" { print $2 }' parameter.txt) if then sql_f_name=`grep "\.sql" parameter.txt` echo $sql_f_name fi (2 Replies)
Discussion started by: Dip
2 Replies

8. Shell Programming and Scripting

file reading through shell script

For reading a file through shell script I am using yhe code : while read line do echo $line done<data.txt It reads all the line of that file data.txt. Content of data.txt looks like: code=y sql=y total no of sql files=4 a.sql b.sql c.sql d.sql cpp=n c=y total no of c files=1 (4 Replies)
Discussion started by: Dip
4 Replies

9. Shell Programming and Scripting

Reading data from a file through shell script

There is one Text file data.txt. Data within this file looks like: a.sql b.sql c.sql d.sql ..... ..... want to write a shell script which will access these values within a loop, access one value at a time and store into a variable. can anyone plz help me. (2 Replies)
Discussion started by: Dip
2 Replies

10. Shell Programming and Scripting

Reading file names from a file and executing the relative file from shell script

Hi How can i dynamically read files names from a list file and execute them from a single shell script. Please help its urgent Thanks in Advance (4 Replies)
Discussion started by: anushilrai
4 Replies
Login or Register to Ask a Question