unix script optimization


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers unix script optimization
# 1  
Old 04-05-2009
unix script optimization

I have a file which contains 9,200,000. It contains 125 clolumns. I have to rearrange some columns and exclude some of them. I scripted the following script to do the same. It is working fine but it is taking more than 4hrs to do it. can it be optmized.
Here is the script

Code:
LOC="/sourcefile/"
LOC_FINAL="finalfile/"
Path="base path"
line_num=`wc -l $LOC"abcd.txt"
 
#making directory for temp files
mkdir tempfiles2
chmod 777 tempfiles2

#getting the coloumn names
head -1 $infile > headers3

#removing the newline character and return carriage from headers
#cat $LOC"temp5" | tr -d '\n \r' > $LOC"temp6"
cat headers3 | tr -d '\n \r' > headers1
cat headers1

#now we have the headers so we will create the tem files for all the headers
file_nam1=`cat headers1`
file_nam=`echo "$file_nam1" | tr "\"" " "`
field=1
for i in $file_nam
do
        echo "$i created"
        awk -F"\"" '{print $'$field'}' $LOC"abcd.txt" > $Path"tempfiles2/$i"
        field=`expr $field + 1`
        chmod 777 $Path"tempfiles2/$i"
done
 
#now checking for the missing coloumn names
find_file=`cat abcd_ref`
 
line_no=`echo $line_num |sed 's/ //g'`
echo "line no=$line_no"

for i in $find_file
do
missing_col=`grep $i headers1`
if test "$missing_col" = ""
then
        echo "$i not found"
        #putting tabs in the missing file
        count=0
        while test $count -le $line_no
        do
                echo "  ">$LOC"tempfiles2/$i"
                count=`expr count + 1`
                line_number=`wc -l $LOC"tempfiles2/$i"`
echo "line number of the missing file:$line_number"
        done
fi
done

#now temp files are created, we have to append them in an order

cd $Path"tempfiles2"
paste file1 file2 ........ file 100 > final
 
cp final $LOC_FINAL"abcd.txt"
 
#removing temp files
cd ..
rm -r tempfiles2
rm headers1
exit 0


please helpSmilie

Last edited by Franklin52; 04-05-2009 at 10:00 AM.. Reason: adding code tags
# 2  
Old 04-06-2009
it will be easy for us if you can provide some i/p data and the desired o/p
# 3  
Old 04-06-2009
what does $infile contain because i don't see any variable infile in your script
If you are selecting head -1 then i don't think you have to get rid of newline character any way its a single line so i didn't get what
cat headers3 | tr -d '\n \r' do??
from where this abcd_ref file came ???
and many more doubts in your script its hard to follow
# 4  
Old 04-06-2009
basically this part of script is posing all the problem...
The abcd.txt has some 92,00,000 records...
$file_nam contains the list of headers for which temp files will be created

so any ideas how to make this loop faster???

for i in $file_nam
do
echo "$i created"
awk -F"\"" '{print $'$field'}' $LOC"abcd.txt" > $Path"tempfiles2/$i"
field=`expr $field + 1`
chmod 777 $Path"tempfiles2/$i"
done
 
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Need Optimization shell/awk script to aggreagte (sum) for all the columns of Huge data file

Optimization shell/awk script to aggregate (sum) for all the columns of Huge data file File delimiter "|" Need to have Sum of all columns, with column number : aggregation (summation) for each column File not having the header Like below - Column 1 "Total Column 2 : "Total ... ...... (2 Replies)
Discussion started by: kartikirans
2 Replies

2. Shell Programming and Scripting

Code optimization

Hi all I wrote below code: #!/bin/sh R='\033 do you have any idea how to optimize my code ? (to make it shorter eg.) (11 Replies)
Discussion started by: primo102
11 Replies

3. Shell Programming and Scripting

Script Optimization - large delimited file, for loop with many greps

Since there are approximately 75K gsfiles and hundreds of stfiles per gsfile, this script can take hours. How can I rewrite this script, so that it's much faster? I'm not as familiar with perl but I'm open to all suggestions. ls file.list>$split for gsfile in `cat $split`; do csplit... (17 Replies)
Discussion started by: verge
17 Replies

4. Shell Programming and Scripting

Awk script gsub optimization

I have created Shell script with below awk code for replacing special characters from input file. Source file has 6 mn records. This script was able to handle 2 mn records in 1 hr. This is very slow speed and we need to optimise our processing. Can any Guru help me for optimization... (6 Replies)
Discussion started by: Akshay
6 Replies

5. Shell Programming and Scripting

Looking for optimization advice on a short script

I already have a solution to my problem, but I'm looking to see if it can be made more succinct and faster. The problem: given a list, as shown below, extract the pathname for any file in a directory named 'ample' and return it's index into the list. The index is also in the data itself. Note that... (1 Reply)
Discussion started by: prigo
1 Replies

6. UNIX for Dummies Questions & Answers

Unix Resource Optimization

Hi all........... Please, Anyone Tell me About How to Optimize Unix/Linux Resources to the Best use of them??????? And also specify if there is some tools for it. Thanks in Advance... Regards Kuldeep (2 Replies)
Discussion started by: ks47
2 Replies

7. Shell Programming and Scripting

Script Optimization required

Dear All, Sorry to bother you. But I tried the below problem but didn't come up a good solution. A have a file containing such info 2009-03-14 22:01:01,430 :: 2009-03-14 22:05:01,430 :: I need to show simply 22:01:01, 568, 181, 472 22:05:01, 903, 458, 572 that is time, TID,... (11 Replies)
Discussion started by: saifurshaon
11 Replies

8. Shell Programming and Scripting

script optimization

:o Hi, I am writing a script in which at some time, I need to get the process id of a special process and kill it... I am getting the PID as follows... ps -ef | grep $PKMS/scripts | grep -v grep | awk '{print $2 }'can we optimize it more further since my script already doing lot of other... (3 Replies)
Discussion started by: vivek.gkp
3 Replies

9. UNIX for Dummies Questions & Answers

Help on optimization of the script

Hi, I have prepared script which is taking more time to process. find below script and help me with fast optimized script:- cat name.txt | while read line do name=$(echo $line| awk '{print $8}') MatchRecord=$(grep $name abc.txt | grep -v grep ) echo "$line | $MatchRecord" | awk... (2 Replies)
Discussion started by: aju_kup
2 Replies
Login or Register to Ask a Question