Help with File Slow Processing


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with File Slow Processing
# 8  
Old 06-28-2011
Try and adapt this new version (not tested) :
Code:
#!/bin/sh
#e.g. 20110627 (june 27 2011)
currdate=$1
#e.g. 20100310 (march 10 2010)
enddate=$2

#directory where 1365 files get generated
localcurves="/home/sratta/feds/localCurves/curves"
outputdir="/home/sratta/curves"
#output fileto be generated
OUTFILE="/home/sratta/ZeroCurves/BulkLoad.csv"
touch $OUTFILE

# List of 133 curve file names
ZEROCURVEFILES="saud1-monthlinmid
saud6-monthlinmid
.....
suvruvr_usdlinmid
szarzar_usdlinmid"

echo "$ZEROCURVEFILES" > /tmp/zerocurvefiles.tmp

#Loop until currdate is not equal to enddate (reverse loop)
while [ $currdate -ne $enddate ]
do

  #Call K2test.sh which generates 1365 files for a given date in $localcurves directory
 ./K2test.sh $currdate
 filesfound=0

#Loop through the 1365 files generated by K2test.sh in $localcurves directory
 for FILE in `cd localcurves; ls | /usr/xpg4/bin/grep -iwf /tmp/zerocurvefiles.tmp 2>/dev/null`
 do
  filesfound=1
  echo "Processing $LOWERCASEFILE.$currdate file"

  nawk '
    FNR==1 {
        numheadrecords = $1;
        rowstoprocess  = numheadrecords + 2;
        printf "Total Number of Rows in header for %s.%s is %s\n", LowFile, Date, numheadrecords;
        next;
    }
    FNR<rowstoprocess {
        julianmdate = $1;
        rate        = $2;
        mdate       = $4
        printf "%s,%s,%s,%s,%s\n", LowFile, Date, juliandate, rate, mdate;
    }
  ' LowFile=$LOWERCASEFILE Date=$currdate $FILE
    
 done
 
#Subtract 1 day from currdate (reverse loop)
 currdate=`./shift_date $currdate -1`
done

Jean-Pierre.
# 9  
Old 06-28-2011
(Late post - lost connection, may be out of context)
Quote:
What Operating System and version are you running? It is sun solaris
The version is in the output from the "uname -a" command. It should then be possible to look up whether your Solaris is an old one which has the old Bourne Shell for /bin/sh or a new one with the more modern Posix Shell.

1) The big inefficency is using a Shell "read" to read records line-by-line from a data file, then using multiple "awk" runs to separate the fields.
I see now why you reassigned the channels because you are already using the Shell input channel to read a list of files.

I agree with the ideas behind "agiles" modifications.

2) As you have a list of required files, use that list.
I'd add a test to the script to check whether the file exists.
I see that "agiles" modification is ingeneous because it allows for this by sending errors to /dev/null:
Quote:
for FILE in `cd localcurves; ls $ZEROCURVEFILES 2>/dev/null`
3) Invoke awk only once and use it to read the data from the files.
A lot of the inefficiency comes from the number of times the original script starts "awk" to process the same $line .

4) Hold the list of 133 files in a real file not an environment variable and use "while" rather than "for". Some Bourne shells will not let you have an environment variable that big.

5) Consider making a version of K2test.sh which only generates the relevant 133 files in /home/sratta/feds/localCurves/curves .

6) Noticed that the variable $LOWERCASEFILE is not set anywhere.

7) If you have a journalling filesystem it is inefficient to repeatedly create a batch of files then overwrite them with Shell. Depends whether K2test.sh removes old files before generating new files.
# 10  
Old 06-28-2011
Hi Jean Pierre/Methyl,

Jean Pierre, I see that when you are using awk you are using printf, I want the data elements to be sent to a file and not printed, how do I make data elements to go to $OUTFILE where OUTFILE is a variable holding name of file.

I would appreciate your help.

Thanks
Regards
# 11  
Old 06-28-2011
Check "agiles" next post, but I think this is enough for the redirect:
' LowFile=$LOWERCASEFILE Date=$currdate $FILE >> ${OUTFILE}

However there are other problems:
e.g. There is no value in $LOWERCASEFILE .

It would be so much easier if $ZEROCURVEFILES was the name of a file containing a list of the required files with their correct names. This could be created from a "ls -1" report and deleting the ones you don't want. It could equally be created using a "here document" within the script.
Translating the mixed upper-and-lower filename to lower case is a trivial task for the unix "tr" command. Working from a lower case list is proving to be not trivial.

Any chance you can let us know your version of Solaris?
# 12  
Old 06-29-2011
Thanks Jean Pierre and Methyl,

The awk command you sent will only run for number of record in header correct? For the lines following the first line, it uses space as delimiter within the line to extract fields? as I dont see mention of separator being space.

I will modify the $ZEROCURVEFILES so that it has exact names. After doing that, how do I loop through only those files basically i have to some how use ls command giving it this list then inside loop i can change file name to lower case using tr like you mentioned.

I will let you know the version of Solaris when I reach work. I appreciate ur help.

Thanks
Regards

Last edited by srattani; 06-29-2011 at 07:20 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Processing too slow with loop

I have 2 files file 1 : contains ALINE ALINE BANG B ON A B.B.V.A. BANG AMER CORG BANG ON MORENA BANG ON MORENAIC BANG ON MORENAICA BANG ON MORENAICA CORP BANG ON MORENAICA N.A file 2 contains and is seprated by ^ delimiter : NATIO MARKET^345432534 (10 Replies)
Discussion started by: nikhil jain
10 Replies

2. Shell Programming and Scripting

Shell script reading file slow

I have shell program as below #!/bin/sh echo ======= LogManageri start ========== #This directory is getting the raw data from remote server Raw_data=/opt/ftplogs # This directory is ready for process the data Processing_dir=/opt/processing_dir # This directory is prcoessed files and... (4 Replies)
Discussion started by: Chenchireddy
4 Replies

3. Programming

awk processing / Shell Script Processing to remove columns text file

Hello, I extracted a list of files in a directory with the command ls . However this is not my computer, so the ls functionality has been revamped so that it gives the filesizes in front like this : This is the output of ls command : I stored the output in a file filelist 1.1M... (5 Replies)
Discussion started by: ajayram
5 Replies

4. Red Hat

GFS file system performance is very slow

My code Hi All, I am having redhat linux 5.3 (Tikanga) with GFS file system and its very very slow for executing ls -ls command also.Please see the below for 2minits 12 second takes. Please help me to fix the issue. $ sudo time ls -la BadFiles |wc -l 0.01user 0.26system... (3 Replies)
Discussion started by: susindram
3 Replies

5. Shell Programming and Scripting

Very big text file - Too slow!

Hello everyone, suppose there is a very big text file (>800 mb) that each line contains an article from wikipedia. Each article begins with a tag (<..>) containing its url. Currently there are 10^6 articles in the file. I want to take random N articles, eliminate all non-alpharithmetic... (14 Replies)
Discussion started by: fedonMan
14 Replies

6. Shell Programming and Scripting

Slow performance filtering file

Please, I need help tuning my script. It works but it's too slow. The code reads an acivity log file with 50.000 - 100.000 lines and filters error messages from it. The data in the actlog file look similar to this: 02/08/2011 00:25:01,ANR2034E QUERY MOUNT: No match found using this criteria.... (5 Replies)
Discussion started by: Miila
5 Replies

7. Shell Programming and Scripting

File processing is very slow with cut command

Dear All, I am using the following script to find and replace the date format in a file. The field18 in the file has the following format: "01/26/2010 11:55:14 GMT+04:00" which I want to convert into the following format "20100126115514" for this purpose I am using the following lines of codes:... (5 Replies)
Discussion started by: bilalghazi
5 Replies

8. Red Hat

file writing over nfs very slow

Hi guys, I am trying something. I wrote a simple shell program to test something where continuous while loop writes on a file over the nfs. The time taken to write "hello" 3000 times take about 10 sec which is not right. Ideally it should take fraction of seconds. If I write on the local disk, it... (1 Reply)
Discussion started by: abhig
1 Replies

9. SCO

Slow Processing - not matching hardware capabilities

I have been a SCO UNIX user, never an administrator...so I am stumbling around looking for information. I don't know too much about what is onboard in terms of hardware, however; I will try my best. We have SCO 5.07 and have applied MP5. We have a quad core processor with 4 250 GB... (1 Reply)
Discussion started by: atpbrownie
1 Replies

10. UNIX for Advanced & Expert Users

File writing is slow

Hello Guru, I am using a Pro*C program to prepare some reports usaually the report file size is greater than 1GB. But nowadays program is very slow. I found out the program is taking much time to write data to file ..... is there any unix related reason to be slow down, the file writting... (2 Replies)
Discussion started by: bhagyaraj.p
2 Replies
Login or Register to Ask a Question