concurrent processes


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting concurrent processes
# 1  
Old 04-10-2007
concurrent processes

We have a very large text file..contains almost 100K lines.
We want to process this file to generate another text file as per our data requirement.
As for now the parsing of data takes 20-25 mins each , for 100K lines.

the current script uses :
While Read Each Line
do parsing..
Done

We want to cut down the runtime from 20-25 mins to aorund 5-10 mins..
Hence it was decided that we must split the large input file into 10 files of 10K lines each and run the parsing script for all the 10 files at the same time..something like concurrent run..

How do I achieve this..I am new to the concept of concurrent runs ...
Please guide.
# 2  
Old 04-10-2007
HI,
As the file size is more it is advised to use C language.
U can make use of pthread to read files. and u can mage the threads easily.

Thanks
Raghuram
# 3  
Old 04-10-2007
do consider Perl for the job that you are doing. Perl is well suited for handling large files and should improve on the time if implemented properly. read this if you are interested
# 4  
Old 04-10-2007
By concurrency, I assume you are referring to simultaneous execution of processes. Unless you are on a multi-processor system the real concurrent execution of any processes is not possible.

In your case, you could try something like,

split the files such that there are 10 10K files
run each of the script individually as a background process
<script> for the first file with 10K & ( in background mode)

run through the loop for all the files
# 5  
Old 04-10-2007
Simultaneous execution of processes..

Hi MatrixMadhan,

I am not sure how to code this exactly....
Also..from what I read...do I have to stop the background processes explicitly..
What if there are some problems with the data parsing...how will the error.log will be created...
Can u explain more, guide for some sample scripts ?
# 6  
Old 04-12-2007
Quote:
Originally Posted by Amruta Pitkar
Hi MatrixMadhan,

I am not sure how to code this exactly....
Also..from what I read...do I have to stop the background processes explicitly..
What if there are some problems with the data parsing...how will the error.log will be created...
Can u explain more, guide for some sample scripts ?

There is no need to stop your background process unless they is a situation to be done so!

would something like this be of help!

split the files ...100k to 10 ( 10k s )
now the process that was used to run against the 100k sample, should be used against each of the smaller chunks

i=1
while[ $i -le 10 ]
do
/somedir/process chunk$i & //Make that a background process
i=$(($i + 1))
done

Now with the above loop, smaller chunks are fed to the individual process and would start processing.

By default background process have a lower priority when compared to the foreground process.

You need to arrive / determine at a threshold value ( more of a bench mark stuff ) where running several processes with smaller chunks is actually not bringing down the performance when compared to running it with a single chunk and just a single process.

Creating the error logs is as the usual way as you had been doing it for the foreground process!

Smilie
# 7  
Old 04-15-2007
I tried to develop a script using matrixmadhan's comment.

Assuming the file as Server.log and its available in the CWD.

Code:
#!/bin/ksh
split -l $(($(wc -l < Server.log)/10)) Server.log /tmp/LogFile.

function ProcessFile {
 echo "Processing File:$1"
while read line ; do
  echo "$line" >> /tmp/Server.Processed
done < $1
}

for F in /tmp/LogFile.* ; do
 echo "Processing $F file"
 ProcessFile $F &
 echo "The Last Child PID is $!"
done

echo "Waiting for Childs"
wait
echo "All Child Process are done"
exit

Expert please comment on this approach.Is this true concurrent processing stuff or will this increase the performance.

Thanks
Nagarajan Ganesan
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. AIX

Difference between concurrent and enhanced concurrent VG

Hi, What are the differences between concurrent and enhanced concurrent VGs.? Any advantages of enhanced concurrent VG over normal concurrent vg Regards, Siva (2 Replies)
Discussion started by: ksgnathan
2 Replies

2. Shell Programming and Scripting

Help with how to run multiple script with concurrent processes runs sequentially.

Hi, The problem detail is follows I have three individual scripts . SCRIPT A sh -x sqoop_channels_nc_daily_01.sh & sh -x sqoop_channels_nc_daily_02.sh & sh -x sqoop_channels_nc_daily_03.sh SCRIPT B sh -x sqoop_contacts_nc_daily_01.sh & sh -x sqoop_contacts_nc_daily_02.sh & sh -x... (1 Reply)
Discussion started by: H_bansal
1 Replies

3. Shell Programming and Scripting

Concurrent execution

Hi all, I have a folder with sql files that need to be inserted in a DB with SQL*Plus. The thing is that it takes too long to insert them all one by one, so I want to insert them five at a time. Currently what I use is this: for $FILENAME in *.sql do sqlplus -s $DBUSER@$SID << EOF ... (0 Replies)
Discussion started by: Tr0cken
0 Replies

4. AIX

chvg -g on Concurrent VG

Hi, on normal (non concurrent) vgs, it's possible to extend a lun on san-storage , and use chvg -g to rewrite vgda, and use disks with the new size for lvm operations is the same procedure possbile on a hacmp-cluster (2 node in our case) with concurrent vgs in active/passive mode? cheers... (5 Replies)
Discussion started by: funksen
5 Replies

5. Solaris

Identifying and grouping OS processes and APP processes

Hi Is there an easy way to identify and group currently running processes into OS processes and APP processes. Not all applications are installed as packages. Any free tools or scripts to do this? Many thanks. (2 Replies)
Discussion started by: wilsonee
2 Replies

6. UNIX for Advanced & Expert Users

Monitoring Processes - Killing hung processes

Is there a way to monitor certain processes and if they hang too long to kill them, but certain scripts which are expected to take a long time to let them go? Thank you Richard (4 Replies)
Discussion started by: ukndoit
4 Replies

7. Shell Programming and Scripting

Concurrent writing to file

Hi I have a ksh that can have multiple instances running at the same time. The script (each instance) uses the SAME log file to write to. Should this cause a problem or is the ksh clever enough to queue write requests to the file? Thanks. GMMIKE (2 Replies)
Discussion started by: GNMIKE
2 Replies

8. UNIX for Advanced & Expert Users

No concurrent login

Hi, I notice in my Sun Solaris 8 sparc workstatin, I am able to login concurrently using a same user ID. Is there a way to disallow this? That is, at anyone time, the user can have only 1 login session. How can it be done? Thanks (10 Replies)
Discussion started by: champion
10 Replies

9. UNIX for Dummies Questions & Answers

concurrent terminal connections and processes

we've got solaris 5.6 installed in a ultra 5 box that serves as gateway server going to the main unix box. just like to find out how to determine the number of concurrent terminal connections and processes that the ultra 5 box can handle? and handling at present time? thanks in advance! (1 Reply)
Discussion started by: eddie_villarta
1 Replies

10. UNIX for Dummies Questions & Answers

Max number of concurrent processes

OS - Sun OS7 What sources can I go to to figure out what is the maximun number of processes for OS7 with 2GB of memory. I believe it is 64K processes, but this number reflects resources being swaped. Any help is appreciated SmartJuniorUnix (1 Reply)
Discussion started by: SmartJuniorUnix
1 Replies
Login or Register to Ask a Question