Inconsistency with parallel run


 
Thread Tools Search this Thread
Operating Systems Linux Inconsistency with parallel run
# 1  
Old 10-30-2017
Inconsistency with parallel run

Hi All,

I am running a parallel processing on aggregating a file. I am splitting the process into 7 separate parallel process and processing the same input file and the process will do the same for each 7 run. The issue I am having is for some reason the 1st parallel processes complete first with minimum time and the second complete as second and so on. Each process completion having significant difference in time.

I tried to look CPU usage when process with top command all the process is occupying 97% of CPU not sure why there is a difference between each parallel run.

Is there a way I can trace the process and find whether it is problem with IO/ Memory or CPU.

Note: Each process will read the same file from NAS mount and do the aggregation. I am using RedHAT

Thanks
Arun
# 2  
Old 10-30-2017
More details, please.
That file almost certainly will be buffered locally when being accessed from NAS, so the first process should take longest. Will the file be updated / written back? Per process? Are the processes doing identical operations on the file? Do these influence each other? How do user and system times compare between processes? Do you have lock information available?
# 3  
Old 10-30-2017
Why would they be identical? Especially if they're I/O bound. More details needed.
# 4  
Old 10-30-2017
No we are not wring the file . Basically we are reading the file from NAS and then comparing the same file with qualified records and doing the aggregation.

The file in NAS is the full set. Where it have details about the customer and the file we will be compared will be SAN. Which have the transaction record of customer. The file from NAS will be compared with the transaction record and then the aggrigation happens. We are splitting that into 7 parallel so that we can achieve performance.
# 5  
Old 10-30-2017
Are the seven splits identical? Sounds like you are breaking a transaction file into seven parts to lookup against a master file.
Because I cannot fathom any reason to do the same thing seven times - my only guess is that you are doing it seven times BUT with different data elements.
Please provide more details.
# 6  
Old 10-30-2017
Below is what I traced back. Basically there will be huge file we are processing that in parallel . The file transaction_data.dat will be compared with the spend.dat. The file spend is a small file. We will match the transaction between these file and do the aggregation

I made the transaction_data.dat in SAN . Even with that I am seeing the first parallel process is taking less time and the process time increase with the split going on

Below is the log on the process. I see the process split the file almost into equal split but not sure why the process different between each parallel run
Quote:
Process 1:
(21899) Total process time = 102.550
(21899) Final Elapsed time = 103.000
(21899) Position Start 0Position End 4700904

Process2:
(21900) Total process time = 193.660
(21900) Final Elapsed time = 195.000
(21900) Position Start 4700904Position End 9401808

Process 3:
(21901) Total process time = 300.220
(21901) Final Elapsed time = 303.000
(21901) Position Start 9401808Position End 14102218

Process 4:
(21902) Total process time = 333.180
(21902) Final Elapsed time = 337.000
(21902) Position Start 14102218Position End 18802628

Process 5:
(21903) Total process time = 379.340
(21903) Final Elapsed time = 383.000
(21903) Position Start 18802628Position End 23504026

Process 6:
(21904) Total process time = 423.610
(21904) Final Elapsed time = 428.000
(21904) Position Start 23504026Position End 28204436

Process 7:
(21905) Total process time = 411.130
(21905) Final Elapsed time = 415.000
(21905) Position Start 28204436Position End 32905093

Process 8:
(21906) Total process time = 532.900
(21906) Final Elapsed time = 538.000

Last edited by arunkumar_mca; 10-31-2017 at 12:03 PM..
# 7  
Old 11-02-2017
Quote:
Originally Posted by arunkumar_mca
Even with that I am seeing the first parallel process is taking less time and the process time increase with the split going on
Once you've maxed out your I/O bandwidth, adding more processes will just make a task slower. How many processes it takes to max out your I/O bandwidth could well be "one". Spinning disks especially lose a lot of bandwidth when split between competing tasks.

Beyond that, it's difficult to say what's happening. We still don't know what you're doing. "Processing" is a fine word but tells us little.
This User Gave Thanks to Corona688 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Run script in parallel in while loop

Hi I am running a loop which actually runs same script for different argument value passed to it. while read repID do echo "Starting for $repID"; date; perl process_report.pl $repID done<${FILE_TO_READ} However this runs in sequence. I want the loop to not to wait for perl to... (3 Replies)
Discussion started by: dashing201
3 Replies

2. Shell Programming and Scripting

Run the for loop in parallel

I have the below code which runs on multiple databases , but this runs one-after-one. I will need this to run in parallel so that i could save a lot of time. Please help!!! Thanks in advance for Db in `cat /var/opt/oracle/oratab |egrep -v "ASM" |grep -v \# |cut -d\: -f1` do { export... (5 Replies)
Discussion started by: jjoy
5 Replies

3. Windows & DOS: Issues & Discussions

To run job in parallel in batch

Hi, I am using a batch file to run 2 or more shutdown batch for each of my server like below: Shutdown_serverA.bat Shutdown_serverB.bat ... Is there anyway i can do this in parallel instead of serially:confused: ServerA & ServerB shutdown at the same time in one click (batch). (4 Replies)
Discussion started by: beginningDBA
4 Replies

4. Shell Programming and Scripting

Run a script in parallel

Hey, I am new to UNIX scripting . I have script (ex: start_script) that starts a job in 10 different servers one server after another.Now I want to modify the script so that the script starts the job in all servers parallely (at a time in all servers).and I need the choice of selecting the... (3 Replies)
Discussion started by: mpspsm
3 Replies

5. Shell Programming and Scripting

Run in series and Parallel

I have a list with four dates say load_date.lst contains 2010-01-01 2010-01-31 2010-03-01 2010-03-31 2010-05-01 2010-05-31 2010-07-01 2010-07-31 And I have directory /lll/src/sql with set of sql's 1_load.sql 2_load.sql 3_load.sql I want to run the sql'in series with respective to... (3 Replies)
Discussion started by: sol_nov
3 Replies

6. Shell Programming and Scripting

script - how to prevent in parallel run

I have one shell script which is being accessed by many jobs at same time. I want to make the script such that , other job should wait for the script if script is being used by some other job. Is there any way to implement it in script level ? Gops (1 Reply)
Discussion started by: Gopal_Engg
1 Replies

7. Shell Programming and Scripting

Run a command in parallel

Hi all, How do i run a command in parallel 50 times and capturing the result of each run in a separate file Eg: myApp arg1 > run1.txt myApp arg1 > run2.txt ::::::::::::::::::::::::: ::::::::::::::::::::::::: myApp arg1 > run50.txt The above way is sequential. ... (3 Replies)
Discussion started by: jakSun8
3 Replies

8. UNIX for Advanced & Expert Users

Run a script parallel for a month's worth:

Is there a utility that can be used in a shell script that would run a .sql file for 30 or 31 days in a month at the same time parallely. Please Advice. Thanks SD12. (2 Replies)
Discussion started by: sd12
2 Replies

9. Shell Programming and Scripting

Run a same script in parallel with diffs parameters

i have script say some_script.ksh that takes an argument I need to run some_script.ksh in background parallely at the sametime with different arguments. Once all the background jobs complete, i need to run this script again in parallel with another 5 set of arguments. Would really... (1 Reply)
Discussion started by: hyennah
1 Replies

10. Shell Programming and Scripting

How to run processes in parallel?

In a korn shell script, how can I run several processes in parallel at the same time? For example, I have 3 processes say p1, p2, p3 if I call them as p1.ksh p2.ksh p3.ksh they will run after one process finishes. But I want to run them in parallel and want to display "Process p1... (3 Replies)
Discussion started by: sbasak
3 Replies
Login or Register to Ask a Question