Sponsored Content
Top Forums Shell Programming and Scripting Break up file into n number of subsets and run in parallel Post 302890776 by gina.lizar on Friday 28th of February 2014 01:20:06 PM
Old 02-28-2014
Break up file into n number of subsets and run in parallel

Hi Guys,

I want to break down one of my input files into say 25 parts , run the same script in parallel and then merge the output into a single script.
I have access to computing resources that can deal with 25 files, if I just run the original file the total time is about 15 days every time.

Is this possible? So if I have an awk script gina.awk, these would be the steps.

1. Split Input.file into Input1.file, Input2.file,....Input25.file

2.
Code:
for file in Input*
do
./gina.awk $file > out_$file
done

3.
Code:
cat out* > Output.file

Is this possible? and will it help my cause in speeding up? I have access to 25 CPU cores.

Last edited by bartus11; 02-28-2014 at 02:44 PM.. Reason: Please use [code][/code] tags.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to run processes in parallel?

In a korn shell script, how can I run several processes in parallel at the same time? For example, I have 3 processes say p1, p2, p3 if I call them as p1.ksh p2.ksh p3.ksh they will run after one process finishes. But I want to run them in parallel and want to display "Process p1... (3 Replies)
Discussion started by: sbasak
3 Replies

2. Programming

how to run prog bet to break points

Hi, I have set two break points at 500 and 572 lines respectively. after running prog using (gdb) run i m on the line 500 but how two go to second breakpoints ie line 572 . when i m giving (gdb) run it is asking again to run from starting lines . (1 Reply)
Discussion started by: useless79
1 Replies

3. Shell Programming and Scripting

Run a command in parallel

Hi all, How do i run a command in parallel 50 times and capturing the result of each run in a separate file Eg: myApp arg1 > run1.txt myApp arg1 > run2.txt ::::::::::::::::::::::::: ::::::::::::::::::::::::: myApp arg1 > run50.txt The above way is sequential. ... (3 Replies)
Discussion started by: jakSun8
3 Replies

4. Shell Programming and Scripting

script - how to prevent in parallel run

I have one shell script which is being accessed by many jobs at same time. I want to make the script such that , other job should wait for the script if script is being used by some other job. Is there any way to implement it in script level ? Gops (1 Reply)
Discussion started by: Gopal_Engg
1 Replies

5. Shell Programming and Scripting

Run in series and Parallel

I have a list with four dates say load_date.lst contains 2010-01-01 2010-01-31 2010-03-01 2010-03-31 2010-05-01 2010-05-31 2010-07-01 2010-07-31 And I have directory /lll/src/sql with set of sql's 1_load.sql 2_load.sql 3_load.sql I want to run the sql'in series with respective to... (3 Replies)
Discussion started by: sol_nov
3 Replies

6. Shell Programming and Scripting

Run a script in parallel

Hey, I am new to UNIX scripting . I have script (ex: start_script) that starts a job in 10 different servers one server after another.Now I want to modify the script so that the script starts the job in all servers parallely (at a time in all servers).and I need the choice of selecting the... (3 Replies)
Discussion started by: mpspsm
3 Replies

7. Windows & DOS: Issues & Discussions

To run job in parallel in batch

Hi, I am using a batch file to run 2 or more shutdown batch for each of my server like below: Shutdown_serverA.bat Shutdown_serverB.bat ... Is there anyway i can do this in parallel instead of serially:confused: ServerA & ServerB shutdown at the same time in one click (batch). (4 Replies)
Discussion started by: beginningDBA
4 Replies

8. Shell Programming and Scripting

Run the for loop in parallel

I have the below code which runs on multiple databases , but this runs one-after-one. I will need this to run in parallel so that i could save a lot of time. Please help!!! Thanks in advance for Db in `cat /var/opt/oracle/oratab |egrep -v "ASM" |grep -v \# |cut -d\: -f1` do { export... (5 Replies)
Discussion started by: jjoy
5 Replies

9. Shell Programming and Scripting

Run script in parallel in while loop

Hi I am running a loop which actually runs same script for different argument value passed to it. while read repID do echo "Starting for $repID"; date; perl process_report.pl $repID done<${FILE_TO_READ} However this runs in sequence. I want the loop to not to wait for perl to... (3 Replies)
Discussion started by: dashing201
3 Replies

10. Linux

Inconsistency with parallel run

Hi All, I am running a parallel processing on aggregating a file. I am splitting the process into 7 separate parallel process and processing the same input file and the process will do the same for each 7 run. The issue I am having is for some reason the 1st parallel processes complete first... (7 Replies)
Discussion started by: arunkumar_mca
7 Replies
STARTPAR(8)						      System Manager's Manual						       STARTPAR(8)

NAME
startpar - start runlevel scripts in parallel SYNOPSIS
startpar [-p par] [-i iorate] [-t timeout] [-T global_timeout] [-a arg] prg1 prg2 ... startpar [-p par] [-i iorate] [-t timeout] [-T global_timeout] -M [ boot|start|stop] DESCRIPTION
startpar is used to run multiple run-level scripts in parallel. The degree of parallelism on one CPU can be set with the -p option, the default is full parallelism. An argument to all of the scripts can be provided with the -a option. Processes blocked by pending I/O will cause new process creation to be weighted by the iorate factor 800. To change this factor the option -i can be used to specify another value. The amount weight=(nblockedxiorate)/1000 will be subtracted from the total number of processes which could be started, where nblocked is the number of processes currently blocked by pending I/O. The output of each script is buffered and written when the script exits, so output lines of different scripts won't mix. You can modify this behaviour by setting a timeout. The timeout set with the -t option is used as buffer timeout. If the output buffer of a script is not empty and the last output was timeout seconds ago, startpar will flush the buffer. The -T option timeout works more globally. If no output is printed for more than global_timeout seconds, startpar will flush the buffer of the script with the oldest output. Afterwards it will only print output of this script until it is finished. The -M option switches startpar into a make(1) like behaviour. This option takes three different arguments: boot, start, and stop for reading .depend.boot or .depend.start or .depend.stop respectively in the directory /etc/init.d/. By scanning the boot and runlevel direc- tories in /etc/init.d/ it then executes the appropriate scripts in parallel. FILES
/etc/init.d/.depend.boot /etc/init.d/.depend.start /etc/init.d/.depend.stop SEE ALSO
init(8) insserv(8). COPYRIGHT
2003,2004 SuSE Linux AG, Nuernberg, Germany. 2007 SuSE LINUX Products GmbH, Nuernberg, Germany. AUTHOR
Michael Schroeder <mls@suse.de> Takashi Iwai <tiwai@suse.de> Werner Fink <werner@suse.de> Jun 2003 STARTPAR(8)
All times are GMT -4. The time now is 03:12 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy