Sponsored Content
Top Forums Shell Programming and Scripting Parallelize bash commands/jobs Post 302974562 by yifangt on Tuesday 31st of May 2016 06:51:40 PM
Old 05-31-2016
Thanks!
Quote:
If you're on a multi-user-system it may be not such a good idea to parallel i/o-heavy processes because you may completely eat up the available i/o of your machine and cause extreme loads at within your server. This could slow down the whole server very greatly.
This is exactly when I use bash-only at the background all at once, and I received warning from the admin for causing trouble with the server! So I want to restrict the jobs less than the maximum cores of the server.
At this moment I do not care too much about the efficiency yet, although the cat/cp do eat a lot the I/O capacity.
Yes, the ln -s part is not the big deal. The real challenge is the cat and cp parts, where big files are involved to make the processes slow that's why I need parallel.
My real code is pretty much the same as the example, and here is the first several rows for the portion with cat:
Code:
# cd /storage/scottJ/data/raw_reads/resequenced
cat EMS01a_Early_Rice_*_R1.fq.gz EMS01b_Early_Rice_*_R1.fq.gz > ../Early_Rice_1_R1.fq.gz
cat EMS01a_Early_Rice_*_R2.fq.gz EMS01b_Early_Rice_*_R2.fq.gz > ../Early_Rice_1_R2.fq.gz

cat EMS02a_Early_Flax_*_R1.fq.gz EMS02b_Early_Flax_*_R1.fq.gz > ../Early_Flax_1_R1.fq.gz
cat EMS02a_Early_Flax_*_R2.fq.gz EMS02b_Early_Flax_*_R2.fq.gz > ../Early_Flax_1_R2.fq.gz
......

I was thinking the option is straight-forward, and my impression is parallel is for jobs with similar pattern of the scripts/options of the commands. I went to gnu website and other parallel tutorial, but could not spot the corresponding part for this case.
Also I found this type of work is quite common for me to process hundreds of samples, which takes at least a couple of hours when I used to do it one-by-one and let the run goes overnight. This is not good if I want the results right away, which can be achievable using 16~20 cores by parallel if the scaling is proportional as 16~20x.
Thanks again if there is option for my situation that I may have missed.

---------- Post updated at 06:51 PM ---------- Previous update was at 05:47 PM ----------

Did an experiment and found out the simple answer for my example is
Code:
cat commands_script.sh | parallel -j 16

Here is my test.
Code:
commands_script.sh:
echo test1; sleep 15s
echo test2; sleep 15s 
echo test3; sleep 15s 
echo test4; sleep 15s 
echo test5; sleep 115s  

echo test6; sleep 15s
echo test7; sleep 15s 
echo test8; sleep 15s 
echo test9; sleep 15s 
echo test10; sleep 115s  

echo test11; sleep 15s
echo test12; sleep 15s 
echo test13; sleep 15s 
echo test14; sleep 15s 
echo test15; sleep 115s

Code:
$ time cat commands_script.sh | parallel -j 16
test1
test2
test3
test4
test6
test7
test8
test9
test11
test12
test13
test14
test5
test10
test15

real    1m56.053s
user    0m0.160s
sys    0m0.132s

The order of the echoed strings is what I expected!
Code:
$ time bash commands_script.sh
test1
test2
test3
test4
test5
test6
test7
test8
test9
test10
test11
test12
test13
test14
test15

real    8m45.042s
user    0m0.012s
sys    0m0.004s

Using parallel took 1m56.053s, whereas bash-only took 8m45.042s as it is sequential sum of each process.
And I appreciate any insight/comment if I missed anything!

Last edited by RudiC; 06-01-2016 at 01:08 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

background jobs exit status and limit the number of jobs to run

i need to execute 5 jobs at a time in background and need to get the exit status of all the jobs i wrote small script below , i'm not sure this is right way to do it.any ideas please help. $cat run_job.ksh #!/usr/bin/ksh #################################### typeset -u SCHEMA_NAME=$1 ... (1 Reply)
Discussion started by: GrepMe
1 Replies

2. Shell Programming and Scripting

Can BASH execute commands on a remote server when the commands are embedded in shell

I want to log into a remote server transfer over a new config and then backup the existing config, replace with the new config. I am not sure if I can do this with BASH scripting. I have set up password less login by adding my public key to authorized_keys file, it works. I am a little... (1 Reply)
Discussion started by: bash_in_my_head
1 Replies

3. Shell Programming and Scripting

commands to be executed in order for a batch jobs!

Hi All, I am trying to run this script. I have a small problem: each "./goada.sh" command when done produces three files (file1, file2, file3) then they are moved to their respective directory as can be seem from this script snippet here. The script goada.sh sends some commands for some... (1 Reply)
Discussion started by: faizlo
1 Replies

4. Shell Programming and Scripting

General Q: how to run/schedule a php script from cron jobs maybe via bash from shell?

Status quo is, within a web application, which is coded completely in php (not by me, I dont know php), I have to fill out several fields, and execute it manually by clicking the "go" button in my browser, several times a day. Thats because: The script itself pulls data (textfiles) from a... (3 Replies)
Discussion started by: lowmaster
3 Replies

5. Shell Programming and Scripting

help to parallelize work on thousands of files

I need to find a smarter way to process about 60,000 files in a single directory. Every night a script runs on each file generating a output on another directory; this used to take 5 hours, but as the data grows it is taking 7 hours. The files are of different sizes, but there are 16 cores... (10 Replies)
Discussion started by: vhope07
10 Replies

6. Shell Programming and Scripting

waiting on jobs in bash, allowing limited parallel jobs at one time, and then for all to finish

Hello, I am running GNU bash, version 3.2.39(1)-release (x86_64-pc-linux-gnu). I have a specific question pertaining to waiting on jobs run in sub-shells, based on the max number of parallel processes I want to allow, and then wait... (1 Reply)
Discussion started by: srao
1 Replies

7. Shell Programming and Scripting

Parallelize a task that have for

Dear all, I'm a newbie in programming and I would like to know if it is possible to parallelize the script: for l in {1..1000} do cut -f$l quase2 |tr "\n" "," |sed 's/$/\ /g' |sed '/^$/d' >a_$l.t done I tried: for l in {1..1000} do cut -f$l quase2 |tr "\n" "," |sed 's/$/\ /g' |sed... (7 Replies)
Discussion started by: valente
7 Replies

8. Shell Programming and Scripting

Bash scripts as commands

Hello, the bulk of my work is run by scripts. An example is as such: #!/bin/bash awk '{print first line}' Input.in > Intermediate.ter awk '{print second line}' Input.in > Intermediate_2.ter command Intermediate.ter Intermediate_2.ter > Output.out It works the way I want it to, but it's not... (1 Reply)
Discussion started by: Leo_Boon
1 Replies

9. Shell Programming and Scripting

Shell script to run multiple jobs and it's dependent jobs

I have multiple jobs and each job dependent on other job. Each Job generates a log and If job completed successfully log file end's with JOB ENDED SUCCESSFULLY message and if it failed then it will end with JOB ENDED with FAILURE. I need an help how to start. Attaching the JOB dependency... (3 Replies)
Discussion started by: santoshkumarkal
3 Replies

10. Shell Programming and Scripting

How to run several bash commands put in bash command line?

How to run several bash commands put in bash command line without needing and requiring a script file. Because I'm actually a windows guy and new here so for illustration is sort of : $ bash "echo ${PATH} & echo have a nice day!" will do output, for example:... (4 Replies)
Discussion started by: abdulbadii
4 Replies
All times are GMT -4. The time now is 01:53 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy