Sponsored Content
Top Forums Shell Programming and Scripting Parallelize bash commands/jobs Post 302974567 by stomp on Tuesday 31st of May 2016 08:17:15 PM
Old 05-31-2016
Quote:
So I want to restrict the jobs less than the maximum cores of the server.
You have a basic misunderstanding here. The number of cores is mostly irrelevant here. The remaining relevance of the number of cores is the huge amount of processes you want (not) to spawn here. Here indeed the number of cores does help. And yes spawning some 100 processes at once puts a considerable load on the server.

What is far more of importance, is the fact, that not cpu cores matters most, but I/O capacity does. And if you spawn multiple excessive I/O hungry tasks, that will hurt server performance much more(It may even kill the server) than the use of cpu-power, because cpu power is what this machine have more than 50 times of what is needed.

Furthermore an issue of I/O is the matter of fact, that especially hard drives are not faster if you spawn multiple I/O stressing tasks. On the contrary: Parallelizing this slows things further down. If you let one process get the full I/O power the hard disk(array), it may read/write continously with maximum speed(You get maximum Throughput if you either only read or write sequentially in one process from/to one disk(array). "copy" or "cat" reads and writes simultaneously). If you spawn e. g. 12 parallel I/O stressing tasks, the head of the hard disk has to do a lot of seeking: Continuosly jumping between the different locations of all 12 different processes data files. That's VERY costly regarding performance and greatly INCREASES the runtime of your script.

----

Your testcase script does not create any suprising nor relevant result because "sleep" is not creating any I/O which is what your real script does.

---

Thanks for posting the real names in your script, because this sheds a little more light on your case.

The question arising is: What do you want to achieve by concatenating .gz files? In my experience, you render the data in the resulting files unusable, because no tool known to me can separate such files into single .gz-archives again.

---

It seems to be the best, if you go back another couple of steps and tell us what you are trying to achieve in this whole process. What do you want to do and what's your current plan to achieve your goal?

I have a hunch that the current method is more like carrying a bicycle and that you can get a lot faster towards your goal if you start riding it.

Last edited by stomp; 06-01-2016 at 07:19 AM..
This User Gave Thanks to stomp For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

background jobs exit status and limit the number of jobs to run

i need to execute 5 jobs at a time in background and need to get the exit status of all the jobs i wrote small script below , i'm not sure this is right way to do it.any ideas please help. $cat run_job.ksh #!/usr/bin/ksh #################################### typeset -u SCHEMA_NAME=$1 ... (1 Reply)
Discussion started by: GrepMe
1 Replies

2. Shell Programming and Scripting

Can BASH execute commands on a remote server when the commands are embedded in shell

I want to log into a remote server transfer over a new config and then backup the existing config, replace with the new config. I am not sure if I can do this with BASH scripting. I have set up password less login by adding my public key to authorized_keys file, it works. I am a little... (1 Reply)
Discussion started by: bash_in_my_head
1 Replies

3. Shell Programming and Scripting

commands to be executed in order for a batch jobs!

Hi All, I am trying to run this script. I have a small problem: each "./goada.sh" command when done produces three files (file1, file2, file3) then they are moved to their respective directory as can be seem from this script snippet here. The script goada.sh sends some commands for some... (1 Reply)
Discussion started by: faizlo
1 Replies

4. Shell Programming and Scripting

General Q: how to run/schedule a php script from cron jobs maybe via bash from shell?

Status quo is, within a web application, which is coded completely in php (not by me, I dont know php), I have to fill out several fields, and execute it manually by clicking the "go" button in my browser, several times a day. Thats because: The script itself pulls data (textfiles) from a... (3 Replies)
Discussion started by: lowmaster
3 Replies

5. Shell Programming and Scripting

help to parallelize work on thousands of files

I need to find a smarter way to process about 60,000 files in a single directory. Every night a script runs on each file generating a output on another directory; this used to take 5 hours, but as the data grows it is taking 7 hours. The files are of different sizes, but there are 16 cores... (10 Replies)
Discussion started by: vhope07
10 Replies

6. Shell Programming and Scripting

waiting on jobs in bash, allowing limited parallel jobs at one time, and then for all to finish

Hello, I am running GNU bash, version 3.2.39(1)-release (x86_64-pc-linux-gnu). I have a specific question pertaining to waiting on jobs run in sub-shells, based on the max number of parallel processes I want to allow, and then wait... (1 Reply)
Discussion started by: srao
1 Replies

7. Shell Programming and Scripting

Parallelize a task that have for

Dear all, I'm a newbie in programming and I would like to know if it is possible to parallelize the script: for l in {1..1000} do cut -f$l quase2 |tr "\n" "," |sed 's/$/\ /g' |sed '/^$/d' >a_$l.t done I tried: for l in {1..1000} do cut -f$l quase2 |tr "\n" "," |sed 's/$/\ /g' |sed... (7 Replies)
Discussion started by: valente
7 Replies

8. Shell Programming and Scripting

Bash scripts as commands

Hello, the bulk of my work is run by scripts. An example is as such: #!/bin/bash awk '{print first line}' Input.in > Intermediate.ter awk '{print second line}' Input.in > Intermediate_2.ter command Intermediate.ter Intermediate_2.ter > Output.out It works the way I want it to, but it's not... (1 Reply)
Discussion started by: Leo_Boon
1 Replies

9. Shell Programming and Scripting

Shell script to run multiple jobs and it's dependent jobs

I have multiple jobs and each job dependent on other job. Each Job generates a log and If job completed successfully log file end's with JOB ENDED SUCCESSFULLY message and if it failed then it will end with JOB ENDED with FAILURE. I need an help how to start. Attaching the JOB dependency... (3 Replies)
Discussion started by: santoshkumarkal
3 Replies

10. Shell Programming and Scripting

How to run several bash commands put in bash command line?

How to run several bash commands put in bash command line without needing and requiring a script file. Because I'm actually a windows guy and new here so for illustration is sort of : $ bash "echo ${PATH} & echo have a nice day!" will do output, for example:... (4 Replies)
Discussion started by: abdulbadii
4 Replies
LIBBASH(7)							  libbash Manual							LIBBASH(7)

NAME
libbash -- A bash shared libraries package. DESCRIPTION
libbash is a package that enables bash dynamic-like shared libraries. Actually its a tool for managing bash scripts whose functions you may want to load and use in scripts of your own. It contains a 'dynamic loader' for the shared libraries ( ldbash(1)), a configuration tool (ldbashconfig(8)), and some libraries. Using ldbash(1) you are able to load loadable bash libraries, such as getopts(1) and hashstash(1). A bash shared library that can be loaded using ldbash(1) must answer 4 requirments: 1. It must be installed in $LIBBASH_PREFIX/lib/bash (default is /usr/lib/bash). 2. It must contain a line that begins with '#EXPORT='. That line will contain (after the '=') a list of functions that the library exports. I.e. all the function that will be usable after loading that library will be listed in that line. 3. It must contain a line that begins with '#REQUIRE='. That line will contain (after the '=') a list of bash libraries that are required for our library. I.e. every bash library that is in use in our bash library must be listed there. 4. The library must be listed (For more information, see ldbashconfig(8)). Basic guidelines for writing library of your own: 1. Be aware, that your library will be actually sourced. So, basically, it should contain (i.e define) only functions. 2. Try to declare all variables intended for internal use as local. 3. Global variables and functions that are intended for internal use (i.e are not defined in '#EXPORT=') should begin with: __<library_name>_ For example, internal function myfoosort of hashstash library should be named as __hashstash_myfoosort This helps to avoid conflicts in global name space when using libraries that come from different vendors. 4. See html manual for full version of this guide. AUTHORS
Hai Zaar <haizaar@haizaar.com> Gil Ran <ril@ran4.net> SEE ALSO
ldbash(1), ldbashconfig(8), getopts(1), hashstash(1) colors(1) messages(1) urlcoding(1) locks(1) Linux Epoch Linux
All times are GMT -4. The time now is 07:36 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy