Sponsored Content
Special Forums UNIX and Linux Applications High Performance Computing Massively parallel on single core? Post 302395115 by Andre_Merzky on Monday 15th of February 2010 04:23:34 AM
Old 02-15-2010
Hi Neo,

thanks for your reply!

I agree abut your remark as distributed architectures. This is my day-job, and I like it a lot :-)

I did not make the problem clear enough I think: the workload I am talking about are mostly idle jobs, so the CPU and memory load for each job is *very* low. Yes, I can beat the problem with more cores or nodes, but that seems very much like a waste, as those would be all idling most of the time.

Assume you plan for 1000 threads per core, and use quad code nodes - that would require 25 nodes which all idle all day long :-(

Some more detail, if that helps: the idle processes/threads are basically watchers, which represent a CPU/Memory heavy remote job they spawned, and whose state they are watching. Only when that state changes they become active, and kick of data movements or spawn new jobs.

We can't control the design of the remote job startup API very well (third party, synchronous API only), thus our technical options for obtaining state information about those jobs are limited, and boil down to
Code:
void * run_job (void * data)
{
   // this call runs a remote job, and blocks for hours
   remote_api_call (data);
   store_output_data (data);
}

#define NJOBS 100000

int main ()
{
  pthread_t threads[NJOBS]
  for ( int i = 0; i < NJOBS; i++ )
  {
     pthread_create (threads[i],  run_job, ...)
  }

  for ( int i = 0; i < NJOBS; i++ )
  {
     pthread_join (threads[i]);
  }
}

So, I can throw 25 nodes on that large for loop, and that is what we do basically - but what a waste...

The *real* workload are 100.000 CPU/Memory heavy remote jobs, which have sufficient resources to run concurrently. I am talking about the management side (our workflow engine).

Thanks, Andre.
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

parallel processing

Hi I want to run two shell script files parallely. These two scripts are interacting with the database. can any body help on this Pls Regards Audippa naidu.M (3 Replies)
Discussion started by: audippa
3 Replies

2. UNIX for Dummies Questions & Answers

difference between Dual-core & Core-to-duo

Can anybody tell What is the exact difference between a Dual-core processor and a Core-to-duo processor ?Advance thanks to all my friends. (1 Reply)
Discussion started by: Ajith kumar.G
1 Replies

3. Programming

how to know the application run on which core, and run how many times on this core

I have a dual core pc, I write a application with two child process. I know I can add sched_get_cpu to know the process run on which core, but, it just when the sched_get_cpu is called, it will tell me the result, my quesion is how to know the child proceess spend how many times on one core. (2 Replies)
Discussion started by: yanglei_fage
2 Replies

4. Shell Programming and Scripting

Replace single quote with two single quotes in perl

Hi I want to replace single quote with two single quotes in a perl string. If the string is <It's Simpson's book> It should become <It''s Simpson''s book> (3 Replies)
Discussion started by: DushyantG
3 Replies

5. Shell Programming and Scripting

Multiple lines in a single column to be merged as a single line for a record

Hi, I have a requirement with, No~Dt~Notes 1~2011/08/1~"aaa bbb ccc ddd eee fff ggg hhh" Single column alone got splitted into multiple lines. I require the output as No~Dt~Notes 1~2011/08/1~"aaa<>bbb<>ccc<>ddd<>eee<>fff<>ggg<>hhh" mean to say those new lines to be... (1 Reply)
Discussion started by: Bhuvaneswari
1 Replies

6. Shell Programming and Scripting

For loop in parallel

Hello, My script shell is: for i in $(seq $nb_lignes) do //command java done Please, how can i execute all iteration in parallel ? Thank you so much. (9 Replies)
Discussion started by: chercheur857
9 Replies

7. Shell Programming and Scripting

How to replace file massively?

Hi Gurus, I need to change a large amout of file name's. for example: current file name: file_ABCDE_sufix.txt I need to change them as file_FGHIGHKE_sufix.txt. Is there any way I can change them with script. Thanks in advance (1 Reply)
Discussion started by: ken6503
1 Replies

8. Shell Programming and Scripting

Paste 2 single column files to a single file

Hi, I have 2 csv/txt files with single columns. I am trying to merge them using paste, but its not working.. output3.csv: flowerbomb everlon-jewelry sofft steve-madden dolce-gabbana-watchoutput2.csv: http://www1.abc.com/cms/slp/2/Flowerbomb http://www1.abc.com/cms/slp/2/Everlon-Jewelry... (5 Replies)
Discussion started by: ajayakunuri
5 Replies

9. Shell Programming and Scripting

Python GNU parallel single command on multiple cores

Hello, I have a 4 core machine. Here is my initial script cd /work/ python script.py input.txt output.txt 1 2 3 This script runs for 1.5hrs. So I read across the web and figured out that you can use GNU parallel to submit multiple jobs using parallel. But I am not sure if I can run... (4 Replies)
Discussion started by: jacobs.smith
4 Replies
ns_job(3aolserver)					    AOLserver Built-In Commands 					ns_job(3aolserver)

__________________________________________________________________________________________________________________________________________________

NAME
ns_job - commands SYNOPSIS
ns_job option ?arg arg ...? ns_job create ?-desc description? queueId ?maxthreads? ns_job queue ?-detached? queueId script ns_job wait ?-timeout seconds:microseconds? queueId jobId ns_job waitany ?-timeout seconds:microseconds? queueId ns_job cancel queueId jobId ns_job delete queueId ns_job jobs queueId ns_job queues ns_job threadlist ns_job queuelist ns_job joblist ns_job genid _________________________________________________________________ DESCRIPTION
ns_job manages a thread pool and a set of named "queues". Queues have a max number of threads and when the current number of running thread reaches "max" then jobs are queued. New threads are created when there are less than maxthread number of idle threads. OPTIONS
create create ?-desc description? queueId ?maxthreads? Create a new job queue called queueId. If maxthreads is not specified, then the default of 4 is used. queue queue ?-detached? queueId script Add a new job to the queue. If there are less than maxthreads current running then the job will be started. If there are maxthreads currently running then this new job will be queued. If detached is true, then the job will be cleaned up when it completes; no wait will be necessary. The new job's ID is returned. wait wait ?-timeout seconds:microseconds? queueId jobId Wait for the specified queued or running job to finish. wait returns the results of the script. An error is thrown if the specified timeout period is reached. waitany waitany ?-timeout seconds:microseconds? queueId Wait for any job on the queue complete. An error is thrown if the specified timeout period is reached. cancel cancel queueId jobId Remove the specified job from the queue. If the job is currently running, then the job will be removed from the queue when it com- pletes. 1 (true) is returned if the job is currently running and can not be cancelled. delete delete queueId Request that the specified queue be deleted. The queue will only be deleted when all jobs are removed. jobs jobs queueId Return a list of the job IDs. queues Returns a list of the queues IDs. threadlist Returns a list of the thread pool's fields. maxthreads Max number of threads for all the queues in the thread pool. numthreads Number of allocated threads. numidle Number of currently idle threads. req stop The thread pools is being stopped. This probably means that the server is shutting down. queuelist Returns a list of the queues. A queue has the following fields: name Name of the queue. desc Description of the queue. maxthreads Max number of threads to run for this queue. numrunning Number of currently running jobs in this queue. REQ delete Someone requested this queue be deleted. Queue will not be deleted until all the jobs on the queue are removed. joblist Returns a list the jobs in the specified queue. A job has the following fields: id Job's ID state scheduled The job is schedule to run. running The job is currently running. done The job is has completed. results If the job has completed, then this field will contain the results. If the job is running or scheduled to run, then this will contain the script. code When the job is done, this will contain the return code. Codes TCL_OK TCL_ERROR TCL_RETURN TCL_BREAK TCL_CONTINUE TYPE nondetached detached REQ none wait cancel genid Generate a new unique ID. This new ID can be used as the queue ID without conflicting with any other queue ID. BUGS
SEE ALSO
nsd(1), info(n) KEYWORDS
ns_job AOLserver 4.0 ns_job(3aolserver)
All times are GMT -4. The time now is 01:39 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy