Sponsored Content
Special Forums UNIX and Linux Applications High Performance Computing Massively parallel on single core? Post 302395115 by Andre_Merzky on Monday 15th of February 2010 04:23:34 AM
Old 02-15-2010
Hi Neo,

thanks for your reply!

I agree abut your remark as distributed architectures. This is my day-job, and I like it a lot :-)

I did not make the problem clear enough I think: the workload I am talking about are mostly idle jobs, so the CPU and memory load for each job is *very* low. Yes, I can beat the problem with more cores or nodes, but that seems very much like a waste, as those would be all idling most of the time.

Assume you plan for 1000 threads per core, and use quad code nodes - that would require 25 nodes which all idle all day long :-(

Some more detail, if that helps: the idle processes/threads are basically watchers, which represent a CPU/Memory heavy remote job they spawned, and whose state they are watching. Only when that state changes they become active, and kick of data movements or spawn new jobs.

We can't control the design of the remote job startup API very well (third party, synchronous API only), thus our technical options for obtaining state information about those jobs are limited, and boil down to
Code:
void * run_job (void * data)
{
   // this call runs a remote job, and blocks for hours
   remote_api_call (data);
   store_output_data (data);
}

#define NJOBS 100000

int main ()
{
  pthread_t threads[NJOBS]
  for ( int i = 0; i < NJOBS; i++ )
  {
     pthread_create (threads[i],  run_job, ...)
  }

  for ( int i = 0; i < NJOBS; i++ )
  {
     pthread_join (threads[i]);
  }
}

So, I can throw 25 nodes on that large for loop, and that is what we do basically - but what a waste...

The *real* workload are 100.000 CPU/Memory heavy remote jobs, which have sufficient resources to run concurrently. I am talking about the management side (our workflow engine).

Thanks, Andre.
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

parallel processing

Hi I want to run two shell script files parallely. These two scripts are interacting with the database. can any body help on this Pls Regards Audippa naidu.M (3 Replies)
Discussion started by: audippa
3 Replies

2. UNIX for Dummies Questions & Answers

difference between Dual-core & Core-to-duo

Can anybody tell What is the exact difference between a Dual-core processor and a Core-to-duo processor ?Advance thanks to all my friends. (1 Reply)
Discussion started by: Ajith kumar.G
1 Replies

3. Programming

how to know the application run on which core, and run how many times on this core

I have a dual core pc, I write a application with two child process. I know I can add sched_get_cpu to know the process run on which core, but, it just when the sched_get_cpu is called, it will tell me the result, my quesion is how to know the child proceess spend how many times on one core. (2 Replies)
Discussion started by: yanglei_fage
2 Replies

4. Shell Programming and Scripting

Replace single quote with two single quotes in perl

Hi I want to replace single quote with two single quotes in a perl string. If the string is <It's Simpson's book> It should become <It''s Simpson''s book> (3 Replies)
Discussion started by: DushyantG
3 Replies

5. Shell Programming and Scripting

Multiple lines in a single column to be merged as a single line for a record

Hi, I have a requirement with, No~Dt~Notes 1~2011/08/1~"aaa bbb ccc ddd eee fff ggg hhh" Single column alone got splitted into multiple lines. I require the output as No~Dt~Notes 1~2011/08/1~"aaa<>bbb<>ccc<>ddd<>eee<>fff<>ggg<>hhh" mean to say those new lines to be... (1 Reply)
Discussion started by: Bhuvaneswari
1 Replies

6. Shell Programming and Scripting

For loop in parallel

Hello, My script shell is: for i in $(seq $nb_lignes) do //command java done Please, how can i execute all iteration in parallel ? Thank you so much. (9 Replies)
Discussion started by: chercheur857
9 Replies

7. Shell Programming and Scripting

How to replace file massively?

Hi Gurus, I need to change a large amout of file name's. for example: current file name: file_ABCDE_sufix.txt I need to change them as file_FGHIGHKE_sufix.txt. Is there any way I can change them with script. Thanks in advance (1 Reply)
Discussion started by: ken6503
1 Replies

8. Shell Programming and Scripting

Paste 2 single column files to a single file

Hi, I have 2 csv/txt files with single columns. I am trying to merge them using paste, but its not working.. output3.csv: flowerbomb everlon-jewelry sofft steve-madden dolce-gabbana-watchoutput2.csv: http://www1.abc.com/cms/slp/2/Flowerbomb http://www1.abc.com/cms/slp/2/Everlon-Jewelry... (5 Replies)
Discussion started by: ajayakunuri
5 Replies

9. Shell Programming and Scripting

Python GNU parallel single command on multiple cores

Hello, I have a 4 core machine. Here is my initial script cd /work/ python script.py input.txt output.txt 1 2 3 This script runs for 1.5hrs. So I read across the web and figured out that you can use GNU parallel to submit multiple jobs using parallel. But I am not sure if I can run... (4 Replies)
Discussion started by: jacobs.smith
4 Replies
mcx diameter(1) 						  USER COMMANDS 						   mcx diameter(1)

  NAME
      mcx diameter - compute the diameter of a graph

  SYNOPSIS
      mcx diameter [options]

      mcxdiameter  is  not  in actual fact a program. This manual page documents the behaviour and options of the mcx program when invoked in mode
      diameter. The options -h, --apropos, --version, -set, --nop, -progress <num> are accessible in all mcx modes. They are described in the  mcx
      manual page.

      mcx  diameter  [-abc <fname> (specify label input)] [-imx <fname> (specify matrix input)] [-o <fname> (output file name)] [-tab <fname> (use
      tab file)] [-t <int> (use <int> threads)] [-J <intJ> (a total of <intJ> jobs are used)] [-j <intj> (this job has index  <intj>)]	[--summary
      (output diameter and average shortest path length)] [--list (list eccentricity for all nodes)] [-h (print synopsis, exit)] [--apropos (print
      synopsis, exit)] [--version (print version, exit)]

  DESCRIPTION
      mcx diameter computes the diameter of a graph. The input graph should be symmetric. Results will be unpredictable for directed graphs.   For
      label input this is irrelevant as mcx diameter will create a symmetric graph from the input.

      The  input  graph/matrix,  if specified with the -imx option, has to be in mcl matrix/graph format. You can use label input instead by using
      the -abc option.	Refer to mcxio(5) for a description of these two input formats.  By default mcx diameter  reads  from  STDIN  and  expects
      matrix format.  To specify label input from STDIN use -abc -.

  OPTIONS
      -abc <fname> (label input)
	The file name for input that is in label format.

      -imx <fname> (input matrix)
	The file name for input that is in mcl native matrix format.

      -o <fname> (output file name)
	The name of the file to write output to.

      -tab <fname> (use tab file)
	This  option causes the output to be printed with the labels found in the tab file.  With -abc this option will, additionally, construct a
	graph only on the labels found in the tab file.  If this option is used in conjunction with -imx the tab domain and the matrix domain  are
	required to be identical.

      -t <int> (use <int> threads)
      -J <intJ> (a total of <intJ> jobs are used)
      -j <intj> (this job has index <intj>)
	Computing  the diameter of a graph is time-intensive.  If you have multiple CPUs available consider using as many threads. Additionally it
	is possible to spread the computation over multiple jobs/machines.  Conceptually, each job takes a number of threads from the total thread
	pool.  If job control is used (the -J option is used) then the number of jobs should not exceed the number of threads. The total number of
	threads divided by the total number of jobs defines the number of threads that will be used by the current job.  Additionally, the  number
	of threads specified signifies the total added amount of all threads across all machines and must be the same for all jobs. This number is
	used by each job to infer its own set of tasks.  The following set of options, if given to as many commands, defines three jobs, each run-
	ning four threads.

	-t 12 -G 3 -g 0
	-t 12 -G 3 -g 1
	-t 12 -G 3 -g 2

      --list (list eccentricity for all nodes)
      --summary (output diameter and average eccentricity)
	The  default  mode  is --list, which results in output of the eccentricity of all nodes. The eccentricity of a node is the distance to any
	node that is the furthest away from it. The diameter of a graph is the maximum of the eccentricity taken over all nodes in  a  graph.	In
	this mode mcx diameter will not output the diameter itself. Use --summary to output just the diameter and the average eccentricity.

  SEE ALSO
      mcxio(5), and mclfamily(7) for an overview of all the documentation and the utilities in the mcl family.

  mcx diameter 12-068						      8 Mar 2012						     mcx diameter(1)
All times are GMT -4. The time now is 02:47 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy