parallel processing - Which are the common causes for non scalability of shared memory programs? -


whenever paralelizes application expected outcome decent speedup, not case.

it usual program runs in x seconds, parallelized use 8 cores not achieve x/8 seconds (optimal speedup). in extreme cases, takes more time original sequential program.

why? , importantly, how improve scalability?

there few common causes of non scalability:

  1. too synchronization: problems (and conservative programmers) require lots of synchronization between parallel tasks, eliminates of parallelism in algorithm, making slower.

1.1. make sure use minimum synchronization possible algorithm. openmp instance, simple change synchronized atomic can result in relevant difference.

1.2 worse sequential algorithm might offer better parallelism opportunities, if have chance try else might worth shot.

  1. memory bandwidth limitation: common "trivial" implementation of algorithm not optimized locality, implies heavy communication costs between processors , main memory.

2.1 optimize locality: means know application run, available cache memories , how change data structures maximize cache usage.

  1. too parallelization overhead: parallel task "small" overhead thread/process creation big compared parallel region total time, causes poor speedup or speed-down.

Comments

Popular posts from this blog

javascript - Karma not able to start PhantomJS on Windows - Error: spawn UNKNOWN -

Nuget pack csproj using nuspec -

c# - Display ASPX Popup control in RowDeleteing Event (ASPX Gridview) -