Tuesday, March 20, 2007

Net seminar: Parallel computing -- Are you ready?

Intel has some free web seminars. This was the first in the series. You can signup at https://event.on24.com/event/36/88/3/rt/1/index.html?&eventid=36883

The next event is April 3, a gentle introduction to parallel softare.

Go to go-parallel.com
Or to Intel Sofware College


Here's my summary of the first one (the PDF of the sliders are available):

Tim Mattson.

Good: Moore's Law expected to continue.
Bad: Single thread performance is falling off
Worse: Growth in power usage is exponential

1) Transition to many core architectures --
driven by the fundamental physics
multiple cores per silicon
one large core = 4 power -> 2 performance
4 small cores = 4 power -> 4 performance

the trend continues - 4, 8, 16, 32, ...
general purpose and special purpose cores

2) Parallel computing is ubiquitous
Software cannot count on hardware frequency improvements
Time to think about parallel applications

Look at high performance computing for examples for standard parallel algorithms. 6 buckets. Not trivial but lots of information exists.

Look for tools to help!

--patterns for parallel programming (book)


James Reinders
3) Software Must Change mindset to parallel
serial algorithms don't necessary work well in parallel
and for sometime, we'll still have some single core systems

Tools from Intel- (lots of later talks)
optimizing and debugging tools need improvement

4) Software challenges and mindset changes

A) scaleability : should be faster on 2 processors than 1 and faster on 4 than 2. Dividing into 2 is not enough.
Best solution is to use abstractions that have scaleability worked out already. OpenMP, threaded libraries, thread profiler, Intel Threading Building Blocks

being able to visualize the cores is essential

Threading Building Blocks - (a seminar just on them soon)
has existing parallel algorithms
highly concurrent containers

Hand coding may not be as efficient as using tools such as building blocks and OpenMP as the cores approach 8 and beyond

B) correctness : race-conditions and deadlocks
deadlock - a is waiting for b but b is waiting for a
livelock - more than one thread is waiting for some thing that won't happen
race condition - thread should be waiting but doesn't

debugging tools haven't dealt with this because these conditions don't happen in serial applications
intel thread checker - dynamically analyze code to find deadlocks and race conditions

Automated detection is essential because the risk is high.

C) Maintainability

calling threads (pthreads/window threads) yourself is like using assembly. increases maintenance issues as more cores become available.

Use OpenMP, MPI (lots of cores), language extensions


5) Programmer challenge
- think about parallel (or perish)
- learn about scalability, correctness, maintainability


6) Questions and Answers

OpenMP vs Intel Thread Building Blocks
- both have there place. openMP is almost everywhere.
- Thread building blocks can be used with OpenMP
- TBB is like a standard template library for parallel computing
The intel compiler has a switch to have it look for parallel opportunities


7) Demo code
Destroy the Castle game:
http://www3.intel.com/cd/ids/developer/asmo-na/eng/337398.htm



-- The bias of the presentation is to use Intel tools. But overall the concepts are important. Many of the later topics go into specifics on the tools.

No comments: