4/6/10

Conclusions on Parallel Computing

By Asaf Shelly (21 posts) on April 9, 2010 at 11:10 am

We have been dealing with parallel computing for some while now. Some of the ideas we had at the start proved to be wrong while others are only becoming relevant in the near future. No doubt about it, parallel computing was pushed and forced into the mainstream of computing just as Object Oriented was in the previous millennia.

Some History: Hardware

The first to deal with parallel computing were hardware developers because the hardware supports multiple devices working at the same time, with different operation rates and response times. Hardware design is also Event Driven because devices work independently and issue an Interrupt event when required. The computer hardware we know today is fully parallel however it is centralized with a single CPU (Central Processing Unit) and multiple peripheral devices.

Some History: Kernel

The next to support parallel computing was the software infrastructure which in modern operating systems is the Kernel. The Kernel must support multiple events coming in the form of Hardware Interrupts and propagated upwards as Software Events. Kernels are commonly distributed in design as several Drivers can communicate with each other. The centralized object in the system is allowing communication between the drivers and supports synchronization but is not supposed to contribute to the application's business logic in any form or way.

Some History: Network

UNIX is based on services. A Service is a way to call a function over network. Network technologies required distributed design in which every element is completely parallel to the next and there is no single 'processor unit' as the system's master. UNIX took this to the next level with technologies such as services, pipes, sockets, mailslots, Fork and more. At a time when programming was a tedious work, developing an operating system to support Fork meant extensive efforts. Still UNIX had built in support for that mechanism which solves so many problems... Only we forgot how to use it and I don't remember seeing a new system design that had Fork in it.

Some History: Applications

When I just started with C programming and have just found out about threads I tried doing things in parallel just to see how it works. The result was, as you can imagine, by far worse. The application runs much slower, there are "Random Bugs" and the code looks terrible. The explanation I got was that there is only one CPU and the different threads compete over it. No Multi-Core CPU means that there is no ROI (return on investment) for using multiple threads and the large efforts required for a parallel design. The only reason to use a thread is when you really have to for example when there is need to wait for hardware or a network buffer.

Parallel Computing Today

A few years ago CPUs got to a certain hardware limitation which would have required special cooling. At this point the race to reduce silicon size and increase clock frequency has ended. Instead of spending massive amounts of silicon on the CPU for advanced algorithms to improve instruction pre-fetch, smaller and simpler CPUs are used and there is room for more CPUs on the same silicon wafer. We got the Multi-Core CPU which practically means several CPUs on the same computer.

At first the cores of a Multi-Core CPU were simpler than the single core one. These cores also operated in a much lower frequency which meant that an application designed for a single task operation had a massive performance impact when moving to a new computer, for the first time ever.

Parallel Computing has become main stream. We started with a long series of lectures about parallel computing. It seemed that people wanted to know about this subject but there was so much overhead that Parallel Computing simply scared people away. There is a huge ramp before you can be a good parallel programmer. Just as there is for object oriented programming. This meant that team leaders and architects were at the same level as beginner programmers, or perhaps with some very little advantage. Add to this the fact that there are massive amounts of code already written for a single core CPU and good advantages can be achieved after at least some re-write. Last but most important reason to reject parallel computing was that it is easier and cheaper to buy another machine than to make the best out of the CPU cores. This was actually a boost for Cloud Computing.

Who is doing Parallel Computing

There are several types of parallel computing. The hardware is parallel so the Kernel is parallel. With this type of parallelism every worker is doing something else, and workers own their resources instead of sharing them. For a long while now DSP (Digital Signal Processing) chips are Multi-Core CPUs so that the algorithms executed on these chips can run faster. Algorithms and DSP chips are evaluated by MIPS which is the amount of instructions per time constant. Gaining performance increase with an algorithm means either using less instructions or adding more worker CPU cores. PCs also run algorithms such as face recognition, image detection, image filtering, motion detection, and more. The transition from single core CPU to a Multi-Core CPU was fast and simple.

Algorithm's increase in performance is relative to the amount of computations per data item. More computation more cores can be used. Image Blending (fade) is an example for an algorithm which cannot enjoy the use of more than a single core. Take an image and blend each pixel with the corresponding pixel of another image. Each pixel should be read from RAM then a simple addition and shift right are performed and then the result should be writen back to RAM. The CPU can operate at a rate of 3GHz and the RAM at 1GHz. For each pixel in the image we: Read pixel A, Read pixel B, Add, Shift, Write result pixel. Add another core and the CPU cores will mutually block on access to the memory. This is also true for Databases and database algorithms such as sort algorithms, linked lists, etc. For this reason the new Multi-Core CPUs have extensive support for parallel access to memory.

Parallel Computing ROI

Parallel Computing is the new future for computers. Object Oriented is no longer the new buzz word. I keep telling people that before they make an Object Oriented Design to their systems they should make flow charts. Good OOD is based on good system flow charts, whether you write them down or do it in your head as an art.

We all used to think that User Interface is the product and OOD is the way to do it. It now looks like we were wrong:

User Experience is the prodcut and Parallel Design is the way to do it. User Experience (UX) is not User Interface (UI). User Interface defines what the product would look like, or in other words UI defines what the product is. Object Oriented Design defines what the code looks like, or in other words OOD defines what the code is. Parallel Computing defines how the code works, or in other words Parallel Computing defines what the code does. User Experience defines how the application behaves, or in other words User Experience defines what the application does.

I am not using a C++ library because it is using linked-lists. I am using that library because it can sort.

I am not buying a product because it looks like I want it to look, for this I can buy a framed picture instead. I am buying a product because it is doing something I need and it is not doing what I do not need.

Parallel Computing is the basis for User Experience. Even if you have a single core it is better to have good parallel design. As customers you know this, you don't want to accidentally hit "Print" instead of "Save" and now wait for 5 seconds punishment for the dialog to open so you can close it. (see minute 43 for demo video)

Today we have so many good resources and tools. Now is the time to learn how to work parallel and produce good prodcuts with good UX.


Comments (7)
April 14, 2010 6:55 AM PDT


Peter da Silva I was doing parallel computing on single-CPU systems back in the late '70s and early '80s, without even thinking about it. It was mainstream. It was called the "UNIX command line". The UNIX pipes and filters model took advantage of parallelism on a single computer by allowing you to take advantage of parallelism inherent in teh division of work between I/O and computation. A UNIX pipeline allowed programs to accumulate and buffer data as fast as the disks could provide it, so that data was available for computation as soon as the CPU-intensive components of the pipeline were ready for it. When multiple CPUs became available, this just happened automatically.

For slow and latency sensitive devices, such as tape drives, one of the earliest tools for buffering I/O was simply to run the "DD" command with a large buffer multiple times in a pipeline: "tar cvf - | dd bs=16k | dd bs=16k | dd bs=16k > /dev/rmt0h" (this was on a PDP-11, 16k was a large buffer). The output of "tar" was uneven and bursty, because it was seeking all over the disk to collect the files for the archive, but the output of the final "dd" was smooth and the tape was able to stream for many megabytes at a time.

This had nothing to do with your proposed redefinition of parallel computing as a user experience design tool, it was a more or less automatic byproduct of good factoring of the problem. It was coarse-grained and could be bottlenecked by non-streaming operations (eg, sorts), but it was an early and effective tool. There have been similar tools created for specialized problem areas in GUI applications, such as MIDI apps that let you lay out multiple MIDI processing steps in two dimensions and hook them together by "wires", but the same kind of factoring of the problem space for GUI applications hasn't really been found.
April 14, 2010 8:34 AM PDT


Richard H. The image blending example only highlights the inherent non-parallel nature of memory-cpu bus contention. Current PCs with multi-cores aren't 100% parallel at the hardware level. ie. the Von-Neuman bottleneck is still present.
Lower your expectations, or get a system that really is parallel at the bus level.
April 14, 2010 8:35 AM PDT


Yves Daoust I don't quite share the comparison of parallel computing with object oriented design. I see the latter as a small step in the art of programming, as opposed to a giant leap for the former.

Anyone can write sequential programs after a few minutes of training on any procedural language. Most people end up writing well structured programs after a few years of practice and find no difficulty switching to Object Oriented Programming.

Writing concurrent programming is of another nature. It reserved for true experts, with a truly scientific understanding of the issues. Just think of the Dining Philosophers problem: even though the problem statement looks easy, I doubt that ordinary people can solve it correctly.

In fact, I consider that parallel programming is not within reach of ... the human brain, except in simple or symmetrical cases. As soon as there are two or three asynchronous agents, you lose the control :)
April 14, 2010 1:44 PM PDT


Thierry Joubert It is true that we see nowadays about as many conferences on Parallel Programmingin as we saw on OOP during the early 90's. From time to time, big actors have to convice the masses. Today, with Java and .NET, OOP has become the standard (try to give a C/C++ course to students if you are any doubt about this). The OOP "push" came from the software industry whose motivation was to provide efficient programming interfaces for programmable products like GUIs, Databases, system services, etc. OOP was a movement towards progress.

Parallelism is one of the oldest thing in computer science as stated in the article and several comments, but the Parallel Programming "push" we see nowadays is organized by silicon vendors who failed to keep up on the Moore's Law slope. OOP was not motivated by any limitation, and I see a noticeable difference here.
April 14, 2010 4:47 PM PDT


paul clayden Parallel is a fad and won't last. It's an interim measure to something much much bigger. Pretty soon we'll have analogue computing/quantum computing which is going to rock all our worlds.
April 14, 2010 8:11 PM PDT


Lava Kafle superb clarification, We have been using oparallelism in java oracle .Net CSharp whatever since very beginning of X64 Architectures supported by Intel
April 18, 2010 3:00 AM PDT

Asaf Shelly
Asaf Shelly Total Points:
1,930
Brown Belt
Hi All,

I will start with thanking Peter for the extensive information. Truly something to respect.

This shows us that the basic ideas were already there and where somehow lost in time. Makes me wonder what else did we forget.

Back at the old days applications and drivers usually had only a few components. These were separated by using different source files. Later in time we had a massive upgrade to use classes and objects as part of the Object Oriented programming and design. C programmers did not have to write down the Object Design whereas C++ programmers found it almost intuitive and mandatory. C programming also defines procedures. Notice the name "Procedure", it means that the function is not a 3 line variable modification code, rather it is a whole procedure in the main process. The flow chart was also too often not written down but as we can see by the names the application was a 'Process' to perform which had a 'main procedure' and several other 'procedures'. Old school programming defined Procedures and Structures, we now go back to Tasks and Objects. This is why my website (where the video is found) says " Welcome to the Renaissance"...

I was slowly getting to reply to Yves Daoust's: "In fact, I consider that parallel programming is not within reach of ... the human brain". See minute 12:30 in the same video mentioned at the end of the post. Everything we do is parallel. If you work as part of a big organization then you probably do Object Oriented Design and manage the programming tasks using SCRUM methodology. Take a look at SCURM, copy the principles to your code and you have a good parallel application. I quote Wikipedia ("http://en.wikipedia.org/wiki/Scrum_(development)") : "...the 'ScrumMaster', who maintains the processes..." There is also sprint, backlog, priority, and daily sync meeting which is used to profile the operation and keep track of progress. There are also interesting things to learn from it, for example the daily sync meeting is where you report of all problems. This means that we don't raise an exception for every problem, instead we collect all the errors and report when the time it right. This might solve a few problems that parallel loops are struggling with.
The " Dining Philosophers problem" is a way to manage a proposed solution – Locks, it is not a way to solve the problem. If instead of using a set of locks you use a service for each resource the problem is completely different.

Is the image here http://www.9to5mac.com/intel-core-i7-mac-pro-xserve the answer to Richard's question?

Hi Thierry, I could respectfully argue that OOP was motivated by the limitation in managing large scale projects just as parallel programming is motivated by managing large scale systems. OOP is for the design time and parallel programming is for the run time. Not that I don't agree with you. It is possible that OOP was focused on so much for the past few years that programmers today think only in objects but find it very difficult to think in tasks.

I guess I have to say to Paul that parallel programming is ignorant to the engine. I am suggesting you use a word-processor instead of a typewriter. It does not matter whether you are using MS-Office for Mac, Open-Office, or something new that will be invented 5 years from now. Quantum computing or not, my application should still know how to cancel an operation when it is no longer required.

Thanks for the comment Lava.

Regards,
Asaf Sphere: Related Content

No hay comentarios: