Posts Tagged ‘parallel’

Deriving natural laws from raw data

Saturday, April 4th, 2009

So I read this article today. Cornell researchers have successfully implemented a parallel computer system that is able to basically derive physics laws from observation data. This is extremely interesting. Before deciding to take computer science, I was split between majoring in physics or computer science (and, to much lesser extent, mathematics). To see an inter-disciplinary research like this one made me very happy and impressed. In fact, I do believe that many of the more interesting research (current and future) will involve multi-disciplinary twists. Just recently I read a paper describing translation of polynomial (or restricted-polynomial), completely-partitionable, first order differential equations into distributed protocols (that satisfy the same mathematical properties of the derivatives)1.

So, in this deduction system, it basically figures out derivatives (changes in values w.r.t. to some other changes in the system, e.g. time, position, etc.) that describe the observables, and uses this derivatives to derive system invariants. Technically, invariants are not the actual physical equations. However, it encodes all the information needed to derive physical equations for the system. With this technique, Cornell researchers were able to derive conservation of energy and momentum, and also Newton’s second laws from observations taken from simple experiments (including spring-loaded acceleration, simple and double pendulums).

Read the article for more details. Also, I’m definitely waiting to read the Science article (Vol. 323, No. 5924) that was supposed to come out on April 3 (the digital version seems to not have come out though).

  1. Gupta, I. On the Design of Distributed Protocols from Differential Equations. PODC’04. []

Parallel programming is a double-edged sword

Tuesday, November 11th, 2008

Today I read an interesting article at DDJ on understanding parallel performance. It outlines several truths about multithreaded code. One is that it is not always faster than non-threaded code.

Yes. It’s surprising. Several years back, when I was young(er) and single-core was the only choice of affordable PC (there were those servers with multiprocessors, but those aren’t affordable at all), I was naive to think that if I make my code multithreaded, I will gain some speed. In general that’s probably true, up to a point. Too many threads imply expensive context-switching, especially in a single core (heck, it is still a problem with multi-cores, as long as there are more threads than cores, as the article clearly pointed out). Over the years, I realized several considerations to ponder about when coding multithreaded code.

The first is how CPU-bound are your threads. If your process is highly CPU-bound, it is not worth to have more threads than cores. Remember, there are always other processes running as well, so the optimum number would be the same as the number of cores, or less as number of cores explodes. For CPU-bound processing, it is better to minimize context-switching.

Blocking process is another consideration. Any process that may block for I/O or to obtain a lock should be run on a thread pool. In this case, I would say that having 2-3 times threads compared to the number of cores are optimal for most cases. Note that for highly I/O-bound processes, more threads may be needed to fully squeeze every ounce of performance out of those cores.

Locks. When your threads require access to common resources, reduce the number of threads. Too many threads will cause lock contention everywhere and slows down processing as CPU cycles are used to acquire locks. Either minimize contention, which may not be possible on most cases, or segregate resources appropriately. For example, you may want just 1 thread to write to a common file, with any threads wanting to write to that file queue its task to that 1 thread. It’s not too easy in other cases. Although there are usually way out that minimize the number of lock contention. For example, when writing to an object, try to have different threads writing to different parts of the object, hopefully unrelated part.

In some cases, you can make a single thread perform much better than multiple threads (well, I said single thread, but I’m lying). The idea is to saturate the thread as much as you can with CPU-bound tasks. You may have some other threads (see I lied) that collect resources and writing out data as the single thread completed its tasks. In a web server example, you can have one master thread. When an incoming HTTP connection arrives, another thread will process the request into a task (perhaps a Request object?) and queue it at the master thread as it comes. The master thread may process one request and realized that it needs to read a file and queue that task to another file reader task, while at the same time returning the partially processed request back to the queue and start processing other request. Most bottleneck here will come from the queue implementation, but this is a baby example so I’m going to stop at that. Another bottleneck will be if the number of incoming requests arrive faster than the speed in which the master thread can process the requests, in which case, we arrive at classic example of dropping of requests (a la network routers and switches). In that case, it may be time to consider having 2 master threads.

From the above examples, you can probably tell that this method of self-managing your “threads” (in this case, instead of threads, we have requests) allow for much faster processing as it reduces context-switching.

What about LWPs (light-weight processes)? Indeed… what about LWPs? :) LWPs are fast, I love it.

P.S. Do read the DDJ’s article above. It’s very thoughtful.