Rant #2: Backward-compatibility and API-breaking changes

November 26th, 2008

So, recently, I’ve switched some of the codes I maintained to use this great new version of a common utility (shall not bore you with the details). The new version has tonnes of great, new stuffs. It was exciting. I managed to clean up my config files a lot just by using the new version.

Of course, the new version specifically promised that while it is breaking the API, all of its functionality that already exist in the previous version already exists in the new one and that user (i.e. me) can use an adapter classes to convert between the two versions.

All is well, until today. Today, I had to extend part of my code to work together with an older code that uses the previous API. Guess what? While the old build rules has been updated so that it can depend on the new build rule for the new version, the reverse is not true! Hold on! Why!? Yea, that surprised me a lot! So I had two options, extracting the part I need and update it to use the new API and make the two systems depend on this or update the older code to use the new version.

I chose the latter because it wasn’t a huge amount of work and the design is cleaner if I kept with the old design (breaking up that part of the code seems rather hackish to me). After half an hour of updating the old code, I got both system working together well (I wrote some script to automate some of the task, if you’re wondering why it takes only half an hour). The code links well, though since there are many other codes that depend on the other code that I just modified, I had to run the API in a compatibility mode, but that’s no big deal.

Only minutes later, when I started writing unit tests (before I even started writing my production code), I realized that, umm, once the new API works on compatibility mode, you have to use the adapter class to get the new functionality I need. There is no automatic detection, everything is manual! No convenient methods to convert one to another easily (I mean it’s C++, they could easily write operator() to convert from one to the other pretty darn easily!). Crap! I realized that I had to modify the codes in far too many places to make this worth it. So I just decided to ditch the feature I need.

There are so many lessons I learned from just today’s experience. If I were to write the new API, I would have provide automatic conversion between one to another, I would have backported some of the most important new features (mind you, these new features are very doable with the old version), I would try my best not to make any API-breaking changes. You know, having the old and new data structures extend the same base class would have been much better.

Lastly, seriously, I know there were some serious problems with the old code, but I’d rather have those codes deprecated slowly over time while the new code is being phased in. During this transition, you better don’t just introduce new features on just the new code (or worse, just the old code). You need to support both of them, or otherwise, help people who use the old version migrate. This is no open source code, it’s a privately-owned code. While the code base is arguably huge, there have been evidence of successful deprecation in the past, why can’t this one do that?

Child sending naked picture of herself persecuted…

November 23rd, 2008

… Really now? Seriously?

This has got to be the stupidest news I’ve seen in awhile. Not to mention the stupidest public persecution in quite awhile too.

Oh and it doesn’t stop there…

A prosecutor says Licking County authorities also considering charges for students who received the photos.

Okay… (Probably no wonder that the county and the school has, well, you know, the word licking in it.) I’m still half-hoping that this is an “onion”-news[1].


On unrelated note, Stoyan Stefanov from YUI blog has just posted a pretty neat article on pngcrush (among other things, like jpeg metadata stripper). This should really be made a common practice nowadays. A friend of mine has even written a short script to automate CSS spriting followed by pngcrush recently. He told me it helps reduce the file size of the sprited PNG by twenty something percent. Not to mention a lot less HTTP requests.

C++’s scoped_ptr and unique_ptr smart pointers

November 23rd, 2008

I got bitten again just the other day when I was modifying old code. It crashed. Yes, I added an extra delete when one is not required resulting in a double deletion. It reinforced my belief that most pointers in C++ should be smart pointers. A new C++ programmers will be caught often for memory leaks for forgetting to delete a pointer. As you get more experienced, while forgetting to delete becomes far less of a problem, double deletion becomes more rampant (Hey! It’s hard to keep track of object ownership you know? Especially when you’re rushing to modify other people’s code before deadline…)

scoped_ptr

To the rescue is Boost scoped_ptr smart pointer. There are two main reasons, in my opinion, to use scoped pointers.

Scoped pointers ease manual memory management. It holds a pointer to the object that it manage and it performs automatic deletion of the object it holds when it is deleted. The scoped_ptr object itself is a templated auto pointer, which means that it will be deleted automatically when it goes out of scope. Here is a simple, incomplete implementation of scoped_ptr:

template <typename T>
class scoped_ptr : noncopyable {
 public:
  explicit scoped_ptr(T* p = NULL) { p_ = p; }
  ~scoped_ptr() {
    if (p_ != NULL) delete p_;
  }
  void reset(T* p = NULL) {
    if (p_) delete p_;
    p_ = p;
  }

  // Some implementation may choose
  // to crash if p_ is NULL for the
  // following 3 operators.
  T& operator*() const { return *p_; }
  T* operator->() const { return p_; }
  T* get() const { return p_; }
 private:
  T* p_;
};

Let’s analyze the class. It extends Boost’s noncopyable interface (some would prefer to use macro for this), which implies that a scoped_ptr object may not be copied or assigned to another scoped_ptr. It induces a strict ownership of the owned pointer. As you see above, the destructor of scoped_ptr simply delete the held pointer, similarly with reset(T*) method.

This brings us to a second, more important point. scoped_ptr enforces a strict ownership of an object. In another word, it forces us programmers to think and then rethink about the ownership of an object. Many times, the problem with C++ developers are not forgetting to delete. It is not knowing who exactly owns an object. For example, let’s check out the following really simple class definition:

class ConfusedClass {
 public:
  ConfusedClass() {}
  ~ConfusedClass() {}
  void DoSomething() {
     a_ = b_.PerformSomething();
  }
  AnotherClass* GetA() { return a_; }
 private:
  AnotherClass* a_;
  YetAnotherClass b_;
};

In a world without scoped_ptr, this class can be really confusing. Who owns the object held by a_? Is it b_? Is it ConfusedClass? Or is it the class who calls GetA? The last option looks unlikely here. But it’s pretty hard to differentiate between the first two cases! A subsequent reader of the class definition would probably need to dig YetAnotherClass to determines that information. (Note also that the destructor is empty, it can be that b_ holds the object held by a_, or… it can be a bug—forgetting to delete a_!)

With scoped_ptr, when we write ConfusedClass, we should be thinking about the ownership of the object held by a_. And if we think this class should owns it, we should use scoped_ptr<AnotherClass> a_ instead! That way, subsequent reader of the class definition knows for sure that the object is owned by ConfusedClass (or shall we call it SmartClass now).

As a bonus, code with multiple exit path will be easily managed with scoped_ptr (instead of hunting each exit path and making sure all the pointers that should be deleted are deleted). Imagine how troublesome it is to manage a method with a throw for example. (I remembered writing a Java code where I’ve had to always have 3 if blocks in the finally part of a try-catch-finally, to actually check for null and close a ResultSet, a PreparedStatement, and an SqlConnection. In C++, I’ll simply write a wrapper similar to scoped_ptr to perform the closing.)

unique_ptr

C++0x expands the smart pointer repertoire even more with unique_ptr (Committee Draft §20.8.12). This smart pointer has a strict ownership as with scoped pointer. However, it is MoveConstructible and MoveAssignable (as described by the CD). What those jargons mean is that a unique_ptr s can be constructed with parameter of another unique_ptr u with a corresponding ownership transfer of the held object from u to s (MoveConstructible) and a unique_ptr u can be assigned to another unique_ptr s with ownership of the owned object transferred from u to s (MoveAssignable).

This pointer adds a little extra value to scoped pointer version. That is, you can transfer ownership (there is no release method in scoped pointer, while unique_ptr has not just a move constructor and move assignment, but also an explicit release method). This is basically a better version of std::auto_ptr (I’ve heard talk of making auto pointer deprecated).

To effectively used smart pointer, use the correct smart pointer for each of your needs. If you need a strict ownership semantics without any trasnfer, use scoped_ptr. If you need ability to transfer ownership in addition to that, use unique_ptr (or std::auto_ptr). Even better, make such rules part of your software/company’s style guide. Future maintainers will thank you when he can easily see an orderly semantics in the chaos.

shared_ptr

There is another smart pointer introduced in C++0x. The name is shared_ptr (CD § 2.8.13.2). This pointer is basically a referenced-counting smart pointers that implements shared ownership of the held object. When the last shared_ptr holding the particular object is destructed, the object is deleted too. Now, I won’t delve too much into this smart pointer because I believe in strict ownership as opposed to shared one. There should be a very, very rare situation where it demands shared ownership of an object. In a good software design, only one object should owned a particular object.

Now there is one place where shared_ptr is very, very useful: the STL containers. When you insert or retrieved a member into an STL containers, a copy of the object is made. For performance reason, keeping a pointer in the containers make lots of sense (especially when copying is expensive). As with any pointer usage, it becomes very hard to keep track of these objects. Hence the usage of shared_ptr. Copying a shared pointer is cheap. Additionally, there’s no risk of forgetting to delete a pointer within an STL containers. (Keep in mind that shared_ptr is usually twice as big as a normal pointer as it needs to keep another pointer to the reference counter.)

Example usage:

vector<shared_ptr<ClassA> > vector_of_a;
hash_map<int, shared_ptr<ClassA> > map_of_int_to_a;

GMail Themes (and its launch faux pas)

November 20th, 2008

Oh my!! Did you see the new GMail themes feature?? It’s freakin’ awesome! No seriously. Within minutes, I’ve found my favorite among those themes: the pebbles theme! The color scheme is absolutely pleasent, complete with transparent bars.

This, together with the label colors lab feature has turned my boring gmail account into a very pleasant color explosion. Soothing and relaxing. :) Try it out if you have an account (or create an account). You know, sometime I think that everyone has a gmail account. When I see an e-mail that is not gmail based, I can’t help feeling delighted (at seeing something not ‘et’ gmail ‘dot’ com).

GMail made a faux pas during the launch of themes. Usually new features will only appears when GMail is refreshed. It seems that this one wasn’t launch that way. The new theme appears halfway on my GMail. It was shocking. I thought something was really wrong with my GMail since the color looks really horrible. Fortunately, the first thing I did was refreshing the page. After refreshing it turns to be quite pleasant (the “default” theme). I thought it might just be my browser. An hour later, a friend was pointing to me that the web snippets (those text at the top of the inbox with text coming from other sites) were virtually unreadable due to the color scheme. My conversation with him was quite funny. We thought that somebody in GMail is so gonna get his arse burned. ;) We were kidding, of course (in case some Googlers read this). I told him to refresh, and he loves it.

(You might also want to try the time-based themes; those are pretty cool as they change with time. I tried the Beach themes, but after awhile, it still doesn’t look as appealing as pebbles.)

John Resig’s degrading script tags

November 13th, 2008

I just re-discovered this interesting Javascript trick from John Resig’s blog (link here). Under usual circumstances, I would frown when I see eval in Javascript code, the one construct I tried to avoid like a plague. But I don’t resist interesting usage of eval. This is one of them. Basically, the trick allows you to have a script tag with an embedded Javascript, like this:

<script src="some_js_file.js">
  // Do something with the loaded file.
  callSomething(); ...
</script>

Interesting huh? To make this work, the loaded script file itself will end with the a two-liner that basically find the script element and evaluates the innerHTML of the element:

var scripts =
    document.getElementsByTagName("script");
eval(scripts[scripts.length - 1].innerHTML);

Ingenious. As an added bonus, the content of the script element will not be executed if the Javascript fails to download (since, obviously, the last two lines that we just added won’t be executed).

Also interesting is how this hack utilizes the synchronous property of Javascript loading. In most browsers except the most recent ones (most of them still in beta/nightly), whenever a script tag is encountered, the rendering stops until the script is downloaded and executed. That means by the time the script is loaded, it knows for sure that the script tag it is in is the last one on the DOM since nothing else has been rendered. Thus, you can access the element by using the above method (see scripts[scripts.length - 1] part of the code).

I’m a little bit behind with how newer browsers download its Javascript. I suspect that the method above may not work. I guess it’ll depend on the heuristics the browsers use to make Javascript download not blocking. I heard that Firefox will actually assume that the script does not do anything to the DOM and continues rendering, in which case the above technique may not work. (I’m sure I’m missing something, probably there are some heuristics that FF used that I’m not aware of.)

Well, still, it is an ingenuous way of utilizing eval.

Parallel programming is a double-edged sword

November 11th, 2008

Today I read an interesting article at DDJ on understanding parallel performance. It outlines several truths about multithreaded code. One is that it is not always faster than non-threaded code.

Yes. It’s surprising. Several years back, when I was young(er) and single-core was the only choice of affordable PC (there were those servers with multiprocessors, but those aren’t affordable at all), I was naive to think that if I make my code multithreaded, I will gain some speed. In general that’s probably true, up to a point. Too many threads imply expensive context-switching, especially in a single core (heck, it is still a problem with multi-cores, as long as there are more threads than cores, as the article clearly pointed out). Over the years, I realized several considerations to ponder about when coding multithreaded code.

The first is how CPU-bound are your threads. If your process is highly CPU-bound, it is not worth to have more threads than cores. Remember, there are always other processes running as well, so the optimum number would be the same as the number of cores, or less as number of cores explodes. For CPU-bound processing, it is better to minimize context-switching.

Blocking process is another consideration. Any process that may block for I/O or to obtain a lock should be run on a thread pool. In this case, I would say that having 2-3 times threads compared to the number of cores are optimal for most cases. Note that for highly I/O-bound processes, more threads may be needed to fully squeeze every ounce of performance out of those cores.

Locks. When your threads require access to common resources, reduce the number of threads. Too many threads will cause lock contention everywhere and slows down processing as CPU cycles are used to acquire locks. Either minimize contention, which may not be possible on most cases, or segregate resources appropriately. For example, you may want just 1 thread to write to a common file, with any threads wanting to write to that file queue its task to that 1 thread. It’s not too easy in other cases. Although there are usually way out that minimize the number of lock contention. For example, when writing to an object, try to have different threads writing to different parts of the object, hopefully unrelated part.

In some cases, you can make a single thread perform much better than multiple threads (well, I said single thread, but I’m lying). The idea is to saturate the thread as much as you can with CPU-bound tasks. You may have some other threads (see I lied) that collect resources and writing out data as the single thread completed its tasks. In a web server example, you can have one master thread. When an incoming HTTP connection arrives, another thread will process the request into a task (perhaps a Request object?) and queue it at the master thread as it comes. The master thread may process one request and realized that it needs to read a file and queue that task to another file reader task, while at the same time returning the partially processed request back to the queue and start processing other request. Most bottleneck here will come from the queue implementation, but this is a baby example so I’m going to stop at that. Another bottleneck will be if the number of incoming requests arrive faster than the speed in which the master thread can process the requests, in which case, we arrive at classic example of dropping of requests (a la network routers and switches). In that case, it may be time to consider having 2 master threads.

From the above examples, you can probably tell that this method of self-managing your “threads” (in this case, instead of threads, we have requests) allow for much faster processing as it reduces context-switching.

What about LWPs (light-weight processes)? Indeed… what about LWPs? :) LWPs are fast, I love it.

P.S. Do read the DDJ’s article above. It’s very thoughtful.