Archive for March, 2009

Copy and move semantics

Sunday, March 29th, 2009

Recently, I have been asked some questions by an academic juniors (by a few years) about smart pointers. In the discussions, we came across terms like copy-constructible, and move semantics. While these concepts may be straightforward for more experienced C++ developers, the concepts are more absurd for programmers who are brought up in the era of garbage-collected languages.

Let’s go through the easy part first: copy semantics. When we talked about copy semantics we meant that a call like these:

vector<int> list;
<..>
vector<int> list2(list);  // Copy constructor

vector<int> list;

/* Thanks to Richie for providing a
   correct version of assignment. */
vector<int> list2;
list2 = list;  // Copy-on-assign

will results in list2 copying the content of list. The first example is an example of a copy constructor: a constructor that takes as a parameter an object of the same type as the constructor’s class (roughly-speaking), and copy the contents of the passed objects. It generally means that changing the content of list will not change the contents of list2 and vice-versa. A class with copy constructor that fulfill the copy semantic is sometime called copy-constructible.

The second example also performs copying operation. The operation is, however, implemented as assignment operator. This is sometime called copy-on-assign.

We should perhaps note that the default copy-constructor and assignment operator generally does not fulfill copy semantics. In fact, it is neither copy nor move semantics; neither here nor there. The default copy-constructor and assignment performs (more or less) a shallow copy; a mixed semantics if you like. They directly copy fields contents. If the field is of primitives type, it correctly performs copy operation. If the field is of types that correctly implemented copy semantics, the default also correctly performs copy operation; otherwise they will not adhere to proper copy semantics. Furthermore, if the field is pointer type, it will perform a copy: a copy of the address in the pointer. Hence, both the new object and the old one will contain the same address in the pointer field. Most of the time, this is not what you want.

Remember to not rely on the defaults: either implement your own copy-constructor and copy assignment, or make them private so that they will not be automatically generated. Boost has a base-class called noncopyable that you may want to use. I prefer using a macro (the macro definition is incomplete but shows the structure that we want):

#define DISABLE_COPY_AND_ASSIGNMENT(classname) \
classname(const classname& t) {} \
classname& operator=(const classname& t) { \
  return *this; \
}

// In some_class.h:
class SomeClass {
 public:
  <..>
 private:
  <..>
  DISABLE_COPY_AND_ASSIGNMENT(SomeClass);
}

Moving on, move semantics arise (more or less) because copy semantics impose performance overhead due to memory allocation and initialization. In many cases you may simply want to give the object to another piece of code without performing an expensive copy operation. In the olden days, you would simply initialize the object you wish to pass on as a pointer and simply pass the pointer to another person in the future. However, this risks 2 major issues: (1) as long as you hold a reference to the pointer, you may still be able to modify the object (a big issue when involving multi-threading); (2) you have to perform manual memory management. Hence, we arrive at modern move semantics. :)

Today, most C++ developers are well-acquainted with smart pointers (if you do not know what they meant, please Google it, they are important). One of the basic smart pointers in C++0x is called unique_ptr. A unique_ptr maintain (as its name implies) a unique pointer: when one instance of unique_ptr is holding to a pointer to an object, no other instances may have a pointer to the same object. Furthermore, people have developed techniques to ensure that you never need to hold a raw pointer at all. All seems good. Well, not really. Smart pointers usually rely on stack allocation, which means that they die (along with the object they hold) when they go out of scope. Here is where move semantics become useful:

unique_ptr<SomeClass> createSomeClass() {
  unique_ptr<SomeClass> ptr(new SomeClass(<..>));
  <..>
  return ptr;
}

// Somewhere else:
unique_ptr<SomeClass> a_ptr = createSomeClass();

Note that, technically, a copy and an assignment occurs here (however, many modern compilers may do away with the copy as optimization). First, when you return ptr, a copy-constructor for unique_ptr is called with ptr as its argument. After that, this new object is assigned to a_ptr. Note that while I said copy-constructor, a more apt term would be move-constructor. Roughly, a move-constructor will pilfer the pointer from the unique_ptr passed to it; the original unique_ptr will no longer hold the pointer. Hence, the term move semantics. Similarly, on assignment, the pointer is being moved from one unique_ptr to another.

Half the time, you would adhere closely to either copy or move semantics. However, sometime you may want to consider partial copy semantics, e.g. shallow copy. This is generally acceptable since the cost of following full copy semantics may be prohibitive. However, we usually do not mix copy semantics and move semantics together. They generally don’t play well together and will cause confusion to other developers. (There is no such thing as mixed copy/move semantics.)

Microsoft takes on browser benchmark

Friday, March 27th, 2009

Recently, a friend sent me a link for video and methodology of Microsoft’s browser benchmark. I was a little bored right now so I finally took the time to read the long PDF on the methodology. Here are the summary of my thoughts:

  • Issue highlighted is very valid. Micro-benchmark does not test end user scenario well. A more macro view is needed. Though completely discounting micro-benchmark is not exactly the right thing. Micro-benchmark does provide good, practically unbiased (due to lack of dependency on network stack, web servers, load-balancing, etc.) measure on browser’s parts. A combination of micro- and macro-benchmark would actually be the best
  • Moving on, the actual test that Microsoft did was very geared towards measuring load time. However, honestly, load time is getting less and less important today. We’re moving more and more towards heavy AJAX pages, where performance while the user is navigating in the page is getting more and more important. This includes (but not restricted to) re-rendering speed as user scrolls horizontally and vertically (or diagonally, for OS X users), Javascript processing that is followed/combined with DOM manipulation along with the actual re-rendering of changed DOM, canvas and animation (i.e. HTML5, CSS), etc.
  • Testing with pre-caching is reasonable, but should not be the only holy grail. If you’re doing lab tests, you could easily arrange to not cache content. Without caching, browser performance will be affected for the worse: it tested how good the browser is at utilizing parallelism (the context here involve things like parallel Javascript downloading, js-to-css blocking download, etc.). Lack of parallelism has been causing a huge slow-down in less modern browsers, though Chrome1/2, FF3.1, Safari 4, IE8 are all trying to fix this (see: http://stevesouders.com/ua/).
  • W.r.t. measurement overhead, actually there is one test that completely ignores measurement overhead: measurement using video recording. We can record the visual cue that the browsers made and perform manual/human comparison. This is very easy to do for web browsers since the results of the computation is directly displayable on the screen. However, manual comparison may cause inaccuracy, especially since the paper seems very intent to calculating to tens of milliseconds. Hence, automating this and improving the accuracy of the timing would be awesome.
  • While we are talking about milliseconds, I disagree with this liking of measuring browser load time to tens of milliseconds (2 decimal places for timings in seconds). Users are not gonna care if the page loaded 50ms faster, just measure accurate to a hundred milliseconds (1 decimal place for timings in second).
  • The issue of extensibility completely contradicts the issue highlighted in the first section: that benchmark should test what users will experience. Right now, more and more users (especially Firefox users) are relying on add-ons to improve their browsing experience. I guess extensibility should generally be addressed separately; however, it should not be discounted so completely. I know that Microsoft is trying to sell this thing (IE8) all right, so I guess it’s acceptable.
  • Inconsistent definitions: I really like this part, it is exactly as I imagine it should be. (The video recording thing I suggested above is also based on my dislike of using browsers’ own onload mechanism to determine that the page has fully loaded.)

[On inconsistent definitions:] Another factor that impacts benchmarking is having a consistent baseline of what it means to complete  a task. In car and horse racing, the end is clear—there is a finish line that all contestants must cross to finish the race. Browser benchmarking needs a similar way to define when a webpage is done loading. The problem is that there is no standard which dictates when a browser can or should report that it is “done” loading a page—each browser operates in a different way, so relying on this marker could produce highly variable results.

Quoted from: Measuring Browser Performance (Microsoft)

Honestly, right now, I don’t really care which one is the fastest browsers around. As long as they are around the same ballpark (not orders of magnitude slower), I would care more about customization feature. If you took a look at my two instances of Firefox (one FF3.1b3, another FF3.0.8), they are both heavily customized with heavy theme-ing (yeah, one of them looks like Chrome) and tonnes of addons.

Oh, and being a Mac users, whatever IE develops do not directly affect me that much. ;)