Real Computer Science begins where we almost stop reading ...: Initialization issues in Programming

Saturday, 28 December 2013

Initialization issues in Programming

The Problem

Uninitialized variables are a perennial problem in C. Writing through uninitialized (but not necessarilly NULL) pointers is particularly nasty, as it can overwrite memory anywhere (notwithstanding hardware memory protection). Of course, any variable can be uninitialized so we want to ensure that all variables are set to some (correct) value before they are actually used.

Years of bad experiences with this problem means that experienced C programmers often use a form of defensive programming where steps are taken to avoid the possibility of using memory that has not been set to some value. For example, I have seen malloc() #defined so that the real malloc() is bypassed in favour of calloc(). The standard library function calloc() is useful because it initializes the allocated memory with zero, but it is not always appropriate.

Defensive Programming

Defensive programming is part of the culture of C. For example, it is common to see code like this:

for (i = 0; i < 10; ++i) // defensive
a[i] = f(i);

rather than:

for (i = 0; i != 10; ++i) // non-defensive
a[i] = f(i);

Now these should do the same thing unless by some strange action the variable 'i'

Note that I use the literal 10 for simplicity. In real code you would not use a magic number but something like sizeof(a).

somehow skips having the value 10, in which case the non-defensive code will go into an infinite loop corrupting memory until the program terminates due to a memory exception or something else nasty happens. How could 'i' possibly do this? Well stranger things have happened. It may be due to errant behavior of the function f().

So which is better? Well, at least when debugging, the non-defensive code is definitely better as the defensive code will mask the symptoms of the bug in f() and you may never be aware of it.

For a release build the defensive code is arguably better. But surely it is better to find and fix the bug (a nasty one probably lurking in f()) than try to recover (perhaps unsuccesfully) from it?

Don't get me wrong. I think defensive programming is useful. It is preferable to use software that might behave a little strangely once in a while, as long as it doesn't crash and lose all your data. However, as we saw, defensive programming can hide bugs. So you might find that you were using a word processor which did some odd things but seemed to recover and allowed you to save your document - but you might not discover till much later that the document you saved was actually corrupted.

The best strategy is to be defensive and (at least in debug builds) also try to detect the bugs you are defending against. I use code like this:

for (i = 0; i < 10; ++i)
a[i] = f(i);
assert(i == 10);

Uninitialized Variables

To get back to the point of this article consider a function like this:

void g()
{
int i;
....
i = f();
...
use(i);
...

The worry is that the initialization of 'i' is bypassed somehow and then 'i' is used with an invalid value. Personally, I think, as far as possible, variables should be initialized with their proper value when they are declared.

int i = f();
...
use(i);
...

but this is not always possible in conventional C (at least in the more commonly used C-89) which requires you to declare variables at the start of a compound statement.

You can still have the problem in C++ if the initial value of all members is not known at the time of construction.

Luckily, this is not required in C++ and is even actively discouraged. A greater part of the problem is that convention (and many C coding standards) require all local variables to be declared at the start of the function.

Debugability

The result is that many C programmers take the shotgun approach:

void g()
{
int i = 0; // shotgun initialization
....
i = f();
...
use(i);
...

The only thing is that zero is not the correct initial value for 'i'. The only good to come from this is that this makes the value of 'i' deterministic, since otherwise it could contain any random value that happened to be left over in the memory that is used for the local variables.

Making the code deterministic makes it more debugable (see my previous blog post on software quality attributes). So it can save time in tracking down a bug, because it is easy to reproduce. However, it can also make the code less verifiable since it can hide the symptoms of the bug. It may be better to initialize the variable to some strange value so that it is obvious when it has not been initialized properly.

This is the reason that the Microsoft VC++ debug code initializes all local variables with the hex value 0xCC (uninitialized heap memory is filled with 0xCD). This obviates the need to do shotgun initialization in that environment since the run-time behavior cannot randomly change.

Disadvantages

We saw that shotgun initialization has the advantage in increasing code debugability, but there are many disadvantages?

First, it can make the bug harder to detect (the software is less verifiable). If you are debugging and 'i' has the (incorrect) value of zero you may think that it has been set correctly. But if it contained 0xCCCCCCCC your interest should be immediately aroused, and it's more likely to trigger an assertion or a run-time error.

Another problem is when you initialize to some arbitrary value it is harder for the compiler to tell you that something is wrong. Many compilers will attempt to warn if you use a variable that has not been initialized.

Further, if the code that uses 'i' is removed at a later date it will prevent many compilers from warning of an unused local variable. Because 'i' is initialized it is not flagged as being unused.

However, my biggest concern is that the code becomes confusing. The initialization of 'i' to zero serves no purpose and may confuse someone reading the code. One of the biggest code smells is finding redundant code. I have often found that such redundancy inhibits changes to the code and can itself lead to bugs.

Last (and least), it is less efficient to write the same memory twice - once with zero and then again with the real initial value. One of the few things I dislike about C# is that you cannot avoid having an array initialized to zeros, even if you are going to set the elements to some other value. For a large array this can take an appreciable amount of time.

The Bug is The Problem

The bottom line is that rather than trying to handle bugs gracefully, the first priority is not to have bugs. Shotgun initialization can even hide bugs and makes the code harder to understand (and hence modify). There are few, if any, advantages to using it and these are far outweighed by the disadvantages.

No comments:

Inspiring Quotes

An inspiring quote may be just what you need to turn your day around. Here are some of the most inspiring quotes ever spoken or written.

I hated every minute of training, but I said, “Don’t quit. Suffer now and live the rest of your life as a champion.”

–Muhammad Ali

“You can have anything you want if you are willing to give up the belief that you can’t have it.”
–Robert Anthony

“There is no man living that can not do more than he thinks he can.”

–Henry Ford

“The best way to predict the future is to create it.”

–Dr. Forrest C. Shaklee

“It’s not about time, it’s about choices. How are you spending your choices?”

–Beverly Adamo

“Success…seems to be connected with action. Successful people keep moving. They make mistakes, but they don’t quit.”
–Conrad Hilton

“Destiny is not a matter of chance; it’s a matter of choice.”

–Anonymous

“The future belongs to those who believe in the beauty of their dreams.”
–Eleanor Roosevelt

“The quality of a person’s life is in direct proportion to their commitment to excellence, regardless of their chosen field of endeavor.”
–Vince Lombardi

“It is never too late to be what you might have been.”
–George Eliot

“Do not let what you can not do; interfere with what you can do.”
–John Wooden

“One man with courage makes a majority.”
–Andrew Jackson

“Failure is the opportunity to begin again more intelligently.”
–Henry Ford

“Try not to become a man of success but rather try to become a man of value.”
–Albert Einstein

“The mind is its own place, and in itself can make a heaven of Hell, a hell of Heaven.”

–John Milton

"If u are student, working and preparing give a little extra effort after regular work. A small sacrifice of TV time, fun time, or facebook time can bring a lot of better things to life than you ever imagined."

-- Naam likhna jaroori nai samajhta.

Thank you for reading, be sure to pass this along!

Saturday, 28 December 2013

Initialization issues in Programming

No comments:

Post a Comment