Generally, this is a mistake. In almost all cases it is better for the
run-time to terminate the program (perhaps first generating an
assertion).
Note that I am not saying that you should be oblivious to the
possibility of division by zero. I am simply saying that trying to
recover from it at the point of the error is usually a mistake.
Conventional Wisdom
When I first began learning about programming the first thing that was drummed into us was that it is
not enough to get your program working. You also need to cater for
error conditions.
The first example my lecturer gave of error-handling was division by zero.
Floating Point Division By Zero
I
am talking about integer divide by zero, not floating point, unless
otherwise stated. However, the same arguments generally apply to
floating point calculations.
The
major difference is that floating point division by zero usually will
not terminate the program but generate an infinite result. (Most
implementations nowadays provide floating point numbers that include
positive and negative infinity).
The guidance was simply to make sure it cannot occur. A few years later I
became an aficionado of defensive programming and my policy became that
whenever I used division I needed to add extra code to check for the
possibility of the divisor being zero and somehow recover from the
situation - usually simply by setting the result of the operation to
zero.
Most experienced C programmers take this approach, but I have found that
this is almost always the wrong approach. Let's first look at the
possible situations where division by zero may occur then consider each
one in more detail.
- a bug causes the divisor to take a zero value when it should never be zero
- a zero divisor resulting from user input
- incorrect data from an external source
- in rare cases division by zero may be mathematically valid and handled specially
Bugs
Most of the time the problem occurs due to bugs in other parts of the
code that have slipped through. It is often argued that the problem
should be detected and handled something like this:
if (numRequests > 0)
aveTime = totalTime / numRequests;
else
aveTime = 0; // Bad idea
The problem with this is that the bug is now silently hidden. Perhaps this
is good for the final release of the software but is certainly not good
when debugging and testing. It's better to find and fix the bug than to
cover it up. This is the problem of defensive programming which I
talked about in a previous post.
Personally, I would just leave the test out altogether and let the run-time system terminate the program. This is the fail fast approach.
But I would also add an assertion, especially if using floating point
values, since some implementations may not terminate but instead
generate infinity, which is probably not what was desired.
assert(numRequests > 0);
aveTime = totalTime / numRequests;
This should be adequate with good design, thorough testing and software that is adequately verifiable (see my post on verifiability),
but with badly written software you may not be certain that a bug has
not slipped through, so the only alternative is to try to recover. But
often continuing with a strange value may cause subsequent problems or
even data-corruption. In the above case it may be that aveTime should always be greater than zero.
Having studied a great deal of these situations I found that it can be
very difficult to decide on a value that makes sense for the continued
safe and sensible operation of the software. For example, it may make
most sense to set aveTime to some very large value (since mathematically speaking, dividing by zero produces an infinite result); but if both totalTime and numRequests are both zero then aveTime probably should also be zero.
if (numRequests > 0)
aveTime = totalTime / numRequests;
else
{
assert(0); // Don't hide this bug in debug/test
// Try to recover sensibly in case a bug was never found
if (totalTime == 0)
aveTime = 0;
else
aveTime = INT_MAX;
}
The other alternative in C++ is to throw a software exception and let the software recover at a higher level.
User Input
Sometimes a calculation can be the result of user input. It is important
to validate user input when it is entered. Not validating can cause
inconsistencies in that data which later leads to problems like division
by zero.
for (;;)
{
int numberOfItems = GetNumberOfItemsFromUser();
if (numberOfItems > 0)
break;
DisplayErrorMessage("You must have at least one item");
}
.
.
aveCost = totalCost / numberOfItems; // No divide by zero here
Of course, a lot of things could happen between the input validation and
the use of the value. If it is at all possible that the value could be
corrupted or input bypassed then the previous section (Bugs) again applies.
Bad Data
A lot of programs carefully validate user input but assume that data
from other sources is valid. Unless you are sure that the data is valid,
for example by using a CRC then you should validate data when you
receive it.
Data can be corrupted due to many things like hardware or software
failure or human error. In software with security implications,
deliberate tampering may be an issue and a CRC is not sufficient - use
a cryptographic checksum like SHA1.
Expected
In very rare cases division by zero may not actually be an error
condition, in which case you may need to handle it especially. This is
the reason that IEEE floating point numbers allow for infinite numbers.
Generally, this sort of code would be for a specialized scientific or
mathematical purpose and would use floating point numbers anyway.
Otherwise, it could be achieved like this:
if (elapsedTime == 0)
{
isInfiniteSpeed = true;
speed = -1;
}
else
{
speed = distance / elapsedTime;
isInfinite = false;
}
Of course, later code that used speed would need to also check isInfiniteSpeed before using speed.
Conclusion
Generally, it is a mistake to detect and try to recover from a divide by
zero error. Except for the very unusual situation where it is not an
error-condition (see Expected section above) then it indicates there was a problem earlier such as a bug, corrupt data, or user input that was not validated.
With verifiable and thoroughly tested software the problem should not
happen. Trying to recover is actually detrimental as it may hide a bug
which would normally be found in testing.
For poorly written software (not well-written and not easily verifiable)
it might be worthwhile trying to recover from the problem. It would
also be necessary for fail safe systems where software termination could
have dire consequences.
The problem with trying to recover is that there may not be a reasonable
value to use in the circumstance. It is conventional to use a value of
zero but often this is the worst possible value to use.
The most important point is that if you do recover from a divide by zero error that you do not hide the fact that there is a defect
in the code. During debugging and testing the software should generate
an exception or make it immediately obvious that there is a problem with
the code. For released software the problem should be detected and
reported - for example, to a monitored error log file.
No comments:
Post a Comment