Bharat Banate's Work Profile

View Bharat Banate's profile on LinkedIn

Thursday, September 6, 2007

News: The World Will End on January 19, 2038

On January 19, 2038, that is precisely what's going to happen.

For the uninitiated, time_t is a data type used by C and C++ programs to represent dates and times internally. (You Windows programmers out there might also recognize it as the basis for the CTime and CTimeSpan classes in MFC.) time_t is actually just an integer, a whole number, that counts the number of seconds since January 1, 1970 at 12:00 AM Greenwich Mean Time. A time_t value of 0 would be 12:00:00 AM (exactly midnight) 1-Jan-1970, a time_t value of 1 would be 12:00:01 AM (one second after midnight) 1-Jan-1970, etc.. Since one year lasts for a little over 31 000 000 seconds, the time_t representation of January 1, 1971 is about 31 000 000, the time_t representation for January 1, 1972 is about 62 000 000, etc.


If you're confused, here are some example times and their exact time_t representations:

_______Date & time _____________time_t representation
01-Jan-1970, 12:00:00 AM GMT ____________________0
01-Jan-1970, 12:00:01 AM GMT ____________________1
01-Jan-1970, 12:01:00 AM GMT ___________________60
01-Jan-1970, 01:00:00 AM GMT _________________3600
02-Jan-1970, 12:00:00 AM GMT ________________86400
03-Jan-1970, 12:00:00 AM GMT _______________172800
01-Feb-1970, 12:00:00 AM GMT ______________2678400
01-Mar-1970, 12:00:00 AM GMT ______________5097600
01-Jan-1971, 12:00:00 AM GMT _____________31536000
01-Jan-1972, 12:00:00 AM GMT _____________63072000
01-Jan-2003, 12:00:00 AM GMT ___________1041379200
01-Jan-2038, 12:00:00 AM GMT ___________2145916800
19-Jan-2038, 03:14:07 AM GMT ___________2147483647

By the year 2038, the time_t representation for the current time will be over 2 140 000 000. And that's the problem. A modern 32-bit computer stores a "signed integer" data type, such as time_t, in 32 bits. The first of these bits is used for the positive/negative sign of the integer, while the remaining 31 bits are used to store the number itself. The highest number these 31 data bits can store works out to exactly 2 147 483 647. A time_t value of this exact number, 2 147 483 647, represents January 19, 2038, at 7 seconds past 3:14 AM Greenwich Mean Time. So, at 3:14:07 AM GMT on that fateful day, every time_t used in a 32-bit C or C++ program will reach its upper limit.

One second later, on 19-January-2038 at 3:14:08 AM GMT, disaster strikes.

What will the time_t's do when this happens?
when a signed integer reaches its maximum value and then gets incremented, it wraps around to its lowest possible negative value. (The reasons for this have to do with a binary notation called "two's complement"; I won't bore you with the details here.) This means a 32-bit signed integer, such as a time_t, set to its maximum value of 2 147 483 647 and then incremented by 1, will become -2 147 483 648. Note that "-" sign at the beginning of this large number. A time_t value of -2 147 483 648 would represent December 13, 1901 at 8:45:52 PM GMT.

So, if all goes normally, 19-January-2038 will suddenly become 13-December-1901 in every time_t across the globe, and every date calculation based on this figure will go haywire. And it gets worse. Most of the support functions that use the time_t data type cannot handle negative time_t values at all. They simply fail and return an error code. Now, most "good" C and C++ programmers know that they are supposed to write their programs in such a way that each function call is checked for an error return, so that the program will still behave nicely even when things don't go as planned. But all too often, the simple, basic, everyday functions they call will "almost never" return an error code, so an error condition simply isn't checked for. It would be too tedious to check everywhere; and besides, the extremely rare conditions that result in the function's failure would "hardly ever" happen in the real world. (Programmers: when was the last time you checked the return value from printf() or malloc()?) When one of the time_t support functions fails, the failure might not even be detected by the program calling it, and more often than not this means the calling program will crash. Spectacularly.

What about making time_t unsigned in 32-bit software?
One of the quick-fixes that has been suggested for existing 32-bit software is to re-define time_t as an unsigned integer instead of a signed integer. An unsigned integer doesn't have to waste one of its bits to store the plus/minus sign for the number it represents. This doubles the range of numbers it can store. Whereas a signed 32-bit integer can only go up to 2 147 483 647, an unsigned 32-bit integer can go all the way up to 4 294 967 295. A time_t of this magnitude could represent any date and time from 12:00:00 AM 1-Jan-1970 all the way out to 6:28:15 AM 7-Feb-2106, surely giving us more than enough years for 64-bit software to dominate the planet. It sounds like a good idea at first. We already know that most of the standard time_t handling functions don't accept negative time_t values anyway, so why not just make time_t into a data type that only represents positive numbers?

Well, there's a problem. time_t isn't just used to store absolute dates and times. It's also used, in many applications, to store differences between two date/time values, i.e. to answer the question of "how much time is there between date A and date B?". (MFC's CTimeSpan class is one notorious example.) In these cases, we do need time_t to allow negative values. It is entirely possible that date B comes before date A. Blindly changing time_t to an unsigned integer will, in these parts of a program, make the code unusable.
You'd fix one set of bugs (the Year 2038 Problem) only to introduce a whole new set (time differences not being computed properly).

Not very obvious, is it?
The greatest danger with the Year 2038 Problem is its invisibility. The more-famous Year 2000 is a big, round number; it only takes a few seconds of thought, even for a computer-illiterate person, to imagine what might happen when 1999 turns into 2000. But January 19, 2038 is not nearly as obvious. Software companies will probably not think of trying out a Year 2038 scenario before doomsday strikes. Of course, there will be some warning ahead of time. Scheduling software, billing programs, personal reminder calendars, and other such pieces of code that set dates in the near future will fail as soon as one of their target dates exceeds 19-Jan-2038, assuming a time_t is used to store them. But the healthy paranoia that surrounded the search for Year 2000 bugs will be absent. Most software development departments are managed by people with little or no programming experience. It's the managers and their V.P.s that have to think up long-term plans and worst-case scenarios, and insist that their products be tested for them. Testing for dates beyond January 19, 2038 simply might not occur to them. And, perhaps worse, the parts of their software they had to fix for Year 2000 Compliance will be completely different from the parts of their programs that will fail on 19-Jan-2038, so fixing one problem will not fix the other.

2 comments:

Anonymous said...

If you are interested in watching this practically, i have a way for unix/linux users.
Below is a perl program. Code it, and run. See the output!

#!/usr/bin/perl

use POSIX;

# Set the Time Zone to GMT
# (Greenwich Mean Time)
# for date calculations.
$ENV{'TZ'} = "GMT";

for ($clock = 2147483641; $clock < 2147483651; $clock++)
{
print ctime($clock);
}
# Count up in seconds of Epoch time
# just before and after the critical event.
# Print out the corresponding date
# in Gregorian calendar for each result.
# Are the date and time outputs
# correct after the critical event second?

Anonymous said...

I tried the above program on RH Linux 9 as well as HP-UX.

On RH Linux 9, after the critical second, the date is rounded back to DEC 13,1901.

Whereas, on HP-UX, the date-time simply stays at JAN 19,2038 3:14:07. Even after the critical second, same time is displayed.