If you work in IT today, the chances are that you joined the field within the past 25 years, after the flip from the year MDCCCCLXXXXVIIII
(or the shorter medieval form MCMXCIX
) to the millennial year MM
.
In today’s notation, that was when the year 1999 rolled over into the year 2000, or Y2K for short.
The very second at which Y2K commenced is perhaps best remembered as the danger moment for the infamous millennium bug or Y2K bug, still commonly referenced in cyber-safety and cybersecurity discussions.
In truth, it wasn’t so much a millennium bug as a century bug, or, in a mixture of computer science and mathematical parlance, an integer overflow problem caused by calculations done modulo 100.
The words “modulo 100” refer to the sort of arithmetic you get if, after every calculation in a sequence, you retain just the last two digits of the intermediate result, which is equivalent to dividing by 100 and keeping only the remainder.
This is also known as clock arithmetic, for obvious reasons, except that with clocks we divide by 24 and keep the remainder, so that two minutes after 23:59 on New Year’s Eve, the time won’t be 24:01, but will instead be 00:01 on New Year’s Day.
The clock starts again from the top (quite literally in the case of clocks with hands) at midnight each morning, and we keep track of the wraparound by bumping the calendar along by one day at the same instant.
As fussy as this sounds, you’d probably think that it’s exactly the sort of algorithm to which computers would be ideally suited.
Unfortunately, simple calendars with long-term precision are are difficult to devise.
For example, our months aren’t all the same length, not least because neither lunar months nor solar years, which individually or in combination make a useful basis for long-term timekeeping, don’t last an exact number of days.
Even more confusingly, there are two different but useful ways to track what we know as a day: the sidereal way, and the solar way.
A sidereal day is the time it takes for the Earth to rotate once and thus to bring the stars (excluding the sun) back at the same place in the night sky, a reassuringly constant sort of thing to keep track of.
The second-closest star to Earth is more than 250,000 times further away than the sun, so the movements of the solar system in our galaxy, and of our galaxy in the universe, don’t make a noticeable difference in day-to-day sidereal observations.
A solar day, in contrast, is determined by the sun, and lasts approximately four minutes longer than a sidereal day.
It is the solar day that we split by convention into 24 hours of equal length.
Each solar day corresponds to a sidereal day plus the additional time needed for the Earth, which itself goes about 1/365th of the way around the sun in that period, to rotate a little bit extra to make up for its orbital advance, and thus to bring the sun back to its highest point in the sky.
This isn’t the same point every day, because the sun rises higher in summer than in winter, but it is still something that can tracked, albeit by using the sun’s shadow rather than by looking directly at it. (Never do that!)
Just to make things even more confusing, solar days vary in length slightly throughout the year.
Firstly, the Earth doesn’t go round the sun in a perfect circle, because our orbit is very slightly elliptical.
The Earth speeds up a bit as it gets closer to the sun, and slows down as it moves further away, with its fastest orbital speed, in January, being about 3.5% faster than its slowest, in July.
The faster the Earth goes, the further it moves along its orbit in a day, and therefore the longer it takes to rotate enough to bring the sun back to its highest point in the sky to complete the solar day.
The slower the progress of the Earth, the shorter the solar day.
Secondly, the Earth’s axis is tilted at about 23.4 degrees, which is why globes are generally mounted at that angle, to remind us why there is less daylight in winter than in summer, and why the seasons are reversed in the southern hemisphere.
The sun takes a varying amount of time to get to its highest point that marks the middle of each day, as the tilt of the Earth relative to the sun changes throughout the year.
The Earth its tilted directly away from or towards the sun at the solstices, and is exactly side-on to the sun at both equinoxes, when night and day are the same length all over the planet.
As an aside, in case you’ve ever wondered where the Mean in the name Greenwich Mean Time (GMT) comes from, it’s because GMT sticks to the annual mean average length of the solar day.
This avoids the hassle of adjusting our clocks back and forth every day, leaving sundials to read variously fast or slow as the year progresses.
Many sundials include a handy chart known as the equation of time to tell you just how much they will be ahead or behind your watch or your mobile phone through the year.
With all the complexity involved in measuring times and dates, and in inventing and maintaining a civil calendar in which those dates maintain a consistent relationship with the seasons of the year, which is handy for cultures that practice settled agriculture…
…you’d be forgiven for assuming that the best-known date-related bug in the computer era would relate to some weird and unforgiving corner case, as programmers call them, involving some abstruse combination of the variations described above.
Instead, the infamous Y2K bug was simply a side-effect of trying to save precious bytes of memory and disk storage by “compressing” dates to the form YYMMMDD
, and assuming that the true date could be reconstructed simply by inserting the two digits 19
at the start.
Saving two bytes for each date sounds like a pointless exercise today, but in the 1960s and 1970s, even top-end mainframe computers typically had just tens of kilobytes of RAM to play with, and just tens of megabytes of space on each hard disk, about a million times less than the storage available on a typical consumer laptop today.
Dates were therefore routinely handled “modulo 100”, by storing the last two digits only.
To be fair, few if any of the programmers writing code in 1969 thought that their software – or, for that matter, code derived directly from it – would still be in use in 1979, let alone in 1999.
So the theoretical time-warp when computer clocks ticked over from 99-31-12T23:59:59Z
to 00-31-12T00:00:00Z
, one hundred years in the past, was ignored for many years.
Even when it became obvious that plenty of old-school code (and, more importantly, lots of old-school YYMMDD
data) was still going to be in play when AD 2000 rolled around, the “fix” was often just to move the problem to a different range of dates.
For example, you could adjust your century window such that the years 20-99
denoted 1920-1999
, while 00-19
were assumed to refer to 2000-2019
, thus sweeping the bug to one side instead of fixing it properly.
Indeed, many software tools still allow each user to set their own century-long window into which two-digit years should be mapped, so that ambiguous data won’t be flagged as “untrustworthy”, but will instead be rewritten to give it what can only be described as unwarranted accuracy.
As you can imagine, when software bugs sort of this are treated as if they have been “solved”, rather than merely hidden from view temporarily, the natural human reaction is to stop worrying about them, and to stop taking them into account when faced with potentially invalid data.
Fortunately, as anyone who was on duty on Millennium Eve will remember, the IT meltdown at the very stroke of midnight that had been predicted by doomsayers didn’t happen – not at the local time in each country, and not globally at midnight in Coordinated Universal Time (UTC).
The underlying issue, of course, was not really the one-shot possibility of software misbehaving dramatically at midnight.
That was mitigated by the fear of bad publicity causing companies to patch their obviously buggy code, and to have humans on stand-by overnight, just in case.
The deeper issue was the amount of already-collected data that was as good as uncorrectable, but that continued being used and trusted anyway, in many cases to this day.
Sadly, that sort of danger is still an issue: we continue to use data as though it were unimpeachably correct and useful, long after its “use by” date has passed.
Vendors of all stripes, from social networks to cybersecurity services, are collecting what are known metaphorically as data lakes – data points of all sorts, grabbed today just in case someone figures out how to do something useful with them in the future.
So, even though Y2K ticked over without the doom and gloom that many had feared, it should nevertheless remind us all that collecting data today that may long outlive its correctness, precision, accuracy, context, and thus its validity…
…is a risk best avoided.
If you are a software designer, a product manager, an IT entrepreneur, or a programmer, why not make the following New Year’s Resolutions to mark the 25th anniversary of Y2K?
Oh, and have a wonderful 2025!
Why not ask how SolCyber can help you do cybersecurity in the most human-friendly way? Don’t get stuck behind an ever-expanding convoy of security tools that leave you at the whim of policies and procedures that are dictated by the tools, even though they don’t suit your IT team, your colleagues, or your customers!
Paul Ducklin is a respected expert with more than 30 years of experience as a programmer, reverser, researcher and educator in the cybersecurity industry. Duck, as he is known, is also a globally respected writer, presenter and podcaster with an unmatched knack for explaining even the most complex technical issues in plain English. Read, learn, enjoy!
Featured image of alarm clock by Anne Nygård via Unsplash.