Start the countdown right now! In a little under 29 years from now in the year 2038, Web Analytics engineers at Google, Yahoo, Omniture, Coremetrics, and WebTrends will have some very tough choices to make – and it’s never too early to start thinking about them!
This isn’t a trivial issue like Y2K or something like the digital TV transition day on June 12th of this year – no, no, no! This has the potential to seriously compromise cookie integrity, and potentially “break” visitor tracking, industry-wide!
What is happening in 2038?
On Tuesday, January 19th at exactly 3:14:07 UTC, all computer software programs (including Web Analytics Cookies) that store system time as a signed 32-bit integer (like a Unix timestamp) will start to “wrap around”, storing time as a negative number, causing every system using signed 32-bit integers to interpret time as 1901, and not 2038.
Whoa, Whoa! Back Up – I have no clue what you’re talking about.
Okay, let me try to break this down for you. Almost every 20th century computer uses a signed 32-bit integer which keeps track of system time on your computer, on servers, ATM machines, iPods and iPhones, and so on. This “signed 32-bit integer” business is also known by another name – Unix Time (or also “POSIX” time). This time is represented by the number of seconds since January 1, 1970.
If you take a look at your browser’s cookies, you’ll see endless strings of numbers and dots, like this:
The cookie selected here in this image is the __utma cookie from Google Analytics, and the 10-digit number that I have highlighted represents the first time I visited the Google.com website. This number – 1239628694 – is a Unix Timestamp, and when you do the math (or use a conversion tool somewhere online), this number translates to Mon, 13 Apr 2009 13:18:14 GMT (of course, I most likely cleared my cookies – yes World, I clear cookies from my computer, too!)
So what’s the problem again?
Okay – the problem with this comes due to the way modern computer programs calculate this 10-digit number. That’s what you need to know (Warning: This next party is very geeky). They almost all use a very standard 4-byte integer to count up the seconds, which is 31 bits long, able to contain a maximum value of 2 to the power of 31. The 32nd bit is the sign, which of course is positive (+). When you do the math, the maximum number that computer software programs can reach and stay positive is 2147483646. When you add one more second to it – 2147483647 – the positive sign will become a negative sign, and instead of Tuesday, 3:14:07 on January 19, 2038, computers everywhere will display the time as Friday, 8:45:52 on December 13, 1901.
Can’t this be fixed? Can’t we just ignore the date and move on?
Unfortunately, it’s not that simple. Most every operating system stores system time as a 32-bit integer, and system time is a very big component of a functioning software program (they absolutely need to be able to come up with a positive time stamp). So, it’s not an easy fix – most likely, entire software programs will need to be re-written and re-programmed to avoid Y2038K.
This includes personal computer operating systems, ATM machine software, other electronic devices with computer-like components, and, yes, Web Analytics cookies.
Okay – Y2038K? Give me a break – this is TWENTY-NINE and a HALF years away! I think you’re jumping the gun here.
You’ll be surprised how fast 29 and a half years goes by in computer programming. Think of this – we’re in the year 2009, and we’re using a timestamp that starts counting seconds from 1970 (39 Years Ago), which was first published in 1988 (21 Years Ago). Most of us are still using Office 2003 (6 Years Ago).
29 Years is right around the corner – so I hope that we can come up with some kind of conversion tool, some type of new timestamp calculation, some new 64-bit integer system that can seamlessly transition all software programs and Web Analytics Cookie Timestamps for the next generation!
*Note: Some of this blog post is obviously “tongue and cheek”. I am not really sounding the general alarm about what will happen in 2038 – but hey, it’s never too early to start planning for the future! :)”
Every Wednesday, I sit down and interview different metrics or report sections from Google Analytics. I ask the tough questions – and I expect straight answers! (This, obviously, is a fictional interview. However, if metrics or reports could talk and be interviewed, this is how I imagine their personalities being and how they would answer my questions. Hopefully this will be a fresh, interesting way to learn about the wonderful world of Google Analytics in a unique way).
Joe Teixeira: “Mr. Average Time on Site…how are things?”
Average Time on Site: “…Average…”
JT: “What’s with the sunglasses?”
ATOS: “…It’s bright in here…”
JT: “Well those are just the studio lights…I can have them turned down if you…”
ATOS: “No…it’s cool.”
JT: “Ummm…OK. Well let me ask you my first question. Can you explain to everyone exactly how you are calculated?”
ATOS: [Turns Away in Disgust and Rolls Eyes] “Man…come on, man. Why you gotta play me like that? Everybody knows it’s up to __utmb and __utmc to calculate the difference between the time stamps of each page. I ain’t got nuthin’ to do with any of that.”
JT: “So, two cookies – __utmb and __utmc – they calculate you…”
ATOS: “Yeah, man…”
JT: “…and the difference between each time stamp on each page is the time a user spent on that page…”
ATOS: “Yeah…”
JT: “…and then the Average Time on Site is the sum of all of the time a user – or groups of users – spent on the pages of a site, divided by the number of pages viewed.”
ATOS: “…something like that. If you know all this, how come you’re asking me, man?”
JT: “Because I wanted to hear what you’d have to say about it…”
ATOS: [Becoming more frustrated] “Look, man, this is how it goes down, a’ight? If somebody bounces from a landing page, guess what happens? I become an average of 0:00:00, because there ain’t no second timestamp to go by, so [pointing to the ceiling] the big man upstairs [GA] can’t give me credit for my time. It ain’t my fault, I’m just doing my job around here.”
JT: “So you really have a problem with this. What about people that leave their computers on and go to lunch, or go to a meeting?”
ATOS: “It’s the same thing, except backwards. Let’s say somebody goes to lunch for an hour and they leave they browser on…after 29 minutes of what they like to call “inactivity”, I stop counting. This happens ALL THE TIME, man. It just ain’t right! If they time me out, no second timestamp happens, which again means the average time for that page becomes 0:00:00.”
JT: “What I’m gathering from you is the message you’re trying to convey here is for people who look at you, and use you in their reports and presentations, to take you with a grain of salt…to use your number precariously.”
ATOS: “Well I don’t know what “precariously” means…but yeah, don’t do that.”
JT: “Last week, I talked briefly to Bounce Rate about setVar, and how his change in classification has impacted him. How has the update to setVar affected you?”
ATOS: “Man, it’s about time they did somethin’ about that. setVar ain’t nothing but a greedy metric, man. I’ve been tryin’ to tell people about setVar, and how it was being counted as an interaction hit, but they weren’t listening to me…but finally they took care of some business and straightened things out.”
JT: “Well, thanks a lot for your time…”
ATOS: “Oh, shoot – we done already?”
JT: “Yeah, I’m sorry…”
ATOS: “C’mon, man…I get paid by the second…”
JT: “Sorry, ATOS…maybe some other time.”
ATOS: “…whatever, man. That’s what everyone always says: “Time”. More time, less time, average time…everyone always wants to know about time. People need to just chill for a second and look at everything else, not just me…”
JT: “Well…thanks again [I start getting up].”
Wednesday Interview Series:
February 11, 2009: Bounce Rate