I'm integrating Omniture with a (very poorly done) web chat system. The solution involves their iframe pointing at their domain sitting on our site which creates a popup for the chat session. The popup contains an iframe pointing at our site with parameters passed into the querystring. My code reads the querystring parameters and inserts the analytics beacon. Iframeception. Apparently it's not possible to do this in any remotely sane way, according to the guy I'm dealing with anyway.

Amazingly it actually works. Even in IE. Without any frigging around with P3P headers. It seems IE sends the cookies in the header for third-party cookies if they were set earlier in the session, so it's all okay. I was surprised.

Optimizely throws down the gauntlet to Adobe

This piece on TechCrunch talks about how Optimizely is rapidly catching up on industry leader Test&Target.


In the comments, an Adobe product mangler points to a blog where they discuss unnamed competitors. Optimizely follow up by throwing down a challenge: try Optimizely and Test&Target side by side and decide.

If you haven't tried the Optimizely demo, well worth a go. It's awesome. Love the product. Be warned though, the pricing isn't the Enterprise Grade you're used to ;)

Real time: you probably don't need it, but you're going to have it anyway

A big buzz in web analytics for the last while has been real-time. It's one of those really cool features that everyone gets all tingly about it. But you don't need it. Seriously, you don't.

Real-time is popular because managers have a Napoleon complex. We all see ourselves as generals sitting on top of the hill, directing our minions into battle. "Send reinforcements down the left flank." War happens in real-time and important decisions need to be made based on real-time outcomes. Marketing and business don't work this way.

The geeks among us all want to feel like we're in the control room of a nuclear power plant, or the War Room from Doctor Strangelove. Those real-time graphs tickle something deep in our consciousness, make us feel like we're alive with fresh data, completely up-to-date. But it's an illusion.

If you're making decisions based on real-time data, you're probably doing it wrong and spending a lot of time spinning your wheels. You really shouldn't be wasting time implementing the kinds of changes you'd make on real-time results. It's just a waste of time. Spend time doing quality analysis of a decent length of time's data, then make a decision.

That's not to say there's absolutely no valid use cases for this stuff. I'm sure in a digital newsroom it's be great to see the results of a breaking story in real-time, tweaking headlines and allocating resources to popular stories. Tools like Chartbeat pretty much have that market nailed.

Google Analytics has a pretty half-arsed attempt at real-time. It's pretty crappy though. No conversions makes it a complete non-starter for any ecommerce business, and the design is completely wrong for the standard use case, sticking it up on a big screen hanging on the wall.

So there's plenty of reasons not to use real-time. You'll probably will end up using it anyway. The pull of a real-time dashboard is just too great. Your boss's Napoleon complex will be stroked by a Big Board dashboard. It'll raise your team's profile. There'll be people standing in front of it, mesmerized, watching the data come in. Particularly when there's a big launch.

The challenge for us analysts is to ensure we do the minimum possible, and keep the real-time data as unactionable as possible. Reducing to a bare minimum the number of "why'd that go down in the last hour" questions we waste time answering.

Automated web analytics testing with Selenium WebDriver and Sauce Labs

I've been vaguely aware of Selenium and WebDriver for years through my mate Simon, but hadn't dipped my toe yet. Last night I had a little time to check it out.

For those who don't know, Selenium is a framework for automated testing of web applications. WebDriver is the component that allo


ws you to drive individual browsers. There's now hosted services that will spin up browsers for your tests at your command, one of which is Sauce Labs, so you don't have to go through the pain of setting up and maintaining all the browsers and platforms you want to test.

There's two ways you can test analytics tagging in this environment. The ideal is to set up a proxy (see Steve Stojanovski's approach) and then inspect the beacons as they pass through. That would definitely be an ideal situation, but it introduces another element to deal with. My approach is a bit simpler and involves getting WebDriver to run a bit of JavaScript that finds and outputs the beacon URLs for us to then handle.

To run this script you'll need Ruby, a Sauce Labs account and the WebDriver Ruby bindings. Follow Sauce Labs' instructions to get set up. The Sauce Labs free account gives you plenty of minutes to get started. You'll need to replace the username and API key with your details. Then run the script.

This script is incredibly simplistic. It just loads the page, runs the JavaScript, and checks to see if there was anything in the output. I'm having trouble with using it in IE6, so it'll need some improving. At the moment this is just for Omniture beacons, but it wouldn't be difficult to get it looking into any other types of beacons.

So this at least allows a simple sanity check that your analytics beacons are at least firing. A good start. I'll start doing more soon, though I suppose I should learn some Ruby first.

Omniture is so 1996

Training a new junior analyst with a web dev background I found myself cursing, yet again, how old fashioned Omniture implementation still is. JavaScript is a wonderfully expressive language with a beautiful object syntax for representing real-world behaviour. Omniture doesn't use any of this. Everything is strings with cryptic, archaic delimiters. Their JavaScript is Stringly Typed.

In case you don't know, the syntax of Omniture's product string looks like this:

 category;product name;quantity;totalprice[,category;product name;quantity;totalprice] 
For example:
The quantity and price are optional for any event except "purchase". Of course. Category is deprecated, so instead you have to always add a leading semicolon. Though the examples in the knowledgebase still helpfully include the deprecated category. Nice work.
Now try explaining that to a developer who won't read anything longer than a sentence, and expecting him to get it right. The number times I've seen "REPLACE THIS WITH THE VALUE" showing up in my Omniture reports is insane.
How about this as a better idea?
s.products = [];
s.products.push( {
productname: 'lemonade (cans)',
quantity: 3,
totalprice: 3.30,
s.products.push( {
productname: 'cola (bottles)',
quantity: 2,
unitprice: 1.23,
Any developer will understand this syntax immediately, no documentation required.
Now to convert that into Omniture's preferred Stringly Typed variable. Show this inside the s_doPlugins() block of your s_code file:
if (typeof(s.products) === 'object') {
var outputString = '';
for (var i = 0; i < s.products.length ; i++) {
if (s.products[i].totalprice === undefined) {
s.products[i].totalprice = s.products[i].unitprice * s.products[i].quantity;
if (i > 0) {
outputString += ",";
outputString += ";" + s.products[i].productname
if (s.products[i].quantity && s.products[i].totalprice) {
outputString += ";" + s.products[i].quantity + ";" + s.products[i].totalprice;
s.products = outputString;
Note, this isn't really tested and won't handle some of the more esoteric uses of s.products.
So really Omniture, this shouldn't be so hard. Why can't you update your code mechanisms to the 21st Century and start using the full expressiveness of the language? It will make implementation easier, because it will make more sense to developers!

SnowPlow web analytics

I'm really excited about this new open-source web analytics system. The loosely-couple architecture is brilliant, meaning you can swap out components as needed.

One example I've already found is the limitation of using CloudFront as the data collector. It's very easy to set up, but limits you to GA-style first-party cookies and an inability to reliably track across domains. I'm already planning to write something in NodeJS to replace this component.

There's no GUI, but it's incredibly easy to set up data collection and just start collecting. Unlike ordinary web analytics tools, you're collecting everything. Tools like GA and Omniture throw away things like full URLs, full User-Agent strings and full referrer URLs, which means there's some times of questions you can't answer unless you were capturing the data the right way from the beginning.

As an example, I recently got asked to analyse our on-site search. Some of the searches are recorded in SiteCatalyst, but other aren't. Even though the keywords are included in the request URL every time, we're bang out of luck doing any keyword-level analysis on historical searches. If we had something like this recording data, the analysis would be tricky but possible using Hive.

Very cool stuff. If you're reading this on my blog site, you're already being recorded to it.

Introduction to web analytics reading list

I've got a new person starting who's a bit of an information sponge. In the lead up to him starting, I thought I'd put together a bit of a reading list to get him up to speed on web analytics, digital marketing and all the other things we look at.

Since this is probably generally applicable, I'll share it here. I'll continue adding to this as I think of new things.

  • Omniture SiteCatalyst: the gold standard of enterprise analytics, though gold isn't supposed to tarnish right?
  • Mixpanel: a new tool that captures every data point, and does amazing things with it.
  • KISSmetrics: next generation web analytics tool.
  • QuBit OpenTag: container tag management platform, to make all the tags manageable. This one has a free offering so you can stick it on your own site and play with it.
  • Ghostery: (all browsers) find out what beacons are on any page
  • Collusion: (Firefox) who's tracking you online, and how do they overlap?
  • Omniture SiteCat debugger: (Chrome) what's being sent to SiteCat on the current page
  • Occam's Razor: Avinash Kaushik's always writing great stuff, great insight.
  • ClickZ: a broad look at the whole online marketing space.
  • Adam Greco: one of the gurus of SiteCatalyst implementation.
  • Andrew Chen: exciting stuff at the boundary of behavioural psychology and analytics. The new field of "Growth Hackers".
  • Which Test Won?: how good is your gut over an A/B test?
  • Nir and far: Another Growth Hacker with interesting things to say.


Quora topics

Only 52% of web analytics spend goes on staff

On average, only 52% of web analytics expenditure is spent on internal staff, a figure which has not changed since 2011. This is despite 40% of companies in 2011 having planned to increase their budget on staff to analyse web data, which highlights that finding the right people is proving a difficult challenge

This craziness has to stop. For every $1 you spend on analytics tools, you should be spending $2 on staff. Minimum. Otherwise you might as well use free tools, despite their limitations.

What this stat shows is the effectiveness of analytics vendors' sales organisations. I've seen them in action and they're very impressive! One company I worked for spent 7 figures annually on its high-end analytics tool and had 1.5 analysts looking at the results and managing the implementation. Insanity!

The report brings up the skills shortage in our field. Something I plan to blog about later. We need to get better, as an industry, at cross-training people. On that note, I'm currently hiring junior web developers to cross-train as web analytics specialists.

How can you reliably work out a new "visit" to your sites?

How can you reliably determine if a particular page view on which your analytics code is running is a new arrival at your site? This is a difficult problem and I can't see any easy solution, so the "Direct" visit source will always be overreported along with all the other sources of inflation for that source.

I'm currently using this logic:
  • URL doesn't have a campaign code
  • Referrer is either blank or is from a domain that isn't in your list of internal domains

However this fails for links between HTTPS and HTTP pages within your own sites. When someone goes from a secure page to a non-secure page, the referrer is wiped out.

The only alternative I can think of is setting a cookie, but that won't work in my case because we have to support multiple domains.

The gold standard would be to send your beacons to a third party, which could apply some kind of time-based test to "direct" traffic to determine if it's truly "direct". That won't work in my current architecture.

Any ideas?