Web analytics in the real world

(Gmail messed up the original post. Fixed manually.)

Over the break a news story bubbled up about Euclid's retail analytics product. These kinds of tools are pretty damn exciting, and scary too.

The premise is this: track individuals as they move through a retail store using the unique MAC address their smartphone's wireless gives out. Euclid explains that by tracking signal levels, they can triangulate the individual's location within the store. Visitors don't need to have their WiFi actually connected to the store's network, just have it switched on.

Another company doing this is Helsinki-based RapidBlue, who illustrate it with this diagram:
This is pretty amazing stuff and brings the kind of analytics and optimization we regularly do online into the retail environment. Conversion rates, dwell times, split tests, the whole lot.

How many individuals walked past your store? How many then went in? Then how many bought?

Thinking about this further, you could get even better at it:
  • Directional antennae to isolate specific areas of the store
  • Highly directional antennae pointed at the checkouts to record sales
  • Match sales using credit card number or loyalty cards, suddenly you've matched the MAC address to a CRM identifier
  • Give visitors an app with some kind of discount and you can automatically match MAC to CRM identifier
  • Free in-store WiFi and you can see what sites they're looking at as they browse through the store
I can think of some further ways retailers might track people beyond smartphone MACs:
  • Long-distance reads of RFID devices (public transport cards, contactless credit cards, security passes)
  • Partner with mobile telcos to bring mobile coverage into the store, in exchange for sharing anonymized identity information
  • Bluetooth MACs

Web Analytics Wednesday in Sydney

Some of you have probably heard me talk about this for some time. Well I've finally got around to setting up a WebAnalytics Wednesday in Sydney. In 2013, this will run every month on the second Wednesday of the month.

The first event will be:

Wednesday, January 09, 2013 at 6:30 PM
at the City Hotel on Kent Street in the Sydney CBD.

What's Web Analytics Wednesday?
It's a casual social meetup of like-minded web analytics, digital marketing and optimization types. We're a reasonably small community, so it's worth getting together and learning from each other.

What format will it take?
We'll have a couple of short presentations, but the focus is on networking. If you've got something you'd like to present, please get in touch with me!

What does it cost?
It's free! I'm looking for sponsors willing to cover food, drinks and venue hire. Please contact me.


I'm integrating Omniture with a (very poorly done) web chat system. The solution involves their iframe pointing at their domain sitting on our site which creates a popup for the chat session. The popup contains an iframe pointing at our site with parameters passed into the querystring. My code reads the querystring parameters and inserts the analytics beacon. Iframeception. Apparently it's not possible to do this in any remotely sane way, according to the guy I'm dealing with anyway.

Amazingly it actually works. Even in IE. Without any frigging around with P3P headers. It seems IE sends the cookies in the header for third-party cookies if they were set earlier in the session, so it's all okay. I was surprised.

Optimizely throws down the gauntlet to Adobe

This piece on TechCrunch talks about how Optimizely is rapidly catching up on industry leader Test&Target.


In the comments, an Adobe product mangler points to a blog where they discuss unnamed competitors. Optimizely follow up by throwing down a challenge: try Optimizely and Test&Target side by side and decide.

If you haven't tried the Optimizely demo, well worth a go. It's awesome. Love the product. Be warned though, the pricing isn't the Enterprise Grade you're used to ;)

Real time: you probably don't need it, but you're going to have it anyway

A big buzz in web analytics for the last while has been real-time. It's one of those really cool features that everyone gets all tingly about it. But you don't need it. Seriously, you don't.

Real-time is popular because managers have a Napoleon complex. We all see ourselves as generals sitting on top of the hill, directing our minions into battle. "Send reinforcements down the left flank." War happens in real-time and important decisions need to be made based on real-time outcomes. Marketing and business don't work this way.

The geeks among us all want to feel like we're in the control room of a nuclear power plant, or the War Room from Doctor Strangelove. Those real-time graphs tickle something deep in our consciousness, make us feel like we're alive with fresh data, completely up-to-date. But it's an illusion.

If you're making decisions based on real-time data, you're probably doing it wrong and spending a lot of time spinning your wheels. You really shouldn't be wasting time implementing the kinds of changes you'd make on real-time results. It's just a waste of time. Spend time doing quality analysis of a decent length of time's data, then make a decision.

That's not to say there's absolutely no valid use cases for this stuff. I'm sure in a digital newsroom it's be great to see the results of a breaking story in real-time, tweaking headlines and allocating resources to popular stories. Tools like Chartbeat pretty much have that market nailed.

Google Analytics has a pretty half-arsed attempt at real-time. It's pretty crappy though. No conversions makes it a complete non-starter for any ecommerce business, and the design is completely wrong for the standard use case, sticking it up on a big screen hanging on the wall.

So there's plenty of reasons not to use real-time. You'll probably will end up using it anyway. The pull of a real-time dashboard is just too great. Your boss's Napoleon complex will be stroked by a Big Board dashboard. It'll raise your team's profile. There'll be people standing in front of it, mesmerized, watching the data come in. Particularly when there's a big launch.

The challenge for us analysts is to ensure we do the minimum possible, and keep the real-time data as unactionable as possible. Reducing to a bare minimum the number of "why'd that go down in the last hour" questions we waste time answering.

Automated web analytics testing with Selenium WebDriver and Sauce Labs

I've been vaguely aware of Selenium and WebDriver for years through my mate Simon, but hadn't dipped my toe yet. Last night I had a little time to check it out.

For those who don't know, Selenium is a framework for automated testing of web applications. WebDriver is the component that allo


ws you to drive individual browsers. There's now hosted services that will spin up browsers for your tests at your command, one of which is Sauce Labs, so you don't have to go through the pain of setting up and maintaining all the browsers and platforms you want to test.

There's two ways you can test analytics tagging in this environment. The ideal is to set up a proxy (see Steve Stojanovski's approach) and then inspect the beacons as they pass through. That would definitely be an ideal situation, but it introduces another element to deal with. My approach is a bit simpler and involves getting WebDriver to run a bit of JavaScript that finds and outputs the beacon URLs for us to then handle.

To run this script you'll need Ruby, a Sauce Labs account and the WebDriver Ruby bindings. Follow Sauce Labs' instructions to get set up. The Sauce Labs free account gives you plenty of minutes to get started. You'll need to replace the username and API key with your details. Then run the script.

This script is incredibly simplistic. It just loads the page, runs the JavaScript, and checks to see if there was anything in the output. I'm having trouble with using it in IE6, so it'll need some improving. At the moment this is just for Omniture beacons, but it wouldn't be difficult to get it looking into any other types of beacons.

So this at least allows a simple sanity check that your analytics beacons are at least firing. A good start. I'll start doing more soon, though I suppose I should learn some Ruby first.

Omniture is so 1996

Training a new junior analyst with a web dev background I found myself cursing, yet again, how old fashioned Omniture implementation still is. JavaScript is a wonderfully expressive language with a beautiful object syntax for representing real-world behaviour. Omniture doesn't use any of this. Everything is strings with cryptic, archaic delimiters. Their JavaScript is Stringly Typed.

In case you don't know, the syntax of Omniture's product string looks like this:

 category;product name;quantity;totalprice[,category;product name;quantity;totalprice] 
For example:
The quantity and price are optional for any event except "purchase". Of course. Category is deprecated, so instead you have to always add a leading semicolon. Though the examples in the knowledgebase still helpfully include the deprecated category. Nice work.
Now try explaining that to a developer who won't read anything longer than a sentence, and expecting him to get it right. The number times I've seen "REPLACE THIS WITH THE VALUE" showing up in my Omniture reports is insane.
How about this as a better idea?
s.products = [];
s.products.push( {
productname: 'lemonade (cans)',
quantity: 3,
totalprice: 3.30,
s.products.push( {
productname: 'cola (bottles)',
quantity: 2,
unitprice: 1.23,
Any developer will understand this syntax immediately, no documentation required.
Now to convert that into Omniture's preferred Stringly Typed variable. Show this inside the s_doPlugins() block of your s_code file:
if (typeof(s.products) === 'object') {
var outputString = '';
for (var i = 0; i < s.products.length ; i++) {
if (s.products[i].totalprice === undefined) {
s.products[i].totalprice = s.products[i].unitprice * s.products[i].quantity;
if (i > 0) {
outputString += ",";
outputString += ";" + s.products[i].productname
if (s.products[i].quantity && s.products[i].totalprice) {
outputString += ";" + s.products[i].quantity + ";" + s.products[i].totalprice;
s.products = outputString;
Note, this isn't really tested and won't handle some of the more esoteric uses of s.products.
So really Omniture, this shouldn't be so hard. Why can't you update your code mechanisms to the 21st Century and start using the full expressiveness of the language? It will make implementation easier, because it will make more sense to developers!

SnowPlow web analytics

I'm really excited about this new open-source web analytics system. The loosely-couple architecture is brilliant, meaning you can swap out components as needed.

One example I've already found is the limitation of using CloudFront as the data collector. It's very easy to set up, but limits you to GA-style first-party cookies and an inability to reliably track across domains. I'm already planning to write something in NodeJS to replace this component.

There's no GUI, but it's incredibly easy to set up data collection and just start collecting. Unlike ordinary web analytics tools, you're collecting everything. Tools like GA and Omniture throw away things like full URLs, full User-Agent strings and full referrer URLs, which means there's some times of questions you can't answer unless you were capturing the data the right way from the beginning.

As an example, I recently got asked to analyse our on-site search. Some of the searches are recorded in SiteCatalyst, but other aren't. Even though the keywords are included in the request URL every time, we're bang out of luck doing any keyword-level analysis on historical searches. If we had something like this recording data, the analysis would be tricky but possible using Hive.

Very cool stuff. If you're reading this on my blog site, you're already being recorded to it.

Introduction to web analytics reading list

I've got a new person starting who's a bit of an information sponge. In the lead up to him starting, I thought I'd put together a bit of a reading list to get him up to speed on web analytics, digital marketing and all the other things we look at.

Since this is probably generally applicable, I'll share it here. I'll continue adding to this as I think of new things.

  • Omniture SiteCatalyst: the gold standard of enterprise analytics, though gold isn't supposed to tarnish right?
  • Mixpanel: a new tool that captures every data point, and does amazing things with it.
  • KISSmetrics: next generation web analytics tool.
  • QuBit OpenTag: container tag management platform, to make all the tags manageable. This one has a free offering so you can stick it on your own site and play with it.
  • Ghostery: (all browsers) find out what beacons are on any page
  • Collusion: (Firefox) who's tracking you online, and how do they overlap?
  • Omniture SiteCat debugger: (Chrome) what's being sent to SiteCat on the current page
  • Occam's Razor: Avinash Kaushik's always writing great stuff, great insight.
  • ClickZ: a broad look at the whole online marketing space.
  • Adam Greco: one of the gurus of SiteCatalyst implementation.
  • Andrew Chen: exciting stuff at the boundary of behavioural psychology and analytics. The new field of "Growth Hackers".
  • Which Test Won?: how good is your gut over an A/B test?
  • Nir and far: Another Growth Hacker with interesting things to say.


Quora topics