Tracking Down Bad Clicks

And now, a little history on reporting here at Geniuslink…

Reporting has been an important part of the Geniuslink offering (even as far back as our early GeoRiot days) for two reasons:  

  1. We want to empower you with the best data to make the smartest marketing decisions. Our reports give you, the marketer, insights into how well your links and campaigns are performing.
  2. A view of the bigger picture.  Because of the dynamic nature of our localized links, it’s important for us to give you the full picture.  We show you where people in other countries, on other devices, speaking other languages are being routed when they click your links.

When we rolled out our first click reporting tools, way back in 2012, we considered every request to resolve a link as a click.  It was simple, it was easy and fairly accurate.  This approach worked well for years.

In 2014 we took a major step to cleaning up the click stream and started classifying clicks as either “Bot” or not.  We did this in the standard way of looking at the “User Agent” of the click and from that we were usually able to determine if a click was coming from a person or a computer.  Clicks that came from a computer, like Google’s crawler, we marked as a bot and removed from our Performance report (you could of course toggle the robot and include bot clicks).  


For this process to work well we needed to know what a bot looked like.  Most of them were pretty obvious and included “bot” in the “User Agent” string but some weren’t so obvious.  For example the User Agent for Google’s web crawler, Googlebot looks like this:

Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)

This then created our first round of playing the “Whack-A-Mole” game where we’d have a client alert us that their reporting seemed off, we’d do a dump of click data from our database and start working through the user agent strings, one by one, to find the culprit.  Once we found the culprit, we’d add it to our known bot list and the process would start over.

While constantly evolving as we tracked down and added new bots to the list, this process worked quite well for some time.  

IP Lookup for an AWS server.

Example of an IP address belonging to Amazon Web Services.

The next iteration came when we realized that our clients were seeing weird traffic patterns but we weren’t able to find any suspect User Agent strings.  However, we did notice there were some patterns in IP addresses.  This lead to a new push on detecting bots by classifying IP blocks.  We found that clicks from IPs originating from Digital Ocean, Amazon’s AWS and other known hosting and cloud providers showed the classic signature of a bot, even though user agent strings were disguised to look like real people.

This made our “Whack-A-Mole” game multidimensional as we were now actively working with clients to identify bots as well as new IP blocks.  

Needless to say that while it was a lot of maintenance this process kept our bot filtering, and thus our reporting, quite accurate and our clients happy.

Tracking Down Bad Clicks

The next chapter in the story comes from investigating some weird behavior from a couple clients that were heavy with posting their links on social media and in particular Facebook and their mobile app.  Unfortunately, it seems that Facebook has no problem with ignoring standards or having any consistency with other platforms in regards to reporting (which is a shame due to their history of miss-reporting).  

From our investigations we noticed three things relating to links on Facebook being clicked (or not!) on the mobile app.  

Facebook Miss-Reports iTunes Clicks

Unfortunately, we found that clicks for an iTunes link had different totals based on the link structure.  A country-specific iTunes link (itunes.apple.com) will record one click (as it should), then be directed to the iTunes mobile app.

A localized iTunes link (geo.itunes.apple.com links), or any iTunes link inside of a short linking tool (like Geniuslink or Bitly) will record one click, but the user is not directed to iTunes.  Instead, Facebook shows an alert in their app, asking if the user want to leave the app.  Regardless of how they answer, another click is then recorded. The end result? Click counts from third parties are at least double what Facebook reports.  

We alerted Facebook and iTunes to this issue in the summer of 2016 and crossed our fingers that something would happen.  Unfortunately, nothing came from it and it wasn’t from a lack of effort from our friends at iTunes.  

Phantom Facebook Clicks

This next one was a real head scratcher — we were seeing “clicks” from the Facebook app when we weren’t actually clicking on anything.   These clicks were happening when we were just thumbing through our news feed!  After lots of digging, we found that Facebook has an algorithm to determine how likely you are to click on a link and then will “Prefetch” the links they think you’ll click on.  

But what is prefetching?

Prefetching was introduced in HTML5, and is commonly used by mobile apps to improve speed and user experience. To improve performance, content can be preloaded before a link is clicked, thus there is less load time once you decide to click a link.  For example, prefetching is especially important to Facebook, for mobile users, on slow connections, when clicking on ads in order to reduce bounces.  

For 3rd party reporting tools, a prefetch call looks just like a real live click.  To Facebook’s credit, they follow the industry standard of adding a special header to prefetch clicks.  So you would think you it’s just a matter of watching for this header and marking a click as prefetch, right?   Unfortunately, no.  We could detect if a click was a prefetch, but we found that Facebook’s app doesn’t fire a second, “real”, click if the user actually clicks the link.  Ugg.

 While we don’t yet have a perfect way to handle this behavior on Facebook’s app to not fire a second click when the user actually does click, we now mark all pre-fetch clicks as Junk clicks to ensure that if anything we are under reporting organic clicks instead of over reporting.  

Note: While Facebook has a recommended way to opt out of their prefetch solution, it requires that you have access to the final destination’s page code.  Our clients commonly post geni.us links to Facebook, or other social media platforms, that don’t go directly to their own website or app, but rather to places like Amazon.com or itunes.apple.com.  Facebook’s solution isn’t helpful in this case.

Only Record One Click When One Click Happens  

Makes perfect sense, right?  After conquering phantom clicks and miss-reported clicks, we took a look at why multiple “clicks” were registered in a short time span for a single “real” click.  During our testing, we noticed that one click was coming through as many clicks on our side and this could happen with any link– not just iTunes or Amazon.  And at significant scale! We saw as many as 12 clicks coming through for one real human click. These duplicate clicks would all come from the same IP, browser, and device, within a minute or less.  I told you we were going to take you down a rabbit hole!

Multiple clicks within a minute had the same IP and user agents.

We were so close to accurately representing Facebook and other mobile app and social media click traffic!  So what did we do?  Well of course we developed an algorithm that intelligently sifts through these duplicates and marks them as Junk!  

Goodbye Junk Clicks!

After lots of tuning, we nailed it.  At least for now.  We’ll keep digging and please keep letting us know when you see something funky in your reporting.  Our hope is to offer you the best 3rd party collective click reporting out there!

P.S. – Remember you can always turn off our Junk traffic filter by just flipping the switch.

Toggle between showing organic clicks and Bot + Prefetch + Duplicate clicks.

Comments are closed.