Beware of the Robots!

Simon  Mathews

February 07, 2014

Related Topics

Google_web_crawling_bots

How Internet Bots are Clouding Insight from Web Metrics: A Case Study in Screen Size for Responsive Design

With our strong emphasis on the use of data in the digital experience design process, one of the first things we do when kicking off a new design process is to ask the data a key question: Who, using what devices, are we designing for?

We want to know the devices, browsers and Operating Systems that the current visitors are using so that we can optimize the experience to that mix, and deliver a responsive design that works best at the most common screen resolutions we are seeing (and will be seeing moving forward). Knowing the networks (mobile/fixed) and bandwidth are also helpful to determine the typical/sweet spot experiences a user will have with the new sites and apps we are designing.

We started working with a new client in the B2B technology space a few months back and started digging into their current Google Analytics data from the first day. First we wanted to take a look at the current typical screen sizes to determine if a responsive or adaptive (or hybrid) design was the best approach for them, and to determine which breakpoints were optimal.

Screen resolution settings in Google Analytics

(Screen resolution settings in Google Analytics)

Pulling the screen resolution out of Google Analytics for the last quarter, adding a secondary dimension of Device Category (so we could filter to just desktop, removing tablet and mobile) and then importing into Excel so we could extract and average the Horizontal and Vertical resolutions separately, we ending up with the following 'average' screen size:

1,119 x 798 pixels

With an aspect ratio close to 4:3 (4:2.8), this seemed low, compared to others we see, but was within the realms of the possible. However, the average screen size is not that helpful for determining design sizes, as we need to know the full distribution of screen size. Taking the raw horizontal screen resolutions and turning them into a cumulative chart produced this more visual representation of the data.

Screen Resolution 2

The chart basically shows, for each horizontal screen size, what % of users have a screen larger than that size - so 1 pixel has 100% of users with larger screens, etc. It also filters out mobile and tablet for now.

Instantly we can see a problem. 75% of all users seem to be at a screen resolution of 1024×768. And, as we have filtered out tablets, this must mean a lot of people have very small screens. In fact, a cursory glance at the Best Buy website shows that it's not even possible to buy a computer or monitor with such a low resolution today.

So, we have a bot problem.

Some recent research from Incapsula (report here) showed that more than half of internet traffic is bots – some good (like Google's search crawlers and monitoring tools) and some nefarious (like scrapers and malware). And both our friendly and less friendly bots tend to report as 'standard' agents, i.e., 1024×768 running on Windows and IE.

So, with our known infestation of bots we went back to Google Analytics and worked closely with the client to identify and filter out as many of these bots as possible. The easy ones (like Google) were already filtered. Some more were easy to spot, based on identifiers and IP networks. But for others we had to dig deeper and look at behavior on the site, such as visiting every page, something a real user would rarely, if ever, do!

The filtering had a dramatic effect, as the chart below with total traffic to the site over the last five months shows.

Visits per day 3

Once we had stabilized the data, we could revisit the screen resolution and other key data that would help inform the design. This chart below shows the "before & after" data for the horizontal screen resolution.

Horizontal Resolution 4

This totally transforms the view of the physical set up of our users' desktop devices.  The data also looks more 'natural' with jumps at standard resolutions. The average resolution also moved from 1,119×798 to 1,501×922, a really significant change.

So, now that we were confident that we had killed off most of the bots (victory to the humans!) we could get back to the first question - what screens should we optimize the design to?

Merging the tablet and mobile data back in, and adding the vertical resolution to the horizontal chart we end up with this true picture of the users of the current experience.

Visits 5

Quickly we can see the main device resolutions and the share of users seeing each. The big takeaway for this project was that 65% of the users are seeing the site on a screen larger than 1,200 pixels, and almost 20% were using monitors at the full HD resolution of 1,920.

This changed our design strategy completely.  From maxing the screen resolution at 1,024 we are now focusing to optimize the desktop experience at 1,200, on tablets for resolutions closer to 1,000, and all within an overall responsive framework.

The lessons?

Always validate your web analytics to ensure your data is correct.  Bots are insidious on the web so filter them out. And as the actual desktop users are using big screens optimize for them, not just for tablets and mobile.