SnowPlow update: first part of source code published

March 13th, 2012 by Yali

The SnowPlow technical blog post series

Since introducing SnowPlow 3 weeks ago, we’ve been contacted by a number of people who are interested in implementing SnowPlow on their websites. We are now working with a couple of companies to do initial implementations, and made available the first part of the SnowPlow source code on it’s own Github repository here.

The release took longer than expected because we were keen to carefully document everything that was released, to make it as easy as possible to implement the code. All the documentation is available on the Github repository, alongside the source code.

Read the rest of this entry »

Call for developers: interested in spending a couple of hours getting young people into coding?

March 5th, 2012 by Yali

We’re looking for developers who would be willing to help out with 1-2 hour sessions running an after school, Coding Club, at Burlington Danes Academy school, one or more Wednesday afternoons from 3.30pm.

We started running the club last week, as part of a broader initiative we’re developing to get more young people interested in technology and programming in particular (more details below). The club is a pilot: if it is successful, we’re looking to role out similar initiatives elsewhere in London.

What happens at the Coding Club?

20-30 students with an interest in programming at Burlington Danes stay behind after schools on Wednesday afternoons. The sessions are very informal: we set them challenges and problems to solve using Python. The students work in pairs and use the excellent Hello World! guidebook for assistance. Volunteer developers walk around the room, answer questions and provide advice and hints when asked by the students. Read the rest of this entry »

How “slowcubators” will replace the traditional corporation

February 23rd, 2012 by Alex

As startuppers ourselves, at Keplar we do a lot of thinking about what the future shape of the startup ecosystem looks like in the UK, Europe and beyond. Yesterday Fast Company ran an interesting profile of successful West Coast incubator Y Combinator, entitled “Paul Graham: Why Y Combinator Replaces The Traditional Corporation”. The article broadly set out how Y Combinator is building a sophisticated ecosystem of complimentary, collaborative startups who help each other out on their path to building traction and profitability, in ways which rival the more ossified, hierarchical form of a traditional corporation; the piece has engendered a lively debate on Hacker News.

Pilot fish accompanying an oceanic whitetip shark in the Red Sea

In my response in the thread I agreed with the underlying trend (keiretsu collectives of innovative, smaller companies outcompeting traditional corporations), but argued that Y Combinator is the wrong posterchild for this movement, for a couple of reasons:

Read the rest of this entry »

Introducing SnowPlow: the world’s most powerful web analytics platform

February 21st, 2012 by Yali

The SnowPlow technical blog post series

Download the SnowPlow brochure

What is SnowPlow?

SnowPlow does three things:

  1. Identifies users, and tracks the way they engage with one or more websites
  2. Stores the associated data in a scalable “clickstream” data warehouse
  3. Makes it possible to leverage a big data toolset (e.g. Hadoop, Pig, Hive) to analyse that data

That sounds like a web analytics tool. There are many of them. (And many of them are free, including Google Analytics.) Why build a new one?

We built SnowPlow for our own web analytics purposes. We recognise that we are unusual web analysts, for a number of reasons, but for us, there are a lot of frustrations with the different solutions that are currently available:

1. We want access to atomic, customer-level and event-level data

Read the rest of this entry »

Approaches to developing your company’s analytics capability in a big data world

February 10th, 2012 by Yali

Whilst there is a lot to love about big data, big data gives CIOs and business folks reason to moan: using it means developing expertise in new approaches to analytics, new technologies and new business processes.  However, for companies that have not, to date, successfully implemented a data-warehouse, big data offers one huge reason to smile: it makes it possible for companies to develop their data warehousing platform in an incremental, step-by-step fashion, with much lower initial costs than the traditional, “big bang” approach that represented the orthodox approach in the pre-big data world.

In this blog post, we explain why that is the case, and outline what we believe is the best approach for companies looking to build out an internal analytics capability that takes advantage of big data technologies like Hadoop. Read the rest of this entry »

Using Tableau and Google Analytics to analyse the drivers of growth in online retail

February 8th, 2012 by Yali

The Google-export-to-csv blog post series

At Keplar, we often find ourselves using web analytics data as one source of data to help us understand how our clients business (particularly in online retail) have performed in the last 3-5 years, and what has driven changes in that performance. We tend to perform that analysis by extracting the data out of the web analytics software (normally Google Analytics) so that we can easily visualise it in a way that makes it easy to spot the drivers of growth and hone in on the causal factors responsible for any changes in performance. In this blog post, we run through the steps to perform this analysis quickly, because they are steps that any online retailer (or in fact web business in general) would want to perform.

Three very common questions for an online retailer to ask are:

  1. How have sales and traffic grown in my online store over time?
  2. What has driven that growth?
  3. What can I do to increase growth in a cost-effective way?

Web analytics data can be helpful in answering the first two questions – by identifying:

  1. Where the people who visited and bought from a website came from
  2. How different sources of traffic have changed their contribution over the time period in question

These can then form a basis for answering the third question.

A plot like this clearly shows the relative importance of different channels in driving traffic growth

Unfortunately, Google Analytics does not provide an easy way to visualise how relative contributions of different traffic sources have changed over time via its web interface. Luckily for us, Google does however make it easy to grab the relevant data from the Google Analytics API and ultimately generate the above visualisation. In this blog post, I will show how to do perform this analysis, using Google-Analytics-Export-to-CSV to extract the data, and Tableau to quickly graph and drill into the results.

I also hope to demonstrate what we call train-of-thought-analysis – where a fast business intelligence tool such as Tableau is used to answer questions which suggest new questions, which can in turn be answered by follow-on analysis, in particular by drilling in on subsets of the data.

The steps presented below were performed for Psychic Bazaar, a startup, specialist retailer in the mind, body and spirit sector. (Many thanks to the folks at Psychic Bazaar for being willing to share their data.) However, the same steps can be applied to any online shop (or indeed, any website) with Google Analytics integrated. And if you don’t have Tableau, you can download a 30 day trial version of the software, and treat this blog post as an introduction to Tableau. Alternatively, comparable analyses should be easy to perform using alternative BI tools (e.g. Microstrategy or Qlikview).

Read the rest of this entry »

Installing Google-Analytics-export-to-CSV

January 31st, 2012 by Yali

The Google-export-to-csv blog post series

Google-Analytics-export-to-CSV is a straightforward, command-line tool for getting data out of Google Analytics (via the API) and into a CSV file, so you can open it in your favourite analytics program.

For an introduction, see here. For instructions on how to use the tool to run queries and extract data, see here. The program can be downloaded here. It is packaged as a ZIP file. It only needs to be unzipped before it can be used at the command line.

The program requires Java to run. If you do not have Java runtime environment installed, you will need to install it.  The following is a step-by-step guide to install Google-Analytics-export-to-CSV (incl. Java if necessary) and run your first query:

Read the rest of this entry »

Using Google-Analytics-export-to-CSV: a step-by-step guide

January 31st, 2012 by Yali

The Google-export-to-csv blog post series

Google-Analytics-Export-to-CSV is a free (open source) command-line tool that makes it easy to pull large volumes of data out of Google Analytics and process it in your favourite analytics tool including Tableau, R or (even) Excel. For an introduction see here, to download it click here.

Extracting data from Google Analytics is a simple, 3 step process:

  1. Develop your query
  2. Run the query
  3. Import the results into an analytics tool

The three steps are described in more detail below.

Read the rest of this entry »

Introducing Google-Analytics-export-to-CSV: a fast, simple way to get your Google Analytics data into your favourite analytics program

January 31st, 2012 by Yali

The Google-export-to-csv blog post series

Update: this is the first in a complete blog post series. The full set of posts are as follows:

  1. Introducing Google-Analytics-export-to-CSV
  2. Installing Google-Analytics-export-to-CSV
  3. Using Google-Analytics-export-to-CSV: a step-by-step guide
  4. Using Tableau and Google Analytics to analyse the drivers of growth in online retail

The source code is now available on Github here: https://github.com/keplar/google-analytics-export-to-csv

A compiled version of Google-Analytics-export-to-csv is also available via Github here

———–

Download Google-Analytics-export-to-CSV, a free (open source), quick and simple tool for easily pulling data out of Google Aanlytics via the API.

Google Analytics contains of wealth of interesting data. Often, however, it makes sense to take the data out of GA and analyse it in a separate tool e.g. Tableau, R, Excel. There are a number of reasons why this is sometimes desirable:

  1. A number of analyses are hard / clunky to do in Google Analytics via the web UI. (Indeed a number are impossible.)
  2. Whilst it is generally impossible to join data in Google Analytics with other data sources (e.g. CRM systems), it is often desirable to compare graphs alongside others generated from different data sources. This is much easier if both sets of data are available in the same analytics tool

Because the tool uses Google’s Data Export API, it can extract much larger volumes of more detailed data than is possible using the web UI: up to 7 dimensions and 10 metrics with each pull. Further, if the query you run with it returns more than 10,000 lines of data (the limit returned by the API), the tool automatically makes extra calls to fetch the additional data and pop it in your CSV automatically, so CSV has all the data you require.

The Google-Analytics-export-to-CSV is a command-line tool we developed internally at Keplar to make it easy for us to grab data out of our own (or our client’s) Google Analytics account to enable us to perform more powerful analyses, faster. We are now making it available to everyone on the internet, for free, as an open source project.

In the next couple of days we plan to make the source code available on Github. In the meantime, if you are a data analyst hungry to get your Google Analytics data out into your favourite analytics tool, you can download it here.

Why big data matters to companies in retail and media. (A straightforward guide for business folk)

January 30th, 2012 by Yali

A downloadable version of this blog post (in PDF format) is available here

Introduction

The term “big data” is very much in vogue at the moment. In this white paper, we explore what big data means, what opportunities it presents to companies in the retail and media sectors and outline what companies need to do to take advantage of big data.

The purpose of this white paper is to provide an overview of the opportunities, challenges and success factors around big data. There is a lot to explore in each of the areas that we introduce: we plan to do this in subsequent white papers and blog posts, all of which will be made available on the Keplar website.

OK - so we should up our marketing spend on this customer segment?

OK - so we should up our marketing spend on this customer segment?

A little history

The idea that important decisions in companies should be data-driven is much older than big data. Toyota’s pioneering use of data to drive efficiency in their manufacturing process helped them to steal a march on their American competitors back in the 1970s. Fast forward to Nineties Britain, and Tesco’s pioneering use of customer data collected via their club card scheme (run by Dunnhumby) helped them to establish themselves as the largest supermarket in the UK and in the top ten retailers globally.
Read the rest of this entry »