Simple web analytics with Python and Pandas
data-analysis pandas python web-analyticsWe are going to do some analytics with our web visits data. As a simple report we are going to obtain the unique and total visits respect the date and many other paramenters like browser, page wisited, language, operative system...
Requirements
% pip install python-dateutil pandas
Getting and filtering the data
Let's assume the structure of our data like:
uuid | ip | city | country_code | country_name | language | browser | os | page | date | ... |
---|---|---|---|---|---|---|---|---|---|---|
ea2d3169-2b71-4beb-9665-108d302c3a67 | 78.146.232.107 | London | UK | United Kingdom | EN | Firefox | Linux | /foo | 2015-02-12 09:25:17.770175 | ... |
bdb18e99-fc80-4d4b-b4a1-286e67ba374f | 95.142.167.120 | Paris | FR | France | FR | Safari | Mac | /bar | 2015-02-09 21:11:02.134322 | ... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
The python code:
from datetime import datetime, timedelta import pandas as pd source = pd.read_csv('data.csv', index_col='uuid', parse_dates=['date']) date_1 = datetime.utcnow() date_0 = date_1 - timedelta(days=30) data = source[(source['date'] > date_0) & (source['date'] < date_1)]
Aggregating the data
We'll obtain data structured as:
unique_visits | total_visits | |
---|---|---|
2015-01-13 | 90 | 140 |
2015-01-14 | 104 | 170 |
2015-01-15 | 80 | 193 |
... | ... | ... |
unique_visits | total_visits | |
---|---|---|
Linux | 76 | 111 |
Mac | 101 | 180 |
Windows | 40 | 73 |
The Python code:
def get_visits(groupby): ip_visits = data.groupby(groupby)['ip'] return pd.DataFrame( {'unique_visits': ip_visits.apply(lambda x: len(set(x))), 'total_visits': ip_visits.apply(len)}) data['date_day'] = data['date'].apply(datetime.date) visits_by_date = get_visits('date_day') # Redefine with date index to avoid lack of dates. visits_by_date = pd.DataFrame(visits_by_date, index=pd.date_range(date_0, date_1)).fillna(0) # Rest of filtered visits. visits_by_os = get_visits('os') visits_by_city = get_visits('city')