Adding Geo and ISP data to your analytics hits with Snowplow and Cloudflare Workers ·
In this post we'll look at how to add geo and ISP data to your analytics hits with Snowplow and Cloudflare Workers, an approach that you can also re-use for GA4.
In this post we'll look at how to add geo and ISP data to your analytics hits with Snowplow and Cloudflare Workers, an approach that you can also re-use for GA4.
Running Snowplow for your (web) analytics pipeline to expensive? Here's a €0.02/day minimal, serverless version of Snowplow open source that you can deploy for your blog or website with Terraform (on GCP/BigQuery) in 5 minutes giving you full ownership of a web and app analytics pipeline from data collection to custom data models (👋 goodbye Google Analytics).
Bots usually run on one of the major cloud providers. Identifying them can be a big factor in determining the quality of your traffic. Whether that's for web analytics or threat mitigation, it's useful to have an overview of IP ranges to identify in bot scoring.
Snowplow schemas are a great way to codify expected data in JSON format. Using Github actions you can make them eevn more powerful by automatically checking for typos, validity, and other errors as well as directly publishing them to your production environment with no manual action.
With the recent update to Google Search Console (GSC) allowing exports to BigQuery we can now leverage some power features of BigQuery to do text processing and extract topics from our search queries with a simple JavaScript UDF.
Web analytics still feels 'messy' in 2023. Why is it so hard to solve the problem of web analytics? Let's dive into some of the misconceptions that fuel the mess, like the ideas that websites are easy, are visited by people, that web analytics is about tracking poeple, that we have all the tools we need, and that web analytics is actually important.
Managing incrementality (change over time) in a large database is hard. Dbt can help us alleviate some of the pain by making the selection of incremental strategies we have easier to choose from. Lets look at updating an example sales table with actuals and estimates over time.
Want to have data from BigQuery publicly available? Create a simple API with BigQuery scheduled queries, JSON exports and a Cloudflare Worker to map the right URL to the right data.
Dbt is a great tool for data transformation. Snowplow is great for collecting web analytics data. What if you could harvest the power of both for just a few cents a day by running dbt in a Docker container on Google Cloud Run Jobs?
Over the last few years SQL has really started embracing its second adolescence. That's cool, but what if you could easily extend your queries beyond the SQL domain and add in Python and Javascript based serverless functions to get real time stock information, enrich location data or: build a language detection function!? That's what we'll do.