Time series anomaly detection with BigQuery ML and Dataform

In my previous post, I briefly overviewed Dataform in BigQuery. Now that your Dataform project is mature enough to support business decisions, it’s time to try something new and build a machine learning anomaly detection pipeline with BigQuery ML. Let’s assume that you already store a number of daily sign-ups and payments in BigQuery. You want to get a notification if there is an unusual drop or a spike in any of these metrics....

February 24, 2023 · 6 min · Alex Danilin

BigQuery, Airbyte, Intercom, and another story of Google Cloud Platform cost optimization

My first story on the Google Cloud Platform cost optimization was about the hidden secrets of Looker Studio (formerly Google Data Studio). Let me tell you another story on how to find the source of soaring costs. Some new cool features for cost optimization are now available on GCP so let’s start. The support team lead reached out to me asking if we can get the content of comments from Intercom....

November 18, 2022 · 6 min · Alex Danilin

Building data pipelines in Google BigQuery with Dataform

I’ve been using Google BigQuery since its’ public release and I like where it has been going through all these years. The team behind the product is doing an amazing job and I do not remember any public feature with no apparent use case. One of the most noticeable BigQuery evolution branches for me is from simple storage and query engine to internal scheduled queries mechanism and a non-linear interconnected query logic for data manipulation later on....

October 10, 2022 · 7 min · Alex Danilin

How to Stop Wasting Money on Google BigQuery in Google Data Studio

Update. I updated this article with a new case of BigQuery cost optimization. One of my responsibilities as a product manager is tracking the influence of product on key metrics like revenue or MRR. I also prefer to share this data with my team for us to be on the same line while we are developing new features or improving something that already exists. A dashboard with a number of key metrics is good choice here....

November 26, 2021 · 6 min · Alex Danilin
Google BigQuery scripting

Constructing SQL Queries Using Google BigQuery Scripting

Let’s start with a simple question: for each tool in my product what fraction of daily active users per day visits the tool? Maybe not so simple so let me add some assumptions to simplify the solution. Daily active user is the one who visits at least one tool per day. You have a lot of tools in the product and you add some new tools from time to time....

March 15, 2021 · 8 min · Alex Danilin