Time series anomaly detection with BigQuery ML and Dataform
In my previous post, I briefly overviewed Dataform in BigQuery. Now that your Dataform project is mature enough to support business decisions, it’s time to try something new and build a machine learning anomaly detection pipeline with BigQuery ML. Let’s assume that you already store a number of daily sign-ups and payments in BigQuery. You want to get a notification if there is an unusual drop or a spike in any of these metrics....
BigQuery, Airbyte, Intercom, and another story of Google Cloud Platform cost optimization
My first story on the Google Cloud Platform cost optimization was about the hidden secrets of Looker Studio (formerly Google Data Studio). Let me tell you another story on how to find the source of soaring costs. Some new cool features for cost optimization are now available on GCP so let’s start. The support team lead reached out to me asking if we can get the content of comments from Intercom....
Building data pipelines in Google BigQuery with Dataform
I’ve been using Google BigQuery since its’ public release and I like where it has been going through all these years. The team behind the product is doing an amazing job and I do not remember any public feature with no apparent use case. One of the most noticeable BigQuery evolution branches for me is from simple storage and query engine to internal scheduled queries mechanism and a non-linear interconnected query logic for data manipulation later on....
How to Stop Wasting Money on Google BigQuery in Google Data Studio
Update. I updated this article with a new case of BigQuery cost optimization. One of my responsibilities as a product manager is tracking the influence of product on key metrics like revenue or MRR. I also prefer to share this data with my team for us to be on the same line while we are developing new features or improving something that already exists. A dashboard with a number of key metrics is good choice here....
How to Use Self-made Service Account Key with Expiration Date on Google Cloud Platform
One of the best practices for storing service account keys is to rotate them on a regular basis. You can do that manually but it would be much better to have a kind of mechanism that doesn’t let you miss the date of key renewal. I will use some gcloud console commands here but it is not necessary as you can do the same in Google Cloud Platform (GCP) interface....
Improving product decision confidence with Kano model
As a product person you are usually trying to find a balance between time to market and amount of value delivered to your users. In other words, if something produces a lot of value and you don’t have to spend ages before you release it - you are good to go. Of course you should have some level of confidence in what you are going to do and raising up this level is a really good habit....
I Thought I Would Never Use Math Again
It was yet another online meeting where we discussed design patterns with my colleagues. Somewhere during the conversation I used the knowledge about disjunctive normal form in boolean logic as an argument to throw away some unnecessary UX elements. This saved us tens of hours on design validation and development. At the end of the meeting one guy told me that he thought he would never use math again after he graduated from the university....
Constructing SQL Queries Using Google BigQuery Scripting
Let’s start with a simple question: for each tool in my product what fraction of daily active users per day visits the tool? Maybe not so simple so let me add some assumptions to simplify the solution. Daily active user is the one who visits at least one tool per day. You have a lot of tools in the product and you add some new tools from time to time....
Vagrant - a quick how to
I must confess - I’m a Windows user :) Maybe it is not as cool as Linux or other operating systems, but it solves all the tasks I need so who cares. But there are some rare cases when it is easier for me to use Ubuntu. Like when I took an online Docker course and had some issues with setting up Docker on Windows. I didn’t want to use raw VirtualBox or similar virtualization software because it is a good exercise to practice command line skills from time to time....
How to Grant an Access for R googlesheets4 package Using Google Cloud Platform Service Account
One of my weekly tasks brought me to Google Cloud Platform instance to deploy a tiny R script as a cron job. This script makes some data transformations and writes the results to Google Sheets. I use googlesheets4 package to communicate with Google Sheets and it works frictionless on my local machine. But Ubuntu cloud instance is not a local Windows machine. I do not use interactive mode on instances and browser as well so the common OAuth authorization is not the case here....