Data and Cloud Technology

Most of the work I’ve done can’t be shared for legal reasons, but here is the stuff that can. I’m working on generating more, so please reach out to collaborate!

I answer questions about Analytics and ETL for Google CloudOnAir. Learn about separating data storage and compute, and analytical tools like BigQuery and Dataflow.

Videos

Scale your python code by using Dask and Rapids to distribute your processing on GPUs. Find out how to do this on Google Cloud Platform. Recorded at the Dask Summit 2021.

Articles

June 2022: Transform satellite imagery from Earth Engine into tabular data in BigQuery

Geospatial cloud can help customers solve business problems using spatial datasets, high power computation and modeling. One of the most common questions customers ask is how to move data from Earth Engine to BigQuery; read this article to find out more.

March 2023: Unlocking Retail Location Data with CARTO and BigQuery

Add geospatial intelligence to your Retail use cases by leveraging the CARTO platform on top of BigQuery

Feb 2021: Scale model training in minutes with RAPIDS + Dask and NVIDIA GPUs on Vertex

Use Dask with GPUs for a major reduction in model training time by following this tutorial. Distributed processing on Google’s AI Platform.

August 2022: Spatial clustering on BigQuery - Best practices

BigQuery does spatial clustering under the hood to increase performance and reduce the cost of geospatial queries. Learn how it works, and how to best utilize other spatial indexes like geohash and H3 in this blog

Feb 2023: Part 1 - How to choose between regional, dual-region and multi-region Cloud Storage

Part 2 - How to migrate Cloud Storage data from multi-region to regional

Are you sure your Cloud Storage buckets should be multi-region? Read this blog to find out if, and how, you should do a migration to regional storage