data

Oh, I'm sure it's probably nothing

How we do (or don't) think about null values and why the polyglot push makes it all the more important

Update: grouped data quality check PR merged to dbt-utils

After a prior post on the merits of grouped data quality checks, I demo my newly merged implementation for dbt

Using databases with Shiny

Key issues when adding persistent storage to a Shiny application, featuring {golem} app development and Digital Ocean serving

Make grouping a first-class citizen in data quality checks

Which of these numbers doesn’t belong? -1, 0, 1, NA. You can't judge data quality without data context, so our tools should enable as much context as possible.

Update: column-name contracts with dbtplyr

Following up on 'Embedding Column-Name Contracts... with dbt' to demo my new dbtplyr package to further streamline the process

A lightweight data validation ecosystem with R, GitHub, and Slack

A right-sized solution to automated data monitoring, alerting, and reporting using R (`pointblank`, `projmgr`), GitHub (Actions, Pages, issues), and Slack

Understanding the data (error) generating processes for data validation

A data consumer's guide to validating data based on the failure modes data producer's try to avoid

A Tale of Six States: Flexible data extraction with scraping and browser automation

Exploring how `Playwright`'s headless browser automation (and its friends) can help unite the states' data

Embedding column-name contracts in data pipelines with dbt

dbt supercharges SQL with Jinja templating, macros, and testing -- all of which can be customized to enforce controlled vocabularies and their implied contracts on a data model

Causal design patterns for data analysts

An informal primer to causal analysis designs and data structures