Data Infrastructure Layers 2: Compute

In my last post I described how you can think of your organization’s data infrastructure as a grid of blocks defined by category of use case and stage of the pipeline. Each block can be further broken down into three layers: Control, Compute and Storage. Last time I briefly described these layers, then discussed differentContinue reading “Data Infrastructure Layers 2: Compute”

Data Infrastructure Layers 1: Control

In my last two posts, I started to break down the types of areas where an organization might need to deploy data tools/infrastructure along two axes: the categories of common use cases and the stages that you’ll encounter in most of these use cases. You can think of these as defining a grid of functionality.Continue reading “Data Infrastructure Layers 1: Control”

Categories of Data Use Cases

As the head of software engineering at a small startup with ambitions to grow much larger, I think a lot about how to design data infrastructure that will both address our immediate needs and adapt to future needs. I’ve seen what happens at large companies when each team has their own set of data infrastructure:Continue reading “Categories of Data Use Cases”