Building Your Data Governance Toolbox

When you first start learning about data governance, it often seems like a hairball of tightly knit ideas where you can’t understand any one piece until you’ve studied and learned the whole thing. I’m not an expert by any stretch, but I’ve wrestled with learning about data governance long enough to find a way ofContinue reading “Building Your Data Governance Toolbox”

Layers of Data Infrastructure 3: Storage

In my last two posts I’ve explored the high-level design decisions related to two of the three layers that define each pipeline stage of each category of data use cases: Control and Compute. The Control layer defines how the user interacts with the system, while the Compute layer defines how the system does the work.Continue reading “Layers of Data Infrastructure 3: Storage”

Data Infrastructure Layers 2: Compute

In my last post I described how you can think of your organization’s data infrastructure as a grid of blocks defined by category of use case and stage of the pipeline. Each block can be further broken down into three layers: Control, Compute and Storage. Last time I briefly described these layers, then discussed differentContinue reading “Data Infrastructure Layers 2: Compute”

Data Infrastructure Layers 1: Control

In my last two posts, I started to break down the types of areas where an organization might need to deploy data tools/infrastructure along two axes: the categories of common use cases and the stages that you’ll encounter in most of these use cases. You can think of these as defining a grid of functionality.Continue reading “Data Infrastructure Layers 1: Control”

Categories of Data Use Cases

As the head of software engineering at a small startup with ambitions to grow much larger, I think a lot about how to design data infrastructure that will both address our immediate needs and adapt to future needs. I’ve seen what happens at large companies when each team has their own set of data infrastructure:Continue reading “Categories of Data Use Cases”

Requirement Diameters and Abstraction

In my last post, I discussed an idea called Requirement Diameters – the distance between all the lines of code that enforce a given software requirement – and the coding principle that these diameters should be kept as small as possible, particularly for requirements that are more likely to change. In this post, I willContinue reading “Requirement Diameters and Abstraction”

Code Factoring and Requirement Diameters

Experienced software teams know that to agree on the design of a project, you must first clearly define and communicate its requirements. But even when this is done well, disagreements over code design often persist due to different understandings of how the requirements are likely to change as the project evolves, and how to prepareContinue reading “Code Factoring and Requirement Diameters”

Writing Software from the Outside In

Whenever you’re developing software, a certain amount of refactoring and rewriting is inevitable. This is sometimes due to a new idea that will simplify the design or a change to the project requirements. But unfortunately, it often also happens because of a misunderstanding about how the software will connect and interact with its external environment.Continue reading “Writing Software from the Outside In”