Engineering

Software engineering is the discipline of making commitments you can keep, at a scale you cannot personally verify.


What Software Engineering Actually Is

Software engineering is not programming at scale.

Programming solves a problem. Engineering produces a solution that can be maintained, extended, tested, verified, and handed to someone else — someone who was not in the room when the original decisions were made.

The distinction matters because most of what goes wrong in software development is not a programming failure. The code often works. What fails is the commitment: the assertion that the software does what it claims, reliably, under conditions that were not fully enumerated when it was written.

Engineering is about making that commitment credible.


Architecture Is Strategic When It Enables Work You Haven't Yet Defined

Good architecture does not optimize for today's requirements.

It creates options. It reduces the cost of future decisions. It maintains the ability to change.

The measure of an architecture is not whether it solved the current problem elegantly. It is whether, two years later, the team can add a capability that was not anticipated without dismantling what was already built.

This is not the same as over-engineering. Over-engineering creates infrastructure for problems that never arrive. Strategic architecture creates leverage — shared capability whose cost is absorbed by the first use and whose value compounds across subsequent ones.

In practice this means:

The regulated software environment makes this concrete: an architecture that cannot produce traceability without separate documentation effort is not a strategic architecture. The evidence should be a byproduct of how the work is done, not a separate deliverability.

→ See also: Platform Strategy in Regulated Environments


Why Verification Is Becoming the Bottleneck

For most of software history, the constraint was generation.

Writing code was slow. Writing correct code was slower. The bottleneck was getting qualified engineers to produce working software at acceptable rates.

AI changes this. Generation is becoming cheap. The model writes the function. The agent proposes the architecture. The tool generates the tests. What does not change is the engineering judgment required to verify that the output is correct, safe, and appropriate for its context.

The bottleneck shifts from generation to verification.

This is not a small change. It means:

Quality gates become the critical path. If reviews, testing, and evidence generation were designed for a world where generation was slow, they are mismatched to a world where generation is fast. The same governance processes that were once a fraction of development cost become the dominant term.

The ability to evaluate output becomes more valuable than the ability to produce it. An engineer who can determine whether AI-generated code is correct, secure, and maintainable in a regulated context is more valuable than one who can only produce such code manually.

Verification must be continuous. Episodic quality gates — review at the end of a phase, test before release — cannot keep pace with generation that is effectively continuous. Evidence must be generated as a byproduct of work, not as a concluding ritual.

→ See also: Interpretable ML and Composite Kernels


Traceability as an Engineering Asset

In regulated environments, traceability is often treated as a compliance burden: a matrix to be populated, a document to be written, an artifact to be submitted.

This framing is wrong. It makes traceability expensive and adversarial.

The right framing: traceability is how an engineering team proves that the software does what it claims. It is the evidence that connects a user need to a requirement, a requirement to a design decision, a design decision to an implementation, an implementation to a test result, and a test result to documented evidence.

When that chain is built into the workflow — when tests are linked to requirements at the moment they are written, when design decisions are recorded where they are made — the cost of traceability is near zero. The evidence is generated as a byproduct of engineering discipline.

When that chain is assembled retrospectively — when someone reconstructs what was intended weeks after the implementation was complete — the cost is high and the quality is low. The engineer who wrote the code may not remember the requirement that motivated a particular implementation decision. The reviewer who approves the evidence was not present when the decision was made.

The difference between these two approaches is not documentation discipline. It is architectural and process design: building systems and workflows where evidence generation is natural rather than effortful.

→ See also: Platform Strategy in Regulated Environments


Platform Engineering as Systems Engineering

Platform engineering is often described as a subset of software engineering focused on infrastructure.

That framing is too narrow.

A software platform in a complex system — regulated devices, industrial systems, connected infrastructure — is a systems engineering problem. It requires understanding:

This is not what most software engineering teams are organized to do. Teams are organized around products. Products have owners, schedules, and revenue targets. The platform belongs to everyone and therefore often belongs to no one — or to whoever is willing to carry the cost.

The organizational design of platform engineering is as hard as the technical design. It requires explicit funding mechanisms, shared ownership models, and leadership willing to credit platform investment rather than charge it against individual product margins.

→ See also: Platform Strategy in Regulated Environments · Voyager, KLN, and the Consortium


Metrics That Actually Matter

The temptation in engineering management is to measure what is visible.

Lines of code. Story points completed. Test count. Code coverage percentage. Defect rate. Velocity.

These are not wrong. Some of them are useful. But they share a structural problem: they measure activity rather than capability, and they measure the present rather than the future.

The metrics that actually matter are harder to collect and slower to appear:

Time to verify a claim. When someone asserts that the software meets a requirement, how long does it take to confirm or refute that assertion? Fast verification means the engineering process is generating evidence continuously. Slow verification means evidence is being reconstructed, which is expensive and unreliable.

Cost of the second product. If the first product in a portfolio took twelve months, how long did the second take? Compressing second-product delivery is the signal that platform investment is working. No compression signals that every product is rebuilding from scratch.

Capability persistence. How much of the knowledge and capability built for one project survived to the next? Engineering organizations that lose capability at every transition are not compounding — they are resetting.

Defect discovery timing. Where in the development lifecycle are defects found? Finding defects late is exponentially more expensive than finding them early. Shifting defect discovery earlier is one of the highest-leverage improvements an engineering organization can make.


The Infrastructure-Before-Mandate Pattern

A recurring observation: the most successful technology transitions begin with infrastructure, not mandates.

Cutter ran Linux before anyone was asked to adopt Linux. The enrollment data warehouse was built before anyone requested analytical enrollment reports. The PALS conversion tools existed before any institution was required to migrate. The engineering curriculum existed before the degree programs were approved.

In each case, the pattern was:

  1. Build the thing that makes adoption possible
  2. Let adoption happen voluntarily, driven by demonstrated utility
  3. Use adoption evidence to justify expansion

The alternative — mandate adoption, then build infrastructure — produces compliance without capability. People comply with the mandate while routing around the infrastructure because the infrastructure doesn't yet serve their actual needs.

The infrastructure-before-mandate pattern requires tolerance for investment without visible return. The work is done before the mandate justifies it. That requires a particular kind of confidence — not arrogance, but belief that the utility will materialize once people can see it working.

→ See also: Cutter: Democratizing Linux on Campus · Enterprise Registration & Data Warehouse


The Gap Between Simulation and Reality

There is a lesson that surfaces repeatedly in embedded systems teaching and in engineering practice more broadly:

Systems that work in simulation fail in physical environments for reasons that were not modeled.

The interrupt fires at an unexpected time. The I2C bus has a timing dependency the spec doesn't emphasize. The motor driver behaves differently under load than under no-load. The regression test passes in CI and fails in the device under the conditions the patient creates.

This gap is not a failure of simulation. It is a property of complex systems: the model is always a simplification, and reality punishes simplifications that matter.

The engineering implication is that verification must eventually cross into physical reality. Tests that run only in simulation are not evidence that the system works in its deployment context. Evidence must be collected in conditions that represent — or bound — the conditions under which the system will actually be used.

In regulated environments this is not optional: it is the explicit requirement of design validation. The system must be tested in conditions that reflect real use, not simulated ideal conditions.

The pedagogical implication is that engineers who have only worked in simulation are less prepared than engineers who have encountered the gap between model and reality and developed the intuition to navigate it.


→ Related: Mental Models · Patterns · Platform