AI Transformation in Regulated Development

1. Context

By 2022, generative AI had crossed from research curiosity to daily engineering practice. Code completion, test scaffolding, documentation drafts, and exploratory refactors that once took days could be produced in minutes. Medical device software organizations noticed immediately — and so did their quality and regulatory functions.

The work described here sits inside that transition: Technical Director for medical device software at Baxter Healthcare, operating under IEC 62304, FDA design controls, cybersecurity expectations, and 510(k) submission requirements. The portfolio already faced the binding constraint of regulated development — evidence, traceability, review cycles — before AI arrived. AI did not create that constraint. It changed where the bottleneck sat.

The intellectual lineage matters. A decade earlier, composite-kernel research at AAAI asked whether ML outputs could be accurate and interpretable. Doctoral work on OWL constraint generation asked whether automated reasoning could be honest about what it actually knew. Those were academic formulations of a problem that regulated AI adoption now forces in production: speed is worthless if you cannot defend what you shipped.

2. The Problem

The immediate organizational question was not "should we use AI?" Engineers were already using it. The question was whether AI-assisted work could enter the same evidence chain as everything else — requirements, design, implementation, verification, validation — without breaking trust with quality, regulatory, and clinical stakeholders.

Regulated software has a specific failure mode for new tooling: episodic governance. Teams generate work quickly, then scramble to document it before review. AI amplifies that failure mode because it increases artifact volume faster than retrospective documentation can keep up. A sprint that produces twice as much code and three times as much draft documentation does not produce twice as much credible evidence unless the workflow changed to match.

The deeper problem was strategic. Organizations that treat AI as either forbidden or exempt from design controls will get the worst of both outcomes — slow adoption with underground use, or fast adoption with uncharacterized risk. Neither is sustainable in a domain where defects discovered late compound into submission delays, cybersecurity findings, and field remediation.

3. Why Existing Thinking Failed

"Ban AI until the regulators clarify." Regulators were not going to issue a checklist that made the engineering problem go away. IEC 62304 and FDA design controls already describe what evidence must exist. The gap was organizational: how to produce that evidence when generation is no longer the bottleneck.

"AI is just a better autocomplete — no process change needed." Autocomplete at scale is still authorship. Someone must verify correctness, appropriateness, security, and traceability to requirements. Treating AI output as inherently low-risk because a human clicked "accept" confuses generation with judgment.

"We'll add governance after we see what AI can do." In consumer software, that sequence sometimes works. In regulated software, retrofitting evidence onto AI-generated work is more expensive than building evidence into the workflow from the start — and it trains the organization to associate AI with compliance debt.

"Centralize AI policy in a committee that doesn't write code." Policy without workflow integration becomes shelfware. Engineers route around it. The useful question is not whether AI is permitted but under what conditions AI-assisted artifacts enter the controlled configuration.

4. My Approach

Treat AI adoption as an evidence and verification problem, not a tooling purchase.

The same principles that applied to harmonized platform architecture applied here: governance should accelerate engineering, not follow it. AI-assisted work belongs in the design-controlled lifecycle from the first commit — linked to requirements, reviewed with the same skepticism as human-authored work, tested with the same rigor, documented as part of the workflow rather than reconstructed before submission.

Three commitments shaped the approach:

Interpretability over opacity. Prefer AI uses where outputs can be inspected, challenged, and traced — code with tests, documentation tied to source, analysis with explicit assumptions. The AAAI-era lesson holds in production: a slightly less fluent draft that a reviewer can reason about beats a polished draft that nobody can defend.

Continuous evidence, not episodic review. Automated unit testing, static analysis (SAST), dynamic analysis (DAST), and traceability practices already embedded in CI are the infrastructure AI adoption inherits. AI increases the return on that infrastructure; it does not replace it.

Judgment remains human. AI expands exploration — alternative designs, edge-case tests, draft risk language — but the decision to merge, release, or submit remains a human responsibility with auditable rationale.

5. Technical Solution

The concrete engineering response combined platform practices already in motion with AI-specific guardrails:

Harmonized architecture and reuse — shared components and documented interfaces so AI-assisted changes in one product line do not fork the portfolio silently. Reuse reduces the surface area that must be re-validated.
Agile workflow with design controls — iterative delivery inside a controlled configuration: defined baselines, change impact analysis, and verification records that scale with iteration rather than fighting it.
Automated verification in the pipeline — unit tests, SAST, and DAST as mandatory gates. AI-generated code that fails the same gates as human code is rejected the same way. The pipeline does not care who wrote the line.
Documentation generated from source — software documentation standards tied to the codebase and review workflow, reducing the gap between implementation and submission artifacts.
Cybersecurity integrated into development — secure coding practices, dependency scrutiny, and security testing treated as part of feature work, not a pre-release scramble.
Agentic scope and audit discipline — where autonomous tools plan multi-step work, explicit boundaries on authority, retained logs of actions taken, and human approval before controlled artifacts change state. Regulated environments require an answer to "what did the system do, and why?" — not only "what is the output?"

None of this is AI-specific magic. It is the regulated software toolchain doing what it was always supposed to do — with AI raising the volume and speed of what flows through it.

6. Organizational Challenges

Technical guardrails fail without cross-functional ownership.

Quality and regulatory partners had to trust that "AI-assisted" did not mean "unreviewed." That required shared vocabulary: what counts as AI-generated, what level of review applies, what evidence satisfies design controls, when clinical or cybersecurity stakeholders must be in the loop.

Product and marketing timelines pressure teams to skip characterization — to ship the demo before the failure modes are documented. Platform and portfolio economics help here: harmonized architecture means AI acceleration on product two benefits from verification patterns established on product one.

Culture change was the long pole. Mentoring engineers to treat AI as a junior collaborator — fast, helpful, frequently wrong in subtle ways — mirrors how good senior engineers already review junior work. The goal was not fear of AI but professional skepticism with productive speed: explore widely, verify narrowly, merge only what you can defend.

Representing software at the strategy table remained essential. AI investment decisions made without software leadership tend to optimize for generation licenses while underinvesting in verification capacity — the exact mismatch regulated domains cannot afford.

7. Outcome

The portfolio moved toward shorter development cycles and stronger consistency across product lines as harmonized architecture matured — the platform economics described in Platform Strategy in Regulated Environments. AI adoption layered onto that foundation rather than replacing it.

Engineering practice shifted measurably in direction if not in a single metric: broader use of automated testing and static analysis, documentation treated as part of delivery, security practices embedded earlier, and teams that could discuss AI outputs with the same rigor they applied to human-authored work. Software leadership supported a major new product submission (510(k)) in an environment where evidence generation kept pace with development velocity rather than lagging it.

The longer outcome is capability, not a tool chain. Teams that learn to verify AI-assisted work in a regulated context can adopt new models and methods without relearning governance from scratch each time. That adaptability — not any particular vendor feature — is what "AI transformation" actually means in IEC 62304 land.

8. Lessons That Generalize

Generation got cheap; verification got critical. The constraint shifted. Organizations that invest only in AI generation without verification capacity will discover the bottleneck moved, not disappeared.

Regulated AI is an evidence problem dressed as a technology problem. Training data, validation methodology, failure modes, performance envelopes, post-market monitoring — regulators ask evidence questions. Workflows that produce evidence continuously win; workflows that bolt evidence on at the end lose.

Interpretability is not nostalgia. It is a risk management strategy. Whether the opaque output is an SVM kernel or a generated module, the organization that cannot explain behavior cannot improve it, defend it, or recover from it.

Governance enables speed when it is built in. Episodic review scales poorly with AI-assisted volume. Continuous traceability, automated gates, and shared architecture scale better — the same lesson as platform strategy, applied to the new generation bottleneck.

Intellectual honesty compounds. The through-line from OWL constraint generation — characterize uncertainty, support revision when evidence changes — to regulated AI adoption is direct. Systems that pretend to know more than they do create compliance and safety debt. Systems honest about limits earn the trust required to move faster.