
Published: May 23, 2026
Effective medical device cybersecurity testing requires a layered approach, moving beyond single-tool solutions like vulnerability scanning. The FDA's February 3, 2026 premarket guidance emphasizes diverse testing methods, including Static Application Security Testing (SAST), Software Composition Analysis (SCA), Dynamic Application Security Testing (DAST), fuzzing, threat model validation, and penetration testing. Each method addresses distinct defect classes at different lifecycle stages, providing evidence for secure design, implementation, and resilience against evolving threats. Submissions should demonstrate how these layers collectively validate security controls and mitigate identified risks.
"We did security testing" is one of the most common - and least useful - statements in a premarket submission. There are at least seven distinct things that phrase can mean, and the FDA expects sponsors to know the difference.
This is the pillar reference for how the pieces fit together: what each testing type actually finds, where it sits in the lifecycle, and what reviewers expect to see as evidence.
Key Takeaways
- Security testing is layered, not a single activity.
- FDA expects evidence from multiple testing types.
- SAST/SCA find code & component issues early.
- DAST/fuzzing identify runtime & input flaws.
- Threat model validation links tests to risks.
- Penetration testing provides real-world attack simulation.
Table of Contents
- Key Takeaways
- The umbrella problem
- The seven testing types
- How the layers actually fit together
- What a submission actually needs
- Common failure modes
- Where to go next
Why this matters
The FDA's Cybersecurity in Medical Devices: Quality System Considerations and Content of Premarket Submissions (Feb 3, 2026 final guidance) made cybersecurity documentation a gating criterion for clearance under Section 524B of the FD&C Act. Reviewers now apply this guidance to medical device security testing the same way they apply software lifecycle expectations from IEC 62304 and security risk-management expectations from AAMI TIR57 and ANSI/AAMI SW96:2023.
Gaps in this area are the single most common driver of first-cycle cybersecurity Additional Information (AI) requests. The FDA's FY2024 CDRH performance reports show cybersecurity is among the top deficiency categories cited in 510(k) and PMA AI letters, behind only software documentation and clinical evidence. Treating it as a checklist exercise rather than a design-controlled engineering artifact is what creates the gap.
The umbrella problem
"Security testing" gets used as if it were one activity. It isn't. A SAST scan, a Nessus run, a fuzzing campaign, and a red-team penetration test all produce findings - but they find fundamentally different classes of defect, at different points in development, with different cost profiles. Treating them as interchangeable is how teams end up with a clean scan report and an exploitable device.
The FDA's premarket cybersecurity guidance and AAMI TIR57 both signal the same expectation: layered testing, with rationale for what each layer covers and what it doesn't.
The seven testing types
1. Static Application Security Testing (SAST)
SAST analyzes source code or compiled binaries without executing them. It catches insecure API usage, hardcoded secrets, unsafe string handling, taint flow into sinks, and common CWE patterns.
Strengths: Runs early. Cheap to repeat. Finds defects before they reach a build.
Limits: High false-positive rate. Blind to runtime behavior, authentication logic, and chained vulnerabilities. A SAST tool will not tell you that an authenticated API endpoint accepts a forged JWT.
FDA fit: Reasonable evidence of secure coding practice. Not sufficient on its own.
2. Software Composition Analysis (SCA)
SCA inventories third-party components - open source libraries, vendor SDKs, firmware blobs - and matches them against known vulnerability databases. This is the engine behind a useful SBOM.
Strengths: The fastest way to find known-CVE exposure. Maps directly to the FDA's SBOM expectations under Section 524B.
Limits: Only as good as the database and the component identification. Tells you nothing about how a vulnerable function is actually called in your codebase. A VEX document is what makes SCA output decision-grade.
FDA fit: Required, in practice. Submissions without component-level vulnerability evidence get RTA'd.
3. Vulnerability Scanning
Network and host scanners (Nessus, Nexpose, OpenVAS) probe a running device for known weaknesses: missing patches, weak ciphers, exposed services, default credentials.
Strengths: Broad coverage. Repeatable. Useful as a regression check between releases.
Limits: Surface-level. Will not find logic flaws, broken auth, or chained exploits. Often noisy on embedded medical devices where the scanner misinterprets a minimal Linux stack.
FDA fit: Expected as one input to vulnerability management. Not a substitute for adversarial testing.
4. Dynamic Application Security Testing (DAST)
DAST tools exercise a running application - usually web or API surfaces - looking for injection, auth bypass, misconfiguration, and similar issues. Burp Suite and OWASP ZAP are the canonical examples.
Strengths: Finds defects that only show up at runtime. Good fit for the cloud back-end and clinician-facing portals that increasingly sit beside a device.
Limits: Coverage depends on whether the scanner can authenticate and reach every endpoint. For physical devices with non-HTTP protocols (BLE, serial, proprietary), DAST is largely irrelevant.
FDA fit: Strong evidence for connected device companion services. Less useful for embedded firmware.
5. Fuzz Testing
Fuzzing throws malformed, unexpected, or randomly mutated input at a target to provoke crashes, hangs, or unsafe states. For medical devices, this often means protocol fuzzers against DICOM, HL7, BLE, or proprietary command interfaces.
Strengths: Finds the kinds of memory corruption and parser bugs that turn into denial-of-service or remote code execution. The FDA explicitly names fuzz testing in its premarket guidance.
Limits: Requires harnessing. Generates findings that need triage to separate exploitable from cosmetic. Coverage-guided fuzzing is more useful than dumb fuzzing but takes engineering effort to set up.
FDA fit: Increasingly expected for any device that parses network protocols or external input. AAMI TIR57 treats malformed input testing as a baseline.
6. Threat Model Validation
See also: Infusion Pump Cybersecurity: FDA Expectations in 2026, CAN Bus and CANopen Vulnerabilities in Medical Devices, and 510(k) Cybersecurity Deficiencies That Trigger FDA Holds.
A threat model identifies what an attacker might do; threat model validation tests whether the mitigations actually work. If your model says "TLS prevents traffic interception," validation means proving the TLS implementation is correctly configured, certificate-pinned where appropriate, and resistant to downgrade.
Strengths: The only testing type that directly answers "did our risk analysis hold up?" This is the connective tissue between threat modeling and verification.
Limits: Only as good as the threat model. A weak model produces weak validation.
FDA fit: Reviewers increasingly ask for evidence that mitigations claimed in the security risk file have been tested, not just designed.
7. Penetration Testing
A penetration test is human-led, goal-oriented, adversarial testing against a near-final or finished device. A tester takes the role of a realistic attacker with defined objectives - extract PHI, brick the device, pivot to a hospital network, defeat the update mechanism - and chains vulnerabilities the way a real adversary would.
Strengths: Finds business-logic flaws, broken authentication, design weaknesses, and exploit chains that no automated tool will surface. Produces the closest evidence of real-world resilience.
Limits: More expensive than scanning. Point-in-time. Quality varies enormously between providers - a pen test scoped poorly is mostly theater.
FDA fit: Specifically named in the FDA's premarket cybersecurity guidance. For most Class II and Class III devices, a credible pen test report is now table stakes. Choosing the right provider matters more than the line item suggests.
How the layers actually fit together
A useful mental model:
DESIGN THREAT MODEL ──► VALIDATION
│
BUILD SAST ─── SCA ─── (continuous)
│
INTEGRATE DAST ─── FUZZ ── VULN SCAN
│
RELEASE PENETRATION TEST
│
POSTMARKET SCA + VULN SCAN (continuous, tied to SBOM/VEX)
Each row catches a class of defect the row above missed. SAST and SCA find what humans wrote or imported. DAST and fuzzing find what shows up under execution. Pen testing finds what survives all of the above. Postmarket SCA finds what becomes a vulnerability after release.
Skip a row and you create blind spots that the FDA - and an attacker - will eventually identify.
What a submission actually needs
For a premarket submission under Section 524B, reviewers expect evidence covering, at minimum:
- An SBOM with associated SCA results and VEX justifications for unpatched components.
- Static analysis evidence tied to secure coding practice.
- Dynamic and/or fuzz testing for input-handling and protocol surfaces.
- A penetration test report with realistic objectives, not just a vulnerability scan in different clothing.
- Traceability from threat model entries to the test that validated each mitigation.
The exact mix depends on device class, connectivity, and risk profile. What does not change is the expectation that you can defend why each layer was included or excluded.
Common failure modes
Three patterns show up repeatedly in deficiency letters:
- One-tool submissions. A clean Nessus report submitted as "security testing." Reviewers see through this immediately.
- Pen test as scanner output. A "pen test" report that is 80% automated scan findings with no exploitation, chaining, or business-logic work. Read the methodology section; if it does not describe attacker objectives, it is not a pen test.
- Threat model and tests in different universes. The risk file claims mitigations the test plan never validated. This is the easiest deficiency to avoid and the most common one to receive.
Where to go next
If you want to go deeper on any single layer:
- The Importance of Medical Device Vulnerability Testing
- Fuzz Testing in Medical Device Cybersecurity
- Cost of Medical Device Penetration Testing
- SBOM vs VEX: Difference
- MedTech Cyber Standards Every Device Team Must Know
Security testing is not a single deliverable. It is a layered evidence program. Treat it that way and the FDA submission writes itself; treat it as a checkbox and the deficiency letter writes itself first.
How Blue Goat approaches this
Blue Goat Cyber's medical device practice is led by engineers with CISSP, OSCP, and prior military red-team backgrounds. We treat cybersecurity documentation as design-controlled engineering output, not a submission template, every artifact (threat model, SBOM, security risk assessment, penetration test, labeling) traces back to a controlled requirement and a verified result.
Our engagements deliver the full Feb 3, 2026 guidance documentation set scoped to the device's risk profile, integrated with the existing IEC 62304 software lifecycle and ISO 14971 risk file. See our medical device cybersecurity services for the full scope. If the FDA raises cybersecurity deficiencies after our submission, we resolve them at no additional cost.
FAQ
Is a single penetration test enough to clear an FDA submission?
No. The FDA's premarket cybersecurity guidance and AAMI TIR57 expect layered evidence - typically SAST or SCA findings, dynamic or fuzz testing, threat-model-driven test cases, and a penetration test that actually exploits and chains issues. A standalone pen test report, especially one dominated by automated scanner output is the most common pattern in deficiency letters.
What is the difference between a vulnerability scan and a penetration test?
A vulnerability scan enumerates known issues against signatures - it is reproducible, automated, and shallow. A penetration test is human-led: scoped against attacker objectives, with exploitation, chaining, and post-exploitation evidence. If the methodology section of a 'pen test' report does not describe attacker goals or exploitation paths, the FDA will treat it as a scan.
Where does SBOM-based vulnerability monitoring fit in the testing taxonomy?
It sits between software composition analysis (SCA) and postmarket surveillance. SCA establishes the baseline component inventory at release; ongoing SBOM monitoring against CISA KEV, NVD, and vendor advisories is what keeps the inventory honest after the device ships. Reviewers increasingly expect to see the operational process, not just the document.
Does the FDA require fuzz testing?
Not by name, but Section 524B and AAMI TIR57 expect dynamic testing against malformed and unexpected input on every external interface. For parsers, network listeners, and file ingest paths, fuzz testing is the cleanest way to evidence that requirement. Teams that skip fuzzing usually have to defend why their dynamic testing was sufficient.
How should threat model validation tests be documented?
Each threat in the threat model should map to one or more test cases, and each test case should produce evidence (logs, screenshots, exploit output) that the mitigation works or that the threat is not exploitable. A risk file that claims mitigations the test plan never validates is the easiest deficiency to receive and the easiest to avoid.
Can we run all of this in-house or do we need an external pen test?
SAST, SCA, DAST, and fuzz testing typically belong inside engineering. Penetration testing benefits from independence, both for objectivity and because reviewers expect to see an external perspective in the evidence package. Most clearance-ready submissions mix internal layered testing with an external pen test.
About the author
Christian Espinosa, CISSP, Founder, Blue Goat Cyber. Christian leads a team focused exclusively on medical device cybersecurity for FDA premarket submissions and postmarket compliance. Read more about Christian.
Sources & references
Primary sources cited in this article. Links open in a new tab.
- premarket cybersecurity guidance- U.S. FDA