
The FDA’s June 2025 cybersecurity guidance made one thing unmistakably clear: penetration testing is no longer something manufacturers can bolt on at the end of development. For cyber devices under Section 524B of the FD&C Act, it is a required, documented, and reviewable part of premarket submissions. A shallow test report, or no report at all, is a common trigger for cybersecurity deficiency letters that stall 510(k) or De Novo submissions by months.
This article is a practical playbook. It covers what a medical device pen test actually includes, what FDA and UL 2900 specifically require, what testers keep finding across device categories, how to scope and budget the work, and what your final deliverables need to look like to satisfy a reviewer. At Blue Goat Cyber, we work exclusively in medical device cybersecurity and have built our process around what actually clears submissions. What follows is what we know works.
What a medical device pen test actually covers
A medical device penetration test is not a vulnerability scan dressed up in a nicer report. The attack surface for a connected medical device spans hardware, firmware, wireless radios, APIs, cloud backends, and user interfaces simultaneously. Each layer requires a different toolset, different expertise, and a different mindset from the tester.
The full attack surface manufacturers need to scope
Scoping is where many manufacturers go wrong, and where shallow tests begin. A credible assessment covers firmware extraction and binary analysis, physical access points and hardware maintenance ports (JTAG, UART, debug interfaces), wireless protocols including BLE, Zigbee, and Wi-Fi, APIs serving both mobile applications and cloud platforms, the cloud device management infrastructure itself, and authentication and access control systems across all entry points. Leaving any of these out of scope gives an attacker an unchecked path and gives an FDA reviewer a reason to question your submission.
Gray-box, white-box, and black-box: choosing the right model
The FDA’s premarket guidance specifically values white-box and gray-box (translucent) testing. In a white-box engagement, testers receive full system knowledge: architecture diagrams, source code, credentials, and design documentation. This produces the most thorough and actionable findings. Gray-box testing provides partial system knowledge, allowing testers to formulate targeted attack vectors without full disclosure. Black-box testing alone, where testers start with no prior knowledge, is generally insufficient for demonstrating device-specific contextual evidence, and FDA reviewers often find it difficult to evaluate without the white- or gray-box results the 2025 guidance expects.
How a well-structured test actually runs
The phases are sequential and each one feeds the next: planning and scoping, reconnaissance, scanning and system mapping, vulnerability assessment, exploitation, and cleanup. Two operational rules apply specifically to medical devices and are non-negotiable. Testing must occur in a lab or vendor-approved environment only, never on devices connected to actual patients. All testing should be staged through lower environments first, using synthetic data, before any production-equivalent hardware is assessed. These constraints are not bureaucratic formalities; they exist because a poorly run test on clinical equipment can cause real harm.
FDA and UL 2900 requirements manufacturers often miss
The FDA’s June 2025 guidance supersedes the 2023 version and expanded both scope and post-market obligations significantly. If your cybersecurity documentation was built around the 2023 guidance, there are gaps. Closing them before submission is far less expensive than responding to a deficiency after the fact.
What the 2025 FDA cybersecurity guidance actually requires
The FDA expects penetration testing documentation in the cybersecurity verification and validation section of your eSTAR submission. That documentation must include a clear scope description, the methodology used, timeframe, technical risk factors assessed, white-box testing results, and a technical analysis of how each finding affects device safety and effectiveness. The FDA also prefers original third-party test reports over manufacturer-summarized findings. Summarized reports raise credibility questions with reviewers and tend to generate additional information requests. For a focused breakdown of how testing maps to premarket submissions, see our guide to the FDA’s 18 cybersecurity deliverables.
How UL 2900-2-1 fits into your submission strategy
UL 2900-2-1 provides standardized security requirements for network-connectable medical devices, including static code analysis, fuzz testing of all external interfaces, vulnerability scanning, and penetration testing criteria. The FDA does not explicitly mandate UL 2900-2-1 compliance, but alignment to it is a recognized path to demonstrating security rigor. Manufacturers who structure their testing programs against UL 2900-2-1 pre-empt the follow-up questions that generalist approaches tend to generate from reviewers.
Post-market obligations that don’t end at clearance
Section 524B of the FD&C Act requires manufacturers of cyber devices to maintain ongoing surveillance, active vulnerability monitoring, and a plan for regular penetration testing across the entire product lifecycle. For Class II and Class III cyber devices, this is not optional. The practical implication is that post-market pen test cadence needs to be built into your quality management system from the start, not retrofitted after clearance. Your QMS should define who owns post-market testing, how often it occurs, and how findings feed back into your risk management file.
Recurring findings from medical device security assessments
The vulnerability patterns that surface in medical device penetration tests repeat across device categories and device classes. Knowing what to expect helps engineering teams make better development decisions before testing begins, rather than scrambling to remediate findings that could have been designed out.
Authentication failures and broken access control
Authentication weaknesses and code defects account for a substantial share of findings across medical device assessments, roughly 60% based on industry data. The most frequent issues include insecure token handling, missing authorization checks on API endpoints, insecure direct object reference vulnerabilities where one user can access another user’s data, and weak password recovery mechanisms. Industry research has documented exploitable connectivity flaws in patient monitoring equipment at rates that make these risks the norm rather than the exception in devices where security was not integrated into the development process from day one.
Firmware and hardware-level weaknesses
Hardcoded credentials embedded in firmware are among the most common and most serious findings in physical device assessments. Insecure firmware update mechanisms, including those lacking cryptographic integrity checks or vulnerable to downgrade attacks, appear broadly across connected devices we assess. Exposed debug ports and outdated third-party libraries with known CVEs round out the list. Firmware extraction from physical devices frequently reveals secrets that development teams assumed were safe because they were not exposed through software interfaces.
Legacy protocols and network segmentation failures
Unencrypted DICOM and HL7 traffic moving across hospital networks represents a persistent risk that standard vulnerability scans often miss. Poor segmentation between IT and clinical OT networks allows lateral movement from a general hospital network straight to safety-critical clinical systems. In simulated smart infusion pump attack scenarios, testers exploit legacy operating systems, unencrypted protocol traffic, and weak segmentation to reach devices and alter dosing parameters. Standard perimeter scans do not catch this attack path; only a purpose-built penetration test does.
Scoping your test: timelines, costs, and vendor criteria
Budget and timeline expectations vary significantly based on device class and scope. Getting these estimates wrong at the planning stage leads to truncated tests, surprise costs mid-engagement, and reports that do not satisfy FDA reviewers because the scope was too narrow.
Cost and timeline ranges by device class
Simple connected devices, such as wireless sensors or basic patient monitors, typically cost between $10,000 and $30,000 and take two to six weeks. Complex systems, including imaging platforms, implantables, and multi-component networked devices, range from $30,000 to $100,000 or more, with timelines of four to twelve weeks or longer depending on scope depth. Retesting after remediation adds one to four weeks at roughly 50 to 70% of the initial engagement cost. Medical device specialization adds 20 to 50% over general IT penetration testing because of the regulatory documentation requirements, the need for biomedical engineering coordination, and the specialized tooling required for firmware and hardware analysis.
Scoping decisions that determine test quality
The variables that drive both cost and depth include the number of wireless protocols in scope, the depth of firmware extraction and binary analysis, the API endpoint count, whether cloud infrastructure is included, and whether the engagement uses white-box access. These decisions need to be resolved before a statement of work is signed; use our comprehensive penetration testing checklist to guide scoping and avoid common coverage gaps. Skipping a rigorous scoping conversation almost always produces reports with coverage gaps, gaps that surface during FDA review when it is most costly to address them.
Choosing a vendor who understands FDA documentation requirements
Most IT security firms can run a vulnerability scan. Far fewer understand what an FDA reviewer expects to see in the final report, how to document findings against AAMI TIR57 and IEC 62304, or how to structure evidence for an eSTAR submission. Blue Goat Cyber works exclusively in medical device cybersecurity, which means our test reports are built from the first page to serve as FDA submission documentation, not retrofitted after the fact by a generalist firm unfamiliar with premarket review expectations.
When evaluating any vendor, apply these criteria: documented FDA submission experience with 510(k) and De Novo submissions specifically, demonstrated understanding of UL 2900-2-1 and how it maps to FDA expectations, the ability to deliver white-box and gray-box testing, and a clear process for retesting and remediation validation. A vendor who cannot explain how their report maps to the eSTAR cybersecurity V&V section is the wrong vendor for a regulatory submission. For practical industry perspectives on aligning testing with submission expectations, see the Censinet overview referenced above.
What your deliverables and remediation evidence must show
Passing a pen test is not the goal. Producing documentation that convinces an FDA reviewer your device is resilient to real-world threats is the goal. The report structure and the remediation evidence trail are what make or break a submission, and both need to be planned before testing begins.
Report structure regulators actually want to see
A submission-ready penetration test report needs a clearly defined scope and methodology section, findings organized by severity with proof-of-concept evidence, white-box testing results, a technical analysis of how each finding affects device safety and effectiveness, and an executive summary written for non-technical reviewers. FDA reviewers prefer original third-party reports. Internally produced summaries raise credibility questions and tend to generate additional information requests that slow the review process.
Remediation tracking and retest validation
Every finding needs a remediation owner and a defined timeline tied to severity. Per FDA expectations, critical findings should be addressed promptly and proportionate to risk, guidance examples cite timelines such as 30 days for very high-severity issues. Documented evidence confirming each fix was implemented is required. Retest validation confirming the vulnerability was resolved is never optional. FDA reviewers and auditors expect to see the full cycle documented: finding, fix, confirmation. An incomplete cycle is a deficiency waiting to happen.
Building the evidence package for your FDA submission
The penetration test report is one component of a broader cybersecurity documentation package that includes threat model outputs, SBOM, risk analysis, and SPDF documentation. Pen test findings must feed back into the risk management file under ISO 14971. For any finding that is mitigated but not fully eliminated, residual risk acceptance must be documented explicitly, with a rationale that a reviewer can evaluate. This traceability from finding to risk file to residual risk acceptance is what transforms a test report into credible submission evidence.
Getting the test right before submission
Penetration testing for medical devices is a structured, regulatory-aligned process that demands specialized expertise, thorough scoping, and documentation built for FDA reviewers from the first page. Manufacturers who treat it as a commodity IT service get reports that generate deficiency letters. Those who approach it as an integrated part of their cybersecurity submission process get cleared faster with fewer rounds of review.
Blue Goat Cyber handles the full scope: from the initial scoping conversation through white-box testing, remediation documentation, and retest validation, all structured to satisfy FDA reviewers on the first submission. While your engineering and regulatory teams stay focused on the device, we build the cybersecurity evidence package that gets it cleared.
If your next submission includes a cyber device, now is the time to get the test right. Reach out to Blue Goat Cyber to start the scoping conversation before your development timeline makes the decision for you.