Skip to Content
TEA CurriculumHands-On ResourcesCase StudiesExplainable Diabetic Retinopathy Screening System

Explainable Diabetic Retinopathy Screening System

Diabetic Retinopathy Screening Hero Image

FieldDetails
DomainHealthcare
Assurance GoalExplainability

Overview

The NHS Midlands Diabetic Eye Screening Programme has deployed an AI-powered system to assist in screening retinal images for signs of diabetic retinopathy (DR)—a complication of diabetes that damages blood vessels in the retina and can lead to vision loss if untreated. The system analyses fundus photographs (i.e. images of the back of the eye) to detect early signs of the condition, processing approximately 200,000 images annually across 50 screening locations.

Given that screening results directly affect patient care pathways, and determine whether patients are referred for specialist treatment or continue routine monitoring, the Programme has commissioned an assurance case. The focus of this case is to demonstrate that the system provides explanations that enable clinicians to understand, verify, and appropriately act on AI recommendations.

System Description

What the System Does

The Diabetic Retinopathy Screening System (DRSS) supports the NHS screening programme by:

  • Analysing digital fundus photographs for signs of diabetic retinopathy
  • Classifying images according to the NHS grading scheme (e.g. R0-R3 for retinopathy severity, M0-M1 for maculopathy)
  • Generating visual explanations highlighting areas of concern
  • Providing confidence scores and uncertainty estimates
  • Flagging cases requiring urgent specialist review

How It Works

During a routine diabetic eye screening appointment:

  1. Image Capture: A trained screener captures fundus photographs of both eyes using a digital retinal camera
  2. Image Quality Assessment: The system first evaluates whether image quality is sufficient for reliable analysis
  3. Feature Detection: A deep learning model identifies relevant clinical features: microaneurysms (tiny bulges in blood vessels), haemorrhages (bleeding), exudates (fatty deposits), and neovascularisation (abnormal new blood vessel growth)
  4. Severity Classification: Based on detected features, the system assigns a grade according to NHS grading criteria 
  5. Explanation Generation: The system produces visual attention maps (highlighting which image regions influenced the decision) and natural language explanations
  6. Clinical Review: A qualified grader reviews the AI output alongside the images and makes the final grading decision

Key Technical Details

AspectDetails
Model ArchitectureEfficientNet-B4 (a neural network optimised for image classification) with attention mechanism (highlighting influential image regions) for explanation generation
Training Data500,000 graded retinal images from UK screening programmes with expert consensus labels
InputTwo fundus photographs per eye (macula-centred and disc-centred views)
OutputGrade (R0-R3, M0-M1), confidence score (0-100%), attention heatmap, feature-level predictions, natural language summary
PerformanceSensitivity: 95.2%, Specificity: 89.7% for referable retinopathy (validated on held-out UK dataset)
Explainability MethodsGrad-CAM  (gradient-weighted class activation mapping, showing which image regions most influenced the classification), attention visualisation, prototype-based reasoning (comparing to similar known cases)
ValidationProspective clinical validation; quarterly performance monitoring; annual external audit

Deployment Context

  • Coverage: Multiple screening programmes across NHS Midlands, serving populations in Birmingham, Coventry, Leicester, Nottingham, and Derby
  • Volume: ~200,000 screening episodes annually
  • Patient Population: Adults with Type 1 or Type 2 diabetes
  • Workflow Integration: Embedded within existing grading workflow; all cases receive human review
  • Operational Since: September 2023

Stakeholders

StakeholderInterestConcern
PatientsAccurate screening and understandable resultsNeed to understand why they’re being referred (or not)
Screening GradersReliable AI assistance that supports their expertiseMust understand AI reasoning to make informed decisions
OphthalmologistsAppropriate referrals with relevant clinical contextNeed explanations that inform treatment planning
Programme ManagersEfficient screening with maintained qualityBalance throughput with clinical safety
MHRASafe and effective medical device operationRegulatory compliance and post-market surveillance
NHS EnglandNational screening programme integrityConsistent standards across all centres

Regulatory Context

The system operates within several regulatory frameworks:

  • UK Medical Devices Regulations 2002: DRSS is classified as a Class IIa medical device requiring CE/UKCA marking
  • MHRA Guidance on AI as a Medical Device: Requirements for clinical evidence, post-market surveillance, and transparency
  • NHS Diabetic Eye Screening Programme Standards: National standards for screening quality and grading accuracy
  • NICE Guidance: Evidence requirements for AI in diagnostic pathways
  • UK GDPR: Special category health data processing requirements; data protection impact assessment required
  • Duty of Candour: Healthcare providers must be open with patients about their care

Explainability Considerations

Several aspects of this system require careful attention to explainability:

Clinical Decision Support

Graders must understand why the AI has assigned a particular grade to make informed decisions. A simple classification without reasoning could lead to over-reliance or inappropriate dismissal of AI recommendations.

Multiple Audiences

Explanations must serve different users with different needs:

  • Graders need technical detail about detected features and their locations
  • Ophthalmologists need clinically relevant information for treatment planning
  • Patients need accessible explanations of their results

Uncertainty Quantification and Communication

The system must clearly communicate when it is uncertain, enabling graders to apply additional scrutiny to borderline cases rather than treating all AI outputs with equal confidence.

Feature Attribution Accuracy

Visual explanations (heatmaps) must accurately reflect the regions the model used for its decision. Misleading explanations could be worse than no explanation at all.

Edge Cases and Limitations

The system must clearly indicate when images or cases fall outside its reliable operating envelope, such as unusual pathology, poor image quality, or rare presentations.

Assurance Focus

The assurance case should demonstrate that:

The Diabetic Retinopathy Screening System provides explanations that enable qualified graders to understand, verify, and appropriately act on AI recommendations for patient care.

Deliberative Prompts

  • What makes an explanation “good enough” for a clinical decision, and who decides?
  • How do you balance explanation detail against the time pressures of high-volume screening?
  • When an explanation is technically accurate but clinically misleading, whose responsibility is the resulting harm?
  • How should uncertainty be communicated without undermining trust in the system or encouraging graders to ignore it?
  • What do patients need to understand about AI involvement in their care, and when should this be disclosed?

Suggested Strategies

When developing your assurance case, consider these potential approaches:

Strategy 1: Explanation Fidelity

Ensure that visual and textual explanations accurately represent the model’s actual reasoning process, so that clinicians are not misled about why a particular recommendation was made.

Strategy 2: Clinical Workflow Integration

Design explanations that fit naturally within graders’ existing decision-making processes, providing the right information at the right time without creating cognitive overload or workflow disruption.

Strategy 3: Uncertainty and Limitation Communication

Develop clear, calibrated ways of communicating model confidence and indicating when the system is operating outside its reliable envelope, enabling graders to appropriately modulate their scrutiny and know when to exercise additional caution or seek alternative assessment.

Strategy 4: Audience-Appropriate Explanations

Create explanation formats tailored to the distinct needs of graders, ophthalmologists, and patients, ensuring each group receives information they can understand and act upon.

The following techniques from the TEA Techniques library  may be useful when gathering evidence for this assurance case:

Further Reading