claims-summarizer · ~/eval-lab
--:--:-- UTC
claims summarizer·eval lab

Prompt-strategy comparison on 20 hand-labeled MA denial scenarios

Three variants scored against the same eval set. Composite is 0.3·category + 0.3·appealability + 0.2·action + 0.2·artifact. Refused→appeal requires both the model’s refusal and a judge action ≥ 0.7 — the “right for the right reason” gate.

VariantCompositeCategoryAppealableRefused→AppealActionArtifactLatencyCacheRun
Zero-shotzero-shot
Few-shotfew-shot
Structured-firststructured-extraction-first
scenarios20 hand-labeled · click to inspect
Medical Necessity — aquatic therapy denied as not medically necessary
Medical Necessityappealable
Medical Necessity — Lumbar MRI denied; conservative care timeline insufficient
Medical Necessityappealable
Medical Necessity — Cardiac rehab extension beyond standard 36 sessions deniedborderline
Medical Necessityappealable
Prior Auth Not Obtained — Outpatient shoulder arthroscopy
Prior Auth Not Obtainedappealable
Prior Auth Not Obtained — IV iron infusionborderline
Prior Auth Not Obtainedappealable
Prior Auth Not Obtained — Emergency surgery during inpatient admission
Prior Auth Not Obtainedappealable
Out-of-Network — ED visit at non-network facility while traveling
Out-of-Networkappealable
Out-of-Network — Elective specialty visit, no network provider unavailableborderline
Out-of-Networkappealable
Step Therapy — Biologic for rheumatoid arthritis denied; methotrexate trial required
Step Therapy / Part D Formularyappealable
Step Therapy — GLP-1 weight-loss indication denied; tier-1 metformin requiredborderline
Step Therapy / Part D Formularyappealable
SNF — Extended post-stroke rehab denied; plan asserts plateau
SNF Skilled-vs-Custodialappealable
SNF — Admission denied; three-day inpatient rule not metborderline
SNF Skilled-vs-Custodialappealable
DME — CPAP rental terminated for documented non-compliance
DME Coverageappealable
DME — Power mobility scooter denied; manual wheelchair preferred
DME Coverageappealable
Duplicate Claim — Provider resubmitted identical paid claim
Duplicate Claimno-appeal
Duplicate Claim — Looks like duplicate but corrected billing pendingborderline
Duplicate Claimno-appeal
Duplicate Claim — Service already bundled into prior procedure code
Duplicate Claimno-appeal
COB — Working-aged member; employer plan primary
COB Second-Payerno-appeal
COB — VA-authorized care for service-connected condition
COB Second-Payerno-appeal
COB — Workers' Compensation pending for occupational injuryborderline
COB Second-Payerno-appeal
$ eval-lab run --variants all --set hand-labeled-20 [idle]
Claims Summarizer — MA denial triage cockpit