RESEARCH

Real science behind the claims

Our work is grounded in peer-reviewed research. Every claim is backed by published methodology, independent evaluation, and reproducible results. We don't do press releases — we do papers.

Laboratory
Laboratory
Laboratory
Laboratory
Laboratory
5
Published Papers
87.8M
Training Pairs
90%
Match Accuracy
+49%
vs. Keyword Baseline

Publications

Paper 01Published

Engineering Truthiness: A Validation Framework for Silver-Label Evaluation in Machine Learning

Dhunga D.·Zenodo 2026

Introduces a systematic framework for validating silver labels used in machine learning training pipelines. Demonstrates that instrument validation techniques from psychometrics can detect label noise and quantify measurement error in automatically generated training data.

ML EvaluationSilver LabelsPsychometrics
Paper 02In Submission

When Explanations Disagree: A Case for Multi-Method Explainability in Clinical AI

Dhunga D.·Under Review 2026

Examines how different explainability methods (SHAP, LIME, attention weights, integrated gradients) produce divergent explanations for the same clinical predictions. Argues that multi-method agreement scoring provides more trustworthy explanations than any single method.

XAIClinical AISHAPLIME
Paper 03In Submission

Silver Labels at Scale: Instrument Validation Without Gold Standards in Clinical Trial Matching

Dhunga D.·Under Review 2026

Proposes methods for validating silver-label quality in clinical trial matching when gold-standard expert labels are scarce. Applies factor analysis and internal consistency metrics to 87.8 million patient-trial pairs.

Label QualityFactor Analysis87.8M Pairs
Paper 04In Submission

Where Clinical Trial Matching Breaks: Boolean Logic as the Primary Failure Mode

Dhunga D.·Under Review 2026

Analyzes failure modes in neural clinical trial matching systems. Finds that boolean eligibility criteria (age ranges, lab value thresholds, binary diagnoses) account for the majority of ranking errors, and proposes hybrid architectures that combine neural ranking with rule-based constraint checking.

Failure AnalysisHybrid ArchitectureBoolean Logic
Paper 05In Submission

MatchBERT: A Cross-Encoder Architecture for Clinical Trial Patient Matching at Scale

Dhunga D.·Under Review 2026

Presents MatchBERT, a fine-tuned cross-encoder for clinical trial eligibility scoring. Evaluated against TREC 2021 Clinical Trials Track gold labels, achieving NDCG@10 of 0.94 on internal validation and AUC 0.703 on external test set — a 49% improvement over BM25 baseline.

MatchBERTTREC 2021Cross-Encoder

MODEL CARD

MatchVox Engine

Proprietary deep learning model for clinical trial eligibility scoring. Trained on 87.8M patient-trial pairs with validated silver labels. Multi-stage architecture: semantic retrieval + precision reranking.

90%

Match accuracy

87.8M

Training pairs

3 sec

Per patient

520K+

Trials indexed

matchvox.ai/research

Our methodology

External validation

Evaluated against TREC 2021 Clinical Trials Track gold-standard expert labels. Not self-reported metrics on hand-picked benchmarks.

AUC 0.703 on external test set

Multi-method explainability

We use multiple independent explanation methods. When methods disagree, clinicians are told — not shown cherry-picked explanations.

Multiple independent methods compared

Hybrid architecture

Neural ranking for semantic understanding + rule-based constraints for boolean eligibility criteria. Best of both worlds.

5,418× faster than manual review

Want to see it work?

Create a free account and run a match in under 2 minutes.