Use Cases – Public-Data Demo Scenarios (Phase I)
Phase I demonstrations use public datasets (e.g., GEO/SRA, FlowRepository) and public literature (PubMed/PMC) to show feasibility, traceability, and calibrated uncertainty in hypothesis prioritization. Consistent with the Phase I design, confidence is computed by the Bayesian engine; the LLM is used only for evidence-constrained explanations.
What each demo outputs
- Ranked hypotheses as testable statements (templates).
- Posterior confidence with credible intervals (CrI).
- Evidence coverage: supporting vs. conflicting vs. missing.
- Traceability: every score links to evidence objects + citations.
- Actionability: recommended next-step experiments/analyses.
Bayesian scoring (engine)
For each hypothesis \(H\), we construct a structured prior and update with standardized evidence features. The engine outputs a posterior distribution (not a single opaque score).
RNA-Seq meta-evidence: “Pathway activation in Disease Y is consistent across cohorts.”
Scenario
A user queries a disease-centric question and requests hypotheses about mechanisms that are reproducible across public cohorts. PromptGenix retrieves multiple GEO/SRA studies, runs standardized processing, and creates evidence objects per dataset (effect size, uncertainty, QC, context descriptors).
| Hypothesis template | Posterior / 95% CrI | Coverage | Next steps |
|---|---|---|---|
| H1: “Pathway \(X\) is upregulated in Disease \(Y\) vs controls.” |
\(P(H_1\mid E)=0.86\) 95% CrI \([0.78,\,0.92]\) |
Supporting: 5 Conflicting: 1 Missing: 0 | Targeted qPCR panel for pathway genes; validate in an independent cohort; test upstream regulator perturbation. |
| H2: “Gene \(G\) is a context-specific driver (cell-type enriched) in Disease \(Y\).” |
\(P(H_2\mid E)=0.63\) 95% CrI \([0.49,\,0.76]\) |
Supporting: 3 Conflicting: 2 Missing: 1 | Stratify by metadata (age/sex/tissue); check cell-type markers; run sensitivity analysis with alternative normalization/QC thresholds. |
Flow cytometry evidence: “A cell subset expands and correlates with outcome.”
Scenario
A user asks whether a specific immune subset (e.g., activated T cell phenotype) differs between groups. PromptGenix retrieves public FCS/processed matrices where available, standardizes metadata, and extracts evidence features (frequency differences, uncertainty, batch/study effects, and robustness checks).
| Hypothesis template | Posterior / 95% CrI | Coverage | Next steps |
|---|---|---|---|
| H3: “Subset \(S\) frequency is higher in Group A vs Group B.” |
\(P(H_3\mid E)=0.71\) 95% CrI \([0.58,\,0.82]\) |
Supporting: 4 Conflicting: 3 Missing: 0 | Report sensitivity to batch correction; confirm with orthogonal marker panel; define pre-registered gating/threshold rules for validation. |
| H4: “Subset \(S\) correlates with outcome \(O\) (direction consistent).” |
\(P(H_4\mid E)=0.57\) 95% CrI \([0.41,\,0.72]\) |
Supporting: 2 Conflicting: 2 Missing: 3 | Acquire additional cohorts; harmonize outcome definition; run robustness checks with alternative models. |
Evidence synthesis: “Mechanism is supported by both omics and immune profiling.”
Scenario
A user wants hypotheses that are supported across modalities (gene expression + immune phenotypes). PromptGenix constructs modality-specific evidence objects and performs an integrated update that favors robust cross-modal agreement while surfacing conflicts explicitly.
| Hypothesis template | Posterior / 95% CrI | Coverage | Next steps |
|---|---|---|---|
| H5: “Mechanism \(M\) explains both transcriptional and immune-phenotype changes.” |
\(P(H_5\mid E)=0.80\) 95% CrI \([0.69,\,0.88]\) |
Supporting: 6 Conflicting: 1 Missing: 1 | Propose a minimal validation set: confirm pathway activation + phenotype shift in a single prospective dataset; prioritize perturbation targets. |
What reviewers will see in the generated report
Report artifacts
- Ranked hypothesis table: posterior + CrI + coverage.
- Evidence appendix: per-study effect sizes, uncertainty, QC, and context.
- Citation panel: supporting vs. conflicting papers (with timestamps/links).
- “Next steps” plan: minimal experiments + required data.
Phase I KPIs (examples)
- Runtime: accession-driven report generated within <24 hours (dataset-dependent).
- Reproducibility: deterministic reruns with pinned versions + config snapshots + checksums.
- Usefulness: pilot feedback that rankings guided ≥1 concrete analysis/experiment decision.
- Traceability: every ranked hypothesis links to evidence objects + citations.