Use Cases – Public-Data Demo Scenarios (Phase I)

PromptGenix LLC · Contact: dohoon.kim1@icloud.com · promptgenix.org
Why these demos

Phase I demonstrations use public datasets (e.g., GEO/SRA, FlowRepository) and public literature (PubMed/PMC) to show feasibility, traceability, and calibrated uncertainty in hypothesis prioritization. Consistent with the Phase I design, confidence is computed by the Bayesian engine; the LLM is used only for evidence-constrained explanations.

What each demo outputs

  • Ranked hypotheses as testable statements (templates).
  • Posterior confidence with credible intervals (CrI).
  • Evidence coverage: supporting vs. conflicting vs. missing.
  • Traceability: every score links to evidence objects + citations.
  • Actionability: recommended next-step experiments/analyses.
Posterior \(P(H\mid E)\) 95% CrI Coverage Evidence-linked

Bayesian scoring (engine)

For each hypothesis \(H\), we construct a structured prior and update with standardized evidence features. The engine outputs a posterior distribution (not a single opaque score).

Core update \[ P(H\mid E) \propto P(H)\,P(E\mid H) \]
Operational example (phase I) \[ \text{logit}\big(P(H\mid E)\big) = \text{logit}\big(P(H)\big) + \sum_{j=1}^{m} w_j \, s_j(E) \]
where \(s_j(E)\) are evidence-derived features (e.g., effect size, uncertainty, cross-dataset consistency, QC), and \(w_j\) are fixed / learned weights under deterministic versioning.
Interpretability rule: The LLM can only explain what is present in evidence objects and citations; no evidence → no claim.
Demo 1

RNA-Seq meta-evidence: “Pathway activation in Disease Y is consistent across cohorts.”

Scenario

Public RNA-Seq Cross-study consistency Strong supporting evidence

A user queries a disease-centric question and requests hypotheses about mechanisms that are reproducible across public cohorts. PromptGenix retrieves multiple GEO/SRA studies, runs standardized processing, and creates evidence objects per dataset (effect size, uncertainty, QC, context descriptors).

Hypothesis template Posterior / 95% CrI Coverage Next steps
H1: “Pathway \(X\) is upregulated in Disease \(Y\) vs controls.” \(P(H_1\mid E)=0.86\)
95% CrI \([0.78,\,0.92]\)
Supporting: 5 Conflicting: 1 Missing: 0 Targeted qPCR panel for pathway genes; validate in an independent cohort; test upstream regulator perturbation.
H2: “Gene \(G\) is a context-specific driver (cell-type enriched) in Disease \(Y\).” \(P(H_2\mid E)=0.63\)
95% CrI \([0.49,\,0.76]\)
Supporting: 3 Conflicting: 2 Missing: 1 Stratify by metadata (age/sex/tissue); check cell-type markers; run sensitivity analysis with alternative normalization/QC thresholds.
Traceability: Each posterior links to evidence objects (per-study effect sizes, SE/CI, QC flags, batch/study descriptors) and citations that support or contradict the hypothesis.
Demo 2

Flow cytometry evidence: “A cell subset expands and correlates with outcome.”

Scenario

Public Flow cytometry Subset frequency Mixed evidence

A user asks whether a specific immune subset (e.g., activated T cell phenotype) differs between groups. PromptGenix retrieves public FCS/processed matrices where available, standardizes metadata, and extracts evidence features (frequency differences, uncertainty, batch/study effects, and robustness checks).

Hypothesis template Posterior / 95% CrI Coverage Next steps
H3: “Subset \(S\) frequency is higher in Group A vs Group B.” \(P(H_3\mid E)=0.71\)
95% CrI \([0.58,\,0.82]\)
Supporting: 4 Conflicting: 3 Missing: 0 Report sensitivity to batch correction; confirm with orthogonal marker panel; define pre-registered gating/threshold rules for validation.
H4: “Subset \(S\) correlates with outcome \(O\) (direction consistent).” \(P(H_4\mid E)=0.57\)
95% CrI \([0.41,\,0.72]\)
Supporting: 2 Conflicting: 2 Missing: 3 Acquire additional cohorts; harmonize outcome definition; run robustness checks with alternative models.
Coverage summary (reported) \[ \text{Coverage}(H) = \big(\#\text{support},\,\#\text{conflict},\,\#\text{missing}\big), \quad \text{reported alongside } P(H\mid E)\text{ and CrI.} \]
Demo 3

Evidence synthesis: “Mechanism is supported by both omics and immune profiling.”

Scenario

Multi-modal RNA-Seq + Flow Convergent signals

A user wants hypotheses that are supported across modalities (gene expression + immune phenotypes). PromptGenix constructs modality-specific evidence objects and performs an integrated update that favors robust cross-modal agreement while surfacing conflicts explicitly.

Multi-source update (illustrative) \[ \text{logit}\big(P(H\mid E_{\text{RNA}}, E_{\text{Flow}})\big) = \text{logit}\big(P(H)\big) + \alpha \, S_{\text{RNA}} + \beta \, S_{\text{Flow}} - \gamma \, S_{\text{conflict}} \]
where \(S_{\text{RNA}}\) and \(S_{\text{Flow}}\) summarize standardized evidence strength per modality, and \(S_{\text{conflict}}\) penalizes directional contradictions or context mismatches.
Hypothesis template Posterior / 95% CrI Coverage Next steps
H5: “Mechanism \(M\) explains both transcriptional and immune-phenotype changes.” \(P(H_5\mid E)=0.80\)
95% CrI \([0.69,\,0.88]\)
Supporting: 6 Conflicting: 1 Missing: 1 Propose a minimal validation set: confirm pathway activation + phenotype shift in a single prospective dataset; prioritize perturbation targets.
LLM role here: produce a reviewer-friendly narrative explaining why the engine ranked \(H_5\) high, explicitly citing supporting and conflicting evidence links; it does not change \(P(H\mid E)\).
Phase I evaluation outputs

What reviewers will see in the generated report

Report artifacts

  • Ranked hypothesis table: posterior + CrI + coverage.
  • Evidence appendix: per-study effect sizes, uncertainty, QC, and context.
  • Citation panel: supporting vs. conflicting papers (with timestamps/links).
  • “Next steps” plan: minimal experiments + required data.

Phase I KPIs (examples)

  • Runtime: accession-driven report generated within <24 hours (dataset-dependent).
  • Reproducibility: deterministic reruns with pinned versions + config snapshots + checksums.
  • Usefulness: pilot feedback that rankings guided ≥1 concrete analysis/experiment decision.
  • Traceability: every ranked hypothesis links to evidence objects + citations.
Reviewer-friendly principle: transparent ranking with uncertainty — not black-box generation.