Specific Aims (Phase I): PromptGenix – Hypothesis Intelligence Engine for Public Biomedical Data (v3)

PromptGenix LLC (Women & Minority-Owned) · Contact: dohoon.kim1@icloud.com · Website: promptgenix.org/sbir.html

Significance

Public biomedical resources contain vast omics/immune profiling datasets and a rapidly expanding literature, yet a key bottleneck remains: converting heterogeneous evidence into prioritized, testable hypotheses with quantified uncertainty. Fragmented tools and bespoke scripts slow turnaround and reduce reproducibility, while interpretation depends on scarce expert time. What is missing is a scalable, transparent mechanism for evidence-weighted scientific synthesis and decision support.

Innovation

PromptGenix introduces a hypothesis intelligence engine that prioritizes mechanistic hypotheses using statistical evidence modeling and Bayesian updating over public biomedical data and literature-derived priors. LLMs are constrained to interpretability (explaining rankings, summarizing evidence, citing sources) rather than determining confidence. Unlike tools that produce plots or narrative summaries alone, PromptGenix delivers evidence-linked hypothesis rankings with explicit uncertainty labeling and actionable next steps.

Overall objective & central hypothesis

Objective: Deliver a validated Phase I prototype that retrieves public datasets and literature, constructs standardized evidence objects, and outputs ranked biological hypotheses with probabilistic confidence, traceable evidence links, and recommended follow-up experiments.

Central hypothesis: An evidence-weighted decision engine combining statistical scoring/Bayesian inference with constrained LLM interpretability will reduce time-to-hypothesis while improving transparency, reproducibility, and usefulness for study design decisions in immunology and infectious disease research.

Specific aims

  • Aim 1. Build a robust public-data ingest and evidence feature layer that generates standardized evidence objects (effect sizes, uncertainty-aware descriptors, reproducibility checks, context) from selected datasets and metadata.
  • Aim 2. Implement an evidence-weighted hypothesis prioritization engine using statistical scoring and Bayesian updating to compute posterior confidence and rank candidate hypotheses with explicit uncertainty annotations.
  • Aim 3. Deliver reviewer-ready HTML/PDF outputs and evaluate usability via 2–3 pilot studies, measuring traceability, rerun reproducibility, turnaround time, and “useful” ratings for hypothesis recommendations.

Expected outcomes & deliverables

  • Deterministic, reproducible evidence objects and ranked hypotheses across reruns (versioned configs, checksums, traceable artifacts).
  • Evidence-linked hypotheses with posterior confidence estimates and clear support vs. conflict coverage.
  • Reviewer-ready reports generated within <24 hours for accession-driven runs (dataset-dependent), with reduced manual synthesis time.

Impact

Successful completion of Phase I will de-risk Phase II development of a scalable hypothesis intelligence platform for faster, more transparent evidence integration and prioritization—supporting, not replacing, human scientific decision-making.

Phase I SBIR – NIH/NIAID focus · Public-data validation · Women & Minority-Owned Small Business