Specific Aims (Phase I): PromptGenix – BioAI Automation Platform
PromptGenix LLC (Women & Minority-Owned) · Contact: dohoon.kim1@icloud.com · Website: promptgenix.org/sbir.html
Significance

Public immunology and infectious disease research increasingly relies on large-scale RNA-Seq, scRNA-Seq, and high-dimensional flow cytometry datasets, yet hypothesis generation remains a major bottleneck. Fragmented tools, bespoke scripts, and limited expert time slow analysis, reduce reproducibility, and delay experimental decision-making. The critical unmet need is a traceable, Bayesian framework that integrates heterogeneous public evidence into prioritized, testable biological hypotheses with quantified uncertainty.

Innovation

PromptGenix introduces a traceable hypothesis intelligence engine that ranks mechanistic hypotheses using statistical evidence modeling and Bayesian updating across public omics datasets and literature-derived priors. For each hypothesis H, prior probabilities are constructed from pathway relevance, cell-type specificity, and literature support, and are updated via likelihood functions derived from observed effect sizes, confidence metrics, and cross-dataset consistency to compute posterior confidence scores. Large language models are used strictly for interpretability—to explain rankings and cite supporting or conflicting evidence— not to determine confidence.

Overall objective & central hypothesis

Objective & Central Hypothesis: This project will deliver a validated Phase I prototype that automatically ingests public RNA-Seq and flow cytometry datasets, constructs standardized evidence objects, and outputs ranked biological hypotheses with probabilistic confidence, traceable evidence links, and recommended validation experiments. We hypothesize that combining deterministic automated pipelines with evidence-weighted Bayesian inference and constrained LLM-based interpretability will significantly reduce hypothesis generation time while improving transparency, reproducibility, and decision quality.

Specific aims
  • Aim 1: Build deterministic, end-to-end automated RNA-Seq and flow cytometry pipelines using selected public datasets, producing reproducible and versioned evidence outputs.
  • Aim 2: Implement an evidence-weighted hypothesis prioritization engine using statistical scoring and Bayesian updating to compute posterior confidence and rank candidate hypotheses.
  • Aim 3: Generate reviewer-ready reports demonstrating quantifiable traceability and utility (>80% rating) for guiding concrete experimental decisions.
Expected outcomes & deliverables
  • A validated prototype producing reproducible evidence objects and ranked hypotheses with posterior confidence estimates.
  • Reviewer-ready reports generated within <24 hours for accession-driven runs.
Impact

Phase I will de-risk Phase II development of a scalable hypothesis intelligence platform that enables faster, more transparent evidence integration and prioritization for experimental planning, supporting—but not replacing—human scientific decision-making.

Phase I SBIR – NIH/NIAID focus · Public-data validation · Women & Minority-Owned Small Business