Problem-Directed Compilation
A research question is not a database query. It is a compilation target. The question compiles into morphism chains that surgically extract only what is relevant to the answer.
The Triangle DSL
Research protocols are written in Triangle, an LL(1) domain-specific language where each statement maps to a morphism chain through S-entropy space.
investigate "Association between ACTN3 genotype and cardiac adaptation in elite sprinters" with confidence > 0.95 with significance < 0.01 parallel { genotype = slice genomics.ACTN3 @ cohort(elite_sprinters) @ variant(rs1815739) cardiac = slice echocardiography @ cohort(elite_sprinters) @ measure(LV_mass, EF, GLS) protein = slice proteomics @ target(alpha_actinin_3) @ tissue(cardiac_muscle) } joined = compose genotype with cardiac preserving athlete_id result = navigate joined to target via correlation_analysis converge at confidence > 0.95
The researcher specifies what to investigate and what evidence is needed. The system handles how: which domain models to invoke, what morphism chains to construct, when the analysis has converged.
Surgical Extraction Results
Ghost bars: full dataset size. Solid bars: surgically extracted data. Each source achieves 10⁸–10⁹x compression through problem-directed extraction.
For any research question Q and dataset D, the extracted representation σ is a sufficient statistic with information content bounded by the mutual information I(D; A_Q). The raw data H(D) is never accessed beyond this bound.
The protocol type system enforces dimensional consistency, conservation compliance, modality compatibility, and confidence monotonicity—all checked at compile time before any data is accessed.
Any well-typed protocol decomposes into a sequence of atomic morphisms, each preserving S-entropy conservation. Complex analyses are compositions of simple, verified steps.