A methodology and limitations note describing how Treeova evaluates AI trading agents in its paper environment and how its reinforcement-learning calibration loop updates regime-segmented expectations from observed outcomes. Past performance does not guarantee future results; paper-trading fills are simulated. Reward weights, classifier thresholds, and regime-detection internals are intentionally withheld.

    Methodology Note: Paper Trading Backtesting & RL Calibration

    A methodology and limitations note describing how Treeova evaluates AI trading agents in its paper environment and how its reinforcement-learning calibration loop updates regime-segmented expectations from observed outcomes. Past performance does not guarantee future results; paper-trading fills are simulated. Reward weights, classifier thresholds, and regime-detection internals are intentionally withheld.

    Paper-Fill-Simulator models limit-fill conditions, intrinsic-value fallbacks, and phantom-fill protection.

    Phase-aware success classification interprets outcomes by lifecycle phase, not just terminal PnL.

    Regime-segmented Bayesian-style alpha/beta updates per detected market regime.

    Past performance does not guarantee future results; paper fills are simulated.

    Reward weights, phase thresholds, and regime-detection internals are intentionally withheld.

    MethodologyRL CalibrationCompliance
    Treeova Whitepaper · v1.0

    WP-10 — Methodology Note: Paper Trading Backtesting & RL Calibration

    A methodology and limitations note describing how Treeova evaluates AI trading agents in its paper environment and how its reinforcement-learning calibration loop updates regime-segmented expectations from observed outcomes. Past performance does not guarantee future results; paper-trading fills are simulated. Reward weights, classifier thresholds, and regime-detection internals are intentionally withheld.

    Authored by Treeova Research· Research CollectiveUpdated 2026-04-18

    #1. Overview

    This note documents how Treeova evaluates the behavior of AI trading agents in its paper environment, and how the platform's reinforcement-learning calibration loop turns observed outcomes into regime-segmented updates. It is intentionally a methodology and limitations document, not a performance brochure.

    Numbers reported anywhere on the platform under "paper trading performance" describe a simulated environment. They are useful for comparing strategies and detecting regressions; they are not predictive of live performance.

    #2. Paper-Fill-Simulator

    The Paper-Fill-Simulator is the engine that models executed fills inside the paper environment. Its commitments are:

    • Limit-order fidelity. Limit orders fill only when the simulated quote conditions actually justify a fill — never on an unfilled-price assumption.
    • Intrinsic-value fallback. When market data for an options contract is unreliable, settlement falls back to intrinsic value rather than booking a phantom price.
    • Phantom-fill protection. Fills derived from stale or one-sided quotes are refused. The simulator will not crystallize a number it cannot trust.

    #3. Phase-Aware Success Classification

    Trade outcomes are not classified purely by terminal PnL. Each trade is segmented into lifecycle phases — entry, management, exit — and the reinforcement-learning signal interprets the outcome of each phase in the context appropriate to it. A trade that lost money but exited cleanly under deteriorating conditions can still contribute a positive management-phase signal; a trade that made money on a managed position that should have been closed earlier can still produce a negative management-phase signal.

    The point of this design is to give the calibration loop the information it actually needs to improve agent behavior, rather than reducing every outcome to the single bit of "won/lost."

    #4. Regime-Segmented Bayesian Calibration

    The platform maintains alpha/beta calibration parameters segmented by detected market regime. New observations update the parameters for the regime in which they occurred, in a Bayesian-style posterior update, without overwriting the parameters of unrelated regimes.

    The practical consequence: confidence in a strategy under "high volatility, mean-reverting" is tracked separately from confidence in the same strategy under "low volatility, trending." Cross-regime contamination — where a streak in one regime inflates expectations in another — is structurally prevented.

    #5. How Aggregate Statistics Are Reported

    When the platform surfaces aggregate paper-trading statistics, those statistics are computed from the same audited event log used by the calibration loop. They are scoped to the simulator environment and include explicit dating so a reader can tell the window the statistics describe.

    Aggregate statistics are descriptive, not predictive. A reader should treat them as a record of how strategies behaved under the observed regimes during the observed window — not as a forecast of how they will behave next week.

    #6. What This Methodology Withholds

    By design, this whitepaper does not disclose:

    • The reinforcement-learning reward function or its weights.
    • The specific thresholds used by the phase-success classifier.
    • The regime-detection algorithm and its parameters.
    • The per-pass model assignments inside Treeova's intelligence stack.
    • Exact agent prompt strings.

    #7. Limitations & Disclaimers

    • Past performance does not guarantee future results. This applies to every paper-trading number Treeova publishes.
    • Paper-trading fills are simulated. Slippage assumptions are conservative but not adversarial; live execution introduces slippage, partial fills, broker outages, halts, and gaps that cannot be fully modeled.
    • Regime detection is inherently lagged. Calibration updates that depend on regime classification carry the same lag.
    • The reinforcement-learning loop converges only with sufficient sample density per regime. Sparse regimes carry materially larger uncertainty bands than dense ones.
    • Nothing in this whitepaper is investment advice. Trading options involves substantial risk of loss; users should review Treeova's risk disclosures.

    Download the PDF

    The full HTML version is freely available above. Enter your email to download the printable PDF of Methodology Note: Paper Trading Backtesting & RL Calibration.

    We use your email only to send whitepaper updates. You can unsubscribe at any time. See our privacy policy.

    Whitepaper FAQ

    Disclaimer. Past performance does not guarantee future results. Trading options involves substantial risk of loss. See our risk disclosures for details.

    © 2026 Treeova Technologies Inc · This whitepaper documents architecture and qualitative behavior only; proprietary internals (formulas, thresholds, prompts, model routing) are intentionally withheld.