Skip to main content

Crossvalidation

⏩ Rolling Cross-Validation​

Train on first folds, validate on the next fold, then roll forward. Works for classification and regression.

Example visualization

Train: [1 2 3] | Test: [4]
Train: [1 2 3 4] | Test: [5]
Train: [1 2 3 4 5] | Test: [6]
...
var cv = new RollingCrossValidator(pipelineGrid);
var result = cv.Run(X, y);
var bestModel = result.BestPipeline;
var score = result.BestScore;

Key points:

  • Always respects temporal order
  • Prevents data leakage
  • Works well for time series forecasting

🔁 K-Fold Cross-Validation​

Split data into K equally sized folds. Each fold is used once as test while remaining folds form the training set. Works for classification and regression on tabular data.

Visualization (K = 5)

Data: [ 1 2 3 4 5 ]

Fold 1: Train [2 3 4 5] | Test [1]
Fold 2: Train [1 3 4 5] | Test [2]
Fold 3: Train [1 2 4 5] | Test [3]
Fold 4: Train [1 2 3 5] | Test [4]
Fold 5: Train [1 2 3 4] | Test [5]
var cv = new KFoldCrossValidator(pipelineGrid, folds: 5);
var result = cv.Run(X, y);

var bestModel = result.BestPipeline;
var score = result.BestScore;

Key points:

  • Order of samples does not matter
  • No temporal assumptions
  • All samples are evaluated exactly once

🧮 Stratified K-Fold Cross-Validation​

Used for classification with imbalanced classes. Ensures that each fold preserves the class proportions.

Example visualization (K = 5)

Class distribution in dataset: 90% class 0, 10% class 1

Fold 1: Train -> 80% class0 / 20% class1 | Test -> 90% class0 / 10% class1
Fold 2: Train -> 80% class0 / 20% class1 | Test -> 90% class0 / 10% class1
...
var cv = new StratifiedKFoldCrossValidator(pipelineGrid, folds: 5);
var result = cv.Run(X, y); // y contains class labels

var bestModel = result.BestPipeline;
var score = result.BestScore;

Key points:

  • Maintains class distribution in every fold
  • Works only for classification
  • Ideal for imbalanced datasets

🔀 ShuffleSplit Cross-Validation​

Randomly splits data into a training set and a test set multiple times. Works for classification and regression. Unlike K-Fold, not all samples are guaranteed to appear in a test set.

Example visualization (3 splits, 20% test size)

Split 1: Train [1 2 3 4] | Test [5]
Split 2: Train [1 3 4 5] | Test [2]
Split 3: Train [2 3 4 5] | Test [1]
...
var cv = new ShuffleSplitCrossValidator(
pipelineGrid,
n_splits: 5,
testSize: 0.2,
trainSize: 0.8,
randomState: 42);

var result = cv.Run(X, y);

var bestModel = result.BestPipeline;
var score = result.BestScore;

Key points:

  • Randomly shuffles data before each split
  • Can perform multiple iterations (n_splits)
  • Does not guarantee all samples are tested exactly once
  • Useful for large datasets where full K-Fold is costly
  • Can be combined with Pipelines, Series, or TimeSeries

🎲 Monte Carlo Cross-Validation​

Runs many random train/test splits (typically 100–1000) and collects the resulting score into a full probability distribution. Built on top of the library's MonteCarloSimulator engine.

Unlike ShuffleSplit which returns a single aggregate score, Monte Carlo CV returns a complete MonteCarloResult with confidence intervals, histograms, standard error, and a convergence curve.

Example visualization (200 iterations, 20% test size)

Iteration   1: Train [random 80%] | Test [random 20%] → score = 0.88
Iteration 2: Train [random 80%] | Test [random 20%] → score = 0.85
Iteration 3: Train [random 80%] | Test [random 20%] → score = 0.91
...
Iteration 200: Train [random 80%] | Test [random 20%] → score = 0.87

→ Mean = 0.87, StdDev = 0.03, 95% CI = [0.84, 0.90]

Standard usage (drop-in ICrossValidator)

var cv = new MonteCarloCrossValidator(
pipelineGrid,
iterations: 200,
testSize: 0.2,
seed: 42);

var result = cv.Run(X, y);

var bestModel = result.BestPipeline;
var score = result.BestScore;

Extended usage (full score distributions)

var cv = new MonteCarloCrossValidator(
pipelineGrid,
iterations: 200,
testSize: 0.2,
seed: 42);

var detailed = cv.RunDetailed(X, y);

// Confidence interval for the best pipeline
var ci = detailed.BestConfidenceInterval; // e.g. (0.84, 0.90)
double stdDev = detailed.BestScoreStdDev; // e.g. 0.03

// Convergence curve — verify that enough iterations were run
double[] convergence = detailed.ConvergenceCurve;

// Full MonteCarloResult per pipeline
foreach (var (pipeline, mcResult) in detailed.DetailedScores)
{
Console.WriteLine($"{pipeline} → {mcResult.Mean:F3} ± {mcResult.StandardDeviation:F3}");

Console.WriteLine($" SE: {mcResult.StandardError:F4}");

// Histogram of score distribution
var histogram = mcResult.Histogram(10);
}

Key points:

  • Quantifies model evaluation uncertainty — not just a point estimate
  • Reports confidence intervals for scores (e.g. "accuracy = 0.87 Âą 0.03")
  • Convergence curve shows whether enough iterations were run
  • Histogram visualizes the full score distribution
  • Standard error decreases with more iterations (SE=΃/nSE = \sigma / \sqrt{n})
  • All pipelines are evaluated on identical random splits (fair comparison)
  • Implements ICrossValidator — drop-in replacement for other validators
  • Reproducible via seed parameter

📅 Leave-One-Out Cross-Validation​

Train on all rows except one, test on the held-out row, then iterate. Works for tabular or grouped data.

Example visualization

Data: [ 1 2 3 4 5 ]

Fold 1: Train [2 3 4 5] | Test [1]
Fold 2: Train [1 3 4 5] | Test [2]
Fold 3: Train [1 2 4 5] | Test [3]
Fold 4: Train [1 2 3 5] | Test [4]
Fold 5: Train [1 2 3 4] | Test [5]
var cv = new LeaveOneOutCrossValidator(pipelineGrid);
var result = cv.Run(X, y);

var bestModel = result.BestPipeline;
var score = result.BestScore;

Key points:

  • Extreme case of K-Fold where K = n
  • Guarantees each sample is used as test exactly once
  • Can be combined with groups if needed

đŸ“Ļ Grouped Cross-Validation​

Used when samples belong to groups and all samples from the same group must stay together. Works for classification and regression.

Example visualization 📊 Series

Groups: [A] [B] [C] [D] [E]

Fold 1: Train -> B, C, D, E | Test -> A
Fold 2: Train -> A, C, D, E | Test -> B
Fold 3: Train -> A, B, D, E | Test -> C
...
var cv = new LeaveOneOutCrossValidator(pipelineGrid);
var result = cv.Run(series, targetColumn: "Target", groupColumn: "Department");

Key points:

  • Groups can be anything: customer, company, department, gender
  • Ensures all group members stay together
  • Often called Leave-One-Group-Out

Example visualization âąī¸ TimeSeries

Train on all groups except one, test on the held-out group, then iterate. Groups can be days, weeks, or custom intervals.

Groups:  [Day1] [Day2] [Day3] [Day4] [Day5]

Fold1: Train -> Day2-Day5 | Test -> Day1

Fold2: Train -> Day1,Day3-Day5 | Test -> Day2

Fold3: Train -> Day1-Day2,Day4-Day5 | Test -> Day3
...
var ts = TimeSeries.FromCsv("data.csv");

var cv = new LeaveOneOutCrossValidator(pipelineGrid);
var result = cv.Run(ts, "Target", new DailyGrouping());

Key points:

  • Order matters
  • Leakage must be avoided
  • Grouping often represents time intervals
ValidatorUses groupingTemporal awarenessNotes
KFoldCrossValidator❌❌Classic tabular K-Fold; all samples used exactly once.
LeaveOneOutCrossValidator✅ (optional)❌Extreme case of K-Fold; can act as Leave-One-Group-Out if groups are provided.
RollingCrossValidator✅ (implicit)✅Designed for time series; respects temporal order to prevent leakage.
ShuffleSplitCrossValidator❌❌Random train/test splits; multiple iterations; not all rows guaranteed to be tested.
StratifiedKFoldCrossValidator❌❌Maintains class proportions; only for classification; useful for imbalanced datasets.