Machine Learning Visualizations
This example demonstrates various Plot marks to visualize machine learning metrics, model performance, and training data using the dg-plot handle.
Dataset
The example dataset contains 50 data points with the following metrics:
- Training and validation loss over epochs
- Feature importance scores
- Model predictions vs actual values
- Confusion matrix metrics
- Cross-validation performance
- Hyperparameter tuning results
Learning Curves
Track training and validation loss over epochs:
View Source
data:
source: |
[
{"epoch": 1, "train_loss": 0.82, "val_loss": 0.85, "train_acc": 0.65, "val_acc": 0.63},
{"epoch": 2, "train_loss": 0.75, "val_loss": 0.79, "train_acc": 0.71, "val_acc": 0.68},
{"epoch": 3, "train_loss": 0.68, "val_loss": 0.74, "train_acc": 0.76, "val_acc": 0.72},
{"epoch": 4, "train_loss": 0.61, "val_loss": 0.69, "train_acc": 0.80, "val_acc": 0.75},
{"epoch": 5, "train_loss": 0.55, "val_loss": 0.65, "train_acc": 0.83, "val_acc": 0.77},
{"epoch": 6, "train_loss": 0.49, "val_loss": 0.62, "train_acc": 0.86, "val_acc": 0.79},
{"epoch": 7, "train_loss": 0.44, "val_loss": 0.60, "train_acc": 0.88, "val_acc": 0.80},
{"epoch": 8, "train_loss": 0.40, "val_loss": 0.58, "train_acc": 0.90, "val_acc": 0.81},
{"epoch": 9, "train_loss": 0.36, "val_loss": 0.57, "train_acc": 0.91, "val_acc": 0.82},
{"epoch": 10, "train_loss": 0.33, "val_loss": 0.56, "train_acc": 0.92, "val_acc": 0.82}
]
marks:
- type: line
x: epoch
y: train_loss
stroke: steelblue
strokeWidth: 2
- type: line
x: epoch
y: val_loss
stroke: coral
strokeWidth: 2
strokeDasharray: [4, 4]
- type: dot
x: epoch
y: train_loss
fill: steelblue
r: 4
- type: dot
x: epoch
y: val_loss
fill: coral
r: 4
- type: crosshair
x: epoch
y: train_loss
opacity: 0.4
tip: true
grid: true
title: Learning Curves
style:
fontSize: 12
width: 1000Feature Importance
Visualize relative importance of model features:
View Source
data:
source: |
[
{"feature": "feature_1", "importance": 0.85, "std": 0.05, "category": "Primary"},
{"feature": "feature_2", "importance": 0.72, "std": 0.06, "category": "Primary"},
{"feature": "feature_3", "importance": 0.65, "std": 0.04, "category": "Secondary"},
{"feature": "feature_4", "importance": 0.58, "std": 0.07, "category": "Secondary"},
{"feature": "feature_5", "importance": 0.45, "std": 0.05, "category": "Secondary"},
{"feature": "feature_6", "importance": 0.38, "std": 0.04, "category": "Tertiary"},
{"feature": "feature_7", "importance": 0.32, "std": 0.06, "category": "Tertiary"},
{"feature": "feature_8", "importance": 0.25, "std": 0.03, "category": "Tertiary"}
]
marks:
- type: bar
x: importance
y: feature
fill: category
sort: y
- type: rule
x1: "d => d.importance - d.std"
x2: "d => d.importance + d.std"
y: feature
stroke: currentColor
strokeOpacity: 0.4
grid: true
title: Feature Importance
style:
fontSize: 12
scales:
color:
type: ordinal
scheme: tableau10Model Performance
Compare predicted vs actual values with confidence bands:
View Source
data:
source: |
[
{"actual": 10.2, "predicted": 9.8, "confidence": 0.92, "group": "Group A"},
{"actual": 15.7, "predicted": 16.1, "confidence": 0.88, "group": "Group A"},
{"actual": 20.5, "predicted": 19.9, "confidence": 0.95, "group": "Group A"},
{"actual": 25.3, "predicted": 26.0, "confidence": 0.91, "group": "Group A"},
{"actual": 30.8, "predicted": 29.5, "confidence": 0.89, "group": "Group A"},
{"actual": 12.4, "predicted": 11.9, "confidence": 0.87, "group": "Group B"},
{"actual": 17.9, "predicted": 18.5, "confidence": 0.93, "group": "Group B"},
{"actual": 22.6, "predicted": 21.8, "confidence": 0.90, "group": "Group B"},
{"actual": 27.1, "predicted": 28.0, "confidence": 0.86, "group": "Group B"},
{"actual": 32.5, "predicted": 31.2, "confidence": 0.94, "group": "Group B"}
]
marks:
- type: line
x: actual
y: actual
stroke: gray
strokeDasharray: [2, 2]
strokeOpacity: 0.3
- type: dot
x: actual
y: predicted
r: (d) => 3 + d.confidence * 4
fill: steelblue
fillOpacity: 0.5
stroke: steelblue
tip: true
- type: crosshair
x: actual
y: predicted
opacity: 0.4
tip: true
grid: true
title: Model Performance (Predicted vs Actual)
style:
fontSize: 12
scales:
x:
domain: [0, 40]
label: Actual Values
y:
domain: [0, 40]
label: Predicted ValuesCross-validation Performance
Visualize model performance across different folds:
View Source
data:
source: |
[
{"fold": 1, "accuracy": 0.82, "precision": 0.80, "recall": 0.83, "f1": 0.81},
{"fold": 2, "accuracy": 0.85, "precision": 0.83, "recall": 0.86, "f1": 0.84},
{"fold": 3, "accuracy": 0.79, "precision": 0.78, "recall": 0.81, "f1": 0.79},
{"fold": 4, "accuracy": 0.83, "precision": 0.82, "recall": 0.85, "f1": 0.83},
{"fold": 5, "accuracy": 0.81, "precision": 0.79, "recall": 0.82, "f1": 0.80}
]
marks:
- type: rect
x: fold
y1: 0
y2: accuracy
fill: steelblue
fillOpacity: 0.3
- type: rule
x: fold
y1: precision
y2: recall
stroke: coral
strokeWidth: 2
- type: dot
x: fold
y: f1
fill: purple
r: 6
- type: text
x: fold
y: "d => d.f1 + 0.05"
text: "d => d.f1.toFixed(2)"
fontSize: 10
textAnchor: middle
grid: trueHyperparameter Tuning
Visualize hyperparameter search results:
View Source
data:
source: |
[
{"learning_rate": 0.001, "batch_size": 32, "score": 0.75, "runtime": 120},
{"learning_rate": 0.001, "batch_size": 64, "score": 0.78, "runtime": 100},
{"learning_rate": 0.001, "batch_size": 128, "score": 0.76, "runtime": 80},
{"learning_rate": 0.01, "batch_size": 32, "score": 0.82, "runtime": 110},
{"learning_rate": 0.01, "batch_size": 64, "score": 0.85, "runtime": 90},
{"learning_rate": 0.01, "batch_size": 128, "score": 0.83, "runtime": 70},
{"learning_rate": 0.1, "batch_size": 32, "score": 0.79, "runtime": 100},
{"learning_rate": 0.1, "batch_size": 64, "score": 0.81, "runtime": 80},
{"learning_rate": 0.1, "batch_size": 128, "score": 0.80, "runtime": 60}
]
marks:
- type: dot
x: learning_rate
y: score
r: "d => d.runtime / 10"
fill: "d => d.batch_size"
fillOpacity: 0.6
title: "d => `Batch Size: ${d.batch_size}\nRuntime: ${d.runtime}s`"
- type: crosshair
x: learning_rate
y: score
opacity: 0.4
tip: true
grid: true
title: Hyperparameter Tuning (Learning Rate vs Score)
scales:
x:
type: log
domain: [0.0001, 1]
label: Learning Rate (log scale)
y:
domain: [0.7, 0.9]
label: Model Score
color:
scheme: viridis
legend: trueModel Residuals
Analyze prediction residuals:
View Source
data:
source: |
[
{"predicted": 25.3, "residual": -2.1, "confidence": 0.85, "feature_val": 12.5},
{"predicted": 28.7, "residual": 1.8, "confidence": 0.92, "feature_val": 15.8},
{"predicted": 31.2, "residual": -1.5, "confidence": 0.88, "feature_val": 18.2},
{"predicted": 35.8, "residual": 2.3, "confidence": 0.90, "feature_val": 22.5},
{"predicted": 38.5, "residual": -1.9, "confidence": 0.87, "feature_val": 25.7},
{"predicted": 42.1, "residual": 1.6, "confidence": 0.91, "feature_val": 28.9},
{"predicted": 45.6, "residual": -2.4, "confidence": 0.86, "feature_val": 32.4},
{"predicted": 48.9, "residual": 2.0, "confidence": 0.89, "feature_val": 35.8},
{"predicted": 52.3, "residual": -1.7, "confidence": 0.93, "feature_val": 38.9},
{"predicted": 55.8, "residual": 1.9, "confidence": 0.90, "feature_val": 42.3}
]
marks:
- type: dot
x: predicted
y: residual
r: "d => d.confidence * 8"
fill: "d => Math.abs(d.residual)"
fillOpacity: 0.6
- type: rule
x1: 20
x2: 60
y: 0
stroke: currentColor
strokeOpacity: 0.2
- type: area
x: predicted
y1: "d => Math.min(0, d.residual)"
y2: "d => Math.max(0, d.residual)"
fill: "d => d.residual > 0 ? 'coral' : 'steelblue'"
fillOpacity: 0.2
- type: crosshair
x: predicted
y: residual
opacity: 0.4
tip: true
grid: true
scales:
color:
type: linear
scheme: rdbu
legend: trueTraining Progress
Detailed view of training metrics over time:
View Source
data:
source: |
[
{"step": 100, "loss": 0.85, "gradient_norm": 0.95, "learning_rate": 0.01},
{"step": 200, "loss": 0.75, "gradient_norm": 0.82, "learning_rate": 0.01},
{"step": 300, "loss": 0.68, "gradient_norm": 0.75, "learning_rate": 0.008},
{"step": 400, "loss": 0.62, "gradient_norm": 0.68, "learning_rate": 0.008},
{"step": 500, "loss": 0.55, "gradient_norm": 0.62, "learning_rate": 0.006},
{"step": 600, "loss": 0.50, "gradient_norm": 0.58, "learning_rate": 0.006},
{"step": 700, "loss": 0.45, "gradient_norm": 0.52, "learning_rate": 0.004},
{"step": 800, "loss": 0.42, "gradient_norm": 0.48, "learning_rate": 0.004},
{"step": 900, "loss": 0.38, "gradient_norm": 0.45, "learning_rate": 0.002},
{"step": 1000, "loss": 0.35, "gradient_norm": 0.42, "learning_rate": 0.002}
]
marks:
- type: line
x: step
y: loss
stroke: steelblue
strokeWidth: 2
- type: area
x: step
y: loss
fill: steelblue
fillOpacity: 0.1
- type: dot
x: step
y: gradient_norm
r: "d => d.learning_rate * 1000"
fill: coral
fillOpacity: 0.6
- type: rule
x: step
y1: loss
y2: gradient_norm
stroke: gray
strokeOpacity: 0.2
- type: text
x: step
y: "d => Math.max(d.loss, d.gradient_norm) + 0.1"
text: "d => d.learning_rate.toFixed(3)"
fontSize: 8
textAnchor: middle
- type: crosshair
x: step
y: loss
opacity: 0.4
tip: true
grid: trueUsage Tips
Each plot demonstrates different mark types and their combinations:
linefor trend visualizationareafor filled regionsdotfor scatter pointsrulefor error bars and reference linesbarfor categorical comparisonsrectfor rectangular markstextfor labelscrosshairfor interactive data exploration
Common styling options:
- Use
fillOpacityandstrokeOpacityfor layering - Set
grid: truefor reference lines - Use
tip: truewith crosshair for tooltips - Apply color scales with
schemefor consistent palettes
- Use
Interactive features:
- Hover over points for tooltips
- Use crosshair marks for precise data reading
- Combine multiple marks for rich visualizations
Data handling:
- Use computed values:
"d => expression" - Format text with templates
- Scale values for visual encoding (e.g., radius)
- Use computed values: