Automated Domain Inference Examples
This document demonstrates the automated domain inference feature, which preserves the data order of categorical fields instead of sorting them alphabetically.
Overview
Automated Domain Inference automatically detects categorical columns and preserves their appearance order in your data. This is useful for:
- Status workflows: Planning → In Progress → Review → Done (custom order, not alphabetical)
- Priority levels: Critical → High → Medium → Low
- Time periods: December → January → February (non-alphabetical months)
- Custom orderings: Any domain with a meaningful sequence
The feature is:
- Enabled by default - No configuration needed for basic usage
- Smart - Only applies to categorical fields (not dates or identifiers)
- Respectful - Doesn't override explicit scale configurations
- Configurable - Can be disabled or tuned per project
Example 1: Simple Status Chart - Data Order Preserved
This example shows how data order is automatically preserved for status categories.
View Source
type: bar
data:
source: |
status,count
Planning,3
In Progress,7
Review,2
Done,8
x: status
y: count
title: Task Status DistributionResult: The X-axis displays statuses in the order they appear in the data: Planning → In Progress → Review → Done
Without domain inference (what used to happen): X-axis would be alphabetically sorted: Done → In Progress → Planning → Review
Configuration
type: bar
data:
source: |
status,count
Planning,3
In Progress,7
Review,2
Done,8
x: status # Categorical field
y: count # Numeric field (no inference needed)
title: Task Status DistributionKey Points
- ✅ Domain inference is automatic for categorical fields
- ✅ Order matches the first appearance in your data
- ✅ No configuration needed - it just works
Example 2: Priority Levels with Custom Order
Demonstrate how priority levels maintain their custom order.
View Source
type: bar
data:
source: |
priority,incidents
Critical,5
High,10
Medium,15
Low,20
x: priority
y: incidents
title: Incident Priority Breakdown
xLabel: Priority Level
yLabel: Number of IncidentsExpected X-axis order: Critical → High → Medium → Low
This preserves the urgency hierarchy without needing explicit configuration.
Example 3: Disable Inference with Explicit Scales
If you want different ordering or to use alphabetical sorting, explicitly specify the domain:
View Source
type: bar
data:
source: |
status,count
Planning,3
In Progress,7
Review,2
Done,8
x: status
y: count
title: Tasks (Alphabetical Order)
scales:
x:
domain:
- Done
- In Progress
- Planning
- ReviewResult: X-axis will follow the explicit domain order (alphabetical in this case)
When to use this:
- You want alphabetical sorting despite non-alphabetical data order
- You want custom ordering different from data appearance
- You need to enforce domain consistency across multiple charts
Example 4: Mixed Data Types - Selective Inference
This example shows how inference works with mixed categorical and numeric data.
View Source
type: scatter
data:
source: |
status,priority,value
Active,Critical,850
Inactive,High,450
Active,Medium,680
Pending,Low,320
Inactive,Critical,920
x: status
y: value
color: priority
title: Mixed Categorical Fields with InferenceResults:
- X-axis (status): Inferred as
Active → Inactive → Pending(data order) - Color scale (priority): Inferred as
Critical → High → Medium → Low(data order) - Y-axis (value): No inference needed (numeric field)
All categorical fields automatically get ordered by first appearance!
Example 5: Comparing Data Order vs Alphabetical
Side-by-side comparison showing the difference between inferred and alphabetical ordering.
Chart 1: With Domain Inference (Default)
View Source
type: bar
data:
source: |
month,sales
December,1000
January,1200
February,1100
March,1300
April,1450
x: month
y: sales
title: Sales by Month - Data Order (Inferred)X-axis: December → January → February → March → April (calendar order)
Chart 2: Forcing Alphabetical Order
View Source
type: bar
data:
source: |
month,sales
December,1000
January,1200
February,1100
March,1300
April,1450
x: month
y: sales
title: Sales by Month - Alphabetical Order (Explicit)
scales:
x:
domain:
- April
- December
- February
- January
- MarchX-axis: April → December → February → January → March (alphabetical)
The difference: Domain inference preserves meaningful order automatically.
Example 6: Date Fields Are Skipped
Domain inference is smart about date detection and won't infer domains for date fields.
View Source
type: line
data:
source: |
date,revenue
2024-01-15,1000
2024-01-16,1200
2024-01-17,1100
x: date
y: revenue
title: Daily Revenue TrendResult: The date field is recognized as a date, not a categorical field, so no domain inference is applied. Obsidian/Plot handles date formatting automatically.
Example 7: Identifier Exclusion
Fields with too many unique values are skipped (likely identifiers, not categories).
View Source
type: scatter
data:
source: |
user_id,score
user_001,750
user_002,680
user_003,920
user_004,450
... (many more users)
x: user_id
y: score
title: User Performance ScoresResult: The user_id field with hundreds of unique values won't be inferred as a categorical domain (too many categories). Explicit configuration would be needed for custom handling.
Configuration Reference
Settings
You can control domain inference behavior in Settings > DataGlass > Domain Configuration:
| Setting | Default | Description |
|---|---|---|
| Enable Auto Domain Inference | ✅ On | Turn domain inference on/off globally |
| Domain Inference Threshold | 2 | Minimum unique values to trigger inference (1-20) |
| Show Domain Validation Warnings | ✅ Off | Display warnings for domain mismatches |
| Show Inferred Domain Info | ✅ Off | Show indicator if domain was inferred vs explicit |
YAML Configuration
type: bar
x: category_field
y: numeric_field
color: another_category
scales:
# Use explicit domain to override inference
x:
domain:
- Custom1
- Custom2
- Custom3
# Leave empty or omit to use inference
color: {} # Will infer from dataCommon Patterns
Pattern 1: Status Workflow
data:
source: |
status,count
Backlog,10
Ready,15
In Progress,8
Testing,5
Done,12
x: status
y: count
type: bar
title: Development PipelineInference result: Backlog → Ready → In Progress → Testing → Done (workflow order preserved)
Pattern 2: Severity/Priority
View Source
data:
source: |
severity,bugs
Critical,5
High,12
Medium,28
Low,45
x: severity
y: bugs
type: bar
title: Bug Distribution by SeverityInference result: Critical → High → Medium → Low (priority order preserved)
Pattern 3: Custom Rankings
View Source
data:
source: |
rating,count
Excellent,320
Good,450
Fair,180
Poor,50
x: rating
y: count
type: bar
title: Customer Satisfaction RatingsInference result: Excellent → Good → Fair → Poor (quality order preserved)
Troubleshooting
Q: Why is my X-axis alphabetical when I want data order?
A: Check if you have an explicit scales.x.domain set. Remove it to enable inference:
# ❌ This forces alphabetical sorting
scales:
x:
domain: [] # Remove this
# ✅ This allows inference
# (Don't specify domain at all)Q: How do I enforce alphabetical order?
A: Explicitly set the domain with sorted values:
scales:
x:
domain:
- Ascending
- Chronological
- Order
- With
- Your
- ValuesQ: Why isn't inference working?
A: Check these conditions:
- Field must be categorical (strings, not numbers)
- Must have 2-50 unique values (not dates, not identifiers)
- At least 80% of values must be strings (not mixed)
- Auto inference enabled in settings (it is by default)
Q: Can I see which domains were inferred?
A: Enable "Show Inferred Domain Info" in Settings > DataGlass > Domain Configuration. This will show an indicator on each chart.
Q: How does this work with transformations?
A: Domain inference runs after transformations, so it infers from the transformed data:
data:
source: |
status,count
Active,100
Inactive,50
transformations:
- type: filter
configuration:
where:
count: { gte: 75 }
x: status
y: count
type: barResult: Only "Active" remains after filtering, so no domain inference (less than 2 categories). Add explicit domain if needed.
Performance
Domain inference is very fast:
- ⚡ Samples only first 100 rows for large datasets
- ⚡ Single pass through data
- ⚡ No sorting required
- ⚡ Typical overhead: < 10ms per chart
Real-world Applications
1. Analytics Dashboard
type: bar
data:
file: data/sales.csv
x: region
y: revenue
color: quarter
title: Sales by Region and QuarterAuto-preserves region order from CSV and quarter order (Q1, Q2, Q3, Q4 if that's the data order).
2. Project Management
type: bar
data:
query: from:projects where:status:open
x: status
y: task_count
title: Tasks by StatusAuto-preserves the workflow status order from your data.
3. Incident Reports
type: scatter
data:
file: data/incidents.parquet
x: severity
y: response_time_minutes
color: team
title: Incident Response AnalysisAuto-preserves severity levels and team order from the data.
Best Practices
- ✅ Let domain inference work - No configuration needed for natural orderings
- ✅ Order data intentionally - Put categorical values in the order you want
- ✅ Use explicit domains for enforcement - Only when you need to override inference
- ✅ Test with sample data - Verify order looks correct
- ❌ Don't mix orderings - Either let inference work or use explicit domains (not both)