Skip to content

Automated Domain Inference Examples

This document demonstrates the automated domain inference feature, which preserves the data order of categorical fields instead of sorting them alphabetically.

Overview

Automated Domain Inference automatically detects categorical columns and preserves their appearance order in your data. This is useful for:

  • Status workflows: Planning → In Progress → Review → Done (custom order, not alphabetical)
  • Priority levels: Critical → High → Medium → Low
  • Time periods: December → January → February (non-alphabetical months)
  • Custom orderings: Any domain with a meaningful sequence

The feature is:

  • Enabled by default - No configuration needed for basic usage
  • Smart - Only applies to categorical fields (not dates or identifiers)
  • Respectful - Doesn't override explicit scale configurations
  • Configurable - Can be disabled or tuned per project

Example 1: Simple Status Chart - Data Order Preserved

This example shows how data order is automatically preserved for status categories.

View Source
type: bar
data:
  source: |
    status,count
    Planning,3
    In Progress,7
    Review,2
    Done,8

x: status
y: count
title: Task Status Distribution

Result: The X-axis displays statuses in the order they appear in the data: Planning → In Progress → Review → Done

Without domain inference (what used to happen): X-axis would be alphabetically sorted: Done → In Progress → Planning → Review

Configuration

yaml
type: bar
data:
  source: |
    status,count
    Planning,3
    In Progress,7
    Review,2
    Done,8

x: status          # Categorical field
y: count           # Numeric field (no inference needed)
title: Task Status Distribution

Key Points

  • ✅ Domain inference is automatic for categorical fields
  • ✅ Order matches the first appearance in your data
  • ✅ No configuration needed - it just works

Example 2: Priority Levels with Custom Order

Demonstrate how priority levels maintain their custom order.

View Source
type: bar
data:
  source: |
    priority,incidents
    Critical,5
    High,10
    Medium,15
    Low,20

x: priority
y: incidents
title: Incident Priority Breakdown
xLabel: Priority Level
yLabel: Number of Incidents

Expected X-axis order: Critical → High → Medium → Low

This preserves the urgency hierarchy without needing explicit configuration.

Example 3: Disable Inference with Explicit Scales

If you want different ordering or to use alphabetical sorting, explicitly specify the domain:

View Source
type: bar
data:
  source: |
    status,count
    Planning,3
    In Progress,7
    Review,2
    Done,8

x: status
y: count
title: Tasks (Alphabetical Order)

scales:
  x:
    domain:
      - Done
      - In Progress
      - Planning
      - Review

Result: X-axis will follow the explicit domain order (alphabetical in this case)

When to use this:

  • You want alphabetical sorting despite non-alphabetical data order
  • You want custom ordering different from data appearance
  • You need to enforce domain consistency across multiple charts

Example 4: Mixed Data Types - Selective Inference

This example shows how inference works with mixed categorical and numeric data.

View Source
type: scatter
data:
  source: |
    status,priority,value
    Active,Critical,850
    Inactive,High,450
    Active,Medium,680
    Pending,Low,320
    Inactive,Critical,920

x: status
y: value
color: priority
title: Mixed Categorical Fields with Inference

Results:

  • X-axis (status): Inferred as Active → Inactive → Pending (data order)
  • Color scale (priority): Inferred as Critical → High → Medium → Low (data order)
  • Y-axis (value): No inference needed (numeric field)

All categorical fields automatically get ordered by first appearance!

Example 5: Comparing Data Order vs Alphabetical

Side-by-side comparison showing the difference between inferred and alphabetical ordering.

Chart 1: With Domain Inference (Default)

View Source
type: bar
data:
  source: |
    month,sales
    December,1000
    January,1200
    February,1100
    March,1300
    April,1450

x: month
y: sales
title: Sales by Month - Data Order (Inferred)

X-axis: December → January → February → March → April (calendar order)

Chart 2: Forcing Alphabetical Order

View Source
type: bar
data:
  source: |
    month,sales
    December,1000
    January,1200
    February,1100
    March,1300
    April,1450

x: month
y: sales
title: Sales by Month - Alphabetical Order (Explicit)

scales:
  x:
    domain:
      - April
      - December
      - February
      - January
      - March

X-axis: April → December → February → January → March (alphabetical)

The difference: Domain inference preserves meaningful order automatically.

Example 6: Date Fields Are Skipped

Domain inference is smart about date detection and won't infer domains for date fields.

View Source
type: line
data:
  source: |
    date,revenue
    2024-01-15,1000
    2024-01-16,1200
    2024-01-17,1100

x: date
y: revenue
title: Daily Revenue Trend

Result: The date field is recognized as a date, not a categorical field, so no domain inference is applied. Obsidian/Plot handles date formatting automatically.

Example 7: Identifier Exclusion

Fields with too many unique values are skipped (likely identifiers, not categories).

View Source
type: scatter
data:
  source: |
    user_id,score
    user_001,750
    user_002,680
    user_003,920
    user_004,450
    ... (many more users)

x: user_id
y: score
title: User Performance Scores

Result: The user_id field with hundreds of unique values won't be inferred as a categorical domain (too many categories). Explicit configuration would be needed for custom handling.

Configuration Reference

Settings

You can control domain inference behavior in Settings > DataGlass > Domain Configuration:

SettingDefaultDescription
Enable Auto Domain Inference✅ OnTurn domain inference on/off globally
Domain Inference Threshold2Minimum unique values to trigger inference (1-20)
Show Domain Validation Warnings✅ OffDisplay warnings for domain mismatches
Show Inferred Domain Info✅ OffShow indicator if domain was inferred vs explicit

YAML Configuration

yaml
type: bar
x: category_field
y: numeric_field
color: another_category

scales:
  # Use explicit domain to override inference
  x:
    domain:
      - Custom1
      - Custom2
      - Custom3
  
  # Leave empty or omit to use inference
  color: {}  # Will infer from data

Common Patterns

Pattern 1: Status Workflow

yaml
data:
  source: |
    status,count
    Backlog,10
    Ready,15
    In Progress,8
    Testing,5
    Done,12

x: status
y: count
type: bar
title: Development Pipeline

Inference result: Backlog → Ready → In Progress → Testing → Done (workflow order preserved)

Pattern 2: Severity/Priority

View Source
data:
  source: |
    severity,bugs
    Critical,5
    High,12
    Medium,28
    Low,45

x: severity
y: bugs
type: bar
title: Bug Distribution by Severity

Inference result: Critical → High → Medium → Low (priority order preserved)

Pattern 3: Custom Rankings

View Source
data:
  source: |
    rating,count
    Excellent,320
    Good,450
    Fair,180
    Poor,50

x: rating
y: count
type: bar
title: Customer Satisfaction Ratings

Inference result: Excellent → Good → Fair → Poor (quality order preserved)

Troubleshooting

Q: Why is my X-axis alphabetical when I want data order?

A: Check if you have an explicit scales.x.domain set. Remove it to enable inference:

yaml
# ❌ This forces alphabetical sorting
scales:
  x:
    domain: []  # Remove this

# ✅ This allows inference
# (Don't specify domain at all)

Q: How do I enforce alphabetical order?

A: Explicitly set the domain with sorted values:

yaml
scales:
  x:
    domain:
      - Ascending
      - Chronological
      - Order
      - With
      - Your
      - Values

Q: Why isn't inference working?

A: Check these conditions:

  1. Field must be categorical (strings, not numbers)
  2. Must have 2-50 unique values (not dates, not identifiers)
  3. At least 80% of values must be strings (not mixed)
  4. Auto inference enabled in settings (it is by default)

Q: Can I see which domains were inferred?

A: Enable "Show Inferred Domain Info" in Settings > DataGlass > Domain Configuration. This will show an indicator on each chart.

Q: How does this work with transformations?

A: Domain inference runs after transformations, so it infers from the transformed data:

yaml
data:
  source: |
    status,count
    Active,100
    Inactive,50

transformations:
  - type: filter
    configuration:
      where:
        count: { gte: 75 }

x: status
y: count
type: bar

Result: Only "Active" remains after filtering, so no domain inference (less than 2 categories). Add explicit domain if needed.

Performance

Domain inference is very fast:

  • ⚡ Samples only first 100 rows for large datasets
  • ⚡ Single pass through data
  • ⚡ No sorting required
  • ⚡ Typical overhead: < 10ms per chart

Real-world Applications

1. Analytics Dashboard

yaml
type: bar
data:
  file: data/sales.csv

x: region
y: revenue
color: quarter
title: Sales by Region and Quarter

Auto-preserves region order from CSV and quarter order (Q1, Q2, Q3, Q4 if that's the data order).

2. Project Management

yaml
type: bar
data:
  query: from:projects where:status:open

x: status
y: task_count
title: Tasks by Status

Auto-preserves the workflow status order from your data.

3. Incident Reports

yaml
type: scatter
data:
  file: data/incidents.parquet

x: severity
y: response_time_minutes
color: team
title: Incident Response Analysis

Auto-preserves severity levels and team order from the data.

Best Practices

  1. Let domain inference work - No configuration needed for natural orderings
  2. Order data intentionally - Put categorical values in the order you want
  3. Use explicit domains for enforcement - Only when you need to override inference
  4. Test with sample data - Verify order looks correct
  5. Don't mix orderings - Either let inference work or use explicit domains (not both)

See Also

Released under the MIT License. Built by Boundary Lab.