Skip to content

Data Transformation Examples - Pivot Wide to Long Format

Data transformation enables converting wide-format data (multiple metrics in separate columns) into long-format data (tidy format with metric column and value column). This is essential for creating flexible visualizations with Observable Plot.

1. Basic Quarterly Sales Pivot

Convert quarterly sales data from wide to long format:

View Source
type: line
x: quarter
y: revenue
color: year
data:
  source: '[{"year": "2023", "q1": 1000, "q2": 1200, "q3": 1400, "q4": 1600}, {"year": "2024", "q1": 1100, "q2": 1300, "q3": 1500, "q4": 1700}]'
transformations:
  - type: pivot
    configuration:
      columns: ["q1", "q2", "q3", "q4"]
      keyColumn: quarter
      valueColumn: revenue
      identifierColumns: ["year"]

Configuration Details:

  • columns: The quarter columns to pivot (q1, q2, q3, q4)
  • keyColumn: Create a "quarter" column containing the column names (q1, q2, q3, q4)
  • valueColumn: Create a "revenue" column containing the values from those columns
  • identifierColumns: Preserve the year column as a grouping identifier

Result: Wide format with 2 rows and 4 quarter columns becomes long format with 8 rows (2 years × 4 quarters). The quarter column will contain values like "q1", "q2", etc., which can be displayed on the x-axis.

2. Multi-Metric Dashboard Conversion

Convert dashboard metrics to long format for flexible visualization:

View Source
type: bar
x: date
y: value
color: metric
data:
  source: '[{"date": "2024-01-01", "activeUsers": 1200, "sessionTime": 450, "pageViews": 5600}, {"date": "2024-01-02", "activeUsers": 1350, "sessionTime": 480, "pageViews": 6100}, {"date": "2024-01-03", "activeUsers": 1280, "sessionTime": 465, "pageViews": 5800}]'
transformations:
  - type: pivot
    configuration:
      columns: ["activeUsers", "sessionTime", "pageViews"]
      keyColumn: metric
      valueColumn: value

Result: Creates one row per metric per date, allowing color-coding by metric type.

3. Department Performance Metrics

Compare quarterly performance scores across departments:

View Source
type: bar
x: period_metric
y: score
color: department
data:
  source: '[{"department": "Engineering", "Q1_efficiency": 0.85, "Q1_productivity": 92, "Q2_efficiency": 0.88, "Q2_productivity": 95}, {"department": "Sales", "Q1_efficiency": 0.72, "Q1_productivity": 88, "Q2_efficiency": 0.75, "Q2_productivity": 91}, {"department": "Marketing", "Q1_efficiency": 0.80, "Q1_productivity": 85, "Q2_efficiency": 0.83, "Q2_productivity": 88}]'
transformations:
  - type: pivot
    configuration:
      columns: ["Q1_efficiency", "Q1_productivity", "Q2_efficiency", "Q2_productivity"]
      identifierColumns: ["department"]
      keyColumn: period_metric
      valueColumn: score

Configuration Details:

  • After pivot, data has columns: department, period_metric, score
  • period_metric contains values like "Q1_efficiency", "Q2_productivity"
  • score contains the numeric values
  • Grouped bars show metrics side-by-side for each department

4. Financial Performance - Profit/Loss Analysis

View Source
type: bar
x: month
y: amount
fill: category
data:
  source: '[{"month": "Jan", "revenue": 15000, "expenses": 8000, "profit": 7000}, {"month": "Feb", "revenue": 18000, "expenses": 9500, "profit": 8500}, {"month": "Mar", "revenue": 16500, "expenses": 8800, "profit": 7700}]'
transformations:
  - type: pivot
    configuration:
      columns: ["revenue", "expenses", "profit"]
      keyColumn: category
      valueColumn: amount

Use Case: Stack bars by category to show financial breakdown per month.

5. Regional Sales Comparison

View Source
type: line
x: month
y: sales
color: region
data:
  source: '[{"month": "Jan", "North": 12000, "South": 8500, "East": 10200, "West": 9800}, {"month": "Feb", "North": 13500, "South": 9200, "East": 11100, "West": 10500}, {"month": "Mar", "North": 14200, "South": 9800, "East": 11900, "West": 11200}]'
transformations:
  - type: pivot
    configuration:
      columns: ["North", "South", "East", "West"]
      keyColumn: region
      valueColumn: sales

Result: Multi-line chart showing sales trends for each region.

6. Year-over-Year Comparison

View Source
type: bar
x: quarter
y: revenue
color: year
data:
  source: '[{"quarter": "Q1", "2022": 45000, "2023": 52000, "2024": 58000}, {"quarter": "Q2", "2022": 48000, "2023": 55000, "2024": 61000}, {"quarter": "Q3", "2022": 52000, "2023": 58500, "2024": 64000}, {"quarter": "Q4", "2022": 55000, "2023": 62000, "2024": 68000}]'
transformations:
  - type: pivot
    configuration:
      columns: ["2022", "2023", "2024"]
      keyColumn: year
      valueColumn: revenue

Result: Grouped bars showing year-over-year revenue trends by quarter.

7. Product Category Performance

View Source
type: bar
x: period_metric
y: value
color: product
data:
  source: '[{"product": "ProductA", "category": "Electronics", "Q1_price": 99.99, "Q1_units": 1200, "Q2_price": 94.99, "Q2_units": 1400, "Q3_price": 89.99, "Q3_units": 1650}, {"product": "ProductB", "category": "Accessories", "Q1_price": 24.99, "Q1_units": 3200, "Q2_price": 22.99, "Q2_units": 3800, "Q3_price": 19.99, "Q3_units": 4100}]'
transformations:
  - type: pivot
    configuration:
      identifierColumns: ["product", "category"]
      columns: ["Q1_price", "Q1_units", "Q2_price", "Q2_units", "Q3_price", "Q3_units"]
      keyColumn: period_metric
      valueColumn: value

Configuration Details:

  • After pivot, data has columns: product, category, period_metric, value
  • period_metric contains: "Q1_price", "Q1_units", "Q2_price", etc.
  • Bars grouped by product show price and unit trends across quarters

Transformation Configuration Guide

Required Properties

  • columns (array): List of column names to pivot from wide to long format

Optional Properties

  • keyColumn (string, default: "metric"): Name of the column that will contain the pivoted column names
  • valueColumn (string, default: "value"): Name of the column that will contain the values
  • identifierColumns (array): Specific columns to preserve as identifiers. If not specified, all columns not in the pivot list are preserved

Best Practices

  1. Use Descriptive Column Names: Make metric names meaningful so they display well in charts

    • Good: Q1_revenue, Region_North
    • Avoid: col1, metric_1
  2. Choose Appropriate Key Names:

    • Use "category", "type", "region" for spatial data
    • Use "quarter", "month", "period" for time data
    • Use "metric", "measure" for generic metrics
  3. Preserve Important Context:

    • Include product, department, or location columns in identifierColumns
    • This allows grouping and color-coding in visualizations
  4. Handle Missing Data:

    • Pivot transformation automatically skips null/undefined values
    • Consider imputing or removing rows with missing metrics before transformation
  5. Performance Considerations:

    • Efficient for datasets up to 100,000 rows
    • Pivot from 5-50 columns is typical use case
    • Large pivots (100+ columns) may slow rendering

Real-World Scenarios

E-commerce Dashboard

Convert product metrics (revenue, orders, returns) by category into long format for unified dashboard view.

HR Analytics

Transform quarterly performance ratings (technical, leadership, communication) into tidy format for trend analysis.

Infrastructure Monitoring

Convert host metrics (CPU%, Memory%, Disk%) into long format for cross-host comparison charts.

Financial Reporting

Pivot P&L statement data (revenue, COGS, operating expenses) into long format for detailed financial visualization.

Chaining Transformations

While currently only pivot transformation is implemented, the pipeline supports chaining multiple transformations:

yaml
transformations:
  - type: pivot
    configuration:
      columns: ["q1", "q2", "q3", "q4"]
      keyColumn: quarter
      valueColumn: revenue
  # Future: additional transformations can be added
  # - type: filter
  #   configuration:
  #     where: "revenue > 1000"
  # - type: aggregate
  #   configuration:
  #     groupBy: ["year"]
  #     aggregations: {"revenue": "sum"}

Troubleshooting

Issue: Data not appearing after transformation

  • Solution: Verify column names match exactly (case-sensitive)
  • Solution: Check for null values that may be skipped

Issue: Too many rows after transformation

  • Solution: Reduce the number of columns being pivoted
  • Solution: Use identifierColumns to specify only needed context columns

Issue: Memory error with large dataset

  • Solution: Filter data before transformation
  • Solution: Split transformation into multiple charts with different column subsets

Released under the MIT License. Built by Boundary Lab.