Crate substrait_explain

Crate substrait_explain 

Source
Expand description

§Substrait-Explain

Transform complex Substrait protobuf plans into readable, SQL EXPLAIN-like text

A Rust library that converts Substrait query plans between protobuf format and a human-readable text format. It transforms verbose, nested protobuf structures into concise, SQL-like text that’s easy to read and debug.

§Key Features

  • Human-readable output: Convert complex Substrait plans into simple, readable text
  • Bidirectional conversion: Parse text format back into Substrait plans
  • Extension support: Full support for Substrait extensions and custom functions
  • Error handling: Graceful error handling that doesn’t prevent output generation
  • Flexible formatting: Configurable output options for different use cases
  • Complete grammar: Full specification of the text format in the grammar module

For installation instructions, see the README.

§Quick Start

§Parse and Format Plans

The main workflow is parsing text format and formatting plans. This example demonstrates both basic usage and extension handling:

use substrait_explain::{parse, format};

// Parse a plan from text format (includes extensions for custom functions)
let plan_text = r#"
=== Extensions
URNs:
  @  1: https://github.com/substrait-io/substrait/blob/main/extensions/functions_arithmetic.yaml
Functions:
  # 10 @  1: add

=== Plan
Project[$0, $1, add($0, $1)]
  Read[table1 => col1:i32?, col2:i32?]
"#;

let plan = parse(plan_text).unwrap();
let (output, errors) = format(&plan);
println!("{}", output);

// Check for any formatting warnings
if !errors.is_empty() {
    println!("Warnings: {:?}", errors);
}

§Custom Formatting

Control output detail with formatting options:

use substrait_explain::{parse, format_with_options, OutputOptions, Visibility};

let plan = parse(r#"
=== Plan
Project[$0, 42, 54:i16]
  Read[data => name:string?, num:i64]
"#).unwrap();

// Verbose output with all details
let verbose = OutputOptions::verbose();
let (text, _) = format_with_options(&plan, &verbose);

// Custom options
let custom = OutputOptions {
    literal_types: Visibility::Always,
    indent: "    ".to_string(),
    ..OutputOptions::default()
};
let (text, _) = format_with_options(&plan, &custom);

§Error Handling

The library provides graceful error handling for formatting, producing best-effort output even if there are errors:

use substrait_explain::{parse, format};

match parse("=== Plan\nInvalidRelation[invalid]") {
    Ok(plan) => {
        let (text, errors) = format(&plan);
        println!("Formatted: {}", text);
        if !errors.is_empty() {
            println!("Warnings: {:?}", errors);
        }
    }
    Err(e) => println!("Parse error: {}", e),
}

§Custom Extension Types

The library supports custom extension relations with user-defined protobuf payloads, enabling round-trip conversion of plans with custom data sources or specialized operations.

§Requirements

To use a custom extension type, it must implement:

  1. prost::Message - For protobuf serialization (usually derived or generated)
  2. prost::Name - For type URL encoding (auto-generated by prost-build with enable_type_names(), or implement manually)
  3. Explainable - For text format conversion (always implemented manually)

§The Explainable Trait

The Explainable trait defines how your type converts to/from the text format:

  • name() - The extension name used in text (e.g., "ParquetScan")
  • from_args(args) - Parse text arguments into your type
  • to_args(&self) - Convert your type to text arguments

Use ArgsExtractor for convenient argument parsing:

  • extractor.expect_named_arg::<T>(name) - Required argument
  • extractor.get_named_or::<T>(name, default) - Optional with default
  • extractor.check_exhausted() - Verify no unexpected arguments

§Extension Namespaces

Extensions are organized into namespaces by their type:

  • Relation - Custom relation types (ExtensionLeaf, ExtensionSingle, ExtensionMulti)
  • Enhancement - Metadata attached to relations (displayed with + Enh: prefix)
  • Optimization - Optimization hints (displayed with + Opt: prefix)

§Using the ExtensionRegistry

Register extensions to the appropriate namespace:

#[derive(Clone, PartialEq, Message)]
pub struct MySourceConfig {
  // Auto-generated from prost
}
impl Name for MySourceConfig {
  // Implement this
}

let mut registry = ExtensionRegistry::new();

// Register a relation extension
registry.register_relation::<MySourceConfig>().unwrap();

// Enhancement and optimization extensions use:
// registry.register_enhancement::<MyEnhancement>().unwrap();
// registry.register_optimization::<MyOptimization>().unwrap();

let parser = Parser::new().with_extension_registry(registry.clone());

See examples/extensions.rs for a complete working example with a custom ParquetScan extension type.

§Output Format

The library produces a structured text format that’s easy to read and parse. For a complete specification of the text format grammar, see the grammar module.

§Basic Plan Structure

=== Extensions
URNs:
  @  1: https://github.com/substrait-io/substrait/blob/main/extensions/functions_arithmetic.yaml
  @  2: https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate.yaml
Functions:
  # 10 @  1: add
  # 11 @  2: sum
  # 12 @  2: count
=== Plan
Root[result]
  Aggregate[$0 => $0, sum($1), count($1)]
    Project[$0, add($1, $2)]
      Read[table1 => category:string, col1:i32?, col2:i32?]

§Relation Format

Each relation is displayed on a single line with the format: RelationName[arguments => columns]

  • arguments: Input expressions, field references, or function calls
  • columns: Output column names and types
  • indentation: Shows the relationship hierarchy

§Expression Format

  • Field references: $0, $1, etc.
  • Literals: 42, "hello", true
  • Function calls: add($0, $1), sum($2) (scalar and aggregate functions)
  • Types: i32, string?, list<i64>

§Configuration Options

Control output formatting with OutputOptions:

use substrait_explain::{OutputOptions, Visibility};

// Default - concise output
let default = OutputOptions::default();

// Verbose - show all details
let verbose = OutputOptions::verbose();

// Custom - show literal types and use 4-space indentation
let custom = OutputOptions {
    literal_types: Visibility::Always,
    indent: "    ".to_string(),
    ..OutputOptions::default()
};

§Command Line Interface

The library includes a command line interface for converting between different Substrait plan formats and validating plans. The CLI is available behind the cli feature flag.

§Installation

Install the CLI with:

cargo install substrait-explain --features cli

Or build from source:

cargo build --release --features cli

§Commands

§Convert Command

The convert command transforms plans between different formats:

# Convert text format to JSON
substrait-explain convert -f text -t json -i plan.substrait -o plan.json

# Convert JSON back to text
substrait-explain convert -f json -t text -i plan.json -o plan.substrait

# Convert to binary protobuf format
substrait-explain convert -f text -t protobuf -i plan.substrait -o plan.pb

# Use stdin/stdout (default)
cat plan.substrait | substrait-explain convert -f text -t json > plan.json

Supported formats:

  • text - Human-readable Substrait text format
  • json - JSON serialized protobuf
  • yaml - YAML serialized protobuf
  • protobuf/proto/pb - Binary protobuf format

Options:

  • -f, --from <FORMAT> - Input format (default: text)
  • -t, --to <FORMAT> - Output format (default: text)
  • -i, --input <FILE> - Input file (default: stdin)
  • -o, --output <FILE> - Output file (default: stdout)
  • --show-literal-types - Show type annotations on literals
  • --show-expression-types - Show type annotations on expressions
  • --verbose - Show detailed progress information
§Validate Command

The validate command performs a roundtrip test on text format plans:

# Validate a plan file
substrait-explain validate -i plan.substrait

# Validate from stdin
cat plan.substrait | substrait-explain validate

# Validate with verbose output
substrait-explain validate -i plan.substrait --verbose

Options:

  • -i, --input <FILE> - Input file (default: stdin)
  • -o, --output <FILE> - Output file (default: stdout)
  • --verbose - Show detailed progress information

§Examples

# Validate the example plans
substrait-explain validate -i example-plans/basic.substrait
substrait-explain validate -i example-plans/simple.substrait

# Convert with verbose output and type information
substrait-explain convert -f text -t json --show-literal-types --show-expression-types --verbose -i example-plans/basic.substrait

# Roundtrip test: text → protobuf → text
substrait-explain convert -f text -t protobuf -i plan.substrait -o plan.pb
substrait-explain convert -f protobuf -t text -i plan.pb -o plan_roundtrip.substrait
diff plan.substrait plan_roundtrip.substrait

§Feature Requirements

To use the CLI, you must build/install with the cli feature:

[dependencies]
substrait-explain = { version = "0.1.0", features = ["cli"] }

For JSON/YAML support, also enable the serde feature:

[dependencies]
substrait-explain = { version = "0.1.0", features = ["cli", "serde"] }

Re-exports§

pub use parser::ParseError;
pub use textify::foundation::FormatError;
pub use textify::foundation::OutputOptions;
pub use textify::foundation::Visibility;

Modules§

cli
extensions
Classes for handling extensions, including the simple extensions that represent functions, types, and type variations that can appear in Substrait simple extension YAML files, and the extension registry system for custom extension relations.
fixtures
Test fixtures for working with Substrait plans and substrait_explain
grammar
Substrait Text Format Grammar
parser
textify
Output a plan in text format.

Functions§

format
Format a Substrait plan as human-readable text.
format_with_options
Format a Substrait plan with custom options.
format_with_registry
Format a Substrait plan with custom options and an extension registry.
parse
Parse a Substrait plan from text format.