json_table_provider

Function json_table_provider 

Source
pub async fn json_table_provider(
    ctx: &SessionContext,
    url: &str,
) -> Result<Arc<dyn TableProvider>, Error>
Expand description

Creates a TableProvider for a JSON file with pre-computed schema

This function infers the schema once and returns a TableProvider that can be registered in multiple SessionContexts without re-inferring the schema.

DataFusion supports JSONL (newline-delimited JSON) format, where each line contains a complete JSON object.

§Arguments

  • url - URL to the JSON file (e.g., “file:///path/to/data.json” or “s3://bucket/data.json”)

§Returns

Returns an Arc<dyn TableProvider> that can be registered using SessionContext::register_table().

§Example

use anyhow::Result;
use datafusion::execution::context::SessionContext;
use micromegas_analytics::dfext::json_table_provider::json_table_provider;

#[tokio::main]
async fn main() -> Result<()> {
    let ctx = SessionContext::new();
    // Create table provider with pre-computed schema (done once at startup)
    let table = json_table_provider(&ctx, "file:///path/to/data.json").await?;

    // Register in session context (fast, no schema inference)
    ctx.register_table("my_table", table)?;

    Ok(())
}

§Performance

Schema inference happens once during this function call. The returned TableProvider caches the schema, making subsequent registrations in different SessionContexts very fast.