Natural Language Search with the Search Orchestrator

The KMDS Search Orchestrator lets you ask free-form questions about a project knowledge base and receive a plain-English answer. Under the hood it uses an LLM as a router to classify your intent, execute the best matching observation-query template, and synthesise the raw results into a readable response. When no structured template matches, it falls back automatically to semantic vector search.


Overview

The orchestrator follows five logical steps every time you call it:

Step

Label

What happens

1

Context Injection

A tool description — a catalogue of all available search templates and their purpose — is injected into the LLM prompt so the model always knows its options.

2

Intent Classification & Entity Extraction

The LLM analyses your query and returns a Pydantic-validated JSON payload identifying which template to invoke and any filter parameters (observation type, keyword, sequence range) it extracted.

3

Template Execution

The matching KMDS API function is called against the loaded knowledge base with the extracted filters applied.

4

LLM Synthesis

The LLM converts the raw observation records into a concise natural language answer.

5

Semantic Fallback (catch-all)

If no template matches, or if a template returns zero results, the ChromaDB semantic vector index is queried instead.

Note

Why not LlamaIndex (for now)? KMDS currently uses a custom orchestrator because it provides strict control over template routing, filter validation, and fallback behaviour with a smaller runtime dependency surface. This keeps behaviour predictable for ontology-specific queries and simplifies maintenance. We may revisit LlamaIndex later if we need broader multi-retriever orchestration, connector breadth, or rapid RAG pipeline experimentation.

Available Search Templates

Template

Best for

exploratory_search

Data quality, missing values, outliers, distributions, initial data understanding

data_representation_search

Feature engineering, transformations, encodings, scaling, data preparation decisions

modelling_choice_search

Algorithm selection rationale, modelling assumptions, hyperparameter decisions

model_selection_search

Model comparison, evaluation metrics, benchmarking, final model recommendation

all_observations_search

Broad cross-phase questions

semantic_search (fallback)

Nuanced natural-language similarity queries


Step-by-Step Usage

Step A – Prerequisites

Install the package with all dependencies:

pip install kmds          # core (pydantic, chromadb, sentence-transformers included)

For LLM-powered routing and synthesis you need a Google GenAI API key (gemini-1.5-flash is used by default):

export GOOGLE_API_KEY="your-google-api-key"

Note

If you prefer a different LLM (OpenAI, Anthropic, local Ollama, etc.), pass a llm_fn callable — see Using a Custom LLM Backend below.

Step B – Build or Load a Knowledge Base

You need a .xml knowledge-base file produced by a KMDS workflow. A test knowledge base ships with the package:

from importlib.resources import files
kb_path = str(files("kmds.examples").joinpath("example_analytics_kb_app_workflow.xml"))

Or use your own project file:

kb_path = "path/to/my_project_kb.xml"

Step C – Initialise the Orchestrator

from kmds.search import SearchOrchestrator

orc = SearchOrchestrator(
    kb_path=kb_path,
    persist_dir="./my_index",   # omit for in-memory (rebuilt each session)
)

What happens here:

  • The knowledge base is loaded into memory once.

  • The semantic vector index (ChromaDB) is built from all observation findings and persisted to ./my_index for fast reload on subsequent runs.

Step D – Ask a Question

result = orc.ask("What data quality issues were found during exploration?")

print(result.answer)          # synthesised natural language answer
print(result.intent_class)    # which template was used
print(result.route_explanation)
print(result.results)         # raw observation records

The returned OrchestratorResult always has these attributes:

Attribute

Description

answer

Synthesised natural language answer (string).

intent_class

The template that was ultimately executed.

route_explanation

The LLM’s one-sentence reason for choosing that template.

results

List of raw observation dicts (obs_type, finding, finding_seq, optional intent, optional distance).

Step E – Inspect Raw Results

for r in result.results:
    print(r["obs_type"], "|", r["finding"])

Step F – Use the CLI

The orchestrator is also available as a command-line tool:

# Basic usage
kmds-ask --project-file my_project.xml \
          --query "What feature engineering steps were taken?"

# Persist the index for fast repeated queries
kmds-ask --project-file my_project.xml \
          --query "Which model was selected and why?" \
          --persist-dir ./my_idx

# Show routing decision and raw records
kmds-ask --project-file my_project.xml \
          --query "transformation decisions" \
          --verbose

# Machine-readable JSON output
kmds-ask --project-file my_project.xml \
          --query "model evaluation metrics" \
          --output-format json

# Use a different LLM model
kmds-ask --project-file my_project.xml \
          --query "data quality" \
          --model gemini-2.0-flash

Using a Custom LLM Backend

Pass any callable that accepts a str prompt and returns a str response:

def my_llm(prompt: str) -> str:
    # Example using OpenAI
    import openai
    client = openai.OpenAI(api_key="sk-...")
    resp = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
    )
    return resp.choices[0].message.content

orc = SearchOrchestrator(kb_path=kb_path, llm_fn=my_llm)
result = orc.ask("What modelling assumptions were made?")
print(result.answer)

Fallback Behaviour

The orchestrator is designed to always return something:

  1. If the LLM is unreachable → router falls back to semantic vector search.

  2. If the JSON response cannot be parsed → falls back to semantic search.

  3. If the chosen template returns zero observations → falls back to semantic search.

  4. If the synthesis LLM call fails → raw records are formatted as plain text.


Worked Example

import os
from importlib.resources import files
from kmds.search import SearchOrchestrator

os.environ["GOOGLE_API_KEY"] = "your-key-here"

kb_path = str(files("kmds.examples").joinpath("example_ml_kb_exp_workflow.xml"))

orc = SearchOrchestrator(kb_path=kb_path, persist_dir="./ml_idx")

questions = [
    "What data quality issues were encountered?",
    "Which features were engineered and why?",
    "What modelling assumptions were made?",
    "Which model was finally selected and what were its evaluation metrics?",
    "Summarise the key findings across all phases.",
]

for q in questions:
    result = orc.ask(q)
    print(f"Q: {q}")
    print(f"   Template used : {result.intent_class}")
    print(f"   Answer        : {result.answer[:200]}")
    print()

Pydantic Schema Reference

The router output is validated against:

class kmds.search.search_orchestrator.SearchFilters(*, obs_type_filter: str | None = None, finding_seq_min: int | None = None, finding_seq_max: int | None = None, keyword: str | None = None)

Optional parameters the LLM may extract from the user query.

All fields are optional. Any field left as None is ignored during the post-retrieval filtering step.

finding_seq_max: int | None

Include only observations with finding_seq <= this value.

finding_seq_min: int | None

Include only observations with finding_seq >= this value.

keyword: str | None

Additional keyword to filter the finding text (case-insensitive substring).

model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

obs_type_filter: str | None

Substring to match against the observation-type label (case-insensitive).

class kmds.search.search_orchestrator.OrchestratorRoute(*, intent_class: Literal['exploratory_search', 'data_representation_search', 'modelling_choice_search', 'model_selection_search', 'all_observations_search', 'semantic_search'], filters: SearchFilters = SearchFilters(obs_type_filter=None, finding_seq_min=None, finding_seq_max=None, keyword=None), explanation: str = '')

Structured output produced by the LLM router.

The LLM is instructed to return a JSON object that conforms to this schema. Pydantic validates and coerces the payload before execution.

explanation: str

Brief explanation of why this route was chosen (surfaced to the caller).

filters: SearchFilters

Optional query parameters extracted from the query text.

intent_class: IntentClass

Which search template best matches the user query.

model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

API Reference

class kmds.search.search_orchestrator.SearchOrchestrator(kb_path: str, *, persist_dir: str | None = None, llm_fn: Callable[[str], str] | None = None, model: str = 'gemini-1.5-flash', embedding_model: str = 'all-MiniLM-L6-v2', n_results: int = 5)

LLM-driven search orchestrator for a KMDS knowledge base.

The orchestrator routes natural language queries through an LLM to identify the best search template, executes it against the loaded knowledge base, and synthesises the results into a natural language answer.

Parameters:
  • kb_path – Path to the KMDS .xml knowledge-base file.

  • persist_dir – Directory to persist the semantic vector index. None keeps the index in memory (rebuilt on each interpreter session).

  • llm_fn – Optional callable (prompt: str) -> str for your own LLM backend. If None, the orchestrator uses Google GenAI (requires GOOGLE_API_KEY environment variable).

  • model – Google GenAI model name (ignored when llm_fn is supplied).

  • embedding_model – Sentence-transformers model used for the semantic fallback index.

  • n_results – Default maximum number of observation records returned per query.

Examples

Using Google GenAI (default):

import os
os.environ["GOOGLE_API_KEY"] = "your-key"

from kmds.search import SearchOrchestrator

orc = SearchOrchestrator("my_project.xml", persist_dir="./idx")
result = orc.ask("What data quality issues were found?")
print(result.answer)

Using a custom LLM backend:

def my_llm(prompt: str) -> str:
    # call any LLM here
    return my_model.generate(prompt)

orc = SearchOrchestrator("my_project.xml", llm_fn=my_llm)
result = orc.ask("Which model was selected and why?")
print(result.answer)
print(result.results)      # raw records
__init__(kb_path: str, *, persist_dir: str | None = None, llm_fn: Callable[[str], str] | None = None, model: str = 'gemini-1.5-flash', embedding_model: str = 'all-MiniLM-L6-v2', n_results: int = 5) None
ask(query: str) OrchestratorResult

Route a natural language query and return a synthesised answer.

This is the single public entry point for the orchestrator. It performs all five steps internally (routing, execution, synthesis, fallback) and returns an OrchestratorResult.

Parameters:

query – Free-form natural language question about the knowledge base.

Return type:

OrchestratorResult

class kmds.search.search_orchestrator.OrchestratorResult(answer: str, intent_class: str, route_explanation: str, results: list[dict[str, Any]])

Result returned by SearchOrchestrator.ask().

answer

Synthesised natural language answer.

intent_class

The search template that was ultimately executed.

route_explanation

The LLM’s own explanation for its routing choice.

results

Raw observation record dicts that informed the answer.