Natural Language Search with the Search Orchestrator
The KMDS Search Orchestrator lets you ask free-form questions about a project knowledge base and receive a plain-English answer. Under the hood it uses an LLM as a router to classify your intent, execute the best matching observation-query template, and synthesise the raw results into a readable response. When no structured template matches, it falls back automatically to semantic vector search.
Overview
The orchestrator follows five logical steps every time you call it:
Step |
Label |
What happens |
|---|---|---|
1 |
Context Injection |
A tool description — a catalogue of all available search templates and their purpose — is injected into the LLM prompt so the model always knows its options. |
2 |
Intent Classification & Entity Extraction |
The LLM analyses your query and returns a Pydantic-validated JSON payload identifying which template to invoke and any filter parameters (observation type, keyword, sequence range) it extracted. |
3 |
Template Execution |
The matching KMDS API function is called against the loaded knowledge base with the extracted filters applied. |
4 |
LLM Synthesis |
The LLM converts the raw observation records into a concise natural language answer. |
5 |
Semantic Fallback (catch-all) |
If no template matches, or if a template returns zero results, the ChromaDB semantic vector index is queried instead. |
Note
Why not LlamaIndex (for now)? KMDS currently uses a custom orchestrator because it provides strict control over template routing, filter validation, and fallback behaviour with a smaller runtime dependency surface. This keeps behaviour predictable for ontology-specific queries and simplifies maintenance. We may revisit LlamaIndex later if we need broader multi-retriever orchestration, connector breadth, or rapid RAG pipeline experimentation.
Available Search Templates
Template |
Best for |
|---|---|
|
Data quality, missing values, outliers, distributions, initial data understanding |
|
Feature engineering, transformations, encodings, scaling, data preparation decisions |
|
Algorithm selection rationale, modelling assumptions, hyperparameter decisions |
|
Model comparison, evaluation metrics, benchmarking, final model recommendation |
|
Broad cross-phase questions |
|
Nuanced natural-language similarity queries |
Step-by-Step Usage
Step A – Prerequisites
Install the package with all dependencies:
pip install kmds # core (pydantic, chromadb, sentence-transformers included)
For LLM-powered routing and synthesis you need a Google GenAI API key
(gemini-1.5-flash is used by default):
export GOOGLE_API_KEY="your-google-api-key"
Note
If you prefer a different LLM (OpenAI, Anthropic, local Ollama, etc.),
pass a llm_fn callable — see Using a Custom LLM Backend below.
Step B – Build or Load a Knowledge Base
You need a .xml knowledge-base file produced by a KMDS workflow. A test
knowledge base ships with the package:
from importlib.resources import files
kb_path = str(files("kmds.examples").joinpath("example_analytics_kb_app_workflow.xml"))
Or use your own project file:
kb_path = "path/to/my_project_kb.xml"
Step C – Initialise the Orchestrator
from kmds.search import SearchOrchestrator
orc = SearchOrchestrator(
kb_path=kb_path,
persist_dir="./my_index", # omit for in-memory (rebuilt each session)
)
What happens here:
The knowledge base is loaded into memory once.
The semantic vector index (ChromaDB) is built from all observation findings and persisted to
./my_indexfor fast reload on subsequent runs.
Step D – Ask a Question
result = orc.ask("What data quality issues were found during exploration?")
print(result.answer) # synthesised natural language answer
print(result.intent_class) # which template was used
print(result.route_explanation)
print(result.results) # raw observation records
The returned OrchestratorResult
always has these attributes:
Attribute |
Description |
|---|---|
|
Synthesised natural language answer (string). |
|
The template that was ultimately executed. |
|
The LLM’s one-sentence reason for choosing that template. |
|
List of raw observation dicts ( |
Step E – Inspect Raw Results
for r in result.results:
print(r["obs_type"], "|", r["finding"])
Step F – Use the CLI
The orchestrator is also available as a command-line tool:
# Basic usage
kmds-ask --project-file my_project.xml \
--query "What feature engineering steps were taken?"
# Persist the index for fast repeated queries
kmds-ask --project-file my_project.xml \
--query "Which model was selected and why?" \
--persist-dir ./my_idx
# Show routing decision and raw records
kmds-ask --project-file my_project.xml \
--query "transformation decisions" \
--verbose
# Machine-readable JSON output
kmds-ask --project-file my_project.xml \
--query "model evaluation metrics" \
--output-format json
# Use a different LLM model
kmds-ask --project-file my_project.xml \
--query "data quality" \
--model gemini-2.0-flash
Using a Custom LLM Backend
Pass any callable that accepts a str prompt and returns a str
response:
def my_llm(prompt: str) -> str:
# Example using OpenAI
import openai
client = openai.OpenAI(api_key="sk-...")
resp = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}],
)
return resp.choices[0].message.content
orc = SearchOrchestrator(kb_path=kb_path, llm_fn=my_llm)
result = orc.ask("What modelling assumptions were made?")
print(result.answer)
Fallback Behaviour
The orchestrator is designed to always return something:
If the LLM is unreachable → router falls back to semantic vector search.
If the JSON response cannot be parsed → falls back to semantic search.
If the chosen template returns zero observations → falls back to semantic search.
If the synthesis LLM call fails → raw records are formatted as plain text.
Worked Example
import os
from importlib.resources import files
from kmds.search import SearchOrchestrator
os.environ["GOOGLE_API_KEY"] = "your-key-here"
kb_path = str(files("kmds.examples").joinpath("example_ml_kb_exp_workflow.xml"))
orc = SearchOrchestrator(kb_path=kb_path, persist_dir="./ml_idx")
questions = [
"What data quality issues were encountered?",
"Which features were engineered and why?",
"What modelling assumptions were made?",
"Which model was finally selected and what were its evaluation metrics?",
"Summarise the key findings across all phases.",
]
for q in questions:
result = orc.ask(q)
print(f"Q: {q}")
print(f" Template used : {result.intent_class}")
print(f" Answer : {result.answer[:200]}")
print()
Pydantic Schema Reference
The router output is validated against:
- class kmds.search.search_orchestrator.SearchFilters(*, obs_type_filter: str | None = None, finding_seq_min: int | None = None, finding_seq_max: int | None = None, keyword: str | None = None)
Optional parameters the LLM may extract from the user query.
All fields are optional. Any field left as
Noneis ignored during the post-retrieval filtering step.- finding_seq_max: int | None
Include only observations with
finding_seq<= this value.
- finding_seq_min: int | None
Include only observations with
finding_seq>= this value.
- keyword: str | None
Additional keyword to filter the finding text (case-insensitive substring).
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- obs_type_filter: str | None
Substring to match against the observation-type label (case-insensitive).
- class kmds.search.search_orchestrator.OrchestratorRoute(*, intent_class: Literal['exploratory_search', 'data_representation_search', 'modelling_choice_search', 'model_selection_search', 'all_observations_search', 'semantic_search'], filters: SearchFilters = SearchFilters(obs_type_filter=None, finding_seq_min=None, finding_seq_max=None, keyword=None), explanation: str = '')
Structured output produced by the LLM router.
The LLM is instructed to return a JSON object that conforms to this schema. Pydantic validates and coerces the payload before execution.
- explanation: str
Brief explanation of why this route was chosen (surfaced to the caller).
- filters: SearchFilters
Optional query parameters extracted from the query text.
- intent_class: IntentClass
Which search template best matches the user query.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
API Reference
- class kmds.search.search_orchestrator.SearchOrchestrator(kb_path: str, *, persist_dir: str | None = None, llm_fn: Callable[[str], str] | None = None, model: str = 'gemini-1.5-flash', embedding_model: str = 'all-MiniLM-L6-v2', n_results: int = 5)
LLM-driven search orchestrator for a KMDS knowledge base.
The orchestrator routes natural language queries through an LLM to identify the best search template, executes it against the loaded knowledge base, and synthesises the results into a natural language answer.
- Parameters:
kb_path – Path to the KMDS
.xmlknowledge-base file.persist_dir – Directory to persist the semantic vector index.
Nonekeeps the index in memory (rebuilt on each interpreter session).llm_fn – Optional callable
(prompt: str) -> strfor your own LLM backend. IfNone, the orchestrator uses Google GenAI (requiresGOOGLE_API_KEYenvironment variable).model – Google GenAI model name (ignored when llm_fn is supplied).
embedding_model – Sentence-transformers model used for the semantic fallback index.
n_results – Default maximum number of observation records returned per query.
Examples
Using Google GenAI (default):
import os os.environ["GOOGLE_API_KEY"] = "your-key" from kmds.search import SearchOrchestrator orc = SearchOrchestrator("my_project.xml", persist_dir="./idx") result = orc.ask("What data quality issues were found?") print(result.answer)
Using a custom LLM backend:
def my_llm(prompt: str) -> str: # call any LLM here return my_model.generate(prompt) orc = SearchOrchestrator("my_project.xml", llm_fn=my_llm) result = orc.ask("Which model was selected and why?") print(result.answer) print(result.results) # raw records
- __init__(kb_path: str, *, persist_dir: str | None = None, llm_fn: Callable[[str], str] | None = None, model: str = 'gemini-1.5-flash', embedding_model: str = 'all-MiniLM-L6-v2', n_results: int = 5) None
- ask(query: str) OrchestratorResult
Route a natural language query and return a synthesised answer.
This is the single public entry point for the orchestrator. It performs all five steps internally (routing, execution, synthesis, fallback) and returns an
OrchestratorResult.- Parameters:
query – Free-form natural language question about the knowledge base.
- Return type:
- class kmds.search.search_orchestrator.OrchestratorResult(answer: str, intent_class: str, route_explanation: str, results: list[dict[str, Any]])
Result returned by
SearchOrchestrator.ask().- answer
Synthesised natural language answer.
- intent_class
The search template that was ultimately executed.
- route_explanation
The LLM’s own explanation for its routing choice.
- results
Raw observation record dicts that informed the answer.