DSL examples with LangChain

Introduction

This blog post (notebook) demonstrates the usage of the Python data package “DSLExamples”, [AAp1], with examples of Domain Specific Language (DSL) commands translations to programming code.

The provided DSL examples are suitable for LLM few-shot trainingLangChain can be used to create translation pipelines utilizing those examples. The utilization of such LLM-translation pipelines is exemplified below.

The Python package closely follows the Raku package  “DSL::Examples”, [AAp2], and Wolfram Language paclet “DSLExamples”, [AAp3], and has (or should have) the same DSL examples data.

Remark: Similar translations — with much less computational resources — are achieved with grammar-based DSL translators; see “DSL::Translators”, [AAp4].


Setup

Load the packages used below:

from DSLExamples import dsl_examples, dsl_workflow_separators
from langchain_core.output_parsers import StrOutputParser
from langchain_ollama import ChatOllama
import pandas as pd
import os

Retrieval

Get all examples and retrieve specific language/workflow slices.

all_examples = dsl_examples()
python_lsa = dsl_examples("Python", "LSAMon")
separators = dsl_workflow_separators("WL", "LSAMon")
list(all_examples.keys()), list(python_lsa.keys())[:5]
# (['WL', 'Python', 'R', 'Raku'],
['load the package',
'use the documents aDocs',
'use dfTemp',
'make the document-term matrix',
'make the document-term matrix with automatic stop words'])

Tabulate Languages and Workflows

rows = [
{"language": lang, "workflow": workflow}
for lang, workflows in all_examples.items()
for workflow in workflows.keys()
]
pd.DataFrame(rows).sort_values(["language", "workflow"]).reset_index(drop=True)
languageworkflow
PythonLSAMon
PythonQRMon
PythonSMRMon
Pythonpandas
RDataReshaping
RLSAMon
RQRMon
RSMRMon
RakuDataReshaping
RakuSMRMon
RakuTriesWithFrequencies
WLClCon
WLDataReshaping
WLLSAMon
WLQRMon
WLSMRMon
WLTabular
WLTriesWithFrequencies

Python LSA Examples

pd.DataFrame([{"command": k, "code": v} for k, v in python_lsa.items()])
commandcode
load the packagefrom LatentSemanticAnalyzer import *
use the documents aDocsLatentSemanticAnalyzer(aDocs)
use dfTempLatentSemanticAnalyzer(dfTemp)
make the document-term matrixmake_document_term_matrix()
make the document-term matrix with automatic s…make_document_term_matrix[stemming_rules=None,…
make the document-term matrix without stemmingmake_document_term_matrix[stemming_rules=False…
apply term weight functionsapply_term_weight_functions()
apply term weight functions: global IDF, local…apply_term_weight_functions(global_weight_func…
extract 30 topics using the method SVDextract_topics(number_of_topics=24, method=’SVD’)
extract 24 topics using the method NNMF, max s…extract_topics(number_of_topics=24, min_number…
Echo topics tableecho_topics_interpretation(wide_form=True)
show the topicsecho_topics_interpretation(wide_form=True)
Echo topics table with 10 terms per topicecho_topics_interpretation(number_of_terms=10,…
find the statistical thesaurus for the words n…echo_statistical_thesaurus(terms=stemmerObj.st…
show statistical thesaurus for king, castle, p…echo_statistical_thesaurus(terms=stemmerObj.st…

LangChain few-shot prompt

Build a few-shot prompt from the DSL examples, then run it over commands.

from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate
# Use a small subset of examples as few-shot demonstrations
example_pairs = list(python_lsa.items())[:5]
examples = [
{"command": cmd, "code": code}
for cmd, code in example_pairs
]
example_prompt = PromptTemplate(
input_variables=["command", "code"],
template="Command: {command}\nCode: {code}"
)
few_shot_prompt = FewShotPromptTemplate(
examples=examples,
example_prompt=example_prompt,
prefix=(
"You translate DSL commands into Python code that builds an LSA pipeline."
"Follow the style of the examples."
),
suffix="Command: {command}\nCode:",
input_variables=["command"],
)
print(few_shot_prompt.format(command="show the topics"))
# You translate DSL commands into Python code that builds an LSA pipeline.Follow the style of the examples.
#
# Command: load the package
# Code: from LatentSemanticAnalyzer import *
#
# Command: use the documents aDocs
# Code: LatentSemanticAnalyzer(aDocs)
#
# Command: use dfTemp
# Code: LatentSemanticAnalyzer(dfTemp)
#
# Command: make the document-term matrix
# Code: make_document_term_matrix()
#
# Command: make the document-term matrix with automatic stop words
# Code: make_document_term_matrix[stemming_rules=None,stopWords=True)
#
# Command: show the topics
# Code:

Translation With Ollama Model

Run the few-shot prompt against a local Ollama model.

llm = ChatOllama(model=os.getenv("OLLAMA_MODEL", "gemma3:12b"))
commands = [
"use the dataset aAbstracts",
"make the document-term matrix without stemming",
"extract 40 topics using the method non-negative matrix factorization",
"show the topics",
]
llm = ChatOllama(model="gemma3:12b")
chain = few_shot_prompt | llm | StrOutputParser()
sep = dsl_workflow_separators('Python', 'LSAMon')
result = []
for command in commands:
result.append(chain.invoke({"command": command}))
print(sep.join([x.strip() for x in result]))
# LatentSemanticAnalyzer(aAbstracts)
# .make_document_term_matrix(stemming_rules=None)
# .extract_topics(40, method='non-negative_matrix_factorization')
# .show_topics()

Simulated Translation With a Fake LLM

For testing purposes it might be useful to use a fake LLM so the notebook runs without setup and API keys.

try:
from langchain_community.llms.fake import FakeListLLM
except Exception:
from langchain_core.language_models.fake import FakeListLLM
commands = [
"use the dataset aAbstracts",
"make the document-term matrix without stemming",
"extract 40 topics using the method non-negative matrix factorization",
"show the topics",
]
# Fake responses to demonstrate the flow
fake_responses = [
"lsamon = lsamon_use_dataset(\"aAbstracts\")",
"lsamon = lsamon_make_document_term_matrix(stemming=False)",
"lsamon = lsamon_extract_topics(method=\"NMF\", n_topics=40)",
"lsamon_show_topics(lsamon)",
]
llm = FakeListLLM(responses=fake_responses)
# Create a simple chain by piping the prompt into the LLM
chain = few_shot_prompt | llm
for command in commands:
result = chain.invoke({"command": command})
print("Command:", command)
print("Code:", result)
print("-")
# Command: use the dataset aAbstracts
# Code: lsamon = lsamon_use_dataset("aAbstracts")
# -
# Command: make the document-term matrix without stemming
# Code: lsamon = lsamon_make_document_term_matrix(stemming=False)
# -
# Command: extract 40 topics using the method non-negative matrix factorization
# Code: lsamon = lsamon_extract_topics(method="NMF", n_topics=40)
# -
# Command: show the topics
# Code: lsamon_show_topics(lsamon)
# -

References

[AAp1] Anton Antonov, DSLExamples, Python package, (2026), GitHub/antononcube.

[AAp2] Anton Antonov, DSL::Examples, Raku package, (2025-2026), GitHub/antononcube.

[AAp3] Anton Antonov DSLExamples, Wolfram Language paclet, (2025-2026), Wolfram Language Paclet Repository.

[AAp4] Anton Antonov, DSL::Translators, Raku package, (2020-2024), GitHub/antononcube.