Introduction
This blog post (notebook) demonstrates the usage of the Python data package “DSLExamples”, [AAp1], with examples of Domain Specific Language (DSL) commands translations to programming code.
The provided DSL examples are suitable for LLM few-shot training. LangChain can be used to create translation pipelines utilizing those examples. The utilization of such LLM-translation pipelines is exemplified below.
The Python package closely follows the Raku package “DSL::Examples”, [AAp2], and Wolfram Language paclet “DSLExamples”, [AAp3], and has (or should have) the same DSL examples data.
Remark: Similar translations — with much less computational resources — are achieved with grammar-based DSL translators; see “DSL::Translators”, [AAp4].
Setup
Load the packages used below:
from DSLExamples import dsl_examples, dsl_workflow_separatorsfrom langchain_core.output_parsers import StrOutputParserfrom langchain_ollama import ChatOllamaimport pandas as pdimport os
Retrieval
Get all examples and retrieve specific language/workflow slices.
all_examples = dsl_examples()python_lsa = dsl_examples("Python", "LSAMon")separators = dsl_workflow_separators("WL", "LSAMon")list(all_examples.keys()), list(python_lsa.keys())[:5]
# (['WL', 'Python', 'R', 'Raku'], ['load the package', 'use the documents aDocs', 'use dfTemp', 'make the document-term matrix', 'make the document-term matrix with automatic stop words'])
Tabulate Languages and Workflows
rows = [ {"language": lang, "workflow": workflow} for lang, workflows in all_examples.items() for workflow in workflows.keys()]pd.DataFrame(rows).sort_values(["language", "workflow"]).reset_index(drop=True)
| language | workflow |
|---|---|
| Python | LSAMon |
| Python | QRMon |
| Python | SMRMon |
| Python | pandas |
| R | DataReshaping |
| R | LSAMon |
| R | QRMon |
| R | SMRMon |
| Raku | DataReshaping |
| Raku | SMRMon |
| Raku | TriesWithFrequencies |
| WL | ClCon |
| WL | DataReshaping |
| WL | LSAMon |
| WL | QRMon |
| WL | SMRMon |
| WL | Tabular |
| WL | TriesWithFrequencies |
Python LSA Examples
pd.DataFrame([{"command": k, "code": v} for k, v in python_lsa.items()])
| command | code |
|---|---|
| load the package | from LatentSemanticAnalyzer import * |
| use the documents aDocs | LatentSemanticAnalyzer(aDocs) |
| use dfTemp | LatentSemanticAnalyzer(dfTemp) |
| make the document-term matrix | make_document_term_matrix() |
| make the document-term matrix with automatic s… | make_document_term_matrix[stemming_rules=None,… |
| make the document-term matrix without stemming | make_document_term_matrix[stemming_rules=False… |
| apply term weight functions | apply_term_weight_functions() |
| apply term weight functions: global IDF, local… | apply_term_weight_functions(global_weight_func… |
| extract 30 topics using the method SVD | extract_topics(number_of_topics=24, method=’SVD’) |
| extract 24 topics using the method NNMF, max s… | extract_topics(number_of_topics=24, min_number… |
| Echo topics table | echo_topics_interpretation(wide_form=True) |
| show the topics | echo_topics_interpretation(wide_form=True) |
| Echo topics table with 10 terms per topic | echo_topics_interpretation(number_of_terms=10,… |
| find the statistical thesaurus for the words n… | echo_statistical_thesaurus(terms=stemmerObj.st… |
| show statistical thesaurus for king, castle, p… | echo_statistical_thesaurus(terms=stemmerObj.st… |
LangChain few-shot prompt
Build a few-shot prompt from the DSL examples, then run it over commands.
from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate# Use a small subset of examples as few-shot demonstrationsexample_pairs = list(python_lsa.items())[:5]examples = [ {"command": cmd, "code": code} for cmd, code in example_pairs]example_prompt = PromptTemplate( input_variables=["command", "code"], template="Command: {command}\nCode: {code}")few_shot_prompt = FewShotPromptTemplate( examples=examples, example_prompt=example_prompt, prefix=( "You translate DSL commands into Python code that builds an LSA pipeline." "Follow the style of the examples." ), suffix="Command: {command}\nCode:", input_variables=["command"],)print(few_shot_prompt.format(command="show the topics"))
# You translate DSL commands into Python code that builds an LSA pipeline.Follow the style of the examples.# # Command: load the package# Code: from LatentSemanticAnalyzer import *# # Command: use the documents aDocs# Code: LatentSemanticAnalyzer(aDocs)# # Command: use dfTemp# Code: LatentSemanticAnalyzer(dfTemp)# # Command: make the document-term matrix# Code: make_document_term_matrix()# # Command: make the document-term matrix with automatic stop words# Code: make_document_term_matrix[stemming_rules=None,stopWords=True)# # Command: show the topics# Code:
Translation With Ollama Model
Run the few-shot prompt against a local Ollama model.
llm = ChatOllama(model=os.getenv("OLLAMA_MODEL", "gemma3:12b"))commands = [ "use the dataset aAbstracts", "make the document-term matrix without stemming", "extract 40 topics using the method non-negative matrix factorization", "show the topics",]llm = ChatOllama(model="gemma3:12b")chain = few_shot_prompt | llm | StrOutputParser()sep = dsl_workflow_separators('Python', 'LSAMon')result = []for command in commands: result.append(chain.invoke({"command": command}))print(sep.join([x.strip() for x in result]))
# LatentSemanticAnalyzer(aAbstracts)# .make_document_term_matrix(stemming_rules=None)# .extract_topics(40, method='non-negative_matrix_factorization')# .show_topics()
Simulated Translation With a Fake LLM
For testing purposes it might be useful to use a fake LLM so the notebook runs without setup and API keys.
try: from langchain_community.llms.fake import FakeListLLMexcept Exception: from langchain_core.language_models.fake import FakeListLLMcommands = [ "use the dataset aAbstracts", "make the document-term matrix without stemming", "extract 40 topics using the method non-negative matrix factorization", "show the topics",]# Fake responses to demonstrate the flowfake_responses = [ "lsamon = lsamon_use_dataset(\"aAbstracts\")", "lsamon = lsamon_make_document_term_matrix(stemming=False)", "lsamon = lsamon_extract_topics(method=\"NMF\", n_topics=40)", "lsamon_show_topics(lsamon)",]llm = FakeListLLM(responses=fake_responses)# Create a simple chain by piping the prompt into the LLMchain = few_shot_prompt | llmfor command in commands: result = chain.invoke({"command": command}) print("Command:", command) print("Code:", result) print("-")
# Command: use the dataset aAbstracts# Code: lsamon = lsamon_use_dataset("aAbstracts")# -# Command: make the document-term matrix without stemming# Code: lsamon = lsamon_make_document_term_matrix(stemming=False)# -# Command: extract 40 topics using the method non-negative matrix factorization# Code: lsamon = lsamon_extract_topics(method="NMF", n_topics=40)# -# Command: show the topics# Code: lsamon_show_topics(lsamon)# -
References
[AAp1] Anton Antonov, DSLExamples, Python package, (2026), GitHub/antononcube.
[AAp2] Anton Antonov, DSL::Examples, Raku package, (2025-2026), GitHub/antononcube.
[AAp3] Anton Antonov DSLExamples, Wolfram Language paclet, (2025-2026), Wolfram Language Paclet Repository.
[AAp4] Anton Antonov, DSL::Translators, Raku package, (2020-2024), GitHub/antononcube.
