Few-shot trianing | Python for Prediction

Introduction

This blog post (notebook) demonstrates the usage of the Python data package “DSLExamples”, [AAp1], with examples of Domain Specific Language (DSL) commands translations to programming code.

The provided DSL examples are suitable for LLM few-shot training. LangChain can be used to create translation pipelines utilizing those examples. The utilization of such LLM-translation pipelines is exemplified below.

The Python package closely follows the Raku package “DSL::Examples”, [AAp2], and Wolfram Language paclet “DSLExamples”, [AAp3], and has (or should have) the same DSL examples data.

Remark: Similar translations — with much less computational resources — are achieved with grammar-based DSL translators; see “DSL::Translators”, [AAp4].

Setup

Load the packages used below:

			
from DSLExamples import dsl_examples, dsl_workflow_separators
from langchain_core.output_parsers import StrOutputParser
from langchain_ollama import ChatOllama
import pandas as pd
import os

		

Retrieval

Get all examples and retrieve specific language/workflow slices.

			
all_examples = dsl_examples()
python_lsa = dsl_examples("Python", "LSAMon")
separators = dsl_workflow_separators("WL", "LSAMon")
list(all_examples.keys()), list(python_lsa.keys())[:5]

			
# (['WL', 'Python', 'R', 'Raku'],
  ['load the package',
   'use the documents aDocs',
   'use dfTemp',
   'make the document-term matrix',
   'make the document-term matrix with automatic stop words'])

		

Tabulate Languages and Workflows

			
rows = [
    {"language": lang, "workflow": workflow}
    for lang, workflows in all_examples.items()
    for workflow in workflows.keys()
]
pd.DataFrame(rows).sort_values(["language", "workflow"]).reset_index(drop=True)

		

language	workflow
Python	LSAMon
Python	QRMon
Python	SMRMon
Python	pandas
R	DataReshaping
R	LSAMon
R	QRMon
R	SMRMon
Raku	DataReshaping
Raku	SMRMon
Raku	TriesWithFrequencies
WL	ClCon
WL	DataReshaping
WL	LSAMon
WL	QRMon
WL	SMRMon
WL	Tabular
WL	TriesWithFrequencies

Python LSA Examples

pd.DataFrame([{"command": k, "code": v} for k, v in python_lsa.items()])

command	code
load the package	from LatentSemanticAnalyzer import *
use the documents aDocs	LatentSemanticAnalyzer(aDocs)
use dfTemp	LatentSemanticAnalyzer(dfTemp)
make the document-term matrix	make_document_term_matrix()
make the document-term matrix with automatic s…	make_document_term_matrix[stemming_rules=None,…
make the document-term matrix without stemming	make_document_term_matrix[stemming_rules=False…
apply term weight functions	apply_term_weight_functions()
apply term weight functions: global IDF, local…	apply_term_weight_functions(global_weight_func…
extract 30 topics using the method SVD	extract_topics(number_of_topics=24, method=’SVD’)
extract 24 topics using the method NNMF, max s…	extract_topics(number_of_topics=24, min_number…
Echo topics table	echo_topics_interpretation(wide_form=True)
show the topics	echo_topics_interpretation(wide_form=True)
Echo topics table with 10 terms per topic	echo_topics_interpretation(number_of_terms=10,…
find the statistical thesaurus for the words n…	echo_statistical_thesaurus(terms=stemmerObj.st…
show statistical thesaurus for king, castle, p…	echo_statistical_thesaurus(terms=stemmerObj.st…

LangChain few-shot prompt

Build a few-shot prompt from the DSL examples, then run it over commands.

			
from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate
# Use a small subset of examples as few-shot demonstrations
example_pairs = list(python_lsa.items())[:5]
examples = [
    {"command": cmd, "code": code}
    for cmd, code in example_pairs
]
example_prompt = PromptTemplate(
    input_variables=["command", "code"],
    template="Command: {command}\nCode: {code}"
)
few_shot_prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=(
        "You translate DSL commands into Python code that builds an LSA pipeline."
        "Follow the style of the examples."
    ),
    suffix="Command: {command}\nCode:",
    input_variables=["command"],
)
print(few_shot_prompt.format(command="show the topics"))

		

			
# You translate DSL commands into Python code that builds an LSA pipeline.Follow the style of the examples.
# 
# Command: load the package
# Code: from LatentSemanticAnalyzer import *
# 
# Command: use the documents aDocs
# Code: LatentSemanticAnalyzer(aDocs)
# 
# Command: use dfTemp
# Code: LatentSemanticAnalyzer(dfTemp)
# 
# Command: make the document-term matrix
# Code: make_document_term_matrix()
# 
# Command: make the document-term matrix with automatic stop words
# Code: make_document_term_matrix[stemming_rules=None,stopWords=True)
# 
# Command: show the topics
# Code:

		

Translation With Ollama Model

Run the few-shot prompt against a local Ollama model.

			
llm = ChatOllama(model=os.getenv("OLLAMA_MODEL", "gemma3:12b"))
commands = [
    "use the dataset aAbstracts",
    "make the document-term matrix without stemming",
    "extract 40 topics using the method non-negative matrix factorization",
    "show the topics",
]
llm = ChatOllama(model="gemma3:12b")
chain = few_shot_prompt | llm | StrOutputParser()
sep = dsl_workflow_separators('Python', 'LSAMon')
result = []
for command in commands:
    result.append(chain.invoke({"command": command}))
print(sep.join([x.strip() for x in result]))

		

			
# LatentSemanticAnalyzer(aAbstracts)
# .make_document_term_matrix(stemming_rules=None)
# .extract_topics(40, method='non-negative_matrix_factorization')
# .show_topics()

Simulated Translation With a Fake LLM

For testing purposes it might be useful to use a fake LLM so the notebook runs without setup and API keys.

			
try:
    from langchain_community.llms.fake import FakeListLLM
except Exception:
    from langchain_core.language_models.fake import FakeListLLM
commands = [
    "use the dataset aAbstracts",
    "make the document-term matrix without stemming",
    "extract 40 topics using the method non-negative matrix factorization",
    "show the topics",
]
# Fake responses to demonstrate the flow
fake_responses = [
    "lsamon = lsamon_use_dataset(\"aAbstracts\")",
    "lsamon = lsamon_make_document_term_matrix(stemming=False)",
    "lsamon = lsamon_extract_topics(method=\"NMF\", n_topics=40)",
    "lsamon_show_topics(lsamon)",
]
llm = FakeListLLM(responses=fake_responses)
# Create a simple chain by piping the prompt into the LLM
chain = few_shot_prompt | llm
for command in commands:
    result = chain.invoke({"command": command})
    print("Command:", command)
    print("Code:", result)
    print("-")

		

			
# Command: use the dataset aAbstracts
# Code: lsamon = lsamon_use_dataset("aAbstracts")
# -
# Command: make the document-term matrix without stemming
# Code: lsamon = lsamon_make_document_term_matrix(stemming=False)
# -
# Command: extract 40 topics using the method non-negative matrix factorization
# Code: lsamon = lsamon_extract_topics(method="NMF", n_topics=40)
# -
# Command: show the topics
# Code: lsamon_show_topics(lsamon)
# -

		

References

[AAp1] Anton Antonov, DSLExamples, Python package, (2026), GitHub/antononcube.

[AAp2] Anton Antonov, DSL::Examples, Raku package, (2025-2026), GitHub/antononcube.

[AAp3] Anton Antonov DSLExamples, Wolfram Language paclet, (2025-2026), Wolfram Language Paclet Repository.

[AAp4] Anton Antonov, DSL::Translators, Raku package, (2020-2024), GitHub/antononcube.

Python for Prediction

Python compared to Mathematica and R.

Menu

Tag Archives: Few-shot trianing

DSL examples with LangChain