Is the chat-and-paste workflow bad?

No, it often works. The argument for Savanty is not that the LLM is smarter inside a pipeline — it is the same class of model — but that the surrounding scaffolding turns the model's failures into typed events the system can act on, without a clipboard.

When is chat-and-paste fine?

When you only need to solve the problem once by hand today, you are comfortable reading OR-Tools Python, or you specifically want CP-SAT or a specialised OR-Tools solver.

Savanty vs ChatGPT writing OR-Tools code

Both approaches use a large language model. Only one of them closes the loop with a solver. That is the whole comparison.

A common workflow today: paste your scheduling problem into a chat window, ask for OR-Tools code, copy it out, run it, paste the error back if it fails, repeat. It works surprisingly often — the LLM is competent at textbook problem shapes. What it does not do is sit inside a structured feedback loop with the solver. When the model produces an infeasible model, you usually do not notice: the script exits cleanly and prints "no solution found" without telling you which constraint was the problem. The loop is in your head.

Savanty makes that loop a first-class component. The same translation step happens, but the target is ASP, and the runtime classifies solver failures into three types, computes a minimal unsatisfiable core for the unsat case, and routes typed diagnostics back through a dedicated repair signature. No clipboard.

Axis	Savanty	ChatGPT emitting OR-Tools
Where the LLM lives	Inside a programmatic pipeline (DSPy Predict modules with typed signatures).	In a chat window. The output is code you paste into a separate runtime.
What the LLM produces	Strict JSON: { facts, rules, optimize } over a canonical assign(Var, Value) contract.	Free-form Python source — variable types, solver setup, and result extraction all decided by the model.
Solver	Clingo (ASP). Selected once, fixed.	Usually CP-SAT, sometimes the linear solver or routing library — the model picks, and sometimes mixes them up.
Failure — syntax error	Typed syntax_error classification; the parse error is fed to a dedicated repair signature.	You copy the Python stack trace back to the chat. The model usually fixes it, but may rewrite unrelated parts.
Failure — infeasible model	Minimal unsatisfiable core extracted by deletion filtering; the LLM is asked to choose "over-constrained" vs "genuinely infeasible".	Script exits INFEASIBLE. The model gets no help locating the wrong constraint unless you wrote assumption tracking.
Failure — silently wrong model	The encoding is returned next to the solution (result.asp_code); the canonical contract is easy to skim.	You must read the generated Python and confirm every constraint matches intent. Easy to miss an omission.
Suitability check	First call in the pipeline; redirects continuous/statistical problems with a suggested_tool.	None. The model will happily write OR-Tools code for a problem that does not fit it.
Clarifying questions	A dedicated gap-identification step surfaces missing entities before code is generated.	Depends on model and prompt; often the model just guesses defaults and proceeds.
Determinism / reproducibility	LLM temperature defaults to 0.0; the canonical contract limits output variance.	Up to you. Default chat settings are not deterministic.
Where the solver runs	In-process; pip install savanty brings Clingo via the clingo package.	Wherever you paste the code; OR-Tools is a separate install the model often forgets to mention.

When chat-and-paste is fine

You only need to solve this problem once, by hand, today.
You are comfortable reading OR-Tools Python and verifying the constraints by eye.
You want CP-SAT specifically, or one of OR-Tools' specialised solvers.

When the closed-loop pipeline wins

You are calling the solver from inside a larger application and need a stable interface.
You want failure modes diagnosable without manual stack-trace pasting.
You care about faithful infeasibility — telling a user "your constraints have no solution" with confidence.
The problem is formulated by someone who cannot read or audit Python OR-Tools code.

The chat workflow is not silly; it often works. The argument for Savanty is that the surrounding scaffolding turns the model's failures into typed events the system can act on, reproducibly, instead of a loop that runs in your head and your terminal history.