Savanty vs. ChatGPT writing OR-Tools code
Both approaches use a large language model. Only one of them closes the loop with a solver. That is the whole comparison.
The frame
A very common workflow today: paste your scheduling or allocation problem into a chat window, ask the assistant to write OR-Tools (or PuLP, or python-mip) code that solves it, copy the code out, run it locally, paste the output back if it errors. Repeat until something runs.
This works surprisingly often. The LLM is genuinely competent at constructing OR-Tools models for textbook problem shapes (knapsack, n-queens, simple shift scheduling). What it does not do is sit inside a structured feedback loop with the solver. When the generated code errors, you paste the error back. When the model produces an infeasible model, you usually do not notice, because the script exits cleanly and prints “no solution found” without telling you which of your constraints was the problem. The loop is in your head and in your terminal history.
Savanty makes that loop a first-class component. The same translation step happens (English to
formal code) but the target is ASP rather than OR-Tools, and the runtime classifies solver
failures into three types, computes a minimal unsatisfiable core for the unsat case,
and routes typed diagnostics back to the LLM through a separate ASPRepair signature.
No clipboard.
Side by side
| Axis | Savanty | ChatGPT emitting OR-Tools |
|---|---|---|
| Where the LLM lives | Inside a programmatic pipeline (DSPy Predict modules with typed signatures). | In a chat window. The output is code you paste into a separate runtime. |
| What the LLM produces | STRICT JSON: {facts, rules, optimize} over a canonical assign(Var, Value) contract. | Free-form Python source code — variable types, solver setup, constraint construction, and result extraction all decided by the model. |
| Solver | Clingo (ASP). Selected once, fixed. | Usually CP-SAT, sometimes linear_solver, sometimes the routing library, sometimes mixed up between them. The model picks. |
| Failure handling — syntax error | Typed syntax_error classification; the parse error is fed back to a dedicated repair signature. | You copy the Python stack trace back to the chat. The model usually fixes it but may also rewrite unrelated parts of the script. |
| Failure handling — infeasible model | Minimal unsatisfiable core extracted by deletion filtering; the LLM is asked to choose between “I over-constrained” and “the problem is genuinely infeasible”. | Script exits with “INFEASIBLE” or equivalent. The model gets no help locating which constraint is wrong unless you wrote assumption tracking by hand. |
| Failure handling — silently wrong model | The encoding is returned next to the solution (result.asp_code) for inspection. The canonical contract makes constraints unusually easy to skim. | You have to read the generated Python and confirm every constraint matches your intent. Easy to miss an omission. |
| Suitability check | First call in the pipeline. If the problem looks continuous or statistical, returns not_suitable=True with suggested_tool set to scipy, cvxpy, sklearn, or pandas. | None. The model will happily write OR-Tools code for a problem that doesn't fit OR-Tools. |
| Clarifying questions | A dedicated gap-identification step surfaces missing entities and counts as a list of questions before code is generated; you reply via additional_info. | Depends on the model and the prompt. Often the model just guesses defaults and proceeds. |
| Programmatic invocation | Library call: solve_optimization_problem(...). CLI. REST API at /solve. | API calls to the LLM provider followed by code execution, written by you. |
| Determinism / reproducibility | LLM temperature defaults to 0.0; the canonical contract limits output structure variance. | Up to you. Default chat settings are not deterministic. |
| Where the solver runs | In-process. pip install savanty brings Clingo via the clingo Python package. | Wherever you paste the code. OR-Tools is a separate pip install the model often forgets to mention. |
When the chat-and-paste workflow is fine
- You only need to solve this problem once, by hand, today.
- You are comfortable reading OR-Tools Python and verifying the constraints by eye.
- You want CP-SAT specifically, or one of OR-Tools' specialised solvers (routing, linear).
- The cost of pasting between two windows is lower than the cost of a Python dependency.
When the closed-loop pipeline is the right answer
- You are calling the solver from inside a larger application and need a stable interface.
- You want the failure modes to be diagnosable without manual stack-trace pasting.
- You care about faithful infeasibility — being able to tell the user “your constraints have no solution” with confidence instead of “the script said infeasible, maybe try again”.
- The problem is being formulated by someone who cannot read or audit Python OR-Tools code.
- You want to A/B the typed repair loop against the Logic-LM-style baseline that Savanty also ships, because that comparison is what the benchmark harness is for.
The honest part
The chat-and-paste workflow is not silly. It often works. The argument for Savanty is not that the LLM is smarter inside its pipeline than inside a chat window — it is the same class of model — but that the surrounding scaffolding turns the model's failures into typed events that the system can act on. The chat workflow's loop runs in your head. The Savanty loop runs in the runtime, returns structured results, and is reproducible across runs.