Table of Contents

In a world increasingly driven by data, relying on intuition alone can be costly—especially in legal services, where decisions involve multiple variables, risks, and tight margins. Quantitative methods help here. The Bayesian model is one such tool: it estimates how likely a hypothesis is to be true as additional evidence arrives. In a law firm’s day-to-day, this means turning uncertainty—about efficiency, cost, timing, user acceptance, and competition—into clear, measurable probabilities that support whether to invest in something new or wait for more data.
In financial automation, a common pain point is bank reconciliation: manual errors, missed deadlines, mismatches between bank statements and internal records, and a lack of immediate visibility to spot discrepancies. ZV Advogados, operating its internal platform ZV Control Master, already faces this. The proposed Reconciliation Pro module automates the task: integrates bank statements into the system, applies intelligent matching rules, flags differences, and shows dashboards that highlight where there is risk or opportunity in real time.
There are case studies of firms that invested in automation and achieved higher accuracy, lower operating costs, and better internal user satisfaction. The difference here is using the Bayesian model as a quantitative lens to evaluate not only the outcome but the path: which pieces of evidence build confidence, which modules have impact, and how much is still missing to reach acceptable safety levels. Instead of blind bets or debates based on opinion, we propose decisions grounded in accumulated evidence.
Objective
This article demonstrates—through a realistic scenario in the context of ZV Advogados Associados—how the Bayesian method can serve as a practical decision instrument to validate the feasibility of Reconciliation Pro – Automated Bank Reconciliation. We show how heterogeneous evidence—research, pilot, marketing, partnerships, competition, seasonality, and macro noise—combines into probabilities; how to define a GO/NO-GO threshold; how to use a target posterior to guide efforts; and even how to distribute investment across the factors with the greatest impact.
By the end, you should not only understand the technique, but also know how to apply it to decide whether, in your firm or project, it is worth moving forward with a new feature or product. Instead of relying only on gut feeling or conflicting opinions, we use a practical Python simulator that integrates research/evidence with probabilities. The premise is simple: start with an initial chance (prior) and update that belief as new evidence arrives. The result is a posterior probability that acts as a scoreboard to guide GO/NO-GO and to prioritize short-term actions. You’ll leave with a ready-to-run script, a seed CSV, and a visual panel to try scenarios, see which factors matter most, and plan evidence collection for the next sprint.
Bayesian technique, variables, and logic

The Bayesian technique starts with an intuitive idea:updating beliefs. We define two hypotheses:H₁(“the module launch will succeed”) and H₀(“the launch will fail”). Before looking at data, we set a prior(e.g., 35%). Then we map evidence into easy-to-grasp modules:W(user research), P(pilot/MVP), M(marketing), I(institutional: partnerships/credit lines), A(alternatives/competition), T(timing/seasonality), and N(macro noise). Each module receives a weight called logBF(log Bayes Factor): positive values favor H₁; negative values favor H₀; zero is neutral. Example: a pilot with high conversion and low churn might be+2.0; a dominant competitor might weigh−1.5. Computationally, we add these weights to the logit of the prior and convert back to the posterior probability. Each piece of evidence nudges the balance, enabling transparent, auditable decisions and sensitivity analyses(e.g., what needs to improve to reach an 80% chance of success).
Bayesian model: concept and math
The Bayesian model updates beliefs as new evidence arrives. The core formula is:
Where:
- H₁= success hypothesis (e.g., “Reconciliation Pro will be viable”).
- H₀= contrary hypothesis.
- E= observed evidence (research, pilot, marketing, etc.).
- P(H₁)= prior probability (before evidence).
- P(E|H₁)= likelihood of seeing the evidence assuming H₁ is true.
- P(H₀) = 1 − P(H₁).
- P(E|H₀)= likelihood of the evidence under H₀.
- P(H₁|E)is the posterior probability (after observing the data).
For decisions involving multiple evidence sources, a convenient parameterization is in log-odds(logit), summing log Bayes Factors:
Here, each log BFi\log BF_i is the contribution of module ii (W, P, M, I, A, T, N). This lets heterogeneous signals—user satisfaction (W), pilot results (P), marketing performance (M), institutional partnerships (I), competitive pressure (A), seasonality (T), and macro noise (N)—be integrated into one updated probability. Managers can then compare the posterior against a GO threshold or a target posterior, turning fragmented data into clear, defensible decisions.
Why apply Bayes to automated reconciliation in legal ops
In corporate legal work, bank reconciliation directly affects cash flow, commissions, partner payouts, contingency provisioning, and risk visibility. Decisions about integrating banks, standardizing OFX/CSV, training description matching, and automating reconciliation involve uncertainties: customer adoption, implementation effort, legacy friction, and look-alike competitors. The Bayesian method shines here: it accepts imperfect evidence, combines heterogeneous signals, and updates the probability at each sprint. Leaders stop asking “do we launch or not?” and start asking “how close are we to the threshold—and where should we act to cross the line?”. For a law firm building B2B products (SaaS subscriptions, managed services), the approach clarifies priorities: if P (pilot) and A (competition)drive the outcome, the plan becomes pragmatic—improve conversion and differentiation, rather than spending time on low-impact areas.
Application scenario – ZV Advogados and the Reconciliation Pro module

Imagine ZV Advogados Associados, managing contracts and finance via ZV Control Master, assessing the feasibility of launching Reconciliation Pro. The module would automate bank-statement ingestion, apply intelligent alias-based matching, reconcile open installments, and trigger immediate alerts for financial divergences—all integrated into management dashboards. The 6-month success target is clear:≥150 paying accounts,monthly churn < 4%, and LTV/CAC ≥ 3.
- Sprint 1. Early signals were promising: a 200-lead survey showed60%acceptance for automation (W = +0.8); the pilot with 30 accounts hit24%conversion and4.8%churn (P = +1.2). Marketing met targets with2.5% CTR and competitive CPL (M = +0.5). A bank partnership was under negotiation (I = +0.3). The Alpha competitor exerted moderate pressure (A = −1.0). The pre-January window helped (T = +0.3), and macro noise was mild (N = −0.2).
- Sprint 2. Results improved: pilot conversion rose to 28% with 4.1% churn (P = +1.8), the bank agreement advanced (I = +0.6), and marketing outperformed (M = +0.6).
- Sprint 3. The pilot consolidated32%conversion with 3.5% churn (P = +2.2), the partnership was formalized (I = +0.8), competition weakened (A = −0.7), CTR doubled vs. benchmark (M = +0.7), and seasonality stayed positive (T = +0.4).
In the Bayesian analysis, the simulator returned a posterior ≈ 0.86—well above the GO Threshold (0.75)and the Target Posterior (0.80). Even as a fictitious scenario, these are plausible indicators for a law firm that wants to innovate. They demonstrate how the Bayesian method turns fragmented signals into a reliable decision view, validating that investment in Reconciliation Pro is safe, strategic, and advantageous for strengthening the firm’s market position.
The simulator (simple GUI) and the CSV
To make onboarding easy, we provide a Python GUI (Tkinter)that runs locally (e.g., VSCode + venv). Dependencies: matplotlib (required) and pyyaml (optional, for YAML). The CSV is the core: each row is a sprint of a scenario, with columns sprint, name, prior, W, P, M, I, A, T, N. You can type log BFs manually (see the rubric in the README) or load a CSV exported from your KPI tracking. The GUI lets you select the scenario, set the GO threshold, upload files, and export a summary. For beginners, this creates tactile feedback: change values (or update the CSV) and the probability chart reacts; move “P” from+0.8to+2.0and the curve climbs immediately—you feel the effect of a strong pilot. The idea is to learn by simulation: test hypotheses, measure sensitivity, prioritize actions, and reduce the risk of “flying blind”.
Python scripts – Bayesian Simulator (Tkinter + CSV)
Requirements: pip install matplotlib pyyaml
- A) Main GUI — bayes_gui_tk_ptbr.py / bayes_gui_tk_eng.py.
Tkinter application to load/edit scenarios (JSON/YAML/CSV), compute the posterior, and visualize module contributions and the prior-vs-posterior “thermometer”.
# ================================================
# Script: Bayesian Scenario Tester (Tkinter GUI) – v3 (EN)
# Author: Izairton Oliveira de Vasconcelos
# Description:
# - Tkinter GUI to load/edit scenarios and compute P(H1|E)
# - Supports: JSON/YAML | CSV "line-by-line" | CSV "wide/sprint"
# - What's new in v3:
# * Sprint ComboBox (typing aligned with Pylance)
# * Sensitivity panel + "Distribute ΔlogBF" button
# * Detailed terminal summary on each Recalculate
# - Charts: bars (modules) and probability "thermometer" (prior vs posterior)
# Requirements: matplotlib (mandatory), pyyaml (optional for YAML)
# Compatible: Python 3.8+
# ================================================
import json, csv, os, math, random
from collections import defaultdict
from typing import Dict, List, Tuple, Any, Optional
# --- Optional YAML support ----------------------------------------------------
# We try to import PyYAML. If not available, we keep running but reject YAML files.
try:
import yaml # type: ignore
HAS_YAML = True
except Exception:
HAS_YAML = False
# --- Tkinter (GUI) + embedded matplotlib backend ------------------------------
import tkinter as tk
from tkinter import ttk, filedialog, messagebox
import matplotlib
matplotlib.use("TkAgg") # <- embed interactive figures inside Tkinter windows
from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg
from matplotlib.figure import Figure # <- explicit import silences Pylance warnings
# Canonical module list used across the app (order matters for plotting/layout)
MODULES = ["W","P","M","I","A","T","N"]
# =========================
# Bayes Utility Functions
# =========================
def prob_to_odds(p: float) -> float:
"""Convert probability p to odds p/(1-p). Handles boundary cases explicitly."""
if p <= 0.0: return 0.0
if p >= 1.0: return float("inf")
return p / (1.0 - p)
def odds_to_prob(o: float) -> float:
"""Convert odds o back to probability o/(1+o)."""
if o == float("inf"): return 1.0
return o / (1.0 + o)
def logit(p: float) -> float:
"""Log-odds (logit) of a probability p."""
o = prob_to_odds(p)
return -float("inf") if o == 0.0 else math.log(o)
def inv_logit(l: float) -> float:
"""Inverse logit (sigmoid) mapping log-odds back to probability."""
o = math.exp(l) if l > -float("inf") else 0.0
return odds_to_prob(o)
def logbf_from_likelihoods(p_e_given_h1: float, p_e_given_h0: float) -> float:
"""Compute log Bayes Factor from likelihoods P(E|H1) and P(E|H0)."""
if p_e_given_h0 <= 0.0: return 50.0 # cap extreme values to avoid inf
if p_e_given_h1 <= 0.0: return -50.0
return math.log(p_e_given_h1 / p_e_given_h0)
def compute_posterior(prior_p: float, mods: Dict[str, float]) -> Tuple[float, float]:
"""
Combine prior (in log-odds space) with the sum of module logBFs
to produce posterior probability and its log-odds.
"""
lo = logit(prior_p) + sum(mods.values())
return inv_logit(lo), lo
def delta_logbf_needed(prior_p: float, mods: Dict[str, float], target_p: float) -> float:
"""
How much additional total logBF is needed to reach a target posterior.
Positive value means a gap still exists; negative/zero means target achieved.
"""
current_lo = logit(prior_p) + sum(mods.values())
target_lo = logit(target_p)
return target_lo - current_lo
# =========================
# File Loading Utilities
# =========================
def _as_float(x: Any, default: float=0.0) -> float:
"""Robust float parser: if casting fails, return a safe default."""
try: return float(x)
except Exception: return default
def load_json_yaml(path: str) -> List[dict]:
"""
Load scenarios from JSON or YAML.
Structure accepted:
- Single object or a list of objects.
- Each scenario:
name: str
prior: float
modules: dict[str -> (log_bf | {log_bf|p_e_given_h1/p_e_given_h0|score/gain})]
"""
with open(path, "r", encoding="utf-8") as f:
raw = f.read()
ext = os.path.splitext(path.lower())[1]
if ext in [".json"]:
data = json.loads(raw)
elif ext in [".yml", ".yaml"]:
if not HAS_YAML:
raise RuntimeError("YAML provided, but 'pyyaml' is not installed.")
data = yaml.safe_load(raw) # type: ignore
else:
# Graceful fallback: try JSON, then YAML (if available)
try:
data = json.loads(raw)
except Exception:
if not HAS_YAML:
raise RuntimeError("Unrecognized format. Use JSON or install 'pyyaml' for YAML.")
data = yaml.safe_load(raw) # type: ignore
# Normalize to list
if isinstance(data, dict):
data = [data]
if not isinstance(data, list):
raise ValueError("Invalid structure: expected an object or a list of scenarios.")
scenarios = []
for sc in data:
name = sc.get("name","scenario")
prior = _as_float(sc.get("prior", 1e-6), 1e-6)
# Parse modules: accept direct logBF floats or richer objects
mods: Dict[str, float] = {}
for m, v in (sc.get("modules") or {}).items():
if isinstance(v, dict):
if "log_bf" in v:
mods[m] = _as_float(v["log_bf"], 0.0)
elif "p_e_given_h1" in v and "p_e_given_h0" in v:
mods[m] = logbf_from_likelihoods(
_as_float(v["p_e_given_h1"]),
_as_float(v["p_e_given_h0"])
)
elif "score" in v and "gain" in v:
mods[m] = _as_float(v["score"]) * _as_float(v["gain"])
else:
mods[m] = 0.0
else:
mods[m] = _as_float(v, 0.0)
# Guarantee all canonical modules exist (missing -> 0.0)
scenarios.append({"name": name, "prior": prior, "modules": {k:mods.get(k,0.0) for k in MODULES}})
return scenarios
def load_csv_line_by_line(path: str) -> List[dict]:
"""
Load a CSV where each row contributes evidence to a scenario/module.
Expected headers: name, prior, module, evidence, source, logBF
Evidence rows are aggregated (sum of logBF by module).
"""
acc: Dict[str, Dict[str, float]] = defaultdict(lambda: defaultdict(float))
priors: Dict[str, float] = {}
with open(path, newline="", encoding="utf-8") as f:
reader = csv.DictReader(f)
for r in reader:
name = (r.get("name","scenario") or "scenario").strip()
priors[name] = _as_float(r.get("prior", priors.get(name, 1e-6)), 1e-6)
module = (r.get("module","") or "").strip().upper()
if module in MODULES:
acc[name][module] += _as_float(r.get("logBF",0.0), 0.0)
scenarios = []
for name, mods in acc.items():
prior = priors.get(name, 1e-6)
scenarios.append({"name":name, "prior":prior, "modules": {k:mods.get(k,0.0) for k in MODULES}})
return scenarios
def load_csv_wide_rows(path: str) -> List[dict]:
"""
Load a 'wide' CSV (possibly with a 'sprint' column and one column per module).
We return raw rows to be post-processed (filtered/aggregated) elsewhere.
"""
rows: List[dict] = []
with open(path, newline="", encoding="utf-8") as f:
reader = csv.DictReader(f)
for r in reader:
rows.append(r)
return rows
def scenarios_from_wide_rows(rows: List[dict], sprint_filter: Optional[int]) -> List[dict]:
"""
Aggregate wide rows into scenarios, optionally filtering by a specific sprint.
Sum module logBFs across rows with the same 'name'.
"""
acc: Dict[str, Dict[str, float]] = defaultdict(lambda: defaultdict(float))
priors: Dict[str, float] = {}
for r in rows:
if sprint_filter is not None and "sprint" in r:
try:
if int(float(r["sprint"])) != sprint_filter:
continue
except Exception:
# if sprint value is malformed, ignore the filter for this row
pass
name = (r.get("name","scenario") or "scenario").strip()
priors[name] = _as_float(r.get("prior", priors.get(name, 1e-6)), 1e-6)
for m in MODULES:
acc[name][m] += _as_float(r.get(m,0.0), 0.0)
scenarios = []
for name, mods in acc.items():
prior = priors.get(name, 1e-6)
scenarios.append({"name":name, "prior":prior, "modules": {k:mods.get(k,0.0) for k in MODULES}})
return scenarios
# =========================
# Tkinter GUI Application
# =========================
class BayesGUI(tk.Tk):
"""
Main window (Tk root) orchestrating:
- Top toolbar (file loading, sprint filter, thresholds, export/save)
- Left list of scenarios
- Right editor (prior, per-module logBFs)
- Results area (posterior, decision, deltas)
- Recommendations text
- Two charts (bar by modules, prior vs posterior thermometer)
"""
def __init__(self):
super().__init__()
# ---- Window meta ------------------------------------------------------
self.title("Bayesian Scenario Tester – Tkinter v3")
self.geometry("1200x800")
self.minsize(1060, 720)
# ---- Data holders -----------------------------------------------------
self.scenarios: List[dict] = [] # list of scenario dicts
self.current_index: int = -1 # index of selected scenario in listbox
# For CSV-wide usage
self.wide_rows: List[dict] = []
self.available_sprints: List[int] = []
# Build the full layout (frames, widgets, bindings)
self._build_layout()
# ---------- Layout build ---------------------------------------------------
def _build_layout(self):
# Top toolbar with file actions and parameters
top = ttk.Frame(self, padding=6)
top.pack(side=tk.TOP, fill=tk.X)
ttk.Button(top, text="Open JSON/YAML", command=self.open_json_yaml).pack(side=tk.LEFT, padx=3)
ttk.Button(top, text="Open CSV (line-by-line)", command=self.open_csv_lines).pack(side=tk.LEFT, padx=3)
ttk.Button(top, text="Open CSV (wide/sprint)", command=self.open_csv_wide).pack(side=tk.LEFT, padx=3)
ttk.Label(top, text="Sprint (CSV wide):").pack(side=tk.LEFT, padx=(14,3))
self.cmb_sprint = ttk.Combobox(top, state="disabled", width=10, values=[])
self.cmb_sprint.pack(side=tk.LEFT)
self.cmb_sprint.bind("<<ComboboxSelected>>", self.on_sprint_change)
ttk.Label(top, text="Decision Threshold:").pack(side=tk.LEFT, padx=(18,3))
self.ent_threshold = ttk.Entry(top, width=7); self.ent_threshold.insert(0, "0.75"); self.ent_threshold.pack(side=tk.LEFT)
ttk.Label(top, text="Target Posterior:").pack(side=tk.LEFT, padx=(18,3))
self.ent_target = ttk.Entry(top, width=7); self.ent_target.insert(0, "0.80"); self.ent_target.pack(side=tk.LEFT)
ttk.Button(top, text="Save current scenario (JSON)", command=self.save_current_json).pack(side=tk.RIGHT, padx=3)
ttk.Button(top, text="Export summary CSV", command=self.export_summary_csv).pack(side=tk.RIGHT, padx=3)
# Middle split pane: left (scenario list) + right (editor/results/charts)
mid = ttk.PanedWindow(self, orient=tk.HORIZONTAL)
mid.pack(fill=tk.BOTH, expand=True, padx=6, pady=6)
# ---- Left: scenarios list --------------------------------------------
left = ttk.Frame(mid, padding=6); mid.add(left, weight=1)
ttk.Label(left, text="Scenarios").pack(anchor="w")
self.listbox = tk.Listbox(left, height=12)
self.listbox.pack(fill=tk.BOTH, expand=True)
self.listbox.bind("<<ListboxSelect>>", self.on_select)
btns = ttk.Frame(left); btns.pack(fill=tk.X, pady=(6,0))
ttk.Button(btns, text="Randomize modules", command=self.randomize_modules).pack(side=tk.LEFT, padx=2)
ttk.Button(btns, text="Recalculate", command=self.update_calc).pack(side=tk.LEFT, padx=2)
ttk.Button(btns, text="Distribute ΔlogBF", command=self.distribute_delta).pack(side=tk.LEFT, padx=2)
# ---- Right: editor + results + charts --------------------------------
right = ttk.Frame(mid, padding=6); mid.add(right, weight=4)
# Scenario editor (name, prior, per-module logBFs)
editor = ttk.LabelFrame(right, text="Scenario Editor", padding=6); editor.pack(fill=tk.X)
self.var_name = tk.StringVar(value="")
self.var_prior = tk.StringVar(value="0.35")
row = 0
ttk.Label(editor, text="Name:").grid(row=row, column=0, sticky="e", padx=3, pady=3)
ttk.Entry(editor, textvariable=self.var_name, width=30).grid(row=row, column=1, sticky="w", padx=3, pady=3)
ttk.Label(editor, text="Prior:").grid(row=row, column=2, sticky="e", padx=3, pady=3)
ttk.Entry(editor, textvariable=self.var_prior, width=10).grid(row=row, column=3, sticky="w", padx=3, pady=3)
# Module entries laid out in a compact grid
self.vars_mod: Dict[str, tk.StringVar] = {}
row += 1
for i, m in enumerate(MODULES):
ttk.Label(editor, text=m).grid(row=row + (i//4), column=(i%4)*2, sticky="e", padx=3, pady=3)
var = tk.StringVar(value="0.0"); self.vars_mod[m] = var
ttk.Entry(editor, textvariable=var, width=8).grid(
row=row + (i//4), column=(i%4)*2 + 1, sticky="w", padx=3, pady=3
)
# Results frame (posterior, decision, deltas)
res = ttk.LabelFrame(right, text="Result", padding=6); res.pack(fill=tk.X, pady=(8,0))
self.lbl_sum = ttk.Label(res, text="∑logBF: 0.000"); self.lbl_sum.pack(side=tk.LEFT, padx=3)
self.lbl_post = ttk.Label(res, text="Posterior: 0.0000"); self.lbl_post.pack(side=tk.LEFT, padx=12)
self.lbl_dec = ttk.Label(res, text="Decision: —"); self.lbl_dec.pack(side=tk.LEFT, padx=12)
self.lbl_delta_thr = ttk.Label(res, text="ΔlogBF→threshold: 0.000"); self.lbl_delta_thr.pack(side=tk.LEFT, padx=12)
self.lbl_delta_tar = ttk.Label(res, text="ΔlogBF→target: 0.000"); self.lbl_delta_tar.pack(side=tk.LEFT, padx=12)
# Recommendations text (brief guidance based on current gaps)
self.lbl_rec = ttk.Label(right, text="Suggestions: —", padding=6, foreground="#333")
self.lbl_rec.pack(fill=tk.X, pady=(6,0))
# Charts area: (1) module contributions, (2) prior vs posterior thermometer
charts = ttk.Frame(right); charts.pack(fill=tk.BOTH, expand=True, pady=(8,0))
self.fig1: Figure = Figure(figsize=(5.2,3.4))
self.ax1 = self.fig1.add_subplot(111)
self.canvas1 = FigureCanvasTkAgg(self.fig1, master=charts)
self.canvas1.get_tk_widget().pack(side=tk.LEFT, fill=tk.BOTH, expand=True)
self.fig2: Figure = Figure(figsize=(5.2,3.4))
self.ax2 = self.fig2.add_subplot(111)
self.canvas2 = FigureCanvasTkAgg(self.fig2, master=charts)
self.canvas2.get_tk_widget().pack(side=tk.LEFT, fill=tk.BOTH, expand=True)
# Footer usage tips (quick start for students)
tips = ttk.Label(
self,
foreground="#555",
text=("Tip: load JSON/YAML/CSV; pick a scenario; edit prior and logBFs; click Recalculate.\n"
"CSV 'line-by-line' aggregates evidence; CSV 'wide' lets you pick a Sprint in the ComboBox.")
)
tips.pack(fill=tk.X, padx=6, pady=(0,6))
# Seed with an example scenario so students can test immediately
self.scenarios = [{
"name": "Reconciliation Pro (example)",
"prior": 0.35,
"modules": {"W":0.8,"P":1.6,"M":0.5,"I":0.3,"A":-0.8,"T":0.2,"N":-0.2}
}]
self.refresh_listbox(select=0)
# ---------- File actions (toolbar) -----------------------------------------
def open_json_yaml(self):
"""Open and parse a JSON or YAML file with scenarios."""
path = filedialog.askopenfilename(
title="Open JSON/YAML",
filetypes=[("JSON/YAML","*.json;*.yml;*.yaml;*.*")]
)
if not path: return
try:
# Reset sprint context (not relevant for JSON/YAML)
self.wide_rows = []; self.available_sprints = []
self.cmb_sprint.config(state="disabled", values=[])
self.scenarios = load_json_yaml(path)
self.refresh_listbox(select=0)
except Exception as e:
messagebox.showerror("Load error", str(e))
def open_csv_lines(self):
"""Open and parse a 'line-by-line' CSV, aggregating module logBFs."""
path = filedialog.askopenfilename(
title="Open CSV (line-by-line)",
filetypes=[("CSV","*.csv;*.*")]
)
if not path: return
try:
self.wide_rows = []; self.available_sprints = []
self.cmb_sprint.config(state="disabled", values=[])
self.scenarios = load_csv_line_by_line(path)
self.refresh_listbox(select=0)
except Exception as e:
messagebox.showerror("Load error", str(e))
def open_csv_wide(self):
"""Open a 'wide' CSV and populate sprint options + scenarios aggregation."""
path = filedialog.askopenfilename(
title="Open CSV (wide/sprint)",
filetypes=[("CSV","*.csv;*.*")]
)
if not path: return
try:
self.wide_rows = load_csv_wide_rows(path)
# Discover available sprints (integers). Non-numeric values are ignored.
sprints = set()
for r in self.wide_rows:
if "sprint" in r and str(r["sprint"]).strip() != "":
try: sprints.add(int(float(r["sprint"])))
except Exception: pass
self.available_sprints = sorted(list(sprints))
if self.available_sprints:
# Pylance prefers list[str]
values = [str(x) for x in self.available_sprints]
self.cmb_sprint.config(state="readonly", values=values)
self.cmb_sprint.set(values[0])
sf = int(values[0])
else:
self.cmb_sprint.config(state="disabled", values=[])
sf = None
self.scenarios = scenarios_from_wide_rows(self.wide_rows, sprint_filter=sf)
self.refresh_listbox(select=0)
except Exception as e:
messagebox.showerror("Load error", str(e))
def on_sprint_change(self, _event=None):
"""Rebuild scenarios when the sprint filter changes (wide CSV only)."""
if not self.wide_rows: return
try: sf = int(float(self.cmb_sprint.get()))
except Exception: sf = None
self.scenarios = scenarios_from_wide_rows(self.wide_rows, sprint_filter=sf)
self.refresh_listbox(select=0)
def save_current_json(self):
"""Save the currently edited scenario as a single-element JSON list."""
if self.current_index < 0: return
sc = self._collect_from_editor()
path = filedialog.asksaveasfilename(
title="Save current scenario",
defaultextension=".json",
filetypes=[("JSON","*.json")]
)
if not path: return
with open(path, "w", encoding="utf-8") as f:
json.dump([sc], f, ensure_ascii=False, indent=2)
messagebox.showinfo("OK", f"Scenario saved to:\n{path}")
def export_summary_csv(self):
"""
Export a CSV summary containing:
name, prior, sum_logbf, posterior, decision, ΔlogBF to threshold, ΔlogBF to target
Useful for quick comparisons across many scenarios.
"""
if not self.scenarios:
messagebox.showwarning("Warning", "No scenarios loaded."); return
path = filedialog.asksaveasfilename(
title="Export summary CSV",
defaultextension=".csv",
filetypes=[("CSV","*.csv")]
)
if not path: return
th = _as_float(self.ent_threshold.get(), 0.75)
tgt = _as_float(self.ent_target.get(), 0.80)
with open(path, "w", newline="", encoding="utf-8") as f:
w = csv.writer(f)
w.writerow([
"name","prior","sum_logbf","posterior",
"decision","delta_logbf_to_threshold","delta_logbf_to_target"
])
for sc in self.scenarios:
p, _ = compute_posterior(sc["prior"], sc["modules"])
s = sum(sc["modules"].values())
dec = "GO" if p >= th else "NO-GO"
d_thr = max(0.0, delta_logbf_needed(sc["prior"], sc["modules"], th))
d_tar = max(0.0, delta_logbf_needed(sc["prior"], sc["modules"], tgt))
w.writerow([
sc["name"], f"{sc['prior']:.6f}", f"{s:.6f}", f"{p:.6f}", dec,
f"{d_thr:.3f}", f"{d_tar:.3f}"
])
messagebox.showinfo("OK", f"Summary exported to:\n{path}")
# ---------- List & Editor sync --------------------------------------------
def refresh_listbox(self, select: int = -1):
"""Refresh the scenarios listbox and optionally select a specific index."""
self.listbox.delete(0, tk.END)
for sc in self.scenarios:
self.listbox.insert(tk.END, sc["name"])
if self.scenarios and (0 <= select < len(self.scenarios)):
self.listbox.selection_clear(0, tk.END)
self.listbox.selection_set(select)
self.listbox.event_generate("<<ListboxSelect>>")
def on_select(self, _event=None):
"""When a scenario is selected in the list, load it into the editor and recompute."""
sel = self.listbox.curselection()
if not sel:
self.current_index = -1
return
idx = sel[0]; self.current_index = idx
sc = self.scenarios[idx]
self._populate_editor(sc)
self.update_calc()
def _populate_editor(self, sc: dict):
"""Populate the editor fields from a scenario dict."""
self.var_name.set(sc["name"])
self.var_prior.set(f"{sc['prior']:.6f}")
for m in MODULES:
self.vars_mod[m].set(f"{sc['modules'].get(m,0.0):.3f}")
def _collect_from_editor(self) -> dict:
"""Build a scenario dict from current editor values (with safe parsing)."""
name = self.var_name.get().strip() or "scenario"
prior = _as_float(self.var_prior.get(), 1e-6)
mods = {m: _as_float(self.vars_mod[m].get(), 0.0) for m in MODULES}
return {"name": name, "prior": prior, "modules": mods}
def randomize_modules(self):
"""For exploration: assign random logBFs (-2.0..2.0) to all modules of the current scenario."""
if self.current_index < 0: return
for m in MODULES:
self.vars_mod[m].set(f"{random.uniform(-2.0, 2.0):.3f}")
self.update_calc()
# ---------- Sensitivity helper: distribute ΔlogBF -------------------------
def distribute_delta(self):
"""
Distribute the missing ΔlogBF to hit the target posterior:
- 70% to P (conversion/churn improvements from pilot)
- 30% to A (reduce negative differentiation vs competitors)
If A is negative, adding a positive amount reduces its magnitude (a mitigation).
"""
if self.current_index < 0: return
sc = self._collect_from_editor()
tgt = _as_float(self.ent_target.get(), 0.80)
gap = delta_logbf_needed(sc["prior"], sc["modules"], tgt)
if gap <= 0:
messagebox.showinfo("Info", "Already at the target (or above).")
return
add_P = 0.70 * gap
add_A = 0.30 * gap
# Apply deltas directly to the editor fields
self.vars_mod["P"].set(f"{_as_float(self.vars_mod['P'].get()) + add_P:.3f}")
self.vars_mod["A"].set(f"{_as_float(self.vars_mod['A'].get()) + add_A:.3f}")
self.update_calc()
# ---------- Core computation + charts -------------------------------------
def update_calc(self):
"""
Recompute posterior and UI artifacts from the current editor state.
Also prints a concise summary to the terminal (useful in VSCode).
"""
if self.current_index < 0 and self.scenarios:
self.current_index = 0
if self.current_index < 0: return
# Parse current editor state
sc = self._collect_from_editor()
th = _as_float(self.ent_threshold.get(), 0.75)
tgt = _as_float(self.ent_target.get(), 0.80)
# Bayesian update
posterior, post_lo = compute_posterior(sc["prior"], sc["modules"])
sum_logbf = sum(sc["modules"].values())
decision = "GO" if posterior >= th else "NO-GO"
# Gaps to threshold/target
d_thr = max(0.0, delta_logbf_needed(sc["prior"], sc["modules"], th))
d_tar = max(0.0, delta_logbf_needed(sc["prior"], sc["modules"], tgt))
# Update labels
self.lbl_sum.config(text=f"∑logBF: {sum_logbf:+.3f}")
self.lbl_post.config(text=f"Posterior: {posterior:.4f}")
self.lbl_dec.config(text=f"Decision: {decision} (thr={th:.2f})")
self.lbl_delta_thr.config(text=f"ΔlogBF→threshold: {d_thr:+.3f}")
self.lbl_delta_tar.config(text=f"ΔlogBF→target: {d_tar:+.3f}")
# Simple recommendations (guidance for students)
recs = []
if decision == "NO-GO" or d_tar > 0.0:
if d_tar > 0.0:
recs.append(f"Missing ~{d_tar:.2f} logBF to reach target ({tgt:.0%}).")
elif d_thr > 0.0:
recs.append(f"Missing ~{d_thr:.2f} logBF to reach threshold ({th:.0%}).")
recs.append("Prioritize raising P (pilot conversion/churn) and reducing |A| (competitive differentiation).")
else:
recs.append("Above threshold: consolidate P/I; keep M efficient; monitor A and N.")
self.lbl_rec.config(text="Suggestions: " + " ".join(recs))
# Refresh charts
self._plot_modules(sc["modules"])
self._plot_thermometer(sc["prior"], posterior)
# Sync the internal scenario copy (so exports reflect current edits)
self.scenarios[self.current_index] = sc
# --- Terminal summary (great when running from VSCode) -----------------
print("\n=== Summary (Recalculate) ===")
print(f"Scenario: {sc['name']}")
print(f"Prior: {sc['prior']:.4f} | ∑logBF: {sum_logbf:+.3f} | Posterior: {posterior:.4f} -> {decision}")
print(f"Posterior log-odds: {post_lo:+.3f}")
print(f"ΔlogBF→threshold({th:.0%}): {d_thr:+.3f} | ΔlogBF→target({tgt:.0%}): {d_tar:+.3f}")
print("Suggestions:", " ".join(recs))
def _plot_modules(self, mods: Dict[str, float]):
"""Bar chart: contribution (logBF) by module."""
self.ax1.clear()
xs = MODULES
ys = [mods.get(m,0.0) for m in xs]
self.ax1.bar(xs, ys)
self.ax1.set_title("Module Contributions (logBF)")
self.ax1.set_ylabel("logBF")
for i, v in enumerate(ys):
self.ax1.text(
i, v + (0.02 if v>=0 else -0.06), f"{v:+.2f}",
ha="center", va="bottom" if v>=0 else "top", fontsize=9
)
self.fig1.tight_layout()
self.canvas1.draw()
def _plot_thermometer(self, prior_p: float, post_p: float):
"""
'Thermometer' chart: shows prior vs posterior on a log-scaled y-axis.
The log scale helps visualize small probabilities without flattening.
"""
self.ax2.clear()
stages = ["Prior", "Posterior"]; vals = [prior_p, post_p]
self.ax2.bar(stages, vals)
self.ax2.set_yscale("log")
self.ax2.set_title("Probability Thermometer (log scale)")
self.ax2.set_ylabel("Probability")
for i, v in enumerate(vals):
self.ax2.text(i, v, f"{v:.1e}", ha="center", va="bottom", fontsize=9)
self.fig2.tight_layout()
self.canvas2.draw()
# =========================
# Main Entry Point
# =========================
if __name__ == "__main__":
# Create and run the GUI application.
# For students: running this file directly (python this_file.py) will open the window.
app = BayesGUI()
app.mainloop()
- B) Seed data — seed_files_bayes_ptbr.py / seed_files_bayes_eng.py.
Generates example files (CSV wide/sprint, CSV line-by-line, JSON and YAML) so readers can reproduce the same results by switching sprints and adjusting modules.
# ================================================
# Script: Seed Files for Bayesian Scenario Testing (EN)
# Author: Izairton Oliveira de Vasconcelos
# Description:
# Generates example files for the Tkinter Bayesian Scenario Tester:
# 1) CSV (wide/sprint)
# 2) CSV (line-by-line)
# 3) JSON scenarios
# 4) YAML scenarios (optional: requires pyyaml)
# How to run:
# python seed_files_bayes_en.py
# Compatible: Python 3.8+ | Tested on VSCode
# ================================================
import csv, json
from typing import List
def write_csv(path: str, rows: List[List[object]]) -> None:
"""Utility to write rows to a CSV file with UTF-8 encoding."""
with open(path, "w", newline="", encoding="utf-8") as f:
csv.writer(f).writerows(rows)
print(f"[OK] {path}")
def main() -> None:
# 1) CSV wide/sprint — one row per sprint, columns per module
csv_wide = "kpi_sprints.csv"
rows_wide = [
["sprint","name","prior","W","P","M","I","A","T","N"],
[1,"Reconciliation Pro",0.35,0.8,1.6,0.5,0.3,-0.8,0.2,-0.2],
[2,"Reconciliation Pro",0.35,0.9,1.9,0.6,0.4,-0.9,0.3,-0.2],
[3,"Reconciliation Pro",0.35,1.0,2.2,0.7,0.5,-0.7,0.4,-0.3],
]
write_csv(csv_wide, rows_wide)
# 2) CSV line-by-line — multiple evidence rows aggregated per module
csv_lines = "evidence_lines.csv"
rows_lines = [
["name","prior","module","evidence","source","logBF"],
["Reconciliation Pro",0.35,"W","Lead survey 200, top-2-box 60%","Survey",+0.8],
["Reconciliation Pro",0.35,"P","Pilot 30 accts, conv 24%, churn 4.8%","MVP",+1.2],
["Reconciliation Pro",0.35,"M","CTR 2.5%, CPL on target","Ads",+0.5],
["Reconciliation Pro",0.35,"I","MoU with partner bank","Partnership",+0.3],
["Reconciliation Pro",0.35,"A","Alpha competitor, moderate switching","Competitive",-1.0],
["Reconciliation Pro",0.35,"T","Pre-January window aids seasonality","Calendar",+0.3],
["Reconciliation Pro",0.35,"N","Mild macro noise","Macro",-0.2],
]
write_csv(csv_lines, rows_lines)
# 3) JSON scenarios — ready for the GUI loader
json_path = "scenarios.json"
json_data = [
{
"name": "Reconciliation Pro – base",
"prior": 0.35,
"modules": {"W":0.8,"P":1.6,"M":0.5,"I":0.3,"A":-0.8,"T":0.2,"N":-0.2}
},
{
"name": "Conservative",
"prior": 0.35,
"modules": {"W":1.0,"P":0.8,"M":0.7,"I":0.5,"A":-2.0,"T":0.4,"N":-0.3}
}
]
with open(json_path, "w", encoding="utf-8") as f:
json.dump(json_data, f, ensure_ascii=False, indent=2)
print(f"[OK] {json_path}")
# 4) YAML scenarios — optional; requires pyyaml
try:
import yaml # type: ignore
yaml_path = "scenarios.yaml"
with open(yaml_path, "w", encoding="utf-8") as f:
yaml.safe_dump(json_data, f, allow_unicode=True, sort_keys=False)
print(f"[OK] {yaml_path}")
except Exception:
print("[INFO] 'pyyaml' not installed: YAML not generated (optional).")
if __name__ == "__main__":
main()
Simulation Results – ZV Advogados with Reconciliation Pro
The Bayesian simulator indicated a posterior of 0.8558, above the GO Threshold (0.75) and the Target Posterior (0.80), with ∑logBF = +2.400—a strong evidence set in favor of launching Reconciliation Pro. In the breakdown, Research (W, +0.8) and especially Pilot (P, +1.6) were the biggest confidence drivers; Competition (A, −0.8) was the main headwind. Marketing (M, +0.5), Partnerships/Institutional (I, +0.3), and Timing/Seasonality (T, +0.2) provided additional lift, while Macro Noise (N, −0.2) had a modest effect. Together, these signals raised belief from a prior of 0.35 to aposterior of 0.8558, supporting a GO decision based on probability, not intuition.
The charts reinforce the reading: the Module Contributions bars highlight each factor’s weight (taller bars for W and P, a negative bar for A), and the probability “thermometer” (log scale) makes the jump from prior to posterior visible—a handy way to communicate the rationale to non-technical stakeholders. The terminal summary printed on each “Recalculate” logs essentials (posterior, ∑logBF, log-odds, and deltas to threshold/target), easing auditability and versioning of simulations.
Beyond a static snapshot, the simulator proved sensitive to “What if…?”. In hypotheses with stronger competitive pressure (more negative A) or aweaker pilot (smaller P), the posterior adjusted consistently downward. When the target is not reached, the “Distribute ΔlogBF” button helps planning by suggesting 70% of effort in P(improving pilot conversion/churn) and 30% in A(reducing competitive disadvantage). The entire pipeline isreproducible: seed data can be generated by seed_files_bayes_ptbr.py / seed_files_bayes_eng.py (CSV/JSON/YAML) and analyzed in the GUIs bayes_gui_tk_ptbr.py / bayes_gui_tk_eng.py, ensuring readers of both languages can replicate results by switching sprints and adjusting modules.
Conclusion

Turning Bayes into a didactic-operational simulatorchanges the game: decisions move from “gut feel” to evidence-based management. For ZV Advogados, Reconciliation Pro shows a high probability of success(posterior 0.8558), justifying investment. The model also explains why: an explicit prior, clear modules (W, P, M, I, A, T, N), and transparent weights (logBF) that expose where confidence was gained and where friction remains—insights rarely produced by qualitative debate alone.
From an execution angle, the recommendation is to proceed with GO across four fronts: (1) consolidate gains in P (pilot)—conversion, retention, and engagement metrics; (2) keep M/I/Tefficient—disciplined marketing, active partnerships, and windows of opportunity; (3) reduce pressure from A (competition)—product differentiation, stronger value propositions, and switching costs; (4) monitor N (macro)—to react to external shocks. The “Distribute ΔlogBF” feature can serve as a prioritization script when the posterior falls short of the target, quantifying “how much is missing” per axis and guiding incremental experiments.
The broader gain is organizational: teams start prioritizing by impact on ∑logBF/posterior, professionalizing the roadmap and shortening the path from hunch to action plan. The technical ecosystem accompanying the article—seed_files_bayes_ptbr.py / seed_files_bayes_eng.py (seed data) andbayes_gui_tk_ptbr.py / bayes_gui_tk_eng.py (GUI simulator)—ensures reproducibility and hands-on learning in both languages. The approach remains accessible to those without advanced statistics, while encouraging discipline in evidence collection and in interpreting charts/reports—the kind of culture that lowers risk and accelerates deliveries with impact.
📖User Manual – Bayesian Simulator for Management
1) What it is and why it matters
The simulator applies Bayes’ theorem to update probabilities from multiple sources of evidence. It turns scattered signals (research, pilots, marketing, partnerships, competition, seasonality, and macro noise) into a single, reliable posterior, indicating whether the decision is closer to GO (proceed) or NO-GO (re-plan). It fits any business decision under uncertainty, such as launching a new service.
2) Interface – key elements
- Scenario List (left): select or add hypotheses.
- Scenario Editor (center):
- Prior: initial probability of success.
- W, P, M, I, A, T, N: evidence factors (positives help, negatives hurt).
- GO Threshold & Target Posterior (top bar):
- GO Threshold = minimum to proceed (e.g., 0.75).
- Target Posterior = ambitious goal (e.g., 0.80).
- Sprint ComboBox:activates after loading a wide CSV (kpi_sprints.csv); switch among evidence rounds (1, 2, 3…).
- Charts (bottom):
- Module Contributions: weight of each piece of evidence.
- Probability Thermometer: compares prior vs posterior (log scale).
- Distribute Δlog BF button: allocates the gap to reach target (70% to P,30% to A).
- Export/Save: exports a summary CSV or saves the current scenario as JSON.
3) Step-by-step flow
- Load data:
ClickOpen CSV (wide/sprint) and select kpi_sprints.csv, or open saved JSON/YAML. - Select Sprint: use the Sprint ComboBox (it activates after loading the wide CSV).
- Edit values:adjust the prior and module weights (e.g., increasePif the pilot improved; easeAif competition weakened).
- Recalculate: click Recalculate to view the posterior, GO/NO-GO, suggestions, updated charts, and a terminal summary.
- Distribute ΔlogBF: if the target isn’t reached, use the button to simulate improvements.
- Export or Save: save the scenario to JSON or export results to CSV.
4) Tips for evidence values
- +2.0 to +3.0: very strong in favor.
- +0.5 to +1.5: moderate in favor.
- −0.5 to −1.5: moderate against.
- −2.0 to −3.0: strong against.
Example: strong MVP (+1.8), strong competitor (−1.2), decent research (+0.8).
5) Interpreting results
- Posterior > Threshold: GO— proceed, but monitor.
- Posterior < Threshold: NO-GO— identify weak factors.
- ΔlogBF→target: how much evidence is still needed to reach theTarget Posterior.
- Suggestions: where to act (e.g., strengthen the pilot or differentiation).
- Negative ∑logBF:unfavorable evidence—replan or gather better data before investing.
6) Supporting concepts
- Reconciliation Pro: practical example of an automation module; the simulator assesses whether to invest, considering research, pilot, competition, etc.
- Sprint:r ounds of evidence collection (Sprint 1 = early tests; Sprint 2 = expanded feedback; Sprint 3 = marketing and partnerships). Switching sprints shows how confidence evolves.
- kpi_sprints.csv values: prior = 0.35 (initial chance), W and P improve across sprints, A (competition) eases slightly, T (timing) and I (partnerships) support, N (noise) stays mild.
- Post-edits: any value can be edited directly in the GUI to test alternative scenarios.
7) Benefits for non-specialists
- No need to master Bayes—the simulator does the math.
- Safe “What if…?” testing with immediate feedback.
- Turns subjective debates into clear, visual numbers.
- Helps prioritize actions and focus where impact is greatest.
8) Final takeaway
The Bayesian Simulator works like a confidence thermometer. Load data, adjust factors, and watch the posterior to decide. It won’t give absolute certainty, but it reduces guesswork and orients decisions with evidence. With the collected data, the Reconciliation Pro example shows a high probability of success (~85.6%), suggesting you should proceed while keeping an eye on competition and market conditions.
Follow & Connect
Izairton Vasconcelos is a technical content creator with degrees in Software Engineering, Business Administration, and Statistics, as well as several specializations in Technology.
He is a Computer Science student and Python specialist focused on productivity, automation, finance, and data science. He develops scripts, dashboards, predictive models, and applications, and also provides consulting services for companies and professionals seeking to implement smart digital solutions in their businesses.
Currently, he publishes bilingual articles and tips in his LinkedIn Newsletter, helping people from different fields apply Python in a practical, fast, and scalable way to their daily tasks.
💼 LinkedIn & Newsletters:
👉https://www.linkedin.com/in/izairton-oliveira-de-vasconcelos-a1916351/
👉https://www.linkedin.com/newsletters/scripts-em-python-produtividad-7287106727202742273
👉https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7319069038595268608
💼 Company Page:
👉https://www.linkedin.com/company/106356348/
💻 GitHub:
👉https://github.com/IOVASCON