Evening Retrospective — 2026-02-27
Evening Retrospective — 2026-02-27
Today's Fixes (from git log)
| Commit | Description |
|---|---|
c8d91de | fix(route): exit 1 when router fails to signal error to callers |
ea6ef79 | fix(backend): return 1 for invalid IDs in db_task_field (#420) |
ffcaaa2 | fix: add error handling for empty/missing PR title in create-pr.sh (#419) |
9453fe6 | fix(docs): add missing front matter closing delimiter in morning review post (#414) |
a448aee | test: add comprehensive test coverage for critical scripts |
065c1bf | fix(cleanup): handle gh errors vs no-merged-pr in cleanup_worktrees.sh (#403) |
ed1b56c | fix(run_task): move _gh_ensure_repo out of subshell in detect_retry_loop (#405) |
a936f89 | fix(backend): quote variable to prevent word splitting in label validation (#404) |
Major improvements today:
- Router error handling — Fixed
route_task.shto exit 1 when router fails (was silently succeeding) - Backend robustness — Fixed error handling for invalid IDs in
db_task_field - PR creation safety — Added validation for empty/missing PR titles in
create-pr.sh - Test coverage — Added comprehensive tests for critical scripts
- Cleanup reliability — Fixed
cleanup_worktrees.shto distinguish gh errors from "no merged PRs" - Retry loop detection — Fixed subshell variable scoping bug in
detect_retry_loop - Label validation — Fixed unquoted variable causing word splitting issues
Current Task Status
| Status | Count |
|---|---|
| new | 0 |
| routed | 0 |
| in_progress | 2 |
| needs_review | 4 |
| done | 63 |
In Progress:
- #421 (opencode) — This evening retrospective task
- #418 (kimi) — Fix route_task.sh exit code — recovered from stuck state (1848s)
Needs Review (pattern: agent response issues):
- #376 (claude) — Add test coverage for gh_mentions.sh — 6 attempts, invalid JSON responses
- #381 (opencode) — Fix agent runner bugs — 5 attempts, missing status field
- #382 (kimi) — Fix non-atomic sidecar writes — needs_review
- #402 (kimi) — Add test coverage for jobs_tick.sh — in_progress
What Went Well
8 commits merged today — All targeted fixes for real bugs identified in recent reviews
Poll recovery working — Task #418 was stuck for 1848s (~31 minutes) and was automatically recovered by poll
Morning review completed — Issue #407 closed successfully with analysis and recommendations
Backend fixes landed — PRs #420, #419, #414, #403, #405, #404 all merged — significant reliability improvements
Evening retrospective job fired correctly — Catch-up mechanism created task #421 at 21:02:34Z
What Failed / Needs Attention
Task #376 stuck in retry loop — 6 attempts with claude, all failing with "invalid YAML/JSON" responses. Pattern:
- Attempt 1: claude rate limited, rerouted to opencode
- Attempts 2-6: claude with model
opencode/glm-5-free— all invalid JSON - Root cause: Model
opencode/glm-5-freemay not support JSON output properly
Task #381 stuck in retry loop — 5 attempts with opencode, all failing with "missing status":
- First attempt succeeded (commit 7c3b8a6) but status was
done - Subsequent attempts all fail to produce valid JSON with status field
- Root cause: opencode with
gpt-5-mininot consistently returning valid JSON
- First attempt succeeded (commit 7c3b8a6) but status was
Agent response validation — Multiple tasks failing because agents return malformed JSON or missing required fields. This is a systemic issue affecting reliability.
Prompt Effectiveness
Reviewed prompts/system.md and prompts/route.md:
- System prompt — Clear workflow rules, good JSON schema definition
- Route prompt — Good complexity guidance, proper skill selection
Issue identified: The prompts are clear, but some models (especially opencode/glm-5-free and possibly gpt-5-mini) are not consistently following the JSON output format. This suggests:
- Models may need stronger JSON enforcement
- Some models may not support
--output-format jsonproperly - Fallback parsing (stdout extraction) may be catching incomplete responses
Recommendation: Review which models actually support structured JSON output and adjust routing accordingly.
Routing Analysis
- Round-robin mode active — Tasks distributed across agents
- Claude rerouting — Task #376 hit claude rate limit and was rerouted to opencode successfully
- Model selection issues:
opencode/glm-5-freeassigned to claude task — this model appears incompatible with claude's JSON output expectationsgpt-5-miniused for opencode — producing inconsistent JSON
Root cause: The model_map in config.yml may be routing tasks to models that don't support the required output format for their assigned agent.
Performance
- Worktree cleanup: ~13-22s per project (normal)
- Poll interval: ~45s between ticks
- Service: v0.56.32 running cleanly
- Rate limits: Claude hit rate limit once (task #376), successfully rerouted
Issues observed:
- Task #418 stuck for 1848s before recovery
- Multiple tasks in retry loops with JSON parsing failures
Tomorrow's Priorities
Fix model/agent compatibility — The root cause of #376 and #381 failures is model selection:
opencode/glm-5-freeshould not be used with claude (expects different output format)- Review model_map configuration for all agents
- Ensure models support JSON output for their assigned agent
Unblock stuck tasks — Reset #376 and #381 to
newafter fixing model config:- #376: Assign to claude with a proper claude-compatible model
- #381: Assign to opencode with a model known to work (gpt-5-mini worked on attempt 1)
Add JSON validation — Consider adding pre-flight JSON validation or stronger model constraints
Close completed tasks — Several tasks marked done but issues still open:
- #390 (marked done in log but issue still open)
- #417 (marked done)
Flag for Follow-up
- Model compatibility matrix — Document which models work with which agents
- JSON output enforcement — Some models ignore
--output-format json; need fallback validation - Retry loop detection — Tasks #376 and #381 have 6 and 5 attempts respectively — should have been blocked earlier
- Task #418 — Investigate why kimi got stuck for 31 minutes (agent startup issue?)
Morning Review Carry-Forward
From #407 (today's morning review):
- ✅ Multiple backend fixes landed (all the PRs mentioned were merged)
- ⚠️ Model compatibility issues identified — need to address
- ⚠️ Task #376 still failing — was flagged, still needs fix
(End of file)