- bridge GET /tiger/health/system checks each layer a message travels
through: host memory/swap (/proc/meminfo), LLM gateway liveliness,
OpenClaw container state, per-cron lastStatus. Rolls up to
healthy/degraded/critical with human-readable issues.
- dashboard /api/health/system proxy (bridge-down itself reported as
critical) + HealthBanner on the homepage: invisible while healthy,
amber/red expandable strip when not. Polls every 30s.
Telegram should never again be the first place a failure shows up.
Root cause of intermittent '⚠️ Tiger timed out or is offline' replies:
TWO consumers raced for getUpdates on one bot token. OpenClaw's native
channel owns the conversation; the bridge poller lost with a 409 every
~40s, and when it occasionally WON it relayed the stolen message into a
fresh context-less tg_* session with a 120s budget — slow turns produced
the ⚠️ reply, and the message never reached the native transcript (so it
was also missing from the dashboard mirror).
Outbound notify (raw Bot API) is unaffected. Re-enable explicitly with
TIGER_TELEGRAM_POLLER=on only if native telegram is disabled.
Four skills that wire Tiger into its own control plane (requires
TIGER_BRIDGE_TOKEN in the container env): delegate work to specialists,
read live P&L (read-only, never trades), manage the TASKS.md inbox,
self-diagnose host RAM/gateway/bridge health.
- bridge GET /tiger/activity/audit merges every durable action store at
read time: executions (spawns), tasks lifecycle, outputs, OpenClaw cron
run JSONL. Cursor pagination (before=ISO), type filters. Read-time merge
= retroactively complete, no action without an audit row.
- dashboard /api/activity merges in recent file-modification events
- /activity page: type filter chips, status colors, Load older
angel.manohargupta.com (standalone position-tracker) is the single owner of
live positions UI; the replicated page here drifted against the same Angel
One data. Page now auto-redirects with a fallback link.
- skip assistant messages carrying toolCall blocks (working narration like
'Let me check if codexbar is available' was never sent to Telegram)
- ignore thinking blocks
- strip injected '(untrusted metadata)' json fences from user messages
- drop synthetic system messages (session startup, heartbeats)
- /api/chat/telegram-thread proxy (token stays server-side)
- TelegramThreadCard rewrite: scrollable full history, 'Load older'
cursor pagination, 15s polling with referential-equality short-circuit,
scroll-anchored prepends, auto-stick to bottom only when already there
- lib/inbox.ts: every 30min (09-20 IST) dispatch the first '- [ ]' item
under '## INBOX' in TASKS.md: classifyAgent -> spawnTask -> rewrite the
line with the run id. Bridge-side scheduling: no model tokens burned on
empty checks, no bearer token embedded in cron prompts.
Manual trigger: POST /tiger/inbox/drain
- routes/chat-telegram.ts: GET /tiger/chat/telegram reads OpenClaw's native
telegram session transcript (JSONL) with cursor pagination and mtime
caching. Replaces the webhook/chat-mirror design, which could never work:
the bot is owned by OpenClaw's long-polling channel, and Telegram forbids
webhook + getUpdates on one token, so chat_messages never saw a row.
- index.ts: wire both + start inbox scheduler
- lib/agents.ts: canonical specialist registry (cody/ethan/cathy/elon),
legacy alias normalization (coder/researcher/writer/pm), personas,
documented upgrade path to true per-agent OpenClaw config
- spawn.ts: executes isolated OpenClaw sessions via docker exec with
temp-file message transport, tracks runs in the executions table,
serializes turns (MAX_CONCURRENT=1, RAM-constrained host), reports
completion to Telegram via /tiger/notify
- new: GET /runs, GET /runs/:id for dashboard status
OpenRouter credits ran dry and silently killed classifyAgent (task routing).
Non-anthropic slugs now go to llm.manohargupta.com (own MiniMax/Anthropic
keys). New env: LLM_GATEWAY_URL, LLM_GATEWAY_KEY. Default router model:
minimax-3.
- New /api/positions route proxies to angel.manohargupta.com (positions + pnl-history)
- Positions page: 4 stat cards (total/unrealised/realised/open count), open table, closed-today table
- Auto-refreshes every 30s; manual refresh triggers force-poll on tracker
- Add Positions nav item to app-sidebar
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove bridge/src/routes/chat.ts.pre-ws-migration (obsolete backup)
- Remove IDENTITY.md + SOUL.md from repo root (canonical copies live in
Docker named volume, not git — these were incorrectly tracked)
- Add scripts/smoke-test.sh: 11-check test suite for bridge + OpenClaw
Run after every deploy. All 11 checks passing on current build.
The UNTRACKED count used 'git status --short | grep '^??' | wc -l | xargs'.
When the working tree is clean, grep exits 1 (no matches found). Combined
with 'set -euo pipefail' at the top of the script, that exit 1 killed the
script mid-preflight.
Fix: use 'grep -c' with '|| true' fallback. grep -c counts matches and
prints 0 if none; the fallback handles the exit code so set -e is happy.
deploy.sh:
Validated explicit-deploy workflow. Pre-flight checks (local build,
uncommitted changes, server reachability) run on Mac before touching
server. Code pushed to server via 'git push ssh://...' over the
existing SSH connection — no Mac SSH server required. Server does
git reset --hard to the pushed commit, reinstalls deps if
package.json changed, rebuilds dashboard, restarts services, verifies
health. Full troubleshooting guide in file header.
local-dev.sh:
Runs bridge (:3457) and dashboard (:3101) locally on Mac while
reaching Tiger via SSH. Separate ports + separate SQLite DB keep it
isolated from prod (still live on :3100/:3456). Hot-reload in both
layers. Clean Ctrl-C shutdown.
bridge remote mode:
Added TIGER_REMOTE=true support in bridge/src/tiger.ts and chat.ts.
When set, 'docker exec tiger-openclaw' calls are prefixed with
'ssh $TIGER_REMOTE_SSH'. Backward-compatible: VPS leaves TIGER_REMOTE
unset and runs docker locally as before.
Workflow moving forward:
• Edit locally on Mac
• ./local-dev.sh to test against real Tiger
• git commit small + often
• ./deploy.sh to push to production
Chat history now survives hard refresh, tab close, and multi-device use.
Schema:
chat_messages(id, session_id, role, content, meta, created_at)
+ index on (session_id, created_at DESC)
Bridge endpoints:
POST /tiger/chat — unchanged externally, now persists
user + agent messages alongside the
existing LLM dispatch
GET /tiger/chat/history — ?sessionId=X&limit=200 → ordered messages
DELETE /tiger/chat/history — ?sessionId=X → wipe history
Dashboard:
/api/chat/history — proxy route, bridge token stays server-side
contexts/chat-context.tsx — ChatProvider hydrates messages from the
history endpoint on mount; clearChat()
now also hits DELETE /api/chat/history
Design: single-session model for now (DEFAULT_SESSION_ID constant matches
the openclaw agent --session-id used by the dispatch call). Multi-session
support would require session UI + session-aware routing — deferred to a
later feature sprint.
Tradeoff noted: message data is duplicated between our SQLite and whatever
state OpenClaw keeps internally. Chose duplication over coupling — if
OpenClaw session semantics change, dashboard history remains intact.
- Rebrand Tarzan → Tiger in layout metadata and header
- Bridge: point WORKSPACE_SYMLINK at the docker volume path post-migration
(/var/lib/docker/volumes/tiger_tiger-workspace/_data) — the old
/root/tiger-workspace symlink was orphaned after the April standalone
migration, causing workspace endpoint to return empty.
- Bridge: read agents.defaults.model.primary for model info + expose all
configured availableModels so the dashboard card can show them.
- Dashboard: page.tsx renders currentModel/fallbackModels/availableModels.
- Chat streaming fix (client side):
* proper SSE buffering across TCP chunks (split on \n\n, keep tail in
buffer, {stream: true} decoder for multi-byte UTF-8)
* separate status vs chunk handlers — status no longer pollutes content
* fall back to data.content in done event if streamingRef is empty
* visible parse errors instead of silent catch
* plain-text rendering while streaming, ReactMarkdown only after done —
avoids per-token markdown reparse which was killing the typing feel
Root causes:
1. tiger-bridge crash-looped for 36h on EADDRINUSE because a manual
nohup restart squatted on port 3456; systemd's tsx version couldn't
bind. Killing the squatter restored the expected tsx-src workflow.
2. ChunkLoadError on /chat: npm run build ran under a live next start
prod server, creating an in-memory manifest vs on-disk build split.
Fixed by disciplined build-then-restart.
3. Dashboard chat silently dropped responses: SSE 'status' event text
was being concatenated into the agent message content.
- Bridge: Express API server with SQLite (projects, tasks, executions, outputs)
- Dashboard: Next.js app rewired from WebSocket gateway to Tiger Bridge HTTP API
- Tasks: Kanban board with drag-drop, project management with CRUD
- Dispatch: Task dispatch to sandbox with file watcher for status updates
- UI: Container health panel, workspace browser, logs viewer, output viewer
Critical fixes:
- Use execInSandbox instead of execOnHost for container operations
- Watch symlink path instead of container-internal path
- URL-encoded params for GET requests instead of body
- PUT/DELETE support added to useBridgeRequest
Sprints 1-5 complete. Ready for VPS deployment.