Battle-tested prevention and recovery procedures from 30+ days of production OpenClaw use. What I broke, how I fixed it, how you avoid it.
Daily Prevention
Run these checks daily. Catch problems before they become outages.
openclaw gateway status — should show "running"
df -h ~/.openclaw — alert if >80% used
openclaw cron.
War Stories
From 30+ days of production use. Each problem includes the symptom, root cause, and exact fix.
Discord channel status: 🔴 FailedError: 401 Unauthorized
Discord bot tokens expire or get invalidated. This happens when you regenerate the token in Discord Developer Portal, the bot was kicked/banned, or token was copied incorrectly.
# 1. Go to Discord Developer Portal → Bot → Reset Token# 2. Update config:nano ~/.openclaw/openclaw.json# 3. Restart: openclaw gateway restart
Store tokens in a password manager. Don't commit tokens to git. Test bot connection after any token change.
$ openclaw models list# SumoPod/BytePlus models missing
The config file says the provider exists, but the gateway didn't load it. Base URL wrong, API key invalid, or gateway needs restart.
# 1. Test API: curl -s https://ai.sumopod.com/v1/models# 2. Restart: openclaw gateway restart# 3. Verify: openclaw models list | grep sumopod
Always test API with curl before adding to config. Restart gateway after any provider config change.
❌ Knowledge Backup — Failed: directory is not a git repository
The Obsidian vault wasn't initialized as a git repo. The sync script tries to push to GitHub but fails.
cd ~/path/to/obsidian-vaultgit init && git remote add origin https://github.com/youruser/your-vault.gitgit add . && git commit -m "Initial commit" && git push -u origin main
Initialize git on day one when setting up Obsidian. Test push/pull before relying on automated sync.
# You edit openclaw.json, but behavior doesn't change
OpenClaw reads config on startup. Editing the file doesn't automatically reload it.
# Option A: openclaw gateway restart# Option B: kill -USR1 $(pgrep -f openclaw)
Get in the habit: edit config → restart gateway → test. Use openclaw config.validate before restarting.
Bot responds in DMs ✅ Bot ignores group messages ❌
Telegram has "privacy mode" for bots. By default, bots can't read all group messages — only messages that mention them.
# Option A: Set requireMention: true in config# Option B: @BotFather → /setprivacy → Disable
Set requireMention: true from the start. Document expected behavior for users.
Error: HTTP 429 Too Many RequestsFallback chain exhausted
You hit the API provider's rate limit. Gemini: 60 req/min, Claude: 50 req/min, OpenAI: varies.
# Switch primary model temporarily"primary": "bailian/glm-4.7"openclaw gateway restart
Always configure 2-3 fallback models. Monitor rate limit headers. Consider paid tier if consistent.
$ df -h ~/.openclaw/dev/sda1 50G 42G 5G 89% ⚠️
Session logs, memory indexes, and config backups grow over time. Without cleanup, they fill the disk.
# Clean old logs: rm -rf ~/.openclaw/logs/*.log.*# Prune sessions: find ~/.openclaw/sessions -mtime +7 -delete# Compress: gzip ~/.openclaw/logs/*.log.*
Set up log rotation cron job. Monitor disk usage. Store large files outside ~/.openclaw.
Error: Invalid JSON in config file
You edited openclaw.json and introduced a syntax error (missing comma, unclosed brace, trailing comma).
# 1. Validate: openclaw config.validate# 2. Restore backup: cp ~/.openclaw/openclaw.json.backup ~/.openclaw/openclaw.json# 3. Restart: openclaw gateway start
Always run openclaw config.validate after editing. Keep automatic backups enabled.
When Things Break
Something went wrong. Don't panic. Follow the playbook.
journalctl -u openclaw -n 50
openclaw config.validate
cp ~/.openclaw/openclaw.json.backup ~/.openclaw/openclaw.json
openclaw gateway restart
openclaw.json
This is a condensed version. The full guide includes weekly maintenance, backup strategies, and 15+ recovery playbooks.