Nota
Cerrar NotaSincronizado el 12/2/2026, 19:42:48
✕
voice-and-image-rules
path://voice-and-image-rules.md
🎙️ Transcripción de Mensajes de Voz (VITAL)
Regla CRÍTICA para precisión:
- SIEMPRE usar modelo
small(2GB) de Faster-Whisper para español - NUNCA usar otros modelos (base es menos preciso)
- El feedback del usuario es crítico - si transcribo mal, respondo mal
Script: ~/voice/scripts/transcribe.sh audio.mp3 [idioma]
Voice Messages (TTS)
Always use the local script for voice messages:
- Option 1 (Default):
/home/elias/deepgram-tts.sh(Deepgram - Aura-2 Agustina) - Option 2:
/home/elias/google-tts.sh(Edge TTS - Elvira Neural)
Workflow for delivery (CRITICAL):
- Generate the audio file in
/tmp/. - Send the file using the
messagetool withfilePath. - DELETE the local file immediately after sending to save disk space.
- Response to user:
NO_REPLY(the audio is the reply).
NEVER save or archive voice messages. If the user asks for a past voice note, inform them that audio is ephemeral and not stored.
🖼️ Image Generation
ALWAYS use nano-banana-pro skill when asked to create/generate images:
- Skill:
nano-banana-pro(bundled with OpenClaw) - API: Google Gemini 3 Pro Image
- Binary required:
uv(~/.local/bin/uv) - Script:
generate_image.pyin skill scripts folder
Usage:
cd ~/.npm-global/lib/node_modules/openclaw/skills/nano-banana-pro/scripts/
uv run generate_image.py --prompt "tu prompt" --filename "nombre.png" --resolution 1K
Resolutions: 1K (default), 2K, 4K