VENUS

Second Brain
Nota
Sincronizado el 12/2/2026, 19:42:48
Cerrar Nota

voice-and-image-rules

path://voice-and-image-rules.md

🎙️ Transcripción de Mensajes de Voz (VITAL)

Regla CRÍTICA para precisión:

  • SIEMPRE usar modelo small (2GB) de Faster-Whisper para español
  • NUNCA usar otros modelos (base es menos preciso)
  • El feedback del usuario es crítico - si transcribo mal, respondo mal

Script: ~/voice/scripts/transcribe.sh audio.mp3 [idioma]

Voice Messages (TTS)

Always use the local script for voice messages:

  1. Option 1 (Default): /home/elias/deepgram-tts.sh (Deepgram - Aura-2 Agustina)
  2. Option 2: /home/elias/google-tts.sh (Edge TTS - Elvira Neural)

Workflow for delivery (CRITICAL):

  1. Generate the audio file in /tmp/.
  2. Send the file using the message tool with filePath.
  3. DELETE the local file immediately after sending to save disk space.
  4. Response to user: NO_REPLY (the audio is the reply).

NEVER save or archive voice messages. If the user asks for a past voice note, inform them that audio is ephemeral and not stored.

🖼️ Image Generation

ALWAYS use nano-banana-pro skill when asked to create/generate images:

  • Skill: nano-banana-pro (bundled with OpenClaw)
  • API: Google Gemini 3 Pro Image
  • Binary required: uv (~/.local/bin/uv)
  • Script: generate_image.py in skill scripts folder

Usage:

cd ~/.npm-global/lib/node_modules/openclaw/skills/nano-banana-pro/scripts/
uv run generate_image.py --prompt "tu prompt" --filename "nombre.png" --resolution 1K

Resolutions: 1K (default), 2K, 4K

Mencionado en