Transcribe audio files via OpenRouter using audio-capable models技能使用说明

2026-03-29 新闻来源：网淘吧围观:149

电脑广告

手机广告

OpenRouter 音频转录

使用 OpenRouter 的聊天补全 API 转录音频文件，内容类型为input_audio。适用于任何支持音频的模型。

快速开始

{baseDir}/scripts/transcribe.sh /path/to/audio.m4a

输出到标准输出。

有用的标志

# Custom model (default: google/gemini-2.5-flash)
{baseDir}/scripts/transcribe.sh audio.ogg --model openai/gpt-4o-audio-preview

# Custom instructions
{baseDir}/scripts/transcribe.sh audio.m4a --prompt "Transcribe with speaker labels"

# Save to file
{baseDir}/scripts/transcribe.sh audio.m4a --out /tmp/transcript.txt

# Custom caller identifier (for OpenRouter dashboard)
{baseDir}/scripts/transcribe.sh audio.m4a --title "MyApp"

工作原理

使用 ffmpeg 将音频转换为 WAV 格式（单声道，16kHz）
对音频进行 Base64 编码
发送到 OpenRouter 聊天补全接口，内容类型为input_audio内容
从响应中提取转录文本

API 密钥

设置环境变量OPENROUTER_API_KEY，或在配置文件~/.clawdbot/clawdbot.json中配置：

{
  skills: {
    "openrouter-transcribe": {
      apiKey: "YOUR_OPENROUTER_KEY"
    }
  }
}

请求头

脚本会向 OpenRouter 发送身份识别请求头：

X-Title调用方名称（默认："Peanut/Clawdbot"）
HTTP-Referer：引用网址（默认："https://clawdbot.com"）

这些信息会显示在您的OpenRouter仪表板中，用于追踪。

故障排除

ffmpeg格式错误：脚本使用临时目录（而非mktemp -t file.wav），因为macOS的mktemp会在扩展名后添加随机后缀，这会破坏格式检测。

参数列表过长：大型音频文件会产生巨大的base64字符串，超出shell参数限制。脚本将数据写入临时文件（--rawfile用于jq，@file用于curl），而不是将数据作为参数传递。

空响应：如果收到"来自API的空响应"，脚本将转储原始响应以进行调试。常见原因：

无效的API密钥
模型不支持音频输入
音频文件过大或已损坏

免责申明

部分文章来自各大搜索引擎，如有侵权，请与我联系删除。

打赏

文章底部电脑广告

手机广告位-内容正文底部

标签

上一篇：LegalDoc AI技能使用说明下一篇：Virtuals Protocol ACP技能使用说明

Transcribe audio files via OpenRouter using audio-capable models技能使用说明

OpenRouter 音频转录

快速开始

有用的标志

工作原理

API 密钥

请求头

故障排除

相关文章

推荐文章

热门浏览

标签列表