网淘吧来吧,欢迎您!

AudioPod技能使用说明

2026-03-29 新闻来源:网淘吧 围观:7
电脑广告
手机广告

AudioPod AI

完整的音频处理API:音乐生成、音轨分离、文本转语音、降噪、转录、说话人分离、钱包管理。

设置

pip install audiopod  # Python
npm install audiopod  # Node.js

认证:设置AUDIOPOD_API_KEY环境变量或传递给客户端构造函数。

AudioPod

获取API密钥

  1. 请前往https://audiopod.ai/auth/signup注册(免费,无需信用卡)
  2. 然后访问https://www.audiopod.ai/dashboard/account/api-keys
  3. 点击"创建API密钥"并复制密钥(以ap_开头)
  4. https://www.audiopod.ai/dashboard/account/wallet为您的钱包充值(按需付费,无订阅制)
from audiopod import AudioPod
client = AudioPod()  # uses AUDIOPOD_API_KEY env var
# or: client = AudioPod(api_key="ap_...")

AI音乐生成

根据文本提示生成歌曲、说唱、器乐、采样和人声。

任务: 文本转音乐(带人声的歌曲),文本转说唱(说唱),提示转器乐(器乐),歌词转人声(仅人声),文本转样本(循环/样本),音频转音频(风格转换),歌曲绽放

Python SDK

# Generate a full song with lyrics
result = client.music.song(
    prompt="Upbeat pop, synth, drums, 120 bpm, female vocals, radio-ready",
    lyrics="Verse 1:\nWalking down the street on a sunny day\n\nChorus:\nWe're on fire tonight!",
    duration=60
)
print(result["output_url"])

# Generate rap
result = client.music.rap(
    prompt="Lo-Fi Hip Hop, 100 BPM, male rap, melancholy, keyboard chords",
    lyrics="Verse 1:\nStarted from the bottom, now we climbing...",
    duration=60
)

# Generate instrumental (no lyrics needed)
result = client.music.instrumental(
    prompt="Atmospheric ambient soundscape, uplifting, driving mood",
    duration=30
)

# Generic generate with explicit task
result = client.music.generate(
    prompt="Electronic dance music, high energy",
    task="text2samples",  # any task type
    duration=30
)

# Async: submit then poll
job = client.music.create(
    prompt="Chill lofi beat", 
    duration=30, 
    task="prompt2instrumental"
)
result = client.music.wait_for_completion(job["id"], timeout=600)

# Get available genre presets
presets = client.music.get_presets()

# List/manage jobs
jobs = client.music.list(skip=0, limit=50)
job = client.music.get(job_id=123)
client.music.delete(job_id=123)

cURL

# Song with lyrics
curl -X POST "https://api.audiopod.ai/api/v1/music/text2music" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"upbeat pop, synth, 120bpm, female vocals", "lyrics":"Walking down the street...", "audio_duration":60}'

# Rap
curl -X POST "https://api.audiopod.ai/api/v1/music/text2rap" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"Lo-Fi Hip Hop, male rap, 100 BPM", "lyrics":"Started from the bottom...", "audio_duration":60}'

# Instrumental
curl -X POST "https://api.audiopod.ai/api/v1/music/prompt2instrumental" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"ambient soundscape, uplifting", "audio_duration":30}'

# Samples/loops
curl -X POST "https://api.audiopod.ai/api/v1/music/text2samples" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"drum loop, sad mood", "audio_duration":15}'

# Vocals only
curl -X POST "https://api.audiopod.ai/api/v1/music/lyric2vocals" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"clean vocals, happy", "lyrics":"Eternal chorus of unity...", "audio_duration":30}'

# Check job status / get result
curl "https://api.audiopod.ai/api/v1/music/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Get genre presets
curl "https://api.audiopod.ai/api/v1/music/presets" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# List jobs
curl "https://api.audiopod.ai/api/v1/music/jobs?skip=0&limit=50" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Delete job
curl -X DELETE "https://api.audiopod.ai/api/v1/music/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

参数

字段必需描述
提示词风格/流派描述
歌词用于歌曲/说唱/人声具有主歌/副歌结构的歌曲歌词
音频时长持续时间(单位:秒)(默认值:30)
流派预设流派预设名称(来自预设端点)
显示名称曲目显示名称

音轨分离

将音频分离为独立的乐器/人声音轨。

模式

模式音轨输出使用场景
单轨1仅指定音轨人声隔离、鼓点提取
双轨2人声 + 伴奏卡拉OK曲目
四轨4人声、鼓点、贝斯、其他标准混音(默认)
6+ 吉他、钢琴完整乐器分离
制作人8+ 底鼓、军鼓、踩镲节拍制作
录音室12+ 镲片、低音贝斯、合成器专业混音
母带处理16最大细节音轨分析

单轨选项:人声、鼓、贝斯、吉他、钢琴、其他

Python SDK

# Sync: extract and wait for result
result = client.stems.separate(
    url="https://youtube.com/watch?v=VIDEO_ID",
    mode="six",
    timeout=600
)
for stem, url in result["download_urls"].items():
    print(f"{stem}: {url}")

# From local file
result = client.stems.separate(file="/path/to/song.mp3", mode="four")

# Single stem extraction
result = client.stems.separate(
    url="https://youtube.com/watch?v=ID",
    mode="single",
    stem="vocals"
)

# Async: submit then poll
job = client.stems.extract(url="https://youtube.com/watch?v=ID", mode="six")
print(f"Job ID: {job['id']}")
status = client.stems.status(job["id"])
# or wait:
result = client.stems.wait_for_completion(job["id"], timeout=600)

# List available modes
modes = client.stems.modes()

# Job management
jobs = client.stems.list(skip=0, limit=50, status="COMPLETED")
job = client.stems.get(job_id=1234)
client.stems.delete(job_id=1234)

cURL

# Extract from URL
curl -X POST "https://api.audiopod.ai/api/v1/stem-extraction/api/extract" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -F "url=https://youtube.com/watch?v=VIDEO_ID" \
  -F "mode=six"

# Extract from file
curl -X POST "https://api.audiopod.ai/api/v1/stem-extraction/api/extract" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -F "file=@/path/to/song.mp3" \
  -F "mode=four"

# Single stem
curl -X POST "https://api.audiopod.ai/api/v1/stem-extraction/api/extract" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -F "url=URL" \
  -F "mode=single" \
  -F "stem=vocals"

# Check job status
curl "https://api.audiopod.ai/api/v1/stem-extraction/status/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# List available modes
curl "https://api.audiopod.ai/api/v1/stem-extraction/modes" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# List jobs (filter by status: PENDING, PROCESSING, COMPLETED, FAILED)
curl "https://api.audiopod.ai/api/v1/stem-extraction/jobs?skip=0&limit=50&status=COMPLETED" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Get specific job
curl "https://api.audiopod.ai/api/v1/stem-extraction/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Delete job
curl -X DELETE "https://api.audiopod.ai/api/v1/stem-extraction/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

响应格式

{
  "id": 1234,
  "status": "COMPLETED",
  "download_urls": {
    "vocals": "https://...",
    "drums": "https://...",
    "bass": "https://...",
    "other": "https://..."
  },
  "quality_scores": {
    "vocals": 0.95,
    "drums": 0.88
  }
}

文本转语音

使用60多种语言的50多种语音将文本转换为语音。支持语音克隆。

语音类型

  • 50多种可直接用于生产的语音— 支持60多种语言,具备自动检测功能
  • 自定义克隆— 仅需约5秒音频样本即可克隆任何声音

Python SDK

# Generate speech and wait for result
result = client.voice.generate(
    text="Hello, world! This is a test.",
    voice_id=123,
    speed=1.0
)
print(result["output_url"])

# Async: submit then poll
job = client.voice.speak(
    text="Hello world",
    voice_id=123,
    speed=1.0
)
status = client.voice.get_job(job["id"])
result = client.voice.wait_for_completion(job["id"], timeout=300)

# List all available voices
voices = client.voice.list()
for v in voices:
    print(f"{v['id']}: {v['name']}")

# Clone a voice (needs ~5 sec audio sample)
new_voice = client.voice.create(
    name="My Voice Clone",
    audio_file="./sample.mp3",
    description="Cloned from recording"
)

# Get/delete voice
voice = client.voice.get(voice_id=123)
client.voice.delete(voice_id=123)

cURL(原始HTTP — 最可靠)

# List all voices
curl "https://api.audiopod.ai/api/v1/voice/voice-profiles" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Generate speech (FORM DATA, not JSON!)
curl -X POST "https://api.audiopod.ai/api/v1/voice/voices/{VOICE_UUID}/generate" \
  -H "Authorization: Bearer $AUDIOPOD_API_KEY" \
  -d "input_text=Hello world, this is a test" \
  -d "audio_format=mp3" \
  -d "speed=1.0"

# Poll job status
curl "https://api.audiopod.ai/api/v1/voice/tts-jobs/{JOB_ID}/status" \
  -H "Authorization: Bearer $AUDIOPOD_API_KEY"

# SDK-style endpoints (alternative)
# Generate via SDK endpoint
curl -X POST "https://api.audiopod.ai/api/v1/voice/tts/generate" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text":"Hello world","voice_id":123,"speed":1.0}'

# Poll via SDK endpoint
curl "https://api.audiopod.ai/api/v1/voice/tts/status/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# List voices (SDK endpoint)
curl "https://api.audiopod.ai/api/v1/voice/voices" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Clone a voice
curl -X POST "https://api.audiopod.ai/api/v1/voice/voices" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -F "name=My Voice" \
  -F "file=@sample.mp3" \
  -F "description=Cloned voice"

# Delete voice
curl -X DELETE "https://api.audiopod.ai/api/v1/voice/voices/VOICE_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

生成参数

字段必需描述
input_text要朗读的文本(最多5000个字符)。原始HTTP请求使用input_text,SDK使用textaudio_format
mp3、wav、ogg(默认:mp3)speed
0.25 - 4.0(默认:1.0)language
noISO代码,如果省略则自动检测

响应格式

// Generate response
{"job_id": 12345, "status": "pending", "credits_reserved": 25}

// Status response (completed)
{"status": "completed", "output_url": "https://r2-url/generated.mp3"}

重要说明

  • 原始HTTP生成端点使用表单数据,而非JSON。字段为input_text而非text
  • SDK端点(/api/v1/voice/tts/generate)使用JSON,其字段为text
  • 输出文件可能是伪装为.mp3的WAV文件——可通过ffmpeg -i output.mp3 -c:a aac real.m4a
  • 进行转换

每次生成约55积分,基于钱包计费

说话人分离

通过自动语音分类按说话人分离音频。

# Diarize and wait for result
result = client.speaker.identify(
    file="./meeting.mp3",
    num_speakers=3,  # optional hint for accuracy
    timeout=600
)
for segment in result["segments"]:
    print(f"Speaker {segment['speaker']}: {segment['text']} [{segment['start']:.1f}s - {segment['end']:.1f}s]")

# From URL
result = client.speaker.identify(
    url="https://youtube.com/watch?v=VIDEO_ID",
    num_speakers=2
)

# Async: submit then poll
job = client.speaker.diarize(
    file="./meeting.mp3",
    num_speakers=3
)
result = client.speaker.wait_for_completion(job["id"], timeout=600)

# Job management
jobs = client.speaker.list(skip=0, limit=50, status="COMPLETED")
job = client.speaker.get(job_id=123)
client.speaker.delete(job_id=123)

Python SDK

# Diarize from file
curl -X POST "https://api.audiopod.ai/api/v1/speaker/diarize" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -F "file=@meeting.mp3" \
  -F "num_speakers=3"

# Diarize from URL
curl -X POST "https://api.audiopod.ai/api/v1/speaker/diarize" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -F "url=https://youtube.com/watch?v=VIDEO_ID" \
  -F "num_speakers=2"

# Check job status
curl "https://api.audiopod.ai/api/v1/speaker/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# List jobs
curl "https://api.audiopod.ai/api/v1/speaker/jobs?skip=0&limit=50" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Delete job
curl -X DELETE "https://api.audiopod.ai/api/v1/speaker/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

cURL

语音转文本(转录)

Python SDK

# Transcribe URL and wait
result = client.transcription.transcribe(
    url="https://youtube.com/watch?v=VIDEO_ID",
    speaker_diarization=True,
    min_speakers=2,
    max_speakers=5,
    timeout=600
)
print(f"Language: {result['detected_language']}")
for seg in result["segments"]:
    print(f"[{seg['start']:.1f}s] {seg.get('speaker','?')}: {seg['text']}")

# Batch: multiple URLs at once
result = client.transcription.transcribe(
    urls=["https://youtube.com/watch?v=ID1", "https://youtube.com/watch?v=ID2"],
    speaker_diarization=True
)

# Upload local file
job = client.transcription.upload(
    file_path="./recording.mp3",
    language="en",
    speaker_diarization=True
)
result = client.transcription.wait_for_completion(job["id"], timeout=600)

# Async: submit then poll
job = client.transcription.create(
    url="https://youtube.com/watch?v=ID",
    language="en",
    speaker_diarization=True,
    word_timestamps=True,
    min_speakers=2,
    max_speakers=4
)
result = client.transcription.wait_for_completion(job["id"], timeout=600)

# Get transcript in different formats
transcript_json = client.transcription.get_transcript(job_id=123, format="json")
transcript_srt = client.transcription.get_transcript(job_id=123, format="srt")
transcript_vtt = client.transcription.get_transcript(job_id=123, format="vtt")
transcript_txt = client.transcription.get_transcript(job_id=123, format="txt")

# Job management
jobs = client.transcription.list(skip=0, limit=50, status="COMPLETED")
job = client.transcription.get(job_id=123)
client.transcription.delete(job_id=123)

cURL

# Transcribe from URL
curl -X POST "https://api.audiopod.ai/api/v1/transcribe/transcribe" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://youtube.com/watch?v=ID","enable_speaker_diarization":true,"word_timestamps":true}'

# Transcribe multiple URLs
curl -X POST "https://api.audiopod.ai/api/v1/transcribe/transcribe" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"urls":["URL1","URL2"],"enable_speaker_diarization":true}'

# Upload file for transcription
curl -X POST "https://api.audiopod.ai/api/v1/transcribe/transcribe-upload" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -F "files=@recording.mp3" \
  -F "language=en" \
  -F "enable_speaker_diarization=true"

# Get job status
curl "https://api.audiopod.ai/api/v1/transcribe/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Get transcript in specific format (json, srt, vtt, txt)
curl "https://api.audiopod.ai/api/v1/transcribe/jobs/JOB_ID/transcript?format=srt" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# List jobs
curl "https://api.audiopod.ai/api/v1/transcribe/jobs?offset=0&limit=50" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Delete job
curl -X DELETE "https://api.audiopod.ai/api/v1/transcribe/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

参数

字段必需描述
url / urls是(或 file)要转录的URL(支持YouTube、SoundCloud、直接链接)
languageISO 639-1代码(如果省略则自动检测)
enable_speaker_diarization启用说话人识别(默认值:false)
min_speakers / max_speakers用于优化说话人分离的说话人数量提示
word_timestamps启用词级时间戳(默认值:true)

输出格式

  • json— 包含片段、时间戳、说话人的完整结构化输出
  • srt— SubRip 字幕格式
  • vtt— WebVTT 字幕格式
  • txt— 纯文本转录稿

降噪

从音频/视频文件中移除背景噪音。

Python SDK

# Denoise and wait for result
result = client.denoiser.denoise(file="./noisy-audio.mp3", timeout=600)
print(f"Clean audio: {result['output_url']}")

# From URL
result = client.denoiser.denoise(url="https://example.com/noisy.mp3")

# Async: submit then poll
job = client.denoiser.create(file="./noisy-audio.mp3")
result = client.denoiser.wait_for_completion(job["id"], timeout=600)

# From URL (async)
job = client.denoiser.create(url="https://example.com/noisy.mp3")

# Job management
jobs = client.denoiser.list(skip=0, limit=50, status="COMPLETED")
job = client.denoiser.get(job_id=123)
client.denoiser.delete(job_id=123)

cURL

# Denoise from file
curl -X POST "https://api.audiopod.ai/api/v1/denoiser/denoise" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -F "file=@noisy-audio.mp3"

# Denoise from URL
curl -X POST "https://api.audiopod.ai/api/v1/denoiser/denoise" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -F "url=https://example.com/noisy.mp3"

# Check job status
curl "https://api.audiopod.ai/api/v1/denoiser/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# List jobs
curl "https://api.audiopod.ai/api/v1/denoiser/jobs?skip=0&limit=50" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Delete job
curl -X DELETE "https://api.audiopod.ai/api/v1/denoiser/jobs/JOB_ID" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

钱包与账单

检查余额、估算费用并查看使用历史。

Python SDK

# Get current balance
balance = client.wallet.get_balance()
print(f"Balance: ${balance['balance_usd']}")

# Check if balance is sufficient for an operation
check = client.wallet.check_balance(
    service_type="stem_extraction",
    duration_seconds=180
)
print(f"Sufficient: {check['sufficient']}")

# Estimate cost before running
estimate = client.wallet.estimate_cost(
    service_type="transcription",
    duration_seconds=300
)
print(f"Cost: ${estimate['cost_usd']}")

# Get pricing for all services
pricing = client.wallet.get_pricing()

# View usage history
usage = client.wallet.get_usage(page=1, limit=50)

cURL

# Get balance
curl "https://api.audiopod.ai/api/v1/api-wallet/balance" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Check balance sufficiency
curl -X POST "https://api.audiopod.ai/api/v1/api-wallet/check-balance" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"service_type":"stem_extraction","duration_seconds":180}'

# Estimate cost
curl -X POST "https://api.audiopod.ai/api/v1/api-wallet/estimate-cost" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"service_type":"transcription","duration_seconds":300}'

# Get pricing
curl "https://api.audiopod.ai/api/v1/api-wallet/pricing" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

# Usage history
curl "https://api.audiopod.ai/api/v1/api-wallet/usage?page=1&limit=50" \
  -H "X-API-Key: $AUDIOPOD_API_KEY"

API 端点摘要

服务端点方法
音乐/api/v1/music/{task}POST
音乐任务/api/v1/music/jobs/{id}GET/DELETE
音乐预设/api/v1/music/presetsGET
音轨分离/api/v1/stem-extraction/api/extractPOST (multipart)
音轨分离状态/api/v1/stem-extraction/status/{id}GET
音轨分离模式/api/v1/stem-extraction/modesGET
音轨分离任务/api/v1/stem-extraction/jobsGET
文本转语音生成/api/v1/voice/voices/{uuid}/generatePOST (form data)
文本转语音生成(SDK)/api/v1/voice/tts/generatePOST (JSON)
文本转语音状态/api/v1/voice/tts-jobs/{id}/statusGET
文本转语音状态(SDK)/api/v1/voice/tts/status/{id}GET
语音列表/api/v1/voice/voice-profilesGET
语音列表(SDK)/api/v1/voice/voicesGET
说话人/api/v1/speaker/diarizePOST(多部分表单)
说话人任务/api/v1/speaker/jobs/{id}GET/DELETE
转录URL/api/v1/transcribe/transcribePOST(JSON)
转录上传/api/v1/transcribe/transcribe-uploadPOST(多部分表单)
转录输出/api/v1/transcribe/jobs/{id}/transcript?format=GET
转录任务/api/v1/transcribe/jobsGET
降噪/api/v1/denoiser/denoisePOST (multipart)
降噪任务/api/v1/denoiser/jobs/{id}GET/DELETE
钱包余额/api/v1/api-wallet/balanceGET
钱包定价/api/v1/api-wallet/pricingGET
钱包使用情况/api/v1/api-wallet/usageGET

认证请求头

两种认证方式有效:

  • X-API-Key: ap_...— 适用于大多数端点
  • Authorization: Bearer ap_...— 适用于 TTS generate/status

已知问题

  • SDK方法签名可能与原始API不同——如有疑问,请参考cURL示例
  • TTS输出文件存储在Cloudflare R2中,可通过output_url在任务状态中下载
  • TTS输出文件可能是伪装成.mp3的WAV文件——通过WhatsApp发送前请使用ffmpeg进行格式转换

免责申明
部分文章来自各大搜索引擎,如有侵权,请与我联系删除。
打赏
文章底部电脑广告
手机广告位-内容正文底部

相关文章

您是本站第339245名访客 今日有22篇新文章/评论