Youtube Apify Transcript技能使用说明

2026-03-29 新闻来源：网淘吧围观:154

电脑广告

手机广告

youtube-apify-transcript

通过 APIFY API 获取 YouTube 字幕（可从云端 IP 地址工作，绕过 YouTube 机器人检测）。

为什么选择 APIFY？

YouTube 会屏蔽来自云端 IP（AWS、GCP 等）的字幕请求。APIFY 通过住宅代理运行请求，可靠地绕过机器人检测。

免费套餐

每月 5 美元免费额度（约 714 个视频）
无需信用卡
非常适合个人使用

费用

每个视频 0.007 美元（不到 1 美分！）
使用情况跟踪地址：https://console.apify.com/billing

链接

设置

创建免费的 APIFY 账户：https://apify.com/
获取您的API令牌：https://console.apify.com/account/integrations
设置环境变量：

# Add to ~/.bashrc or ~/.zshrc
export APIFY_API_TOKEN="apify_api_YOUR_TOKEN_HERE"

# Or use .env file (never commit this!)
echo 'APIFY_API_TOKEN=apify_api_YOUR_TOKEN_HERE' >> .env

使用方法

基本用法

# Get transcript as text (uses cache by default)
python3 scripts/fetch_transcript.py "https://www.youtube.com/watch?v=VIDEO_ID"

# Short URL also works
python3 scripts/fetch_transcript.py "https://youtu.be/VIDEO_ID"

选项

# Output to file
python3 scripts/fetch_transcript.py "URL" --output transcript.txt

# JSON format (includes timestamps)
python3 scripts/fetch_transcript.py "URL" --json

# Both: JSON to file
python3 scripts/fetch_transcript.py "URL" --json --output transcript.json

# Specify language preference
python3 scripts/fetch_transcript.py "URL" --lang de

缓存（节省费用！）

默认情况下，转录文本会在本地缓存。重复请求同一视频的费用为0美元。

# First request: fetches from APIFY ($0.007)
python3 scripts/fetch_transcript.py "URL"

# Second request: uses cache (FREE!)
python3 scripts/fetch_transcript.py "URL"
# Output: [cached] Transcript for: VIDEO_ID

# Bypass cache (force fresh fetch)
python3 scripts/fetch_transcript.py "URL" --no-cache

# View cache stats
python3 scripts/fetch_transcript.py --cache-stats

# Clear all cached transcripts
python3 scripts/fetch_transcript.py --clear-cache

缓存位置：.cache/位于技能目录中（可通过YT_TRANSCRIPT_CACHE_DIR环境变量覆盖）

批量模式

一次性处理多个视频：

# Create a file with URLs (one per line)
cat > urls.txt << EOF
https://youtube.com/watch?v=VIDEO1
https://youtu.be/VIDEO2
https://youtube.com/watch?v=VIDEO3
EOF

# Process all URLs
python3 scripts/fetch_transcript.py --batch urls.txt

# Output: 
# [1/3] Fetching VIDEO1...
# [2/3] [cached] VIDEO2
# [3/3] Fetching VIDEO3...
# Batch complete: 2 fetched, 1 cached, 0 failed
# [Cost: ~$0.014 for 2 API call(s)]

# Batch with JSON output to file
python3 scripts/fetch_transcript.py --batch urls.txt --json --output all_transcripts.json

输出格式

文本（默认）：

Hello and welcome to this video.
Today we're going to talk about...

JSON（--json）：

{
  "video_id": "dQw4w9WgXcQ",
  "title": "Video Title",
  "transcript": [
    {"start": 0.0, "duration": 2.5, "text": "Hello and welcome"},
    {"start": 2.5, "duration": 3.0, "text": "to this video"}
  ],
  "full_text": "Hello and welcome to this video..."
}

错误处理

脚本处理常见错误：

无效的YouTube网址
视频没有转录文本
API配额已超
网络错误

元数据

metadata:
  clawdbot:
    emoji: "📹"
    requires:
      env: ["APIFY_API_TOKEN"]
      bins: ["python3"]

免责申明

部分文章来自各大搜索引擎，如有侵权，请与我联系删除。

打赏

文章底部电脑广告

手机广告位-内容正文底部

标签

上一篇：Agent Content Pipeline技能使用说明下一篇：Systematic Debugging技能使用说明