网淘吧来吧,欢迎您!

Youtube Apify Transcript

2026-03-29 新闻来源:网淘吧 围观:15
电脑广告
手机广告

youtube-apify-transcript

通过 APIFY API 获取 YouTube 字幕(可从云端 IP 地址工作,绕过 YouTube 机器人检测)。

为什么选择 APIFY?

YouTube 会屏蔽来自云端 IP(AWS、GCP 等)的字幕请求。APIFY 通过住宅代理运行请求,可靠地绕过机器人检测。

免费套餐

  • 每月 5 美元免费额度(约 714 个视频)
  • 无需信用卡
  • 非常适合个人使用

费用

链接

设置

  1. 创建免费的 APIFY 账户:https://apify.com/
  2. 获取您的API令牌:https://console.apify.com/account/integrations
  3. 设置环境变量:
# Add to ~/.bashrc or ~/.zshrc
export APIFY_API_TOKEN="apify_api_YOUR_TOKEN_HERE"

# Or use .env file (never commit this!)
echo 'APIFY_API_TOKEN=apify_api_YOUR_TOKEN_HERE' >> .env

使用方法

基本用法

# Get transcript as text (uses cache by default)
python3 scripts/fetch_transcript.py "https://www.youtube.com/watch?v=VIDEO_ID"

# Short URL also works
python3 scripts/fetch_transcript.py "https://youtu.be/VIDEO_ID"

选项

# Output to file
python3 scripts/fetch_transcript.py "URL" --output transcript.txt

# JSON format (includes timestamps)
python3 scripts/fetch_transcript.py "URL" --json

# Both: JSON to file
python3 scripts/fetch_transcript.py "URL" --json --output transcript.json

# Specify language preference
python3 scripts/fetch_transcript.py "URL" --lang de

缓存(节省费用!)

默认情况下,转录文本会在本地缓存。重复请求同一视频的费用为0美元。

# First request: fetches from APIFY ($0.007)
python3 scripts/fetch_transcript.py "URL"

# Second request: uses cache (FREE!)
python3 scripts/fetch_transcript.py "URL"
# Output: [cached] Transcript for: VIDEO_ID

# Bypass cache (force fresh fetch)
python3 scripts/fetch_transcript.py "URL" --no-cache

# View cache stats
python3 scripts/fetch_transcript.py --cache-stats

# Clear all cached transcripts
python3 scripts/fetch_transcript.py --clear-cache

缓存位置:.cache/位于技能目录中(可通过YT_TRANSCRIPT_CACHE_DIR环境变量覆盖)

批量模式

一次性处理多个视频:

# Create a file with URLs (one per line)
cat > urls.txt << EOF
https://youtube.com/watch?v=VIDEO1
https://youtu.be/VIDEO2
https://youtube.com/watch?v=VIDEO3
EOF

# Process all URLs
python3 scripts/fetch_transcript.py --batch urls.txt

# Output: 
# [1/3] Fetching VIDEO1...
# [2/3] [cached] VIDEO2
# [3/3] Fetching VIDEO3...
# Batch complete: 2 fetched, 1 cached, 0 failed
# [Cost: ~$0.014 for 2 API call(s)]

# Batch with JSON output to file
python3 scripts/fetch_transcript.py --batch urls.txt --json --output all_transcripts.json

输出格式

文本(默认):

Hello and welcome to this video.
Today we're going to talk about...

JSON(--json):

{
  "video_id": "dQw4w9WgXcQ",
  "title": "Video Title",
  "transcript": [
    {"start": 0.0, "duration": 2.5, "text": "Hello and welcome"},
    {"start": 2.5, "duration": 3.0, "text": "to this video"}
  ],
  "full_text": "Hello and welcome to this video..."
}

错误处理

脚本处理常见错误:

  • 无效的YouTube网址
  • 视频没有转录文本
  • API配额已超
  • 网络错误

元数据

metadata:
  clawdbot:
    emoji: "📹"
    requires:
      env: ["APIFY_API_TOKEN"]
      bins: ["python3"]
免责申明
部分文章来自各大搜索引擎,如有侵权,请与我联系删除。
打赏
文章底部电脑广告
手机广告位-内容正文底部

相关文章

您是本站第319384名访客 今日有85篇新文章/评论