Youtube Apify Transcript
2026-03-29
新闻来源:网淘吧
围观:15
电脑广告
手机广告
youtube-apify-transcript
通过 APIFY API 获取 YouTube 字幕(可从云端 IP 地址工作,绕过 YouTube 机器人检测)。
为什么选择 APIFY?
YouTube 会屏蔽来自云端 IP(AWS、GCP 等)的字幕请求。APIFY 通过住宅代理运行请求,可靠地绕过机器人检测。
免费套餐
- 每月 5 美元免费额度(约 714 个视频)
- 无需信用卡
- 非常适合个人使用
费用
- 每个视频 0.007 美元(不到 1 美分!)
- 使用情况跟踪地址:https://console.apify.com/billing
链接
设置
- 创建免费的 APIFY 账户:https://apify.com/
- 获取您的API令牌:https://console.apify.com/account/integrations
- 设置环境变量:
# Add to ~/.bashrc or ~/.zshrc
export APIFY_API_TOKEN="apify_api_YOUR_TOKEN_HERE"
# Or use .env file (never commit this!)
echo 'APIFY_API_TOKEN=apify_api_YOUR_TOKEN_HERE' >> .env
使用方法
基本用法
# Get transcript as text (uses cache by default)
python3 scripts/fetch_transcript.py "https://www.youtube.com/watch?v=VIDEO_ID"
# Short URL also works
python3 scripts/fetch_transcript.py "https://youtu.be/VIDEO_ID"
选项
# Output to file
python3 scripts/fetch_transcript.py "URL" --output transcript.txt
# JSON format (includes timestamps)
python3 scripts/fetch_transcript.py "URL" --json
# Both: JSON to file
python3 scripts/fetch_transcript.py "URL" --json --output transcript.json
# Specify language preference
python3 scripts/fetch_transcript.py "URL" --lang de
缓存(节省费用!)
默认情况下,转录文本会在本地缓存。重复请求同一视频的费用为0美元。
# First request: fetches from APIFY ($0.007)
python3 scripts/fetch_transcript.py "URL"
# Second request: uses cache (FREE!)
python3 scripts/fetch_transcript.py "URL"
# Output: [cached] Transcript for: VIDEO_ID
# Bypass cache (force fresh fetch)
python3 scripts/fetch_transcript.py "URL" --no-cache
# View cache stats
python3 scripts/fetch_transcript.py --cache-stats
# Clear all cached transcripts
python3 scripts/fetch_transcript.py --clear-cache
缓存位置:.cache/位于技能目录中(可通过YT_TRANSCRIPT_CACHE_DIR环境变量覆盖)
批量模式
一次性处理多个视频:
# Create a file with URLs (one per line)
cat > urls.txt << EOF
https://youtube.com/watch?v=VIDEO1
https://youtu.be/VIDEO2
https://youtube.com/watch?v=VIDEO3
EOF
# Process all URLs
python3 scripts/fetch_transcript.py --batch urls.txt
# Output:
# [1/3] Fetching VIDEO1...
# [2/3] [cached] VIDEO2
# [3/3] Fetching VIDEO3...
# Batch complete: 2 fetched, 1 cached, 0 failed
# [Cost: ~$0.014 for 2 API call(s)]
# Batch with JSON output to file
python3 scripts/fetch_transcript.py --batch urls.txt --json --output all_transcripts.json
输出格式
文本(默认):
Hello and welcome to this video.
Today we're going to talk about...
JSON(--json):
{
"video_id": "dQw4w9WgXcQ",
"title": "Video Title",
"transcript": [
{"start": 0.0, "duration": 2.5, "text": "Hello and welcome"},
{"start": 2.5, "duration": 3.0, "text": "to this video"}
],
"full_text": "Hello and welcome to this video..."
}
错误处理
脚本处理常见错误:
- 无效的YouTube网址
- 视频没有转录文本
- API配额已超
- 网络错误
元数据
metadata:
clawdbot:
emoji: "📹"
requires:
env: ["APIFY_API_TOKEN"]
bins: ["python3"]
文章底部电脑广告
手机广告位-内容正文底部


微信扫一扫,打赏作者吧~