Academic Research Hub技能使用说明

2026-03-28 新闻来源：网淘吧围观:209

电脑广告

手机广告

学术研究枢纽

从包括arXiv、PubMed、Semantic Scholar等在内的多个来源搜索并获取学术论文。下载PDF、提取引文、生成参考文献列表以及构建文献综述。

⚠️前提条件：安装OpenClawCLI（Windows、MacOS）

安装最佳实践：

# Standard installation
pip install arxiv scholarly pubmed-parser semanticscholar requests

# If you encounter permission errors, use a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install arxiv scholarly pubmed-parser semanticscholar requests

切勿使用--break-system-packages因为它可能损坏您系统的Python安装。

快速参考

任务	命令
搜索arXiv	`python scripts/research.py arxiv "quantum computing"`
搜索PubMed	`python scripts/research.py pubmed "covid vaccine"`
搜索Semantic Scholar	`python scripts/research.py semantic "machine learning"`
下载论文	`python scripts/research.py arxiv "topic" --download`
获取引用文献	`python scripts/research.py arxiv "topic" --citations`
生成参考文献列表	`python scripts/research.py arxiv "topic" --format bibtex`
保存结果	`python scripts/research.py arxiv "topic" --output results.json`

核心功能

1. 多源搜索

通过单一界面跨多个学术数据库进行搜索。

支持的来源：

arXiv- 物理学、数学、计算机科学、定量生物学、定量金融学、统计学
PubMed- 生物医学与生命科学文献
Semantic Scholar- 计算机科学与跨学科研究
Google Scholar- 广泛的学术搜索（有限制，无官方API）

2. 论文下载

在可用时下载全文PDF。

python scripts/research.py arxiv "deep learning" --download --output-dir papers/

3. 引文提取

从论文中提取并格式化引文。

支持的格式：

BibTeX
RIS
JSON
纯文本

4. 元数据检索

获取每篇论文的全面元数据：

标题、作者、摘要
发表日期
期刊/会议
DOI、arXiv ID、PubMed ID
引用次数
参考文献

特定来源命令

arXiv 搜索

在 arXiv 存储库中搜索预印本。

# Basic search
python scripts/research.py arxiv "quantum computing"

# Filter by category
python scripts/research.py arxiv "neural networks" --category cs.LG

# Filter by date
python scripts/research.py arxiv "transformers" --year 2023

# Download papers
python scripts/research.py arxiv "attention mechanism" --download --max-results 10

可用类别：

cs.AI- 人工智能
cs.LG- 机器学习
计算机视觉- 计算机视觉
计算与语言- 计算与语言
组合数学- 组合数学
光学- 光学
基因组学- 基因组学
完整列表

输出：

1. Attention Is All You Need
   Authors: Vaswani et al.
   Published: 2017-06-12
   arXiv ID: 1706.03762
   Categories: cs.CL, cs.LG
   Abstract: The dominant sequence transduction models...
   PDF: http://arxiv.org/pdf/1706.03762v5

PubMed 搜索

搜索收录在 PubMed 中的生物医学文献。

# Basic search
python scripts/research.py pubmed "cancer immunotherapy"

# Filter by date range
python scripts/research.py pubmed "CRISPR" --start-date 2023-01-01 --end-date 2023-12-31

# Filter by publication type
python scripts/research.py pubmed "covid vaccine" --publication-type "Clinical Trial"

# Get full text links
python scripts/research.py pubmed "gene therapy" --full-text

出版物类型：

临床试验
荟萃分析
综述
系统综述
随机对照试验

输出：

1. mRNA vaccine effectiveness against COVID-19
   Authors: Smith J, Jones K, et al.
   Journal: New England Journal of Medicine
   Published: 2023-03-15
   PMID: 36913851
   DOI: 10.1056/NEJMoa2301234
   Abstract: Background: mRNA vaccines have shown...
   Full Text: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9876543/

语义学者搜索

搜索计算机科学及跨学科研究。

# Basic search
python scripts/research.py semantic "reinforcement learning"

# Filter by year
python scripts/research.py semantic "graph neural networks" --year 2022

# Get highly cited papers
python scripts/research.py semantic "transformers" --min-citations 100

# Include references
python scripts/research.py semantic "BERT" --include-references

输出包含：

引用次数
有影响力的引用次数
参考文献列表
引用文献
研究领域

输出：

1. BERT: Pre-training of Deep Bidirectional Transformers
   Authors: Devlin J, Chang MW, Lee K, Toutanova K
   Published: 2019
   Paper ID: df2b0e26d0599ce3e70df8a9da02e51594e0e992
   Citations: 15000+
   Influential Citations: 2000+
   Fields: Computer Science, Linguistics
   Abstract: We introduce a new language representation model...
   PDF: https://arxiv.org/pdf/1810.04805.pdf

基本选项

结果限制

控制返回结果的数量。

--max-results N    # Default: 10, range: 1-100

示例：

python scripts/research.py arxiv "machine learning" --max-results 5
python scripts/research.py pubmed "diabetes" --max-results 50

输出格式

选择结果的格式化方式。

--format <text|json|bibtex|ris|markdown>

文本- 人类可读格式（默认）

python scripts/research.py arxiv "quantum" --format text

JSON- 用于处理的结构化数据

python scripts/research.py arxiv "quantum" --format json

BibTeX- 适用于 LaTeX 文档

python scripts/research.py arxiv "quantum" --format bibtex

RIS- 适用于参考文献管理器（如 Zotero、Mendeley）

python scripts/research.py arxiv "quantum" --format ris

Markdown- 用于文档编写

python scripts/research.py arxiv "quantum" --format markdown

保存到文件

将结果保存到文件。

--output <filepath>

示例：

python scripts/research.py arxiv "AI" --output results.txt
python scripts/research.py pubmed "cancer" --format json --output papers.json
python scripts/research.py semantic "NLP" --format bibtex --output references.bib

下载论文

在可用时下载全文PDF。

--download
--output-dir <directory>    # Where to save PDFs (default: downloads/)

示例：

# Download to default directory
python scripts/research.py arxiv "deep learning" --download --max-results 5

# Download to specific directory
python scripts/research.py arxiv "transformers" --download --output-dir papers/nlp/

高级功能

引文提取

从论文中提取引文。

--citations              # Extract citations
--citation-format <format>    # bibtex, ris, json (default: bibtex)

示例：

python scripts/research.py arxiv "attention mechanism" --citations --citation-format bibtex --output citations.bib

日期筛选

按发表日期筛选。

arXiv：

--year <YYYY>           # Specific year
--start-date <YYYY-MM-DD>
--end-date <YYYY-MM-DD>

PubMed：

--start-date <YYYY-MM-DD>
--end-date <YYYY-MM-DD>

示例：

python scripts/research.py arxiv "quantum" --year 2023
python scripts/research.py pubmed "vaccine" --start-date 2022-01-01 --end-date 2023-12-31

作者搜索

按特定作者搜索论文。

--author "Last, First"

示例：

python scripts/research.py arxiv "neural networks" --author "Hinton, Geoffrey"
python scripts/research.py semantic "deep learning" --author "Bengio, Yoshua"

排序选项

按不同标准对结果进行排序。

--sort-by <relevance|date|citations>

示例：

python scripts/research.py arxiv "machine learning" --sort-by date
python scripts/research.py semantic "NLP" --sort-by citations

常用工作流程

文献综述

为文献综述收集某一主题的论文。

# Step 1: Search multiple sources
python scripts/research.py arxiv "graph neural networks" --max-results 20 --format json --output arxiv_gnn.json
python scripts/research.py semantic "graph neural networks" --max-results 20 --format json --output semantic_gnn.json

# Step 2: Download key papers
python scripts/research.py arxiv "graph neural networks" --download --max-results 10 --output-dir papers/gnn/

# Step 3: Generate bibliography
python scripts/research.py arxiv "graph neural networks" --max-results 20 --format bibtex --output gnn_references.bib

查找最新研究

追踪某一领域的最新论文。

# Last year's papers
python scripts/research.py arxiv "large language models" --year 2023 --sort-by date --max-results 30

# Last month's biomedical papers
python scripts/research.py pubmed "gene therapy" --start-date 2023-11-01 --end-date 2023-11-30 --format markdown --output recent_gene_therapy.md

高被引论文

寻找领域内有影响力的论文。

python scripts/research.py semantic "reinforcement learning" --min-citations 500 --sort-by citations --max-results 25

作者发表历史

跟踪某位作者的研究工作。

python scripts/research.py arxiv "deep learning" --author "LeCun, Yann" --sort-by date --max-results 50 --output lecun_papers.json

构建参考文献库

创建全面的参考文献集。

# Create directory structure
mkdir -p references/{papers,citations}

# Search and download papers
python scripts/research.py arxiv "transformers NLP" --download --max-results 15 --output-dir references/papers/

# Generate citations
python scripts/research.py arxiv "transformers NLP" --max-results 15 --format bibtex --output references/citations/transformers.bib

跨来源验证

在多个数据库中验证发现。

# Search same topic across sources
python scripts/research.py arxiv "federated learning" --max-results 10 --output arxiv_fl.txt
python scripts/research.py semantic "federated learning" --max-results 10 --output semantic_fl.txt
python scripts/research.py pubmed "federated learning" --max-results 10 --output pubmed_fl.txt

# Compare results
diff arxiv_fl.txt semantic_fl.txt

输出格式示例

文本格式（默认）

Search Results: 3 papers found

1. Attention Is All You Need
   Authors: Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; et al.
   Published: 2017-06-12
   arXiv ID: 1706.03762
   Categories: cs.CL, cs.LG
   Abstract: The dominant sequence transduction models are based on complex recurrent or convolutional neural networks...
   PDF: http://arxiv.org/pdf/1706.03762v5

2. BERT: Pre-training of Deep Bidirectional Transformers
   Authors: Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina
   Published: 2018-10-11
   arXiv ID: 1810.04805
   Categories: cs.CL
   Abstract: We introduce a new language representation model called BERT...
   PDF: http://arxiv.org/pdf/1810.04805v2

JSON格式

[
  {
    "title": "Attention Is All You Need",
    "authors": ["Vaswani, Ashish", "Shazeer, Noam", "Parmar, Niki"],
    "published": "2017-06-12",
    "arxiv_id": "1706.03762",
    "categories": ["cs.CL", "cs.LG"],
    "abstract": "The dominant sequence transduction models...",
    "pdf_url": "http://arxiv.org/pdf/1706.03762v5",
    "doi": "10.48550/arXiv.1706.03762"
  }
]

BibTeX格式

@article{vaswani2017attention,
  title={Attention Is All You Need},
  author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\L}ukasz and Polosukhin, Illia},
  journal={arXiv preprint arXiv:1706.03762},
  year={2017},
  url={http://arxiv.org/abs/1706.03762}
}

RIS格式

TY  - JOUR
TI  - Attention Is All You Need
AU  - Vaswani, Ashish
AU  - Shazeer, Noam
AU  - Parmar, Niki
PY  - 2017
DA  - 2017/06/12
JO  - arXiv preprint
VL  - arXiv:1706.03762
UR  - http://arxiv.org/abs/1706.03762
ER  -

Markdown格式

# Search Results: 3 papers found

## 1. Attention Is All You Need

**Authors:** Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; et al.

**Published:** 2017-06-12

**arXiv ID:** 1706.03762

**Categories:** cs.CL, cs.LG

**Abstract:** The dominant sequence transduction models are based on complex recurrent or convolutional neural networks...

**PDF:** [Download](http://arxiv.org/pdf/1706.03762v5)

最佳实践

搜索策略

从宽泛开始- 使用通用术语以获得概览
迭代优化- 根据初始结果添加筛选条件
使用多个来源- 交叉验证发现
查阅近期论文- 使用日期过滤器获取最新研究

结果管理

保存搜索记录- 使用--output参数保存结果
整理下载文件- 建立逻辑清晰的目录结构
尽早导出引文- 边检索边生成BibTeX格式
追踪文献来源- 记录各篇论文的检索数据库

下载准则

遵守速率限制- 避免一次性下载数百篇论文
核查使用许可- 确认拥有论文使用权
按主题整理- 使用清晰的目录名称
保留元数据- 将JSON文件与PDF同步保存

引文规范

核验引文信息- 检查DOI和URL
使用标准格式- BibTeX用于LaTeX，RIS用于参考文献管理软件
包含摘要- 便于后续查阅
定期更新- 重新运行搜索以获取新论文

故障排除

安装问题

"缺少必要的依赖项"

# Install all dependencies
pip install arxiv scholarly pubmed-parser semanticscholar requests

# Or use virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install arxiv scholarly pubmed-parser semanticscholar requests

"未找到OpenClawCLI"

从以下网址下载https://clawhub.ai/
为您的操作系统（Windows/MacOS）安装

搜索问题

"未找到结果"

尝试更宽泛的搜索词
检查拼写和术语
移除限制性筛选条件
尝试不同的数据库

"超出速率限制"

等待几分钟后重试
减少--max-results值
间隔请求时间

"下载失败"

检查网络连接
部分论文可能没有可用的PDF
确认您有访问权限
尝试单独下载

API问题

"API超时"

服务可能暂时不可用
稍后重试
在相应服务网站上检查状态

"无效的API响应"

检查服务是否宕机
验证您的查询语法
尝试更简单的查询

限制

访问限制

并非所有论文都有可下载的PDF
部分内容需要机构访问权限
付费期刊可能只显示摘要
Google Scholar有严格的速率限制

数据完整性

引用次数可能已过时
并非每篇论文都提供所有元数据字段
一些较旧的论文可能记录不完整
预印本可能没有最终的出版信息

搜索功能

布尔运算符因数据源而异
各数据库之间没有统一的查询语法
有些数据库不支持所有筛选条件
结果可能与网页界面搜索不一致

法律注意事项

尊重版权和许可协议
请勿重新分发已下载的论文
遵守所在机构的访问政策
检查每个数据库的服务条款

命令参考

python scripts/research.py <source> "<query>" [OPTIONS]

SOURCES:
  arxiv              Search arXiv repository
  pubmed             Search PubMed database
  semantic           Search Semantic Scholar

REQUIRED:
  query              Search query string (in quotes)

GENERAL OPTIONS:
  -n, --max-results  Maximum results (default: 10, max: 100)
  -f, --format       Output format (text|json|bibtex|ris|markdown)
  -o, --output       Save to file path
  --sort-by          Sort by (relevance|date|citations)

FILTERING:
  --year             Filter by specific year (YYYY)
  --start-date       Start date (YYYY-MM-DD)
  --end-date         End date (YYYY-MM-DD)
  --author           Author name
  --min-citations    Minimum citation count

ARXIV-SPECIFIC:
  --category         arXiv category (e.g., cs.AI, cs.LG)

PUBMED-SPECIFIC:
  --publication-type Publication type filter
  --full-text        Include full text links

SEMANTIC-SPECIFIC:
  --include-references   Include paper references

DOWNLOAD:
  --download         Download paper PDFs
  --output-dir       Download directory (default: downloads/)

CITATIONS:
  --citations        Extract citations
  --citation-format  Citation format (bibtex|ris|json)

HELP:
  --help             Show all options

按使用场景分类的示例

快速搜索

# Find recent papers
python scripts/research.py arxiv "quantum computing"

# Search biomedical literature
python scripts/research.py pubmed "alzheimer disease"

综合性研究

# Search multiple sources
python scripts/research.py arxiv "neural networks" --max-results 30 --output arxiv.json
python scripts/research.py semantic "neural networks" --max-results 30 --output semantic.json

# Download important papers
python scripts/research.py arxiv "neural networks" --download --max-results 10

引用管理

# Generate BibTeX
python scripts/research.py arxiv "deep learning" --format bibtex --output dl_refs.bib

# Export to reference manager
python scripts/research.py pubmed "gene editing" --format ris --output genes.ris

追踪新研究

# This month's papers
python scripts/research.py arxiv "LLM" --start-date 2024-01-01 --sort-by date

# Recent highly-cited work
python scripts/research.py semantic "transformers" --year 2023 --min-citations 50

支持

如有问题或疑问：

查阅此文档
运行python scripts/research.py --help
验证依赖项是否已安装
查阅数据库特定文档

资源：

OpenClawCLI：https://clawhub.ai/
arXiv API：https://arxiv.org/help/api
PubMed API：https://www.ncbi.nlm.nih.gov/books/NBK25501/
Semantic Scholar API：https://api.semanticscholar.org/

免责申明

部分文章来自各大搜索引擎，如有侵权，请与我联系删除。

打赏

文章底部电脑广告

手机广告位-内容正文底部

标签

上一篇：Docker Compose技能使用说明下一篇：Clawbrowser技能使用说明

Academic Research Hub技能使用说明

学术研究枢纽

快速参考

核心功能

1. 多源搜索

2. 论文下载

3. 引文提取

4. 元数据检索

特定来源命令

arXiv 搜索

PubMed 搜索

语义学者搜索

基本选项

结果限制

输出格式

保存到文件

下载论文

高级功能

引文提取

日期筛选

作者搜索

排序选项

常用工作流程

文献综述

查找最新研究

高被引论文

作者发表历史

构建参考文献库

跨来源验证

输出格式示例

文本格式（默认）

JSON格式

BibTeX格式

RIS格式

Markdown格式

最佳实践

搜索策略

结果管理

下载准则

引文规范

故障排除

安装问题

搜索问题

API问题

限制

访问限制

数据完整性

搜索功能

法律注意事项

命令参考

按使用场景分类的示例

快速搜索

综合性研究

引用管理

追踪新研究

支持

相关文章

推荐文章

热门浏览

标签列表