【Scrapy】命令行工具

Posted Aug 16, 2020

By Zhao Zhengyang

1 min read

【Scrapy】命令行工具

Scrapy提供了一个命令行工具scrapy，位于{Python安装目录}\Scripts\scrapy.exe，对应的模块：scrapy.cmdline

无参数运行该命令将打印帮助信息：

D:\PyCharm\projects>scrapy
Scrapy 2.3.0 - no active project

Usage:
  scrapy <command> [options] [args]

Available commands:
...

如果当前目录在一个Scrapy项目中则会打印项目名称，并且可以使用项目相关的命令：

D:\PyCharm\projects\myproject>scrapy
Scrapy 2.3.0 - project: myproject

Usage:
  scrapy <command> [options] [args]

Available commands:
...

使用scrapy <command> -h可查看某个命令的具体帮助

常用命令

scrapy startproject <project_name> [project_dir]

如果未指定项目路径则与项目名称相同

scrapy genspider [-t <template>] <name> <domain>

在当前目录中或当前项目的spiders目录中创建一个新的爬虫，<name>参数用于设置爬虫的name属性，<domain>参数用于生成爬虫的allowed_domains和start_urls属性

可用的模板在{Python安装目录}\scrapy\templates\spiders目录下，默认为basic

scrapy shell [url]

使用给定的URL发送一个请求，并进入交互式控制台

scrapy runspider <spider_file>

scrapy crawl [options] <spider>

运行指定名称的爬虫，可通过-a NAME=VALUE选项指定Spider参数

scrapy list

This post is licensed under CC BY 4.0 by the author.