docker构建scrapyd 镜像

pipidi

zhujingdi1998@gmail.com

1、编写Scrapy的配置文件

新建一个scrapyd.conf文件,填写配置内容,默认使用官网上的就可以。官网配置文件:

[scrapyd]
eggs_dir    = eggs
logs_dir    = logs
items_dir   =
jobs_to_keep = 5
dbs_dir     = dbs
max_proc    = 0
max_proc_per_cpu = 10
finished_to_keep = 100
poll_interval = 5.0
bind_address = 0.0.0.0
http_port   = 6800
debug       = off
runner      = scrapyd.runner
application = scrapyd.app.application
launcher    = scrapyd.launcher.Launcher
webroot     = scrapyd.website.Root

[services]
schedule.json     = scrapyd.webservice.Schedule
cancel.json       = scrapyd.webservice.Cancel
addversion.json   = scrapyd.webservice.AddVersion
listprojects.json = scrapyd.webservice.ListProjects
listversions.json = scrapyd.webservice.ListVersions
listspiders.json  = scrapyd.webservice.ListSpiders
delproject.json   = scrapyd.webservice.DeleteProject
delversion.json   = scrapyd.webservice.DeleteVersion
listjobs.json     = scrapyd.webservice.ListJobs
daemonstatus.json = scrapyd.webservice.DaemonStatus
rvice.DaemonStatus

新建requirements

`vim requirements.txt`

### Dockerfile

vi Dockerfile

     FROM python:3.5
    ADD . /code
    WORKDIR /code
    RUN pip install  -r ./requirements.txt
    EXPOSE 6800
    COPY ./scrapyd.conf /etc/scrapyd/
    CMD ["scrapyd"]

建立镜像

`docker build -t scrapyd:test .`

启动

`docker run -d -p 6800:6800 scrapyd`

### 调度爬虫项目

`curl http://ip:6800/schedule.json -d project=jdcrawler -d spider=DetailSpider`

查看正在运行的容器

`docker ps`

### 进入容器 sudo docker exec -it 90dfde684bc4 bash

### 部署爬虫项目
curl http://ip/schedule.json -d project=jdcrawler -d spider=DetailSpider

  • 登录阿里云Docker Registry $ sudo docker login --username=[user] registry.cn-hangzhou.aliyuncs.com 用于登录的用户名为阿里云账号全名,密码为开通服务时设置的密码。您可以在产品控制台首页修改登录密码。
  • 从Registry中拉取镜像 $ sudo docker pull registry.cn-hangzhou.aliyuncs.com/xiantang/xiantang:[镜像版本号]
  • 将镜像推送到Registry $ sudo docker login --username=[user] registry.cn-hangzhou.aliyuncs.com $ sudo docker tag [ImageId] registry.cn-hangzhou.aliyuncs.com/xiantang/xiantang:[镜像版本号] $ sudo docker push registry.cn-hangzhou.aliyuncs.com/xiantang/xiantan:[镜像版本号] 请根据实际镜像信息替换示例中的[ImageId]和[镜像版本号]参数。

阅读量