<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ThinkChat2.0新版上線,更智能更精彩,支持會話、畫圖、視頻、閱讀、搜索等,送10W Token,即刻開啟你的AI之旅 廣告
                ## 1. 擴展(Extensions) ### 1.1 Extensions的作用 > 摘取自官方文檔:http://scrapy-chs.readthedocs.io/zh_CN/1.0/topics/extensions.html#topics-extensions **scrapy的事件控制是通過信號(Signals)和擴展(Extensions)完成的** > 1. 擴展框架提供一個機制,使得你能將自定義功能綁定到Scrapy。 > 2. 擴展只是正常的類,它們在Scrapy啟動時被實例化、初始化。 > 3. Scrapy擴展(包括middlewares和pipelines)的主要入口是 from_crawler 類方法, 它接收一個 Crawler 類的實例.通過這個對象訪問settings,signals,stats,控制爬蟲的行為。 > 4. 如果 from_crawler 方法拋出 NotConfigured 異常, 擴展會被禁用。否則,擴展會被開啟。 ### 1.2 擴展例子(Sample extension) > * 這里我們將實現一個簡單的擴展來演示上面描述到的概念。 該擴展會在以下事件時記錄一條日志: * spider被打開 * spider被關閉 * 爬取了特定數量的條目(items) > 該擴展通過 MYEXT_ENABLED 配置項開啟, items的數量通過 MYEXT_ITEMCOUNT 配置項設置。 以下是擴展的代碼: ~~~ import logging from scrapy import signals from scrapy.exceptions import NotConfigured logger = logging.getLogger(__name__) class SpiderOpenCloseLogging(object): def __init__(self, item_count): self.item_count = item_count self.items_scraped = 0 @classmethod def from_crawler(cls, crawler): # first check if the extension should be enabled and raise # NotConfigured otherwise if not crawler.settings.getbool('MYEXT_ENABLED'): raise NotConfigured # get the number of items from settings item_count = crawler.settings.getint('MYEXT_ITEMCOUNT', 1000) # instantiate the extension object ext = cls(item_count) # connect the extension object to signals crawler.signals.connect(ext.spider_opened, signal=signals.spider_opened) crawler.signals.connect(ext.spider_closed, signal=signals.spider_closed) crawler.signals.connect(ext.item_scraped, signal=signals.item_scraped) # return the extension object return ext def spider_opened(self, spider): logger.info("opened spider %s", spider.name) def spider_closed(self, spider): logger.info("closed spider %s", spider.name) def item_scraped(self, item, spider): self.items_scraped += 1 if self.items_scraped % self.item_count == 0: logger.info("scraped %d items", self.items_scraped) ~~~
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看