<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ThinkChat2.0新版上線,更智能更精彩,支持會話、畫圖、視頻、閱讀、搜索等,送10W Token,即刻開啟你的AI之旅 廣告
                ## 案例網站`http://m.soxs.cc/shuku/` ## 邏輯: 1. 我們需要定義一個入口將url放進啟動程序 2. 我們寫一個獲取url方法,用之前的xpath選擇器進行獲取篩選, 3. 在主程序中我們調用上面的方法,如果有連接則在利用 `scrapy.Request` 進行訪問 4. yield scrapy.Request("url",callback = self.回調方法名) 5. 請求寫法固定 ``` import scrapy class booklist2Spider(scrapy.Spider): name = 'booklist2' allowed_domains = ['m.soxs.cc'] # 定義只爬取變量內的網站 start_urls = ["http://m.soxs.cc/"]# 定義爬取的url, # 程序入口 def parse(self, response): # 爬蟲啟動后進入parse方法 print("程序加載完成。。。") yield scrapy.Request("http://m.soxs.cc/shuku/",callback = self.next) # 固定寫法 nextLink是url ,getrepones是請求成功調用方法 # 這里是主程序用來處理每頁數據 def next(self, response): print(response) # 輸出爬取狀態 200為成功獲取內容 nextLink = self.getnextlink(response) if nextLink == False: print("到最后一頁了。。。") return yield scrapy.Request(nextLink,callback = self.next) # 獲取下一頁鏈接方法 def getnextlink(self, response): list = response.xpath("//div[@class='pagelist']/a") is_last = 1 for i in list: if i.xpath("text()")[0].extract() == "下一頁": url = "http://"+self.allowed_domains[0]+i.xpath("@href").extract_first() return url return False ``` 執行結果: ![](https://img.kancloud.cn/66/4f/664f649bdaca54593247e6350d2a064a_915x683.png)
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看