<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                合規國際互聯網加速 OSASE為企業客戶提供高速穩定SD-WAN國際加速解決方案。 廣告
                ### 1.說明 爬取今日頭條街拍美圖,并保存到MongoDB中 ### 2.準備 [安裝requests庫](/1kai-fa-huan-jing-pei-zhi/12-qing-qiu-ku-de-an-zhuang/121-requestsde-an-zhuang.md) ### 3.抓取分析 鏈接:[https://www.toutiao.com/search/?keyword=街拍](https://www.toutiao.com/search/?keyword=街拍) ![](/assets/6.4-1.png)打開開發者工具 &gt; Network面板&gt;選中XHR,篩選ajax請求 分析[https://www.toutiao.com/search\_content/?offset=0&format=json&keyword=街拍&autoload=true&count=20&cur\_tab=1&from=search\_tab](https://www.toutiao.com/search_content/?offset=0&format=json&keyword=街拍&autoload=true&count=20&cur_tab=1&from=search_tab)鏈接 可以看到有幾個參數,往下不停刷新,可以得到幾個重要參數的含義 offset:偏移量,每次刷新后,從第幾條開始顯示的數據 keyword:搜索關鍵字 count:顯示的數據條數 ### 4. 實戰演練 {#3-實戰演練} ``` import requests,os from urllib.parse import urlencode from multiprocessing import Pool from hashlib import md5 baseurl = "https://www.toutiao.com/search_content/?" headers = { 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 Safari/537.36', 'x-requested-with': 'XMLHttpRequest', } GROUP_START = 1 GROUP_END = 20 def get_page(offset=0): params = { 'offset': offset, 'format': 'json', 'keyword': '街拍', 'autoload': 'true', 'count': '20', 'cur_tab': '1', 'from': 'search_tab', } url = baseurl + urlencode(params) try: response = requests.get(url,headers=headers) if response.status_code == 200: return response.json() except requests.ConnectionError as e: print(e.args) def get_images(response): if response: items = response.get("data") if items: for item in items: title = item.get("title") images = item.get("image_list") if title and images: for image in images: yield { 'image': image.get("url"), 'title': title, } def save_images(item): title = item.get("title") if not os.path.exists(title): os.mkdir(title) try: image = item.get("image") response = requests.get("http:"+image) if response.status_code == 200: # 讀取二進制流數據 content = response.content # 利用md5函數判斷重復 filepath = "{0}/{1}.{2}".format(title,md5(content).hexdigest(),'jpg') if not os.path.exists(filepath): with open(filepath,'wb' ) as f: f.write(response.content) else: print("Already Download {}".format(filepath)) except requests.ConnectionError as e: print("Failed to Save Image") def main(offset=0): response = get_page(offset) for item in get_images(response): print(item) save_images(item) if __name__ == "__main__": # 開啟進程池 pool = Pool() groups = [x*20 for x in range(GROUP_START,GROUP_END+1)] print(groups) pool.map(main,groups) pool.close() pool.join() ``` ![](/assets/6.4-10.png)
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看