# httpclient6
<div><div><div style="font-family:Verdana, Arial, Tahoma, sans-serif;font-size:large;font-weight:bold;text-align:left;"><span style="font-size:19px;">第六章 高級主題</span></div><div style="font-family:Verdana, Arial, Tahoma, sans-serif;font-size:9.5pt;"><h3 style="font-family:Verdana, Arial, Tahoma, sans-serif;">6.1 自定義客戶端連接</h3><p style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">在特定條件下,也許需要來定制HTTP報文通過線路傳遞,越過了可能使用的HTTP參數來處理非標準不兼容行為的方式。比如,對于Web爬蟲,它可能需要強制HttpClient接受格式錯誤的響應頭部信息,來搶救報文的內容。</p><p style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">通常插入一個自定義的報文解析器的過程或定制連接實現需要幾個步驟:</p><p style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">提供一個自定義LineParser/LineFormatter接口實現。如果需要,實現報文解析/格式化邏輯。</p><blockquote style="background-color:#f9f9ff;font-family:Verdana, Arial, Tahoma, sans-serif;"><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">class MyLineParser extends BasicLineParser {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">@Override</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">public Header parseHeader(</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">final CharArrayBuffer buffer) throws ParseException {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">try {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">return super.parseHeader(buffer);</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">} catch (ParseException ex) {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">// 壓制ParseException異常</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">return new BasicHeader("invalid", buffer.toString());</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div></blockquote><p style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">提過一個自定義的OperatedClientConnection實現。替換需要自定義的默認請求/響應解析器,請求/響應格式化器。如果需要,實現不同的報文寫入/讀取代碼。</p><blockquote style="background-color:#f9f9ff;font-family:Verdana, Arial, Tahoma, sans-serif;"><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">class MyClientConnection extends DefaultClientConnection {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">@Override</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">protected HttpMessageParser createResponseParser(</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">final SessionInputBuffer buffer,</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">final HttpResponseFactory responseFactory,</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">final HttpParams params) {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">return new DefaultResponseParser(buffer,</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">new MyLineParser(),responseFactory,params);</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div></blockquote><p style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">為了創建新類的連接,提供一個自定義的ClientConnectionOperator接口實現。如果需要,實現不同的套接字初始化代碼。</p><blockquote style="background-color:#f9f9ff;font-family:Verdana, Arial, Tahoma, sans-serif;"><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">class MyClientConnectionOperator extends</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">DefaultClientConnectionOperator {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">public MyClientConnectionOperator(</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">final SchemeRegistry sr) {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">super(sr);</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">@Override</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">public OperatedClientConnection createConnection() {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">return new MyClientConnection();</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div></blockquote><p style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">為了創建新類的連接操作,提供自定義的ClientConnectionManager接口實現。</p><blockquote style="background-color:#f9f9ff;font-family:Verdana, Arial, Tahoma, sans-serif;"><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">class MyClientConnManager extends SingleClientConnManager {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">public MyClientConnManager(</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">final HttpParams params,</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">final SchemeRegistry sr) {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">super(params, sr);</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">@Override</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">protected ClientConnectionOperator createConnectionOperator(</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">final SchemeRegistry sr) {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">return new MyClientConnectionOperator(sr);</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div></blockquote><h3 style="font-family:Verdana, Arial, Tahoma, sans-serif;">6.2 有狀態的HTTP連接</h3><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HTTP
規范假設session狀態信息通常是以HTTP
cookie格式嵌入在HTTP報文中的,因此HTTP連接通常是無狀態的,這個假設在現實生活中通常是不對的。也有一些情況,當HTTP連接使用特定的
用戶標識或特定的安全上下文來創建時,因此不能和其它用戶共享,只能由該用戶重用。這樣的有狀態的HTTP連接的示例就是NTLM認證連接和使用客戶端證
書認證的SSL連接。</div><h4 style="font-family:Verdana, Arial, Tahoma, sans-serif;">6.2.1 用戶令牌處理器</h4><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HttpClient
依賴UserTokenHandler接口來決定給定的執行上下文是否是用戶指定的。如果這個上下文是用戶指定的或者如果上下文沒有包含任何資源或關于當
前用戶指定詳情而是null,令牌對象由這個處理器返回,期望唯一地標識當前的用戶。用戶令牌將被用來保證用戶指定資源不會和其它用戶來共享或重用。</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;"><p style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">如
果它可以從給定的執行上下文中來獲得,UserTokenHandler接口的默認實現是使用主類的一個實例來代表HTTP連接的狀態對象。
UserTokenHandler將會使用基于如NTLM或開啟的客戶端認證SSL會話認證模式的用戶的主連接。如果二者都不可用,那么就不會返回令牌。</p><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">如果默認的不能滿足它們的需要,用戶可以提供一個自定義的實現:</div><blockquote style="background-color:#f9f9ff;font-family:Verdana, Arial, Tahoma, sans-serif;"><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">DefaultHttpClient httpclient = new DefaultHttpClient();</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">httpclient.setUserTokenHandler(new UserTokenHandler() {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">public Object getUserToken(HttpContext context) {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">return context.getAttribute("my-token");</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">});</div></blockquote><h4 style="font-family:Verdana, Arial, Tahoma, sans-serif;">6.2.2 用戶令牌和執行上下文</h4><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">在HTTP請求執行的過程中,HttpClient添加了下列和用戶標識相關的對象到執行上下文中:</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;"><p style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">'http.user-token':對象實例代表真實的用戶標識,通常期望Principle接口的實例。</p><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">我們可以在請求被執行后,通過檢查本地HTTP上下文的內容,發現是否用于執行請求的連接是有狀態的。</div><blockquote style="background-color:#f9f9ff;font-family:Verdana, Arial, Tahoma, sans-serif;"><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">DefaultHttpClient httpclient = new DefaultHttpClient();</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HttpContext localContext = new BasicHttpContext();</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HttpGet httpget = new HttpGet("http://localhost:8080/");</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HttpResponse response = httpclient.execute(httpget, localContext);</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HttpEntity entity = response.getEntity();</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">if (entity != null) {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">entity.consumeContent();</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">Object userToken = localContext.getAttribute(ClientContext.USER_TOKEN);</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">System.out.println(userToken);</div></blockquote><h5 style="font-family:Verdana, Arial, Tahoma, sans-serif;">6.2.2.1 持久化有狀態的連接</h5><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">請注意帶有狀態對象的持久化連接僅當請求被執行時,相同狀態對象被綁定到執行上下文時可以被重用。所以,保證相同上下文重用于執行隨后的相同用戶,或用戶令牌綁定到之前請求執行上下文的HTTP請求是很重要的。</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;"><blockquote style="background-color:#f9f9ff;font-family:Verdana, Arial, Tahoma, sans-serif;"><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">DefaultHttpClient httpclient = new DefaultHttpClient();</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HttpContext localContext1 = new BasicHttpContext();</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HttpGet httpget1 = new HttpGet("http://localhost:8080/");</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HttpResponse response1 = httpclient.execute(httpget1, localContext1);</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HttpEntity entity1 = response1.getEntity();</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">if (entity1 != null) {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">entity1.consumeContent();</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">Principal principal = (Principal) localContext1.getAttribute(</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">ClientContext.USER_TOKEN);</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HttpContext localContext2 = new BasicHttpContext();</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">localContext2.setAttribute(ClientContext.USER_TOKEN, principal);</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HttpGet httpget2 = new HttpGet("http://localhost:8080/");</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HttpResponse response2 = httpclient.execute(httpget2, localContext2);</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">HttpEntity entity2 = response2.getEntity();</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">if (entity2 != null) {</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">entity2.consumeContent();</div><div style="color:inherit;font-family:inherit;font-size:inherit;font-style:inherit;">}</div></blockquote></div></div></div></div></div><br></div>
- Introduction
- 爬蟲相關技能介紹
- 爬蟲簡單介紹
- 爬蟲涉及到的知識點
- 爬蟲用途
- 爬蟲流程介紹
- 需求描述
- Http請求處理
- http基礎知識介紹
- http狀態碼
- httpheader
- java原生態處理http
- URL類
- 獲取URL請求狀態
- 模擬Http請求
- apache httpclient
- Httpclient1
- httpclient2
- httpclient3
- httpclient4
- httpclient5
- httpclient6
- okhttp
- OKhttp使用教程
- 技術使用
- java執行javascript
- 網頁解析
- Xpath介紹
- HtmlCleaner
- HtmlCleaner介紹
- HtmlCleaner使用
- HtmlParser
- HtmlParser介紹
- Jsoup
- 解析和遍歷一個HTML文檔
- 解析一個HTML字符串
- 解析一個body片斷
- 從一個URL加載一個Document
- 從一個文件加載一個文檔
- 使用DOM方法來遍歷一個文檔
- 使用選擇器語法來查找元素
- 從元素抽取屬性,文本和HTML
- 處理URLs
- 示例程序 獲取所有鏈接
- 設置屬性的值
- 設置一個元素的HTML內容
- 消除不受信任的HTML (來防止XSS攻擊)
- 正則表達式
- elasticsearch筆記
- 下載安裝elasticsearch
- 檢查es服務健康