解析請求頭 · Nginx開發從入門到精通

解析請求頭? 在ngx_http_process_request_line函數中，解析完請求行之后，如果請求行的uri里面包含了域名部分，則將其保存在請求結構的headers_in成員的server字段，headers_in用來保存所有請求頭，它的類型為ngx_http_headers_in_t： typedef struct { ngx_list_t headers; ngx_table_elt_t *host; ngx_table_elt_t *connection; ngx_table_elt_t *if_modified_since; ngx_table_elt_t *if_unmodified_since; ngx_table_elt_t *user_agent; ngx_table_elt_t *referer; ngx_table_elt_t *content_length; ngx_table_elt_t *content_type; ngx_table_elt_t *range; ngx_table_elt_t *if_range; ngx_table_elt_t *transfer_encoding; ngx_table_elt_t *expect; #if (NGX_HTTP_GZIP) ngx_table_elt_t *accept_encoding; ngx_table_elt_t *via; #endif ngx_table_elt_t *authorization; ngx_table_elt_t *keep_alive; #if (NGX_HTTP_PROXY || NGX_HTTP_REALIP || NGX_HTTP_GEO) ngx_table_elt_t *x_forwarded_for; #endif #if (NGX_HTTP_REALIP) ngx_table_elt_t *x_real_ip; #endif #if (NGX_HTTP_HEADERS) ngx_table_elt_t *accept; ngx_table_elt_t *accept_language; #endif #if (NGX_HTTP_DAV) ngx_table_elt_t *depth; ngx_table_elt_t *destination; ngx_table_elt_t *overwrite; ngx_table_elt_t *date; #endif ngx_str_t user; ngx_str_t passwd; ngx_array_t cookies; ngx_str_t server; off_t content_length_n; time_t keep_alive_n; unsigned connection_type:2; unsigned msie:1; unsigned msie6:1; unsigned opera:1; unsigned gecko:1; unsigned chrome:1; unsigned safari:1; unsigned konqueror:1; } ngx_http_headers_in_t; 接著，該函數會檢查進來的請求是否使用的是http0.9，如果是的話則使用從請求行里得到的域名，調用ngx_http_find_virtual_server（）函數來查找用來處理該請求的虛擬服務器配置，之前通過端口和地址找到的默認配置不再使用，找到相應的配置之后，則直接調用ngx_http_process_request（）函數處理該請求，因為http0.9是最原始的http協議，它里面沒有定義任何請求頭，顯然就不需要讀取請求頭的操作。 if (r->host_start && r->host_end) { host = r->host_start; n = ngx_http_validate_host(r, &host, r->host_end - r->host_start, 0); if (n == 0) { ngx_log_error(NGX_LOG_INFO, c->log, 0, "client sent invalid host in request line"); ngx_http_finalize_request(r, NGX_HTTP_BAD_REQUEST); return; } if (n < 0) { ngx_http_close_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR); return; } r->headers_in.server.len = n; r->headers_in.server.data = host; } if (r->http_version < NGX_HTTP_VERSION_10) { if (ngx_http_find_virtual_server(r, r->headers_in.server.data, r->headers_in.server.len) == NGX_ERROR) { ngx_http_close_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR); return; } ngx_http_process_request(r); return; } 當然，如果是1.0或者更新的http協議，接下來要做的就是讀取請求頭了，首先nginx會為請求頭分配空間，ngx_http_headers_in_t結構的headers字段為一個鏈表結構，它被用來保存所有請求頭，初始為它分配了20個節點，每個節點的類型為ngx_table_elt_t，保存請求頭的name/value值對，還可以看到ngx_http_headers_in_t結構有很多類型為ngx_table_elt_t*的指針成員，而且從它們的命名可以看出是一些常見的請求頭名字，nginx對這些常用的請求頭在ngx_http_headers_in_t結構里面保存了一份引用，后續需要使用的話，可以直接通過這些成員得到，另外也事先為cookie頭分配了2個元素的數組空間，做完這些內存準備工作之后，該請求對應的讀事件結構的處理函數被設置為ngx_http_process_request_headers，并隨后馬上調用了該函數。 if (ngx_list_init(&r->headers_in.headers, r->pool, 20, sizeof(ngx_table_elt_t)) != NGX_OK) { ngx_http_close_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR); return; } if (ngx_array_init(&r->headers_in.cookies, r->pool, 2, sizeof(ngx_table_elt_t *)) != NGX_OK) { ngx_http_close_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR); return; } c->log->action = "reading client request headers"; rev->handler = ngx_http_process_request_headers; ngx_http_process_request_headers(rev); ngx_http_process_request_headers函數循環的讀取所有的請求頭，并保存和初始化和請求頭相關的結構，下面詳細分析一下該函數：因為nginx對讀取請求頭有超時限制，ngx_http_process_request_headers函數作為讀事件處理函數，一并處理了超時事件，如果讀超時了，nginx直接給該請求返回408錯誤： if (rev->timedout) { ngx_log_error(NGX_LOG_INFO, c->log, NGX_ETIMEDOUT, "client timed out"); c->timedout = 1; ngx_http_close_request(r, NGX_HTTP_REQUEST_TIME_OUT); return; } 讀取和解析請求頭的邏輯和處理請求行差不多，總的流程也是循環的調用ngx_http_read_request_header（）函數讀取數據，然后再調用一個解析函數來從讀取的數據中解析請求頭，直到解析完所有請求頭，或者發生解析錯誤為主。當然由于涉及到網絡io，這個流程可能發生在多個io事件的上下文中。接著來細看該函數，先調用了ngx_http_read_request_header（）函數讀取數據，如果當前連接并沒有數據過來，再直接返回，等待下一次讀事件到來，如果讀到了一些數據則調用ngx_http_parse_header_line（）函數來解析，同樣的該解析函數實現為一個有限狀態機，邏輯很簡單，只是根據http協議來解析請求頭，每次調用該函數最多解析出一個請求頭，該函數返回4種不同返回值，表示不同解析結果： 1，返回NGX_OK，表示解析出了一行請求頭，這時還要判斷解析出的請求頭名字里面是否有非法字符，名字里面合法的字符包括字母，數字和連字符（-），另外如果設置了underscores_in_headers指令為on，則下劃線也是合法字符，但是nginx默認下劃線不合法，當請求頭里面包含了非法的字符，nginx默認只是忽略這一行請求頭；如果一切都正常，nginx會將該請求頭及請求頭名字的hash值保存在請求結構體的headers_in成員的headers鏈表,而且對于一些常見的請求頭，如Host，Connection，nginx采用了類似于配置指令的方式，事先給這些請求頭分配了一個處理函數，當解析出一個請求頭時，會檢查該請求頭是否有設置處理函數，有的話則調用之，nginx所有有處理函數的請求頭都記錄在ngx_http_headers_in全局數組中： typedef struct { ngx_str_t name; ngx_uint_t offset; ngx_http_header_handler_pt handler; } ngx_http_header_t; ngx_http_header_t ngx_http_headers_in[] = { { ngx_string("Host"), offsetof(ngx_http_headers_in_t, host), ngx_http_process_host }, { ngx_string("Connection"), offsetof(ngx_http_headers_in_t, connection), ngx_http_process_connection }, { ngx_string("If-Modified-Since"), offsetof(ngx_http_headers_in_t, if_modified_since), ngx_http_process_unique_header_line }, { ngx_string("If-Unmodified-Since"), offsetof(ngx_http_headers_in_t, if_unmodified_since), ngx_http_process_unique_header_line }, { ngx_string("User-Agent"), offsetof(ngx_http_headers_in_t, user_agent), ngx_http_process_user_agent }, { ngx_string("Referer"), offsetof(ngx_http_headers_in_t, referer), ngx_http_process_header_line }, { ngx_string("Content-Length"), offsetof(ngx_http_headers_in_t, content_length), ngx_http_process_unique_header_line }, { ngx_string("Content-Type"), offsetof(ngx_http_headers_in_t, content_type), ngx_http_process_header_line }, { ngx_string("Range"), offsetof(ngx_http_headers_in_t, range), ngx_http_process_header_line }, { ngx_string("If-Range"), offsetof(ngx_http_headers_in_t, if_range), ngx_http_process_unique_header_line }, { ngx_string("Transfer-Encoding"), offsetof(ngx_http_headers_in_t, transfer_encoding), ngx_http_process_header_line }, { ngx_string("Expect"), offsetof(ngx_http_headers_in_t, expect), ngx_http_process_unique_header_line }, #if (NGX_HTTP_GZIP) { ngx_string("Accept-Encoding"), offsetof(ngx_http_headers_in_t, accept_encoding), ngx_http_process_header_line }, { ngx_string("Via"), offsetof(ngx_http_headers_in_t, via), ngx_http_process_header_line }, #endif { ngx_string("Authorization"), offsetof(ngx_http_headers_in_t, authorization), ngx_http_process_unique_header_line }, { ngx_string("Keep-Alive"), offsetof(ngx_http_headers_in_t, keep_alive), ngx_http_process_header_line }, #if (NGX_HTTP_PROXY || NGX_HTTP_REALIP || NGX_HTTP_GEO) { ngx_string("X-Forwarded-For"), offsetof(ngx_http_headers_in_t, x_forwarded_for), ngx_http_process_header_line }, #endif #if (NGX_HTTP_REALIP) { ngx_string("X-Real-IP"), offsetof(ngx_http_headers_in_t, x_real_ip), ngx_http_process_header_line }, #endif #if (NGX_HTTP_HEADERS) { ngx_string("Accept"), offsetof(ngx_http_headers_in_t, accept), ngx_http_process_header_line }, { ngx_string("Accept-Language"), offsetof(ngx_http_headers_in_t, accept_language), ngx_http_process_header_line }, #endif #if (NGX_HTTP_DAV) { ngx_string("Depth"), offsetof(ngx_http_headers_in_t, depth), ngx_http_process_header_line }, { ngx_string("Destination"), offsetof(ngx_http_headers_in_t, destination), ngx_http_process_header_line }, { ngx_string("Overwrite"), offsetof(ngx_http_headers_in_t, overwrite), ngx_http_process_header_line }, { ngx_string("Date"), offsetof(ngx_http_headers_in_t, date), ngx_http_process_header_line }, #endif { ngx_string("Cookie"), 0, ngx_http_process_cookie }, { ngx_null_string, 0, NULL } }; ngx_http_headers_in數組當前包含了25個常用的請求頭，每個請求頭都設置了一個處理函數，其中一部分請求頭設置的是公共處理函數，這里有2個公共處理函數，ngx_http_process_header_line和ngx_http_process_unique_header_line。先來看一下處理函數的函數指針定義： typedef ngx_int_t (*ngx_http_header_handler_pt)(ngx_http_request_t *r, ngx_table_elt_t *h, ngx_uint_t offset); 它有3個參數，r為對應的請求結構，h為指向該請求頭在headers_in.headers鏈表中對應節點的指針，offset為該請求頭對應字段在ngx_http_headers_in_t結構中的偏移。再來看ngx_http_process_header_line函數： static ngx_int_t ngx_http_process_header_line(ngx_http_request_t *r, ngx_table_elt_t *h, ngx_uint_t offset) { ngx_table_elt_t **ph; ph = (ngx_table_elt_t **) ((char *) &r->headers_in + offset); if (*ph == NULL) { *ph = h; } return NGX_OK; } 這個函數只是簡單將該請求頭在ngx_http_headers_in_t結構中保存一份引用。ngx_http_process_unique_header_line功能類似，不同點在于該函數會檢查這個請求頭是否是重復的，如果是的話，則給該請求返回400錯誤。 ngx_http_headers_in數組中剩下的請求頭都有自己特殊的處理函數，這些特殊的函數根據對應的請求頭有一些特殊的處理，下面拿Host頭的處理函數ngx_http_process_host做一下介紹： static ngx_int_t ngx_http_process_host(ngx_http_request_t *r, ngx_table_elt_t *h, ngx_uint_t offset) { u_char *host; ssize_t len; if (r->headers_in.host == NULL) { r->headers_in.host = h; } host = h->value.data; len = ngx_http_validate_host(r, &host, h->value.len, 0); if (len == 0) { ngx_log_error(NGX_LOG_INFO, r->connection->log, 0, "client sent invalid host header"); ngx_http_finalize_request(r, NGX_HTTP_BAD_REQUEST); return NGX_ERROR; } if (len < 0) { ngx_http_close_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR); return NGX_ERROR; } if (r->headers_in.server.len) { return NGX_OK; } r->headers_in.server.len = len; r->headers_in.server.data = host; return NGX_OK; } 此函數的目的也是保存Host頭的快速引用，它會對Host頭的值做一些合法性檢查，并從中解析出域名，保存在headers_in.server字段，實際上前面在解析請求行時，headers_in.server可能已經被賦值為從請求行中解析出來的域名，根據http協議的規范，如果請求行中的uri帶有域名的話，則域名以它為準，所以這里需檢查一下headers_in.server是否為空，如果不為空則不需要再賦值。其他請求頭的特殊處理函數，不再做介紹，大致都是根據該請求頭在http協議中規定的意義及其值設置請求的一些屬性，必備后續使用。對一個合法的請求頭的處理大致為如上所述； 2，返回NGX_AGAIN，表示當前接收到的數據不夠，一行請求頭還未結束，需要繼續下一輪循環。在下一輪循環中，nginx首先檢查請求頭緩沖區header_in是否已滿，如夠滿了，則調用ngx_http_alloc_large_header_buffer（）函數分配更多緩沖區，下面分析一下ngx_http_alloc_large_header_buffer函數： static ngx_int_t ngx_http_alloc_large_header_buffer(ngx_http_request_t *r, ngx_uint_t request_line) { u_char *old, *new; ngx_buf_t *b; ngx_http_connection_t *hc; ngx_http_core_srv_conf_t *cscf; ngx_log_debug0(NGX_LOG_DEBUG_HTTP, r->connection->log, 0, "http alloc large header buffer"); /* * 在解析請求行階段，如果客戶端在發送請求行之前發送了大量回車換行符將 * 緩沖區塞滿了，針對這種情況，nginx只是簡單的重置緩沖區，丟棄這些垃圾 * 數據，不需要分配更大的內存。 */ if (request_line && r->state == 0) { /* the client fills up the buffer with "\r\n" */ r->request_length += r->header_in->end - r->header_in->start; r->header_in->pos = r->header_in->start; r->header_in->last = r->header_in->start; return NGX_OK; } /* 保存請求行或者請求頭在舊緩沖區中的起始地址 */ old = request_line ? r->request_start : r->header_name_start; cscf = ngx_http_get_module_srv_conf(r, ngx_http_core_module); /* 如果一個大緩沖區還裝不下請求行或者一個請求頭，則返回錯誤 */ if (r->state != 0 && (size_t) (r->header_in->pos - old) >= cscf->large_client_header_buffers.size) { return NGX_DECLINED; } hc = r->http_connection; /* 首先在ngx_http_connection_t結構中查找是否有空閑緩沖區，有的話，直接取之 */ if (hc->nfree) { b = hc->free[--hc->nfree]; ngx_log_debug2(NGX_LOG_DEBUG_HTTP, r->connection->log, 0, "http large header free: %p %uz", b->pos, b->end - b->last); /* 檢查給該請求分配的請求頭緩沖區個數是否已經超過限制，默認最大個數為4個 */ } else if (hc->nbusy < cscf->large_client_header_buffers.num) { if (hc->busy == NULL) { hc->busy = ngx_palloc(r->connection->pool, cscf->large_client_header_buffers.num * sizeof(ngx_buf_t *)); if (hc->busy == NULL) { return NGX_ERROR; } } /* 如果還沒有達到最大分配數量，則分配一個新的大緩沖區 */ b = ngx_create_temp_buf(r->connection->pool, cscf->large_client_header_buffers.size); if (b == NULL) { return NGX_ERROR; } ngx_log_debug2(NGX_LOG_DEBUG_HTTP, r->connection->log, 0, "http large header alloc: %p %uz", b->pos, b->end - b->last); } else { /* 如果已經達到最大的分配限制，則返回錯誤 */ return NGX_DECLINED; } /* 將從空閑隊列取得的或者新分配的緩沖區加入已使用隊列 */ hc->busy[hc->nbusy++] = b; /* * 因為nginx中，所有的請求頭的保存形式都是指針（起始和結束地址）， * 所以一行完整的請求頭必須放在連續的內存塊中。如果舊的緩沖區不能 * 再放下整行請求頭，則分配新緩沖區，并從舊緩沖區拷貝已經讀取的部分請求頭， * 拷貝完之后，需要修改所有相關指針指向到新緩沖區。 * status為0表示解析完一行請求頭之后，緩沖區正好被用完，這種情況不需要拷貝 */ if (r->state == 0) { /* * r->state == 0 means that a header line was parsed successfully * and we do not need to copy incomplete header line and * to relocate the parser header pointers */ r->request_length += r->header_in->end - r->header_in->start; r->header_in = b; return NGX_OK; } ngx_log_debug1(NGX_LOG_DEBUG_HTTP, r->connection->log, 0, "http large header copy: %d", r->header_in->pos - old); r->request_length += old - r->header_in->start; new = b->start; /* 拷貝舊緩沖區中不完整的請求頭 */ ngx_memcpy(new, old, r->header_in->pos - old); b->pos = new + (r->header_in->pos - old); b->last = new + (r->header_in->pos - old); /* 修改相應的指針指向新緩沖區 */ if (request_line) { r->request_start = new; if (r->request_end) { r->request_end = new + (r->request_end - old); } r->method_end = new + (r->method_end - old); r->uri_start = new + (r->uri_start - old); r->uri_end = new + (r->uri_end - old); if (r->schema_start) { r->schema_start = new + (r->schema_start - old); r->schema_end = new + (r->schema_end - old); } if (r->host_start) { r->host_start = new + (r->host_start - old); if (r->host_end) { r->host_end = new + (r->host_end - old); } } if (r->port_start) { r->port_start = new + (r->port_start - old); r->port_end = new + (r->port_end - old); } if (r->uri_ext) { r->uri_ext = new + (r->uri_ext - old); } if (r->args_start) { r->args_start = new + (r->args_start - old); } if (r->http_protocol.data) { r->http_protocol.data = new + (r->http_protocol.data - old); } } else { r->header_name_start = new; r->header_name_end = new + (r->header_name_end - old); r->header_start = new + (r->header_start - old); r->header_end = new + (r->header_end - old); } r->header_in = b; return NGX_OK; } 當ngx_http_alloc_large_header_buffer函數返回NGX_DECLINED時，表示客戶端發送了一行過大的請求頭，或者是整個請求頭部超過了限制，nginx會返回494錯誤，注意到nginx在返回494錯誤之前將請求的lingering_close標識置為了1，這樣做的目的是在返回響應之前丟棄掉客戶端發過來的其他數據； 3，返回NGX_HTTP_PARSE_INVALID_HEADER，表示請求頭解析過程中遇到錯誤，一般為客戶端發送了不符合協議規范的頭部，此時nginx返回400錯誤； 4，返回NGX_HTTP_PARSE_HEADER_DONE，表示所有請求頭已經成功的解析，這時請求的狀態被設置為NGX_HTTP_PROCESS_REQUEST_STATE，意味著結束了請求讀取階段，正式進入了請求處理階段，但是實際上請求可能含有請求體，nginx在請求讀取階段并不會去讀取請求體，這個工作交給了后續的請求處理階段的模塊，這樣做的目的是nginx本身并不知道這些請求體是否有用，如果后續模塊并不需要的話，一方面請求體一般較大，如果全部讀取進內存，則白白耗費大量的內存空間，另一方面即使nginx將請求體寫進磁盤，但是涉及到磁盤io，會耗費比較多時間。所以交由后續模塊來決定讀取還是丟棄請求體是最明智的辦法。讀取完請求頭之后，nginx調用了ngx_http_process_request_header（）函數，這個函數主要做了兩個方面的事情，一是調用ngx_http_find_virtual_server（）函數查找虛擬服務器配置；二是對一些請求頭做一些協議的檢查。比如對那些使用http1.1協議但是卻沒有發送Host頭的請求，nginx給這些請求返回400錯誤。還有nginx現在的版本并不支持chunked格式的輸入，如果某些請求申明自己使用了chunked格式的輸入（請求帶有值為chunked的transfer_encoding頭部)，nginx給這些請求返回411錯誤。等等。最后調用ngx_http_process_request（）函數處理請求,至此，nginx請求頭接收流程就介紹完畢。 ### 請求體讀取(100%)[](http://tengine.taobao.org/book/chapter_12.html#id5 "永久鏈接至標題") 上節說到nginx核心本身不會主動讀取請求體，這個工作是交給請求處理階段的模塊來做，但是nginx核心提供了ngx_http_read_client_request_body()接口來讀取請求體，另外還提供了一個丟棄請求體的接口-ngx_http_discard_request_body()，在請求執行的各個階段中，任何一個階段的模塊如果對請求體感興趣或者希望丟掉客戶端發過來的請求體，可以分別調用這兩個接口來完成。這兩個接口是nginx核心提供的處理請求體的標準接口，如果希望配置文件中一些請求體相關的指令（比如client_body_in_file_only，client_body_buffer_size等）能夠預期工作，以及能夠正常使用nginx內置的一些和請求體相關的變量（比如$request_body和$request_body_file），一般來說所有模塊都必須調用這些接口來完成相應操作，如果需要自定義接口來處理請求體，也應盡量兼容nginx默認的行為。 #### 讀取請求體[](http://tengine.taobao.org/book/chapter_12.html#id6 "永久鏈接至標題") 請求體的讀取一般發生在nginx的content handler中，一些nginx內置的模塊，比如proxy模塊，fastcgi模塊，uwsgi模塊等，這些模塊的行為必須將客戶端過來的請求體（如果有的話）以相應協議完整的轉發到后端服務進程，所有的這些模塊都是調用了ngx_http_read_client_request_body()接口來完成請求體讀取。值得注意的是這些模塊會把客戶端的請求體完整的讀取后才開始往后端轉發數據。由于內存的限制，ngx_http_read_client_request_body()接口讀取的請求體會部分或者全部寫入一個臨時文件中，根據請求體的大小以及相關的指令配置，請求體可能完整放置在一塊連續內存中，也可能分別放置在兩塊不同內存中，還可能全部存在一個臨時文件中，最后還可能一部分在內存，剩余部分在臨時文件中。下面先介紹一下和這些不同存儲行為相關的指令： <table class="docutils field-list" frame="void" rules="none" style="margin: 0px -0.5em; border: 0px;"><colgroup><col class="field-name"/><col class="field-body"/></colgroup><tbody valign="top"><tr class="field-odd field"><th class="field-name" colspan="2" style="padding: 1px 8px 1px 5px; border: 0px !important;">client_body_buffer_size:</th></tr><tr class="field-odd field"><td style="padding: 1px 8px 1px 5px; border: 0px !important;">?</td><td class="field-body" style="padding: 1px 8px 1px 5px; border: 0px !important;">設置緩存請求體的buffer大小，默認為系統頁大小的2倍，當請求體的大小超過此大小時，nginx會把請求體寫入到臨時文件中。可以根據業務需求設置合適的大小，盡量避免磁盤io操作;</td></tr><tr class="field-even field"><th class="field-name" colspan="2" style="padding: 1px 8px 1px 5px; border: 0px !important;">client_body_in_single_buffer:</th></tr><tr class="field-even field"><td style="padding: 1px 8px 1px 5px; border: 0px !important;">?</td><td class="field-body" style="padding: 1px 8px 1px 5px; border: 0px !important;">指示是否將請求體完整的存儲在一塊連續的內存中，默認為off，如果此指令被設置為on，則nginx會保證請求體在不大于client_body_buffer_size設置的值時，被存放在一塊連續的內存中，但超過大小時會被整個寫入一個臨時文件;</td></tr><tr class="field-odd field"><th class="field-name" colspan="2" style="padding: 1px 8px 1px 5px; border: 0px !important;">client_body_in_file_only:</th></tr><tr class="field-odd field"><td style="padding: 1px 8px 1px 5px; border: 0px !important;">?</td><td class="field-body" style="padding: 1px 8px 1px 5px; border: 0px !important;">設置是否總是將請求體保存在臨時文件中，默認為off，當此指定被設置為on時，即使客戶端顯式指示了請求體長度為0時，nginx還是會為請求創建一個臨時文件。</td></tr></tbody></table> 接著介紹ngx_http_read_client_request_body()接口的實現，它的定義如下： ngx_int_t ngx_http_read_client_request_body(ngx_http_request_t *r, ngx_http_client_body_handler_pt post_handler) 該接口有2個參數，第1個為指向請求結構的指針，第2個為一個函數指針，當請求體讀完時，它會被調用。之前也說到根據nginx現有行為，模塊邏輯會在請求體讀完后執行，這個回調函數一般就是模塊的邏輯處理函數。ngx_http_read_client_request_body()函數首先將參數r對應的主請求的引用加1，這樣做的目的和該接口被調用的上下文有關，一般而言，模塊是在content handler中調用此接口，一個典型的調用如下： static ngx_int_t ngx_http_proxy_handler(ngx_http_request_t *r) { ... rc = ngx_http_read_client_request_body(r, ngx_http_upstream_init); if (rc >= NGX_HTTP_SPECIAL_RESPONSE) { return rc; } return NGX_DONE; } 上面的代碼是在porxy模塊的content handler，ngx_http_proxy_handler()中調用了ngx_http_read_client_request_body()函數，其中ngx_http_upstream_init()被作為回調函數傳入進接口中，另外nginx中模塊的content handler調用的上下文如下： ngx_int_t ngx_http_core_content_phase(ngx_http_request_t *r, ngx_http_phase_handler_t *ph) { ... if (r->content_handler) { r->write_event_handler = ngx_http_request_empty_handler; ngx_http_finalize_request(r, r->content_handler(r)); return NGX_OK; } ... } 上面的代碼中，content handler調用之后，它的返回值作為參數調用了ngx_http_finalize_request()函數，在請求體沒有被接收完全時，ngx_http_read_client_request_body()函數返回值為NGX_AGAIN，此時content handler，比如ngx_http_proxy_handler()會返回NGX_DONE，而NGX_DONE作為參數傳給ngx_http_finalize_request()函數會導致主請求的引用計數減1，所以正好抵消了ngx_http_read_client_request_body()函數開頭對主請求計數的加1。接下來回到ngx_http_read_client_request_body()函數，它會檢查該請求的請求體是否已經被讀取或者被丟棄了，如果是的話，則直接調用回調函數并返回NGX_OK，這里實際上是為子請求檢查，子請求是nginx中的一個概念，nginx中可以在當前請求中發起另外一個或多個全新的子請求來訪問其他的location，關于子請求的具體介紹會在后面的章節作詳細分析，一般而言子請求不需要自己去讀取請求體。函數接著調用ngx_http_test_expect()檢查客戶端是否發送了Expect: 100-continue頭，是的話則給客戶端回復”HTTP/1.1 100 Continue”，根據http 1.1協議，客戶端可以發送一個Expect頭來向服務器表明期望發送請求體，服務器如果允許客戶端發送請求體，則會回復”HTTP/1.1 100 Continue”，客戶端收到時，才會開始發送請求體。接著繼續為接收請求體做準備工作，分配一個ngx_http_request_body_t結構，并保存在r->request_body，這個結構用來保存請求體讀取過程用到的緩存引用，臨時文件引用，剩余請求體大小等信息，它的定義如下: typedef struct { ngx_temp_file_t *temp_file; ngx_chain_t *bufs; ngx_buf_t *buf; off_t rest; ngx_chain_t *to_write; ngx_http_client_body_handler_pt post_handler; } ngx_http_request_body_t; | temp_file: | 指向儲存請求體的臨時文件的指針； | |-----|-----| | bufs: | 指向保存請求體的鏈表頭； | | buf: | 指向當前用于保存請求體的內存緩存； | | rest: | 當前剩余的請求體大小； | | post_handler: | 保存傳給ngx_http_read_client_request_body()函數的回調函數。 | 做好準備工作之后，函數開始檢查請求是否帶有content_length頭，如果沒有該頭或者客戶端發送了一個值為0的content_length頭，表明沒有請求體，這時直接調用回調函數并返回NGX_OK即可。當然如果client_body_in_file_only指令被設置為on，且content_length為0時，該函數在調用回調函數之前，會創建一個空的臨時文件。進入到函數下半部分，表明客戶端請求確實表明了要發送請求體，該函數會先檢查是否在讀取請求頭時預讀了請求體，這里的檢查是通過判斷保存請求頭的緩存(r->header_in)中是否還有未處理的數據。如果有預讀數據，則分配一個ngx_buf_t結構，并將r->header_in中的預讀數據保存在其中，并且如果r->header_in中還有剩余空間，并且能夠容下剩余未讀取的請求體，這些空間將被繼續使用，而不用分配新的緩存，當然甚至如果請求體已經被整個預讀了，則不需要繼續處理了，此時調用回調函數后返回。如果沒有預讀數據或者預讀不完整，該函數會分配一塊新的內存（除非r->header_in還有足夠的剩余空間），另外如果request_body_in_single_buf指令被設置為no，則預讀的數據會被拷貝進新開辟的內存塊中，真正讀取請求體的操作是在ngx_http_do_read_client_request_body()函數，該函數循環的讀取請求體并保存在緩存中，如果緩存被寫滿了，其中的數據會被清空并寫回到臨時文件中。當然這里有可能不能一次將數據讀到，該函數會掛載讀事件并設置讀事件handler為ngx_http_read_client_request_body_handler，另外nginx核心對兩次請求體的讀事件之間也做了超時設置，client_body_timeout指令可以設置這個超時時間，默認為60秒，如果下次讀事件超時了，nginx會返回408給客戶端。最終讀完請求體后，ngx_http_do_read_client_request_body()會根據配置，將請求體調整到預期的位置(內存或者文件)，所有情況下請求體都可以從r->request_body的bufs鏈表得到，該鏈表最多可能有2個節點，每個節點為一個buffer，但是這個buffer的內容可能是保存在內存中，也可能是保存在磁盤文件中。另外$request_body變量只在當請求體已經被讀取并且是全部保存在內存中，才能取得相應的數據。 #### 丟棄請求體[](http://tengine.taobao.org/book/chapter_12.html#id7 "永久鏈接至標題") 一個模塊想要主動的丟棄客戶端發過的請求體，可以調用nginx核心提供的ngx_http_discard_request_body()接口，主動丟棄的原因可能有很多種，如模塊的業務邏輯壓根不需要請求體，客戶端發送了過大的請求體，另外為了兼容http1.1協議的pipeline請求，模塊有義務主動丟棄不需要的請求體。總之為了保持良好的客戶端兼容性，nginx必須主動丟棄無用的請求體。下面開始分析ngx_http_discard_request_body()函數： ngx_int_t ngx_http_discard_request_body(ngx_http_request_t *r) { ssize_t size; ngx_event_t *rev; if (r != r->main || r->discard_body) { return NGX_OK; } if (ngx_http_test_expect(r) != NGX_OK) { return NGX_HTTP_INTERNAL_SERVER_ERROR; } rev = r->connection->read; ngx_log_debug0(NGX_LOG_DEBUG_HTTP, rev->log, 0, "http set discard body"); if (rev->timer_set) { ngx_del_timer(rev); } if (r->headers_in.content_length_n <= 0 || r->request_body) { return NGX_OK; } size = r->header_in->last - r->header_in->pos; if (size) { if (r->headers_in.content_length_n > size) { r->header_in->pos += size; r->headers_in.content_length_n -= size; } else { r->header_in->pos += (size_t) r->headers_in.content_length_n; r->headers_in.content_length_n = 0; return NGX_OK; } } r->read_event_handler = ngx_http_discarded_request_body_handler; if (ngx_handle_read_event(rev, 0) != NGX_OK) { return NGX_HTTP_INTERNAL_SERVER_ERROR; } if (ngx_http_read_discarded_request_body(r) == NGX_OK) { r->lingering_close = 0; } else { r->count++; r->discard_body = 1; } return NGX_OK; } 由于函數不長，這里把它完整的列出來了，函數的開始同樣先判斷了不需要再做處理的情況：子請求不需要處理，已經調用過此函數的也不需要再處理。接著調用ngx_http_test_expect() 處理http1.1 expect的情況，根據http1.1的expect機制，如果客戶端發送了expect頭，而服務端不希望接收請求體時，必須返回417(Expectation Failed)錯誤。nginx并沒有這樣做，它只是簡單的讓客戶端把請求體發送過來，然后丟棄掉。接下來，函數刪掉了讀事件上的定時器，因為這時本身就不需要請求體，所以也無所謂客戶端發送的快還是慢了，當然后面還會講到，當nginx已經處理完該請求但客戶端還沒有發送完無用的請求體時，nginx會在讀事件上再掛上定時器。客戶端如果打算發送請求體，就必須發送content-length頭，所以函數會檢查請求頭中的content-length頭，同時還會查看其他地方是不是已經讀取了請求體。如果確實有待處理的請求體，函數接著檢查請求頭buffer中預讀的數據，預讀的數據會直接被丟掉，當然如果請求體已經被全部預讀，函數就直接返回了。接下來，如果還有剩余的請求體未處理，該函數調用ngx_handle_read_event()在事件處理機制中掛載好讀事件，并把讀事件的處理函數設置為ngx_http_discarded_request_body_handler。做好這些準備之后，該函數最后調用ngx_http_read_discarded_request_body()接口讀取客戶端過來的請求體并丟棄。如果客戶端并沒有一次將請求體發過來，函數會返回，剩余的數據等到下一次讀事件過來時，交給ngx_http_discarded_request_body_handler()來處理，這時，請求的discard_body將被設置為1用來標識這種情況。另外請求的引用數(count)也被加1，這樣做的目的是客戶端可能在nginx處理完請求之后仍未完整發送待發送的請求體，增加引用是防止nginx核心在處理完請求后直接釋放了請求的相關資源。 ngx_http_read_discarded_request_body()函數非常簡單，它循環的從鏈接中讀取數據并丟棄，直到讀完接收緩沖區的所有數據，如果請求體已經被讀完了，該函數會設置讀事件的處理函數為ngx_http_block_reading，這個函數僅僅刪除水平觸發的讀事件，防止同一事件不斷被觸發。最后看一下讀事件的處理函數ngx_http_discarded_request_body_handler，這個函數每次讀事件來時會被調用，先看一下它的源碼： void ngx_http_discarded_request_body_handler(ngx_http_request_t *r) { ... c = r->connection; rev = c->read; if (rev->timedout) { c->timedout = 1; c->error = 1; ngx_http_finalize_request(r, NGX_ERROR); return; } if (r->lingering_time) { timer = (ngx_msec_t) (r->lingering_time - ngx_time()); if (timer <= 0) { r->discard_body = 0; r->lingering_close = 0; ngx_http_finalize_request(r, NGX_ERROR); return; } } else { timer = 0; } rc = ngx_http_read_discarded_request_body(r); if (rc == NGX_OK) { r->discard_body = 0; r->lingering_close = 0; ngx_http_finalize_request(r, NGX_DONE); return; } /* rc == NGX_AGAIN */ if (ngx_handle_read_event(rev, 0) != NGX_OK) { c->error = 1; ngx_http_finalize_request(r, NGX_ERROR); return; } if (timer) { clcf = ngx_http_get_module_loc_conf(r, ngx_http_core_module); timer *= 1000; if (timer > clcf->lingering_timeout) { timer = clcf->lingering_timeout; } ngx_add_timer(rev, timer); } } 函數一開始就處理了讀事件超時的情況，之前說到在ngx_http_discard_request_body()函數中已經刪除了讀事件的定時器，那么什么時候會設置定時器呢？答案就是在nginx已經處理完該請求，但是又沒有完全將該請求的請求體丟棄的時候（客戶端可能還沒有發送過來），在ngx_http_finalize_connection()函數中，如果檢查到還有未丟棄的請求體時，nginx會添加一個讀事件定時器，它的時長為lingering_timeout指令所指定，默認為5秒，不過這個時間僅僅兩次讀事件之間的超時時間，等待請求體的總時長為lingering_time指令所指定，默認為30秒。這種情況中，該函數如果檢測到超時事件則直接返回并斷開連接。同樣，還需要控制整個丟棄請求體的時長不能超過lingering_time設置的時間，如果超過了最大時長，也會直接返回并斷開連接。如果讀事件發生在請求處理完之前，則不用處理超時事件，也不用設置定時器，函數只是簡單的調用ngx_http_read_discarded_request_body()來讀取并丟棄數據。