# How filters work in Apache 2.0
### Warning
This is a cut 'n paste job from an email (<022501c1c529$f63a9550$7f00000a@KOJ>) and only reformatted for better readability. It's not up to date but may be a good start for further research.
## Filter Types
There are three basic filter types (each of these is actually broken down into two categories, but that comes later).
`CONNECTION`
Filters of this type are valid for the lifetime of this connection. (`AP_FTYPE_CONNECTION`, `AP_FTYPE_NETWORK`)
`PROTOCOL`
Filters of this type are valid for the lifetime of this request from the point of view of the client, this means that the request is valid from the time that the request is sent until the time that the response is received. (`AP_FTYPE_PROTOCOL`, `AP_FTYPE_TRANSCODE`)
`RESOURCE`
Filters of this type are valid for the time that this content is used to satisfy a request. For simple requests, this is identical to `PROTOCOL`, but internal redirects and sub-requests can change the content without ending the request. (`AP_FTYPE_RESOURCE`, `AP_FTYPE_CONTENT_SET`)
It is important to make the distinction between a protocol and a resource filter. A resource filter is tied to a specific resource, it may also be tied to header information, but the main binding is to a resource. If you are writing a filter and you want to know if it is resource or protocol, the correct question to ask is: "Can this filter be removed if the request is redirected to a different resource?" If the answer is yes, then it is a resource filter. If it is no, then it is most likely a protocol or connection filter. I won't go into connection filters, because they seem to be well understood. With this definition, a few examples might help:
Byterange
We have coded it to be inserted for all requests, and it is removed if not used. Because this filter is active at the beginning of all requests, it can not be removed if it is redirected, so this is a protocol filter.
http_header
This filter actually writes the headers to the network. This is obviously a required filter (except in the asis case which is special and will be dealt with below) and so it is a protocol filter.
Deflate
The administrator configures this filter based on which file has been requested. If we do an internal redirect from an autoindex page to an index.html page, the deflate filter may be added or removed based on config, so this is a resource filter.
The further breakdown of each category into two more filter types is strictly for ordering. We could remove it, and only allow for one filter type, but the order would tend to be wrong, and we would need to hack things to make it work. Currently, the `RESOURCE` filters only have one filter type, but that should change.
## How are filters inserted?
This is actually rather simple in theory, but the code is complex. First of all, it is important that everybody realize that there are three filter lists for each request, but they are all concatenated together. So, the first list is `r->output_filters`, then `r->proto_output_filters`, and finally `r->connection->output_filters`. These correspond to the `RESOURCE`, `PROTOCOL`, and `CONNECTION` filters respectively. The problem previously, was that we used a singly linked list to create the filter stack, and we started from the "correct" location. This means that if I had a `RESOURCE` filter on the stack, and I added a `CONNECTION` filter, the `CONNECTION` filter would be ignored. This should make sense, because we would insert the connection filter at the top of the `c->output_filters` list, but the end of `r->output_filters` pointed to the filter that used to be at the front of `c->output_filters`. This is obviously wrong. The new insertion code uses a doubly linked list. This has the advantage that we never lose a filter that has been inserted. Unfortunately, it comes with a separate set of headaches.
The problem is that we have two different cases were we use subrequests. The first is to insert more data into a response. The second is to replace the existing response with an internal redirect. These are two different cases and need to be treated as such.
In the first case, we are creating the subrequest from within a handler or filter. This means that the next filter should be passed to `make_sub_request` function, and the last resource filter in the sub-request will point to the next filter in the main request. This makes sense, because the sub-request's data needs to flow through the same set of filters as the main request. A graphical representation might help:
```
Default_handler --> includes_filter --> byterange --> ...
```
If the includes filter creates a sub request, then we don't want the data from that sub-request to go through the includes filter, because it might not be SSI data. So, the subrequest adds the following:
```
Default_handler --> includes_filter -/-> byterange --> ...
/
Default_handler --> sub_request_core
```
What happens if the subrequest is SSI data? Well, that's easy, the `includes_filter` is a resource filter, so it will be added to the sub request in between the `Default_handler` and the `sub_request_core` filter.
The second case for sub-requests is when one sub-request is going to become the real request. This happens whenever a sub-request is created outside of a handler or filter, and NULL is passed as the next filter to the `make_sub_request` function.
In this case, the resource filters no longer make sense for the new request, because the resource has changed. So, instead of starting from scratch, we simply point the front of the resource filters for the sub-request to the front of the protocol filters for the old request. This means that we won't lose any of the protocol filters, neither will we try to send this data through a filter that shouldn't see it.
The problem is that we are using a doubly-linked list for our filter stacks now. But, you should notice that it is possible for two lists to intersect in this model. So, you do you handle the previous pointer? This is a very difficult question to answer, because there is no "right" answer, either method is equally valid. I looked at why we use the previous pointer. The only reason for it is to allow for easier addition of new servers. With that being said, the solution I chose was to make the previous pointer always stay on the original request.
This causes some more complex logic, but it works for all cases. My concern in having it move to the sub-request, is that for the more common case (where a sub-request is used to add data to a response), the main filter chain would be wrong. That didn't seem like a good idea to me.
## Asis
The final topic. :-) Mod_Asis is a bit of a hack, but the handler needs to remove all filters except for connection filters, and send the data. If you are using `mod_asis`, all other bets are off.
## Explanations
The absolutely last point is that the reason this code was so hard to get right, was because we had hacked so much to force it to work. I wrote most of the hacks originally, so I am very much to blame. However, now that the code is right, I have started to remove some hacks. Most people should have seen that the `reset_filters`和`add_required_filters` functions are gone. Those inserted protocol level filters for error conditions, in fact, both functions did the same thing, one after the other, it was really strange. Because we don't lose protocol filters for error cases any more, those hacks went away. The `HTTP_HEADER`, `Content-length`, and `Byterange` filters are all added in the `insert_filters` phase, because if they were added earlier, we had some interesting interactions. Now, those could all be moved to be inserted with the `HTTP_IN`, `CORE`, and `CORE_IN` filters. That would make the code easier to follow.
- Apache HTTP Server Version 2.2 文檔 [最后更新:2006年3月21日]
- 版本說明
- 從1.3升級到2.0
- 從2.0升級到2.2
- Apache 2.2 新特性概述
- Apache 2.0 新特性概述
- The Apache License, Version 2.0
- 參考手冊
- 編譯與安裝
- 啟動Apache
- 停止和重啟
- 配置文件
- 配置段(容器)
- 緩沖指南
- 服務器全局配置
- 日志文件
- 從URL到文件系統的映射
- 安全方面的提示
- 動態共享對象(DSO)支持
- 內容協商
- 自定義錯誤響應
- 地址和端口的綁定(Binding)
- 多路處理模塊
- Apache的環境變量
- Apache處理器的使用
- 過濾器(Filter)
- suEXEC支持
- 性能方面的提示
- URL重寫指南
- Apache虛擬主機文檔
- 基于主機名的虛擬主機
- 基于IP地址的虛擬主機
- 大批量虛擬主機的動態配置
- 虛擬主機示例
- 深入研究虛擬主機的匹配
- 文件描述符限制
- 關于DNS和Apache
- 常見問題
- 經常問到的問題
- Apache的SSL/TLS加密
- SSL/TLS高強度加密:緒論
- SSL/TLS高強度加密:兼容性
- SSL/TLS高強度加密:如何...?
- SSL/TLS Strong Encryption: FAQ
- 如何.../指南
- 認證、授權、訪問控制
- CGI動態頁面
- 服務器端包含入門
- .htaccess文件
- 用戶網站目錄
- 針對特定平臺的說明
- 在Microsoft Windows中使用Apache
- 在Microsoft Windows上編譯Apache
- Using Apache With Novell NetWare
- Running a High-Performance Web Server on HPUX
- The Apache EBCDIC Port
- 服務器和支持程序
- httpd - Apache超文本傳輸協議服務器
- ab - Apache HTTP服務器性能測試工具
- apachectl - Apache HTTP服務器控制接口
- apxs - Apache 擴展工具
- configure - 配置源代碼樹
- dbmmanage - 管理DBM格式的用戶認證文件
- htcacheclean - 清理磁盤緩沖區
- htdbm - 操作DBM密碼數據庫
- htdigest - 管理用于摘要認證的用戶文件
- httxt2dbm - 生成RewriteMap指令使用的dbm文件
- htpasswd - 管理用于基本認證的用戶文件
- logresolve - 解析Apache日志中的IP地址為主機名
- rotatelogs - 滾動Apache日志的管道日志程序
- suexec - 在執行外部程序之前切換用戶
- 其他程序
- 雜項文檔
- 與Apache相關的標準
- Apache模塊
- 描述模塊的術語
- 描述指令的術語
- Apache核心(Core)特性
- Apache MPM 公共指令
- Apache MPM beos
- Apache MPM event
- Apache MPM netware
- Apache MPM os2
- Apache MPM prefork
- Apache MPM winnt
- Apache MPM worker
- Apache模塊 mod_actions
- Apache模塊 mod_alias
- Apache模塊 mod_asis
- Apache模塊 mod_auth_basic
- Apache模塊 mod_auth_digest
- Apache模塊 mod_authn_alias
- Apache模塊 mod_authn_anon
- Apache模塊 mod_authn_dbd
- Apache模塊 mod_authn_dbm
- Apache模塊 mod_authn_default
- Apache模塊 mod_authn_file
- Apache模塊 mod_authnz_ldap
- Apache模塊 mod_authz_dbm
- Apache模塊 mod_authz_default
- Apache模塊 mod_authz_groupfile
- Apache模塊 mod_authz_host
- Apache模塊 mod_authz_owner
- Apache模塊 mod_authz_user
- Apache模塊 mod_autoindex
- Apache模塊 mod_cache
- Apache模塊 mod_cern_meta
- Apache模塊 mod_cgi
- Apache模塊 mod_cgid
- Apache模塊 mod_charset_lite
- Apache模塊 mod_dav
- Apache模塊 mod_dav_fs
- Apache模塊 mod_dav_lock
- Apache模塊 mod_dbd
- Apache模塊 mod_deflate
- Apache模塊 mod_dir
- Apache模塊 mod_disk_cache
- Apache模塊 mod_dumpio
- Apache模塊 mod_echo
- Apache模塊 mod_env
- Apache模塊 mod_example
- Apache模塊 mod_expires
- Apache模塊 mod_ext_filter
- Apache模塊 mod_file_cache
- Apache模塊 mod_filter
- Apache模塊 mod_headers
- Apache模塊 mod_ident
- Apache模塊 mod_imagemap
- Apache模塊 mod_include
- Apache模塊 mod_info
- Apache模塊 mod_isapi
- Apache模塊 mod_ldap
- Apache模塊 mod_log_config
- Apache模塊 mod_log_forensic
- Apache模塊 mod_logio
- Apache模塊 mod_mem_cache
- Apache模塊 mod_mime
- Apache模塊 mod_mime_magic
- Apache模塊 mod_negotiation
- Apache模塊 mod_nw_ssl
- Apache模塊 mod_proxy
- Apache模塊 mod_proxy_ajp
- Apache模塊 mod_proxy_balancer
- Apache模塊 mod_proxy_connect
- Apache模塊 mod_proxy_ftp
- Apache模塊 mod_proxy_http
- Apache模塊 mod_rewrite
- Apache模塊 mod_setenvif
- Apache模塊 mod_so
- Apache模塊 mod_speling
- Apache模塊 mod_ssl
- Apache模塊 mod_status
- Apache模塊 mod_suexec
- Apache模塊 mod_unique_id
- Apache模塊 mod_userdir
- Apache模塊 mod_usertrack
- Apache模塊 mod_version
- Apache模塊 mod_vhost_alias
- Developer Documentation for Apache 2.0
- Apache 1.3 API notes
- Debugging Memory Allocation in APR
- Documenting Apache 2.0
- Apache 2.0 Hook Functions
- Converting Modules from Apache 1.3 to Apache 2.0
- Request Processing in Apache 2.0
- How filters work in Apache 2.0
- Apache 2.0 Thread Safety Issues
- 詞匯和索引
- 詞匯表
- 指令索引
- 指令速查
- 模塊索引
- 站點導航