# Apache模塊 mod_unique_id
| [說明](#calibre_link-11) | 為每個請求生成唯一的標識以便跟蹤 |
| --- | --- |
| [狀態](#calibre_link-12) | 擴展(E) |
| [模塊名](#calibre_link-13) | unique_id_module |
| [源文件](#calibre_link-14) | mod_unique_id.c |
### 概述
This module provides a magic token for each request which is guaranteed to be unique across "all" requests under very specific conditions. The unique identifier is even unique across multiple machines in a properly configured cluster of machines. The environment variable `UNIQUE_ID` is set to the identifier for each request. Unique identifiers are useful for various reasons which are beyond the scope of this document.
## Theory
First a brief recap of how the Apache server works on Unix machines. This feature currently isn't supported on Windows NT. On Unix machines, Apache creates several children, the children process requests one at a time. Each child can serve multiple requests in its lifetime. For the purpose of this discussion, the children don't share any data with each other. We'll refer to the children as <dfn class="calibre27">httpd processes</dfn>.
Your website has one or more machines under your administrative control, together we'll call them a cluster of machines. Each machine can possibly run multiple instances of Apache. All of these collectively are considered "the universe", and with certain assumptions we'll show that in this universe we can generate unique identifiers for each request, without extensive communication between machines in the cluster.
The machines in your cluster should satisfy these requirements. (Even if you have only one machine you should synchronize its clock with NTP.)
* The machines' times are synchronized via NTP or other network time protocol.
* The machines' hostnames all differ, such that the module can do a hostname lookup on the hostname and receive a different IP address for each machine in the cluster.
As far as operating system assumptions go, we assume that pids (process ids) fit in 32-bits. If the operating system uses more than 32-bits for a pid, the fix is trivial but must be performed in the code.
Given those assumptions, at a single point in time we can identify any httpd process on any machine in the cluster from all other httpd processes. The machine's IP address and the pid of the httpd process are sufficient to do this. So in order to generate unique identifiers for requests we need only distinguish between different points in time.
To distinguish time we will use a Unix timestamp (seconds since January 1, 1970 UTC), and a 16-bit counter. The timestamp has only one second granularity, so the counter is used to represent up to 65536 values during a single second. The quadruple _( ip_addr, pid, time_stamp, counter )_ is sufficient to enumerate 65536 requests per second per httpd process. There are issues however with pid reuse over time, and the counter is used to alleviate this issue.
When an httpd child is created, the counter is initialized with ( current microseconds divided by 10 ) modulo 65536 (this formula was chosen to eliminate some variance problems with the low order bits of the microsecond timers on some systems). When a unique identifier is generated, the time stamp used is the time the request arrived at the web server. The counter is incremented every time an identifier is generated (and allowed to roll over).
The kernel generates a pid for each process as it forks the process, and pids are allowed to roll over (they're 16-bits on many Unixes, but newer systems have expanded to 32-bits). So over time the same pid will be reused. However unless it is reused within the same second, it does not destroy the uniqueness of our quadruple. That is, we assume the system does not spawn 65536 processes in a one second interval (it may even be 32768 processes on some Unixes, but even this isn't likely to happen).
Suppose that time repeats itself for some reason. That is, suppose that the system's clock is screwed up and it revisits a past time (or it is too far forward, is reset correctly, and then revisits the future time). In this case we can easily show that we can get pid and time stamp reuse. The choice of initializer for the counter is intended to help defeat this. Note that we really want a random number to initialize the counter, but there aren't any readily available numbers on most systems (_i.e._, you can't use rand() because you need to seed the generator, and can't seed it with the time because time, at least at one second resolution, has repeated itself). This is not a perfect defense.
How good a defense is it? Suppose that one of your machines serves at most 500 requests per second (which is a very reasonable upper bound at this writing, because systems generally do more than just shovel out static files). To do that it will require a number of children which depends on how many concurrent clients you have. But we'll be pessimistic and suppose that a single child is able to serve 500 requests per second. There are 1000 possible starting counter values such that two sequences of 500 requests overlap. So there is a 1.5% chance that if time (at one second resolution) repeats itself this child will repeat a counter value, and uniqueness will be broken. This was a very pessimistic example, and with real world values it's even less likely to occur. If your system is such that it's still likely to occur, then perhaps you should make the counter 32 bits (by editing the code).
You may be concerned about the clock being "set back" during summer daylight savings. However this isn't an issue because the times used here are UTC, which "always" go forward. Note that x86 based Unixes may need proper configuration for this to be true -- they should be configured to assume that the motherboard clock is on UTC and compensate appropriately. But even still, if you're running NTP then your UTC time will be correct very shortly after reboot.
`UNIQUE_ID` environment variable is constructed by encoding the 112-bit (32-bit IP address, 32 bit pid, 32 bit time stamp, 16 bit counter) quadruple using the alphabet `[A-Za-z0-9@-]` in a manner similar to MIME base64 encoding, producing 19 characters. The MIME base64 alphabet is actually `[A-Za-z0-9+/]` however `+`和`/` need to be specially encoded in URLs, which makes them less desirable. All values are encoded in network byte ordering so that the encoding is comparable across architectures of different byte ordering. The actual ordering of the encoding is: time stamp, IP address, pid, counter. This ordering has a purpose, but it should be emphasized that applications should not dissect the encoding. Applications should treat the entire encoded `UNIQUE_ID` as an opaque token, which can be compared against other `UNIQUE_ID`s for equality only.
The ordering was chosen such that it's possible to change the encoding in the future without worrying about collision with an existing database of `UNIQUE_ID`s. The new encodings should also keep the time stamp as the first element, and can otherwise use the same alphabet and bit length. Since the time stamps are essentially an increasing sequence, it's sufficient to have a _flag second_ in which all machines in the cluster stop serving and request, and stop using the old encoding format. Afterwards they can resume requests and begin issuing the new encodings.
This we believe is a relatively portable solution to this problem. It can be extended to multithreaded systems like Windows NT, and can grow with future needs. The identifiers generated have essentially an infinite life-time because future identifiers can be made longer as required. Essentially no communication is required between machines in the cluster (only NTP synchronization is required, which is low overhead), and no communication between httpd processes is required (the communication is implicit in the pid value assigned by the kernel). In very specific situations the identifier can be shortened, but more information needs to be assumed (for example the 32-bit IP address is overkill for any site, but there is no portable shorter replacement for it).
- Apache HTTP Server Version 2.2 文檔 [最后更新:2006年3月21日]
- 版本說明
- 從1.3升級到2.0
- 從2.0升級到2.2
- Apache 2.2 新特性概述
- Apache 2.0 新特性概述
- The Apache License, Version 2.0
- 參考手冊
- 編譯與安裝
- 啟動Apache
- 停止和重啟
- 配置文件
- 配置段(容器)
- 緩沖指南
- 服務器全局配置
- 日志文件
- 從URL到文件系統的映射
- 安全方面的提示
- 動態共享對象(DSO)支持
- 內容協商
- 自定義錯誤響應
- 地址和端口的綁定(Binding)
- 多路處理模塊
- Apache的環境變量
- Apache處理器的使用
- 過濾器(Filter)
- suEXEC支持
- 性能方面的提示
- URL重寫指南
- Apache虛擬主機文檔
- 基于主機名的虛擬主機
- 基于IP地址的虛擬主機
- 大批量虛擬主機的動態配置
- 虛擬主機示例
- 深入研究虛擬主機的匹配
- 文件描述符限制
- 關于DNS和Apache
- 常見問題
- 經常問到的問題
- Apache的SSL/TLS加密
- SSL/TLS高強度加密:緒論
- SSL/TLS高強度加密:兼容性
- SSL/TLS高強度加密:如何...?
- SSL/TLS Strong Encryption: FAQ
- 如何.../指南
- 認證、授權、訪問控制
- CGI動態頁面
- 服務器端包含入門
- .htaccess文件
- 用戶網站目錄
- 針對特定平臺的說明
- 在Microsoft Windows中使用Apache
- 在Microsoft Windows上編譯Apache
- Using Apache With Novell NetWare
- Running a High-Performance Web Server on HPUX
- The Apache EBCDIC Port
- 服務器和支持程序
- httpd - Apache超文本傳輸協議服務器
- ab - Apache HTTP服務器性能測試工具
- apachectl - Apache HTTP服務器控制接口
- apxs - Apache 擴展工具
- configure - 配置源代碼樹
- dbmmanage - 管理DBM格式的用戶認證文件
- htcacheclean - 清理磁盤緩沖區
- htdbm - 操作DBM密碼數據庫
- htdigest - 管理用于摘要認證的用戶文件
- httxt2dbm - 生成RewriteMap指令使用的dbm文件
- htpasswd - 管理用于基本認證的用戶文件
- logresolve - 解析Apache日志中的IP地址為主機名
- rotatelogs - 滾動Apache日志的管道日志程序
- suexec - 在執行外部程序之前切換用戶
- 其他程序
- 雜項文檔
- 與Apache相關的標準
- Apache模塊
- 描述模塊的術語
- 描述指令的術語
- Apache核心(Core)特性
- Apache MPM 公共指令
- Apache MPM beos
- Apache MPM event
- Apache MPM netware
- Apache MPM os2
- Apache MPM prefork
- Apache MPM winnt
- Apache MPM worker
- Apache模塊 mod_actions
- Apache模塊 mod_alias
- Apache模塊 mod_asis
- Apache模塊 mod_auth_basic
- Apache模塊 mod_auth_digest
- Apache模塊 mod_authn_alias
- Apache模塊 mod_authn_anon
- Apache模塊 mod_authn_dbd
- Apache模塊 mod_authn_dbm
- Apache模塊 mod_authn_default
- Apache模塊 mod_authn_file
- Apache模塊 mod_authnz_ldap
- Apache模塊 mod_authz_dbm
- Apache模塊 mod_authz_default
- Apache模塊 mod_authz_groupfile
- Apache模塊 mod_authz_host
- Apache模塊 mod_authz_owner
- Apache模塊 mod_authz_user
- Apache模塊 mod_autoindex
- Apache模塊 mod_cache
- Apache模塊 mod_cern_meta
- Apache模塊 mod_cgi
- Apache模塊 mod_cgid
- Apache模塊 mod_charset_lite
- Apache模塊 mod_dav
- Apache模塊 mod_dav_fs
- Apache模塊 mod_dav_lock
- Apache模塊 mod_dbd
- Apache模塊 mod_deflate
- Apache模塊 mod_dir
- Apache模塊 mod_disk_cache
- Apache模塊 mod_dumpio
- Apache模塊 mod_echo
- Apache模塊 mod_env
- Apache模塊 mod_example
- Apache模塊 mod_expires
- Apache模塊 mod_ext_filter
- Apache模塊 mod_file_cache
- Apache模塊 mod_filter
- Apache模塊 mod_headers
- Apache模塊 mod_ident
- Apache模塊 mod_imagemap
- Apache模塊 mod_include
- Apache模塊 mod_info
- Apache模塊 mod_isapi
- Apache模塊 mod_ldap
- Apache模塊 mod_log_config
- Apache模塊 mod_log_forensic
- Apache模塊 mod_logio
- Apache模塊 mod_mem_cache
- Apache模塊 mod_mime
- Apache模塊 mod_mime_magic
- Apache模塊 mod_negotiation
- Apache模塊 mod_nw_ssl
- Apache模塊 mod_proxy
- Apache模塊 mod_proxy_ajp
- Apache模塊 mod_proxy_balancer
- Apache模塊 mod_proxy_connect
- Apache模塊 mod_proxy_ftp
- Apache模塊 mod_proxy_http
- Apache模塊 mod_rewrite
- Apache模塊 mod_setenvif
- Apache模塊 mod_so
- Apache模塊 mod_speling
- Apache模塊 mod_ssl
- Apache模塊 mod_status
- Apache模塊 mod_suexec
- Apache模塊 mod_unique_id
- Apache模塊 mod_userdir
- Apache模塊 mod_usertrack
- Apache模塊 mod_version
- Apache模塊 mod_vhost_alias
- Developer Documentation for Apache 2.0
- Apache 1.3 API notes
- Debugging Memory Allocation in APR
- Documenting Apache 2.0
- Apache 2.0 Hook Functions
- Converting Modules from Apache 1.3 to Apache 2.0
- Request Processing in Apache 2.0
- How filters work in Apache 2.0
- Apache 2.0 Thread Safety Issues
- 詞匯和索引
- 詞匯表
- 指令索引
- 指令速查
- 模塊索引
- 站點導航