配置 · Prometheus中文文檔

## 配置configuration --- Prometheus可以通過命令行參數和配置文件來配置它的服務參數。命令行主要用于配置系統參數（例如：存儲位置，保留在磁盤和內存中的數據量大小等），配置文件主要用于配置與抓取[任務和任務下的實例](https://prometheus.io/docs/concepts/jobs_instances/)相關的所有內容, 并且加載指定的抓取[規則file](https://prometheus.io/docs/querying/rules/#configuring-rules)。可以通過運行`prometheus -h`命令, 查看Prometheus服務所有可用的命令行參數， Prometheus服務可以reload它的配置。如果這個配置錯誤，則更改后的配置不生效。配置reolad是通過給Prometheus服務發送信號量`SIGHUP`或者通過http發送一個post請求到`/-/reload`。這也會重載所有配置的規則文件(rule files)。 ### 配置文件 Configuration file 使用`-config.file`命令行參數來指定Prometheus啟動所需要的配置文件。這個配置文件是[YAML](http://en.wikipedia.org/wiki/YAML)格式，通過下面描述的范式定義, 括號表示參數是可選的。對于非列表參數，這個值被設置了默認值。通用占位符由下面定義： - `\<boolean\>`: 一個布爾值，包括`true`或者`false`. - `\<duration\>`: 持續時間，與正則表達式`[0-9]+(ms|smhdwy)`匹配 - `\<labelname\>`: 一個與正則表達式`[a-zA-Z_][a-zA-Z0-9_]*`匹配的字符串 - `\<labelvalue\>`: 一個為unicode字符串 - `\<filename\>`: 當前工作目錄下的有效路徑 - `\<host\>`: 一個包含主機名或者IP地址，并且可以帶上一個非必需的端口號的有效字符串 - `\<path\>`: 一個有效的URL路徑 - `\<scheme\>`: 一個可以是`http`或者`https`的字符串 - `\<string\>`: 一個正則表達式字符串其他的占位符被分開指定：一個有效的配置文件[示例](https://github.com/prometheus/prometheus/blob/master/config/testdata/conf.good.yml)。全局配置指定的參數，在其他上下文配置中是生效的。這也默認這些全局參數在其他配置區域有效。 ``` global: # 抓取目標實例的頻率時間值，默認10s [ scrape_interval: <duration> | default = 10s ] # 一次抓取請求超時時間值，默認10s [ scrape_timeout: <duration> | default = 10s ] # 執行配置文件規則的頻率時間值, 默認1m [ evaluation_interval: <duration> | default=1m ] # 當和外部系統通信時(federation, remote storage, Alertmanager), 這些標簽會增加到度量指標數據中 external_labels: [ <labelname>: <labelvalue> ... ] # 規則文件指定規則文件路徑列表。規則和警報是從所有匹配的文件中讀取的 rule_files: [ - <filepath_glob> ...] # 抓取配置的列表 scrape_configs: [ - <scrape_config> ... ] # 警報設置 alerting: alert_relabel_configs: [ - <relabel_config> ... ] alertmanagers: [ - <alertmanager_config> ... ] # 設置涉及到未來的實驗特征 remote_write: [url: <string> ] [ remote_timeout: <duration> | default = 30s ] tls_config: [ <tls_config> ] [proxy_url: <string> ] basic_auth: [user_name: <string> ] [password: <string> ] write_relabel_configs: [ - <relabel_config> ... ] ``` #### <scrape_config> `<scrape_config>`區域指定了目標列表和目標下的配置參數, 這些配置參數描述了如何抓取度量指標數據。通常，一個scrape_config只指定一個job，但是可以改變，一個scrape_config可以指定多個job，每個job下有多個targets 通過`static_configs`參數靜態指定要監控的目標列表，或者使用一些服務發現機制發現目標。另外，`relabel_configs`允許在獲取度量指標數據之前，對任何目標和它的標簽進行進一步地修改。 ``` # 默認下任務名稱賦值給要抓取的度量指標 job_name: <job_name> # 從這個任務中抓取目標的頻率時間值 [ scrape_interval: <duration> | default= <global_config.scrape_interval>] # 當抓取這個任務的所有目標時，超時時間值 [ scrape_timeout: <duration> | default = <global_config.scrape_timeout> ] # 從目標列表中抓取度量指標的http資源路徑, 默認為/metrics [ metrics_path: <path> | default = /metrics ] # honor_labels controls how Prometheus handles conflicts between would labels that are already present in scraped data and labels that Prometheus would attach server-side ("job" and "instance" labels, manually configured target labels, and labels generated by service discovery implementations). # If honor_labels is set to "true", label conflicts are resolved by keeping label # values from the scraped data and ignoring the conflicting server-side labe# ls. If honor_labels is set to "false", label conflicts are resolved by ren# amin conflicting labels in the scraped data to "exported_<original-label>" (for example "exported_instance", "exported_job") and then attaching server-side labels. This is useful for use cases such as federation, where all label#s specified in the target should be preserved. Note that any globally configured "external_labels" are unaffected by this # setting. In communication with external systems, they are always applied # only when a time series does not have a given label yet and are ignored otherwise. [ honor_labels: <boolean> | default = false ] # 配置請求的協議范式, 默認為http請求 [ scheme: <scheme> | default = http ] # 可選的http url參數 params: [ <string>:[<string>, ...]] # 在`Authorization`頭部設置每次抓取請求的用戶名和密碼 basic_auth: [username: <string>] [password: <string>] # Sets the `Authorization` header on every scrape request with # the configured bearer token. It is mutually exclusive with `bearer_token_file`. [ bearer_token: <string> ] # Sets the `Authorization` header on every scrape request with the bearer token read from the configured file. It is mutually exclusive with `bearer_token`. [ bearer_token_file: /path/to/bearer/token/file ] # 配置抓取請求的TLS設置 tls_config: [ <tls_config> ] # 可選的代理URL [ proxy_url: <string> ] # 微軟的Azure服務發現配置列表 azure_sd_configs: [ - <azure_sd_config> ... ] # Consul服務發現配置列表 consul_sd_configs: [ - <consul_sd_config> ... ] # DNS服務發現配置列表 dns_sd_configs: [ - <dns_sd_config> ... ] # 亞馬遜EC2服務發現的配置列表 ec2_sd_configs: [ - <ec2_sd_config> ... ] # 文件服務發現配置列表 file_sd_configs: [ - <file_sd_config> ... ] # google GCE服務發現配置列表 gce_sd_configs: [ - <gce_sd_config> ... ] # Kubernetes服務發現配置列表 kubernetes_sd_configs: [ - <kubernetes_sd_config> ... ] # Marathon服務發現配置列表 marathon_sd_configs: [ - <marathon_sd_config> ... ] # AirBnB的Nerve服務發現配置列表 nerve_sd_configs: [ - <nerve_sd_config> ... ] # Zookeeper服務發現配置列表 serverset_sd_configs: [ - <serverset_sd_config> ... ] # Triton服務發現配置列表 triton_sd_configs: [ - <triton_sd_config> ... ] # 靜態配置目標列表 static_configs: [ - <static_config> ... ] # 抓取之前的標簽重構配置列表 relabel_configs: [ - <relabel_config> ... ] # List of metric relabel configurations. metric_relabel_configs: [ - <relabel_config> ... ] # Per-scrape limit on number of scraped samples that will be accepted. # If more than this number of samples are present after metric relabelling # the entire scrape will be treated as failed. 0 means no limit. [ sample_limit: <int> | default = 0 ] ``` 記住：在所有獲取配置中`<job_name>`必須是唯一的。 #### <tls_config> `<tls_config>`允許配置TLS連接。 ``` # CA證書 [ ca_file: <filename> ] # 證書和key文件 [ cert_file: <filename> ] [ key_file: <filename> ] # ServerName extension to indicate the name of the server. # http://tools.ietf.org/html/rfc4366#section-3.1 [ server_name: <string> ] # Disable validation of the server certificate. [ insecure_skip_verify: <boolean> ] ``` #### <azure_sd_config> **Azure SD正處于測試階段：在未來的版本中，仍然可能對配置進行實質性修改** Azure SD配置允許從Azure虛擬機中檢索和獲取目標。下面的測試標簽在relabeling期間在目標上仍然是可用的： - `__meta_azure_machine_id`: 機器ID - `__meta_azure_machine_location`: 機器運行的位置 - `__meta_azure_machine_name`: 機器名稱 - `__meta_azure_machine_private_ip`: 機器的內網IP - `__meta_azure_machine_resource_group`: 機器的資源組 - `__meta_azure_tag_<tagname>`: 機器的每個tag值對于Azure發現，看看下面的配置選項： ``` # The information to access the Azure API. # The subscription ID. subscription_id: <string> # The tenant ID. tenant_id: <string> # The client ID. client_id: <string> # The client secret. client_secret: <string> # Refresh interval to re-read the instance list. [ refresh_interval: <duration> | default = 300s ] # The port to scrape metrics from. If using the public IP address, this must # instead be specified in the relabeling rule. [ port: <int> | default = 80 ] ``` #### <consul_sd_config> Consul服務發現配置允許從Consul's Catalog API中檢索和獲取目標。下面的meta標簽在relabeling期間在目標上仍然是可用的： - `__meta_consul_address`: 目標地址 - `__meta_consul_dc`: 目標的數據中心名稱 - `__meta_consul_node`: 目標的節點名稱 - `__meta_consul_service_address`: 目標的服務地址 - `__meta_consul_service_id`: 目標的服務ID - `__meta_consul_service_port`: 目標的服務端口 - `__meta_consul_service`: 這個目標屬于哪個服務名稱 - `__meta_consul_tags`: 由標簽分隔符鏈接的目標的標簽列表 ``` # 下面配置是訪問Consul API所需要的信息 server: <host> [ token: <string> ] [ datacenter: <string> ] [ scheme: <string> ] [ username: <string> ] [ password: <string> ] # 指定對于某個目標的服務列表被檢測，如果省略，所有服務被抓取 services: [ - <string> ] # The string by which Consul tags are joined into the tag label. [ tag_separator: <string> | default = , ] ``` 注意：用于獲取目標的IP和PORT，被組裝到`<__meta_consul_address>:<__meta_consul_service_port>`。然而，在一些Consul創建過程中，這個相關地址在`__meta_consul_service_address`。在這些例子中，你能使用[relabel](https://prometheus.io/docs/operating/configuration/#relabel_config)特性去替換指定的`__address__`標簽。 #### <dns_sd_config> 一個基于DNS的服務發現配置允許指定一系列的DNS域名稱，這些DNS域名被周期性地查詢，用來發現目標列表。這些DNS服務是從`/etc/resolv.conf`獲取的。這些服務發現方法僅僅支持基本的DNS A，AAAA和SRV記錄查詢，但不支持在RFC6763中指定更高級的DNS-SD方案。在[重構標簽階段](https://prometheus.io/docs/operating/configuration/#relabel_config)，這個標簽`__meta_dns_name`在每一個目標上都是可用的，并且會設置生產發現的目標到記錄名稱中。 ``` # 將被查詢的DNS域名列表 names: [ - <domain_name> ] # 要執行DNS查詢類型，默認為SRV，其他方式：A、AAAA和SRV [ type: <query_type> | default = 'SRV' ] # 如果查詢類型不是SRV，這端口被使用 [ port: <number>] # 刷新周期, 默認30s [ refresh_interval: <duration> | default = 30s ] ``` `<domain_name>`必須是一個有效的DNS域名。`<query_type>`必須是`SRV, A， AAAA`三種之一。 #### <ec2_sd_config> EC2 SD配置允許從AWS EC2實例中檢索目標。默認情況下用內網IP地址, 但是在relabeling期間可以改變成公網ID地址。下面meta標簽在relabeling期間在目標上是可用的： - `__meta_ec2_availability_zone`: 正在運行的實例的可用域。 - `__meta_ec2_instance_id`: EC2的實例ID - `__meta_ec2_instance_state`: EC2的實例狀態 - `__meta_ec2_instance_type`: EC2的實例類型 - `__meta_ec2_private_ip`: 如果存在，表示內網IP的地址 - `__meta_ec2_public_dns_name`: 如果可用，表示實例的公網DNS名稱 - `__meta_ec2_public_ip`: 如果可用，表示實例的公網IP地址 - `__meta_ec2_subnet_id`: 如果可用，表示子網IDs的列表。 - `__meta_ec2_tag_<tagkey>`: 這個實例的tag值 - `__meta_ec2_vpc_id`: 如果可用，表示正在運行的實例的VPC的ID 對于EC2 discovery，看看下面的配置選項： ``` # 訪問EC2 API的信息 # AWS域 region: <string> # AWS API keys. 如果空白，環境變量`AWS_ACCESS_KEY_ID`和`AWS_SECRET_ACCESS_KEY`可以被使用 [ access_key: <string> ] [ secret_key: <string> ] # Named AWS profile used to connect to the API. [ profile: <string> ] # Refresh interval to re-read the instance list. [ refresh_interval: <duration> | default = 60s ] # The port to scrape metrics from. If using the public IP address, this must # instead be specified in the relabeling rule. [ port: <int> | default = 80 ] ``` #### <file_sd_config> 基于文件的服務發現提供了一些通用方法去配置靜態目標，以及作為插件自定義服務發現機制的接口。它讀取包含零個或者多個`<static_config>s`的一些文件。通過磁盤監視器檢測對所有定義文件的更改，并立即應用。文件可能以YAML或JSON格式提供。只應用于形成良好目標群體的變化。這個JSON文件必須包含靜態配置的列表，使用這個格式： ``` [ { "targets": [ "<host>", ... ], "labels": { "<labelname>": "<labelvalue>", ... } }, ... ] ``` 文件內容也可以通過周期性刷新時間重新加載。在標簽重構階段，每個目標有一個meta標簽`__meta_filepath`。它的值被設置成從目標中提取的文件路徑。 ``` # Patterns for files from which target groups are extracted. files: [ - <filename_pattern> ... ] # Refresh interval to re-read the files. [ refresh_interval: <duration> | default = 5m ] ``` `filename_pattern`可以是以`.json, .yml, .yaml`結尾。最后路徑段可以包含單個`*`，它匹配任何字符順序，例如: `my/path/tg_*.json`。在`v0.20`, `names`: 用`files:`代替。 #### <gce_sd_config> **GCE SD在測試中：在將來版本中，配置可能會有實質性變化。** 從GCP GCE實例中，GCE SD配置允許檢索和獲取目標。這個內網IP地址被默認使用，但是在relabeling期間，這個公網IP地址可能會發生變化。在relabeling期間，下面的meta標簽在目標上是可用的： - `__meta_gce_instance_name`: 實例名稱 - `__meta_gce_metadata_<name>`: 實例每一個metadata項 - `__meta_gce_network`: 實例的網絡 - `__meta_gce_private_ip`: 實例的內網IP - `__meta_gce_project`: 正在運行的GCP項目 - `__meta_gce_public_ip`: 如果存在，表示GCP的公網IP地址 - `__meta_gce_subnetwork`: 實例的子網 - `__meta_gce_tags`: 實例的tag列表 - `__meta_gce_zone`: 正在運行的實例的GCE區域對于GCE discovery，看看下面的配置選項： ``` # The information to access the GCE API. # The GCP Project project: <string> # The zone of the scrape targets. If you need multiple zones use multiple # gce_sd_configs. zone: <string> # Filter can be used optionally to filter the instance list by other criteria [ filter: <string> ] # Refresh interval to re-read the instance list [ refresh_interval: <duration> | default = 60s ] # The port to scrape metrics from. If using the public IP address, this must # instead be specified in the relabeling rule. [ port: <int> | default = 80 ] # The tag separator is used to separate the tags on concatenation [ tag_separator: <string> | default = , ] ``` Google Cloud SDK默認客戶端通過查找一下位置發現憑據，優先選擇找到的第一個位置： 1. 由GOOGLE_APPLICATION_CREENTIALS環境變量指定的JSON文件 2. 一個JSON文件在大家都熟悉的路徑下：$HOME/.config/gclooud/application_default_credentials.json 3. 從GCE元數據服務器獲取如果Prometheus運行在GCE上，關聯這個正在運行的實例的服務賬號，應該至少可以從計算資源上有讀取數據的權限。如果運行在GCE外面，需要確保創建一個合適的服務賬號，并把證書文件放置在指定的某個地方。 #### <kubernets_sd_config> **Kubernets SD在測試中，在將來的版本中，配置可能會有實質性的變化** 從Kubernetes's REST API上，Kubernets SD配置允許檢索和獲取目標，并且始終保持與集群狀態同步。下面`role`類型中的任何一個都能在發現目標上配置： ##### 節點node 這個`node`角色發現帶有地址的每一個集群節點一個目標，都指向Kublelet的HTTP端口。這個目標地址默認為Kubernetes節點對象的第一個現有地址，地址類型為`NodeInernalIP, NodeExternalIP, NodeLegacyHostIP和NodeHostName`。可用的meta標簽： - `__meta_kubernetes_node_name`: 節點對象的名稱 - `__meta_kubernetes_node_label_<labelname>`: 節點對象的每個標簽 - `__meta_kubernetes_node_annotation_<annotationname>`: 節點對象的每個注釋 __meta_kubernetes_node_address_<address_type>: 如果存在，每一個節點對象類型的第一個地址另外，對于節點的`instance`標簽，將會被設置成從API服務中獲取的節點名稱。 ##### 服務service 對于每個服務每個服務端口，`service`角色發現一個目標。對于一個服務的黑盒監控是通常有用的。這個地址被設置成這個服務的Kubernetes DNS域名, 以及各自的服務端口。可用的meta標簽： - `__meta_kubernetes_namespace`: 服務對象的命名空間 - `__meta_kubernetes_service_name`: 服務對象的名稱 - `__meta_kubernetes_service_label_<labelname>`: 服務對象的標簽。 - `__meta_kubernetes_service_annotation_<annotationname>`: 服務對象的注釋 - `__meta_kubernetes_service_port_name`: 目標服務端口的名稱 - `__meta_kubernetes_service_port_number`: 目標服務端口的數量 - `__meta_kubernetes_service_port_portocol`: 目標服務端口的協議 #### pod `pod`角色發現所有的pods，并暴露它們的容器作為目標。對于每一個容器的聲明端口，單個目標被生成。如果一個容器沒有指定端口，每個容器的無端口目標都是通過relabeling手動添加端口而創建的。可用的meta標簽： - `__meta_kubernetes_namespace`: pod對象的命名空間 - `__meta_kubernetes_pod_name`: pod對象的名稱 - `__meta_kubernetes_pod_ip`: pod對象的IP地址 - `__meta_kubernetes_pod_label_<labelname>`: pod對象的標簽 - `__meta_kubernetes_pod_annotation_<annotationname>`: pod對象的注釋 - `__meta_kubernetes_pod_container_name`: 目標地址的容器名稱 - `__meta_kubernetes_pod_container_port_name`: 容器端口名稱 - `__meta_kubernetes_pod_container_port_number`: 容器端口的數量 - `__meta_kubernetes_pod_container_port_protocol`: 容器端口的協議 - `__meta_kubernetes_pod_ready`: 設置pod ready狀態為true或者false - `__meta_kubernetes_pod_node_name`: pod調度的node名稱 - `__meta_kubernetes_pod_host_ip`: 節點對象的主機IP ##### endpoints端點 `endpoints`角色發現來自于一個服務的列表端點目標。對于每一個終端地址，一個目標被一個port發現。如果這個終端被寫入到pod中，這個節點的所有其他容器端口，未綁定到端點的端口，也會被目標發現。可用的meta標簽： - `__meta_kubernetes_namespace`: 端點對象的命名空間 - `__meta_kubernetes_endpoints_name`: 端點對象的名稱 - 對于直接從端點列表中獲取的所有目標，下面的標簽將會被附加上。 - `__meta_kubernetes_endpoint_ready`: endpoint ready狀態設置為true或者false。 - `__meta_kubernetes_endpoint_port_name`: 端點的端口名稱 - `__meta_kubernetes_endpoint_port_protocol`: 端點的端口協議 - 如果端點屬于一個服務，這個角色的所有標簽：服務發現被附加上。 - 對于在pod中的所有目標，這個角色的所有表掐你：pod發現被附加上對于Kuberntes發現，看看下面的配置選項： ``` # The information to access the Kubernetes API. # The API server addresses. If left empty, Prometheus is assumed to run inside # of the cluster and will discover API servers automatically and use the pod's # CA certificate and bearer token file at /var/run/secrets/kubernetes.io/serviceaccount/. [ api_server: <host> ] # The Kubernetes role of entities that should be discovered. role: <role> # Optional authentication information used to authenticate to the API server. # Note that `basic_auth`, `bearer_token` and `bearer_token_file` options are # mutually exclusive. # Optional HTTP basic authentication information. basic_auth: [ username: <string> ] [ password: <string> ] # Optional bearer token authentication information. [ bearer_token: <string> ] # Optional bearer token file authentication information. [ bearer_token_file: <filename> ] # TLS configuration. tls_config: [ <tls_config> ] ``` `<role>`必須是`endpoints`, `service`, `pod`或者`node`。關于Prometheus的一個詳細配置例子，見[路徑]（https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml）你可能希望查看第三方的Prometheus操作符，它可以自動執行Kubernetes上的Prometheus設置。 #### <marathon_sd_config> **Marathon SD正在測試中：在將來的版本中配置可能會有實質性的變化** Marathon SD配置使用[Marathon](https://mesosphere.github.io/marathon/)REST API允許檢索和獲取目標。Prometheus將會定期地檢查當前運行的任務REST端點，以及對每個app創建一個目標組，這個app至少有一個健康的任務。在relabeling期間，下面的meta標簽在目標機上是可用的： - `__meta_marathon_app`: app的名稱 - `__meta_marathon_image`: 正在使用的Docker鏡像名稱 - `__meta_marathon_task`: Mesos任務ID - `__meta_marathon_app_label_<labelname>`: 附加在app上的Marathon標簽對于Marathon發現，詳見下面的配置選項： ``` # List of URLs to be used to contact Marathon servers. # You need to provide at least one server URL, but should provide URLs for # all masters you have running. servers: - <string> # Polling interval [ refresh_interval: <duration> | default = 30s ] ``` 默認情況下，在Markdown的每個列出的app會被Prometheus抓取。如果不是所有提供Prometheus度量指標，你能使用一個Marathon標簽和Prometheus relabeling去控制實際過程中被獲取的實例。默認情況下所有的app也會以Prometheus系統中的一個任務的形式顯示出來，這可以通過使用relabeling改變這些。 #### <nerve_sd_config> 從存儲在Zookeeper中的AirBnB's Nerve上，Nerve SD配置允許檢索和獲取目標。在relabeling期間，下面的meta標簽在目標上是可用的： - `__meta_nerve_path`: 在Zookeeper集群中的端節點全路徑 - `__meta_nerve_endpoint_host`: 端點的IP - `__meta_nerve_endpoint_port`: 端點的端口 - `__meta_nerve_endpoint_name`: 端點的名稱 ``` # The Zookeeper servers. servers: - <host> # Paths can point to a single service, or the root of a tree of services. paths: - <string> [ timeout: <duration> | default = 10s ] ``` #### <serverset_sd_config> Serverset SD配置允許檢索和獲取從存儲在Zookeeper中的Serversetsd的目標。Servesets由[Finagle](https://twitter.github.io/finagle/)和[Aurora](http://aurora.apache.org/)經常使用。在relabeling期間，下面的meta標簽在目標上是可用的： - `__meta_serverset_path`: 在zookeeper里的serverset成員的全路徑 - `__meta_serverset_endpoint_host`: 默認端點的host - `__meta_serverset_endpoint_port`: 默認端點的端口 - `__meta_serverset_endpoint_host_<endpoint>`: 給定端點的host - `__meta_serverset_endpoint_port_<endpoint>`: 給定端點的port - `__meta_serverset_shard`: 成員的分片數 - `__meta_serverset_status`: 成員的狀態 ``` # The Zookeeper servers. servers: - <host> # Paths can point to a single serverset, or the root of a tree of serversets. paths: - <string> [ timeout: <duration> | default = 10s ] ``` Serverset數據必須是JSON格式，Thrift格式當前不被支持 #### <triton_sd_config> ** Triton SD正在測試中：在將來的版本中配置可能會有實質性的變化** [Triton](https://github.com/joyent/triton) SD配置允許從容器監控發現端點的目標中檢索和獲取。在relabeling期間，下面的meta標簽在目標上是可用的： - `__meta_triton_machine_id`: 目標容器的UUID - `__meta_triton_machine_alias`: 目標容器的別名 - `__meta_triton_machine_image`: 目標容器的鏡像類型 - `__meta_triton_machine_server_id`: 目標容器的服務UUID ``` # The information to access the Triton discovery API. # The account to use for discovering new target containers. account: <string> # The DNS suffix which should be applied to target containers. dns_suffix: <string> # The Triton discovery endpoint (e.g. 'cmon.us-east-3b.triton.zone'). This is # often the same value as dns_suffix. endpoint: <string> # The port to use for discovery and metric scraping. [ port: <int> | default = 9163 ] # The interval which should should be used for refreshing target containers. [ refresh_interval: <duration> | default = 60s ] # The Triton discovery API version. [ version: <int> | default = 1 ] # TLS configuration. tls_config: [ <tls_config> ] ``` #### <static_config> 一個`static_config`允許指定目標列表，以及附帶的通用標簽。在獲取配置中指定靜態目標是規范的方法 ``` # The targets specified by the static config. targets: [ - '<host>' ] # Labels assigned to all metrics scraped from the targets. labels: [ <labelname>: <labelvalue> ... ] ``` #### <relabel_config> Relabeling是一個非常強大的工具，在獲取度量指標之前，它可以動態地重寫標簽集合。每個獲取配置過程中，多個relabeling步驟能夠被配置。它們按照出現在配置文件中的順序，應用到每個目標的標簽集中。最初，除了配置的每個目標標簽之外，目標的作業標簽設置為相應獲取配置的`job_name`值，這個`__address__`標簽設置為目標地址<host>:<port>。在relabeling之后，這個`instance`標簽默認設置為`__address__`標簽值。這個`__scheme__`和`__metrics_path__`標簽設置為各自目標的范式和度量指標路徑。 `__param_<name>`標簽設置為成為`<name>`的第一個傳入的URL參數。另外以`__meta__`為前綴的標簽在relabeling階段是可用的。他們由服務發現機制設置。在relabeling完成之后，由`__`開頭的標簽將會從標簽集合從移除。如果一個relabeling步驟僅僅需要臨時地存儲標簽值（作為后續relabeling步驟的輸入），使用以`__tmp`為前綴的標簽名稱。這個前綴需要確保Prometheus本身從沒有使用。 ``` # The source labels select values from existing labels. Their content is concatenated # using the configured separator and matched against the configured regular expression # for the replace, keep, and drop actions. [ source_labels: '[' <labelname> [, ...] ']' ] # Separator placed between concatenated source label values. [ separator: <string> | default = ; ] # Label to which the resulting value is written in a replace action. # It is mandatory for replace actions. [ target_label: <labelname> ] # Regular expression against which the extracted value is matched. [ regex: <regex> | default = (.*) ] # Modulus to take of the hash of the source label values. [ modulus: <uint64> ] # Replacement value against which a regex replace is performed if the # regular expression matches. [ replacement: <string> | default = $1 ] # Action to perform based on regex matching. [ action: <relabel_action> | default = replace ] ``` `<regex>`是任何有效的正則表達式，它提供`replace, keep, drop, labelmap, labeldrop, labelkeep`動作，正則表達式處于兩端。要取消指定正則表達式，請使用。.*<regex>.*。 `<relabel_action>`決定要采取的relabeling動作。 - `replace`: 匹配與`source_labels`相反的regex。然后，設置`target_label`替換`source_labels`, 返回結果包括(${1}, ${2}, ...)。如果正則表達會不匹配，則不進行任何替換。 - `keep`: 放棄與`source_labels`標簽不匹配的目標 - `drop`: 放棄與`source_labels`標簽匹配的目標 - `hashmod`: 將`target_label`設置為`source_labels`的散列模數 - `labelmap`: 匹配所有的標簽名稱，然后將匹配到的標簽值復制為由匹配組引用(${1}, ${2},...) 替換的標簽名稱替換為其值 - `labeldrop`: 匹配所有的標簽名稱。然后刪除匹配到的標簽集合。 - `labelkeep`: 匹配所有的標簽名稱。然后保留匹配到的標簽集合。必須注意`labeldrop`和`labelkeep`, 以確保除去標簽后，度量指標仍然會被唯一標識。 #### <alert_relabel_configs> 在警告被發送到Alertmanager之前，警告relabeling應用到alerts。它有相同配置格式和目標relabeling動作。警告relabeling被應用到外部標簽。一個用途是確保HA對Prometheus服務與不同的外部標簽發送相同的警告。 #### <alertmanager_config> **Alertmanager實例的動態發現是處于alpha狀態。在將來的版本中配置會發生較大地更改。通過`-alertmanager.url`標志使用靜態配置** `alertmanager_config`區域指定了Prometheus服務發送警告的Alertmanager實例。它也提供參數配置與這些Alertmanagers的通信。 Alertmanagers可以通過`static_configs`參數靜態配置，或者使用服務發現機制動態發現目標。另外，從發現的實體和使用的API路徑，`relabel_configs`允許從發現的實體列表和提供可使用的API路徑中選擇路徑。這個api path是通過`__alerts_path__`標簽暴露出來的。 ```config # Per-target Alertmanager timeout when pushing alerts. [ timeout: <duration> | default = 10s ] # Prefix for the HTTP path alerts are pushed to. [ path_prefix: <path> | default = / ] # Configures the protocol scheme used for requests. [ scheme: <scheme> | default = http ] # Sets the `Authorization` header on every request with the # configured username and password. basic_auth: [ username: <string> ] [ password: <string> ] # Sets the `Authorization` header on every request with # the configured bearer token. It is mutually exclusive with `bearer_token_file`. [ bearer_token: <string> ] # Sets the `Authorization` header on every request with the bearer token # read from the configured file. It is mutually exclusive with `bearer_token`. [ bearer_token_file: /path/to/bearer/token/file ] # Configures the scrape request's TLS settings. tls_config: [ <tls_config> ] # Optional proxy URL. [ proxy_url: <string> ] # List of Azure service discovery configurations. azure_sd_configs: [ - <azure_sd_config> ... ] # List of Consul service discovery configurations. consul_sd_configs: [ - <consul_sd_config> ... ] # List of DNS service discovery configurations. dns_sd_configs: [ - <dns_sd_config> ... ] # List of EC2 service discovery configurations. ec2_sd_configs: [ - <ec2_sd_config> ... ] # List of file service discovery configurations. file_sd_configs: [ - <file_sd_config> ... ] # List of GCE service discovery configurations. gce_sd_configs: [ - <gce_sd_config> ... ] # List of Kubernetes service discovery configurations. kubernetes_sd_configs: [ - <kubernetes_sd_config> ... ] # List of Marathon service discovery configurations. marathon_sd_configs: [ - <marathon_sd_config> ... ] # List of AirBnB's Nerve service discovery configurations. nerve_sd_configs: [ - <nerve_sd_config> ... ] # List of Zookeeper Serverset service discovery configurations. serverset_sd_configs: [ - <serverset_sd_config> ... ] # List of Triton service discovery configurations. triton_sd_configs: [ - <triton_sd_config> ... ] # List of labeled statically configured Alertmanagers. static_configs: [ - <static_config> ... ] # List of Alertmanager relabel configurations. relabel_configs: [ - <relabel_config> ... ] ``` #### <remote_write> **遠程寫是實驗性的：在將來的版本中配置可能會實質性地變化** `url`是發送樣本的端點URL。`remote_timeout`指定發送請求到URL的超時時間。目前沒有重試機制 `basic_auth`, `tls_config`和`proxy_url`和在`scrape_config`區域里有相同的含義。 `write_relabel_configs`是relabeling應用到樣本數據的。寫relabeling是應用到外部標簽之后的。這可能有樣本發送數量的限制。這里有一個[小Demo](https://github.com/prometheus/prometheus/tree/master/documentation/examples/remote_storage)，告訴你怎樣使用這個功能