## **mapping參數解析**
官方文檔地址:[https://www.elastic.co/guide/en/elasticsearch/reference/6.x/mapping-params.html]()
**1. analyzer**
指定分詞器(分析器更合理),對索引和查詢都有效。如下,指定ik分詞的配置
(1)定義索引并定義mapping
```
PUT test
{
"mappings": {
"it":{
"properties":{
"name" : {
"type" : "text",
"analyzer" : "ik_smart",
"search_analyzer":"ik_max_word"
}
}
}
}
}
```
(2)插入數據
```
PUT test/it/1
{
"name" : "美國留給伊拉克的是個爛攤子"
}
PUT test/it/2
{
"name" : "中國駐洛杉磯領事館遭亞裔男子槍擊,嫌犯已自首"
}
PUT test/it/3
{
"name" : "中韓漁船沖突調查:韓警平均扣留一艘國漁船"
}
PUT test/it/4
{
"name" : "公安部:各地校車將享受最高路權"
}
```
(3)查詢
```
POST test/it/_search
{
"query": {
"match": {
"name": "中國"
}
}
}
```
查詢結果:
```
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.65109104,
"hits": [
{
"_index": "test",
"_type": "it",
"_id": "2",
"_score": 0.65109104,
"_source": {
"name": "中國駐洛杉磯領事館遭亞裔男子槍擊,嫌犯已自首"
}
}
]
}
}
```
**2. normalizer**
normalizer用于解析前的標準化配置,比如把所有的字符轉化為小寫等。
(1) 創建索引
```
PUT my_index/
{
"settings": {
"analysis": {
"normalizer":{
"my_normalizer":{
"type":"custom",
"char_filter" : [],
"filter" : ["lowercase", "asciifolding"]
}
}
}
},
"mappings": {
"_doc" : {
"properties" : {
"foo" : {
"type": "keyword",
"normalizer": "my_normalizer"
}
}
}
}
}
```
(2) 插入數據
```
PUT my_index/_doc/1
{
"foo": "BàR"
}
PUT my_index/_doc/2
{
"foo": "bar"
}
PUT my_index/_doc/3
{
"foo": "baz"
}
```
(3) 查詢數據
```
GET my_index/_search
{
"query": {
"term": {
"foo": "BAR"
}
}
}
GET my_index/_search
{
"query": {
"match": {
"foo": "BAR"
}
}
}
```
返回結果:
```
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.2876821,
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "2",
"_score": 0.2876821,
"_source": {
"foo": "bar"
}
},
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"foo": "BàR"
}
}
]
}
}
```
**3.boost**
通過指定一個boost值來控制每個查詢子句的相對權重,該值默認為1。一個大于1的boost會增加該查詢子句的相對權重。
(1) 創建索引并插入數據:
```
#創建索引
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"title": {
"type": "text",
"boost": 2
},
"content": {
"type": "text"
}
}
}
}
}
#插入數據
PUT my_index/_doc/1
{
"title" : "hello world",
"content" : "你好世界"
}
```
(2) 查詢:
```
#查詢
POST my_index/_search
{
"query": {
"match" : {
"title": {
"query": "quick brown fox"
}
}
}
}
#返回結果:
{
"took": 13,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.1507283,
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 1.1507283,
"_source": {
"title": "hello world",
"content": "你好世界"
}
}
]
}
}
```
boost參數被用來增加一個子句的相對權重(當boost大于1時),或者減小相對權重(當boost介于0到1時),但是增加或者減小不是線性的。換言之,boost設為2并不會讓最終的_score加倍。 相反,新的_score會在適用了boost后被歸一化(Normalized)。每種查詢都有自己的歸一化算法(Normalization Algorithm)。但是能夠說一個高的boost值會產生一個高的_score。
**4.coerce**
coerce屬性用于清除臟數據,coerce的默認值是true。整型數字5有可能會被寫成字符串“5”或者浮點數5.0.coerce屬性可以用來清除臟數據:
* 字符串會被強制轉換為整數
* 浮點數被強制轉換為整數
```
#創建索引
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"title": {
"type": "text"
},
"content": {
"type": "text"
},
"age" : {
"type" : "integer",
"coerce" : false
}
}
}
}
}
#第一次插入數據
PUT my_index/_doc/1
{
"title" : "hello world",
"content" : "你好世界",
"age" : 5 #注意此處區別
}
#第一次返回結果
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
#第二次插入數據:
PUT my_index/_doc/1
{
"title" : "hello world",
"content" : "你好世界",
"age" : "5" #注意此處區別
}
#第二次返回結果
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "failed to parse [age]"
}
],
"type": "mapper_parsing_exception",
"reason": "failed to parse [age]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Integer value passed as String"
}
},
"status": 400
}
```
**5.copy-to**
copy_to屬性用于配置自定義的_all字段。換言之,就是多個字段可以合并成一個超級字段。比如,first_name和last_name可以合并為full_name字段。
```
#創建索引
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"first_name":{
"type" : "text",
"copy_to" : "full_name"
},
"second_name" : {
"type" : "text" ,
"copy_to" : "full_name"
},
"full_name" : {
"type" : "text"
}
}
}
}
}
#插入數據
PUT my_index/_doc/1
{
"first_name" : "hello",
"second_name" : "world"
}
#查詢
POST my_index/_search
{
"query": {
"match": {
"full_name": {
"query": "hello world",
"operator": "and"
}
}
}
}
#返回結果
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.5753642,
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 0.5753642,
"_source": {
"first_name": "hello",
"second_name": "world"
}
}
]
}
}
```
**6.doc_values**
doc_values是為了加快排序、聚合操作,在建立倒排索引的時候,額外增加一個列式存儲映射,是一個空間換時間的做法。默認是開啟的,對于確定不需要聚合或者排序的字段可以關閉。
```
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"first_name":{
"type" : "text",
"copy_to" : "full_name"
},
"second_name" : {
"type" : "text" ,
"copy_to" : "full_name",
"doc_values" : false
},
"full_name" : {
"type" : "text"
}
}
}
}
}
```
**7.dynamic**
dynamic屬性用于檢測新發現的字段(即插入記錄是存在字段沒有被定義的情況),有三個取值:
* true:新發現的字段添加到映射中。(默認)
* flase:新檢測的字段被忽略。必須顯式添加新字段。
* strict:如果檢測到新字段,就會引發異常并拒絕文檔
```
#創建索引
PUT my_index
{
"mappings": {
"_doc": {
"dynamic":"strict",
"properties": {
"first_name":{
"type" : "text",
"copy_to" : "full_name"
},
"second_name" : {
"type" : "text" ,
"copy_to" : "full_name",
"doc_values" : false
},
"full_name" : {
"type" : "text"
}
}
}
}
}
#添加文檔,添加不存在的字段
PUT my_index/_doc/1
{
"first_name" : "hello",
"second_name" : "world",
"age" : 10
}
#返回結果
{
"error": {
"root_cause": [
{
"type": "strict_dynamic_mapping_exception",
"reason": "mapping set to strict, dynamic introduction of [age] within [_doc] is not allowed"
}
],
"type": "strict_dynamic_mapping_exception",
"reason": "mapping set to strict, dynamic introduction of [age] within [_doc] is not allowed"
},
"status": 400
}
```
**8.enabled**
ELasticseaech默認會索引所有的字段,enabled設為false的字段,es會跳過字段內容,該字段只能從_source中獲取,但是不可搜。而且字段可以是任意類型。
```
#創建索引
PUT my_index
{
"mappings": {
"_doc": {
"dynamic":"strict",
"properties": {
"first_name":{
"type" : "text",
"copy_to" : "full_name"
},
"second_name" : {
"type" : "text" ,
"copy_to" : "full_name",
"doc_values" : false
},
"full_name" : {
"type" : "text"
},
"age":{
"enabled": false
}
}
}
}
}
#插入數據
PUT my_index/_doc/1
{
"first_name" : "hello",
"second_name" : "world",
"age" : 10
}
#查詢
POST my_index/_search
{
"query": {
"match": {
"age": {
"query": 10
}
}
}
}
#返回結果
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
```
**9.format**
當type(字段類型)為date時指定日期的保存格式。除了使用系統內置的格式還可以使用自己熟悉的格式,例如:yyyy/mm/dd。(格式將在接下來的章節中詳細講解)
**10.ignore_above**
ignore_above用于指定字段索引和存儲的長度最大值,超過最大值的會被忽略(不能用于type類型為text的字段中)
```
#添加索引
PUT my_index
{
"mappings": {
"_doc": {
"dynamic":"strict",
"properties": {
"keyword" : {
"type":"keyword",
"ignore_above" : 5
}
}
}
}
}
#添加第一條數據(不超過5個字符)
PUT my_index/_doc/1
{
"keyword" : "hello"
}
#添加第二條數據(超過5個字符)
PUT my_index/_doc/2
{
"keyword" : "hello world"
}
#查詢字段
POST my_index/_search
{
"query": {
"match": {
"keyword": {
"query": "hello"
}
}
}
}
#查詢結果,超過5個字符的將被忽略
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"keyword": "hello"
}
}
]
}
}
```
mapping中指定了ignore_above字段的最大長度為5,第一個文檔的字段長小于等于5,因此索引成功,第二個超過5,因此不索引
**11.ignore_malformed**
ignore_malformed可以忽略不規則數據。對于賬號userid字段,有人可能填寫的是 整數類型,也有人填寫的是郵件格式。給一個字段索引不合適的數據類型發生異常,導致整個文檔索引失敗。如果ignore_malformed參數設為true,異常會被忽略,出異常的字段不會被索引,其它字段正常索引。
```
#第一種情況當ignore_malformed為false時
PUT my_index
{
"mappings": {
"_doc": {
"dynamic":"strict",
"properties": {
"age" : {
"type":"integer",
"ignore_malformed" : false
}
}
}
}
}
#插入數據(整型)
PUT my_index/_doc/2
{
"age" : "10"
}
#返回結果插入成功
{
"_index": "my_index",
"_type": "_doc",
"_id": "2",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
#插入數據(非整形)
PUT my_index/_doc/1
{
"age" : "hello"
}
#返回結果
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "failed to parse [age]"
}
],
"type": "mapper_parsing_exception",
"reason": "failed to parse [age]",
"caused_by": {
"type": "number_format_exception",
"reason": "For input string: \"hello\""
}
},
"status": 400
}
#第二種情況,當ignore_malformed為true時
PUT my_index
{
"mappings": {
"_doc": {
"dynamic":"strict",
"properties": {
"age" : {
"type":"integer",
"ignore_malformed" : true
}
}
}
}
}
#插入整形數據和非整形數據
PUT my_index/_doc/1
{
"age" : "hello"
}
PUT my_index/_doc/2
{
"age" : "10"
}
#均插入成功
{
"_index": "my_index",
"_type": "_doc",
"_id": "2",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
```
**12.index_options**
用于控制倒排索引記錄的內容,有如下四個配置選項

```
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"text": {
"type": "text",
"index_options": "offsets"
}
}
}
}
}
```
**13.index**
index屬性用于指定字段是否索引,不索引也就不可搜索,取值可以為true或者false。
```
PUT my_index
{
"mappings": {
"_doc": {
"dynamic":"strict",
"properties": {
"name" : {
"type":"text",
"index" : false
},
"title" : {
"type" : "text"
}
}
}
}
}
```
**14.null_value**
當字段遇到null時得處理策略,默認為null,即為空,此時es會忽略該值。可以通過設定該值設定字段的默認值。(該屬性不能用于type類型為:text的字段下)
```
PUT my_index
{
"mappings": {
"_doc": {
"dynamic":"strict",
"properties": {
"name" : {
"type":"text",
"index" : false
},
"title" : {
"type" : "keyword",
"null_value" : "null"
}
}
}
}
}
```
**15.fields**
fields可以讓同一文本有多種不同的索引方式,比如一個String類型的字段,可以使用text類型做全文檢索,使用keyword類型做聚合和排序。```
fields可以讓同一文本有多種不同的索引方式,比如一個String類型的字段,可以使用text類型做全文檢索,使用keyword類型做聚合和排序。```
fields可以讓同一文本有多種不同的索引方式,比如一個String類型的字段,可以使用text類型做全文檢索,使用keyword類型做聚合和排序。
```
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"city": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
}
PUT my_index/my_type/1
{
"city": "New York"
}
PUT my_index/my_type/2
{
"city": "York"
}
GET my_index/_search
{
"query": {
"match": {
"city": "york"
}
},
"sort": {
"city.raw": "asc"
},
"aggs": {
"Cities": {
"terms": {
"field": "city.raw"
}
}
}
}
{
"took": 31,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": null,
"hits": [
{
"_index": "my_index",
"_type": "my_type",
"_id": "1",
"_score": null,
"_source": {
"city": "New York"
},
"sort": [
"New York"
]
},
{
"_index": "my_index",
"_type": "my_type",
"_id": "2",
"_score": null,
"_source": {
"city": "York"
},
"sort": [
"York"
]
}
]
},
"aggregations": {
"Cities": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "New York",
"doc_count": 1
},
{
"key": "York",
"doc_count": 1
}
]
}
}
}