{"id":433,"date":"2021-01-05T23:30:46","date_gmt":"2021-01-05T15:30:46","guid":{"rendered":"https:\/\/blog.frost-s.tk\/?p=433"},"modified":"2021-12-17T11:41:23","modified_gmt":"2021-12-17T03:41:23","slug":"es%e8%b7%af%e5%be%84%e5%88%86%e8%af%8d%e5%99%a8","status":"publish","type":"post","link":"https:\/\/blog.frost-s.com\/index.php\/2021\/01\/05\/es%e8%b7%af%e5%be%84%e5%88%86%e8%af%8d%e5%99%a8\/","title":{"rendered":"ES\u8def\u5f84\u5206\u8bcd\u5668"},"content":{"rendered":"\n<p>\u9644\uff1aElasticSearch version \uff1a7.10<\/p>\n\n\n\n<p><strong>path_hierarchy tokenizer <\/strong>\u628a\u5206\u5c42\u7684\u503c\u770b\u6210\u662f\u6587\u4ef6\u8def\u5f84\uff0c\u7528\u8def\u5f84\u5206\u9694\u7b26\u5206\u5272\u6587\u672c\uff0c\u8f93\u51fa\u6811\u4e0a\u7684\u5404\u4e2a\u8282\u70b9\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>POST _analyze\n{\n  \"tokenizer\": \"path_hierarchy\",\n  \"text\": \"\/one\/two\/three\"\n}<\/code><\/pre>\n\n\n\n<p>\u8f93\u51fa\u4e3a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#91; \/one, \/one\/two, \/one\/two\/three ]<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Configuration<a href=\"https:\/\/github.com\/elastic\/elasticsearch\/edit\/7.10\/docs\/reference\/analysis\/tokenizers\/pathhierarchy-tokenizer.asciidoc\">edit<\/a><\/h3>\n\n\n\n<p>The&nbsp;<code>path_hierarchy<\/code>&nbsp;tokenizer accepts the following parameters:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><code>delimiter<\/code><\/td><td>The character to use as the path separator. Defaults to&nbsp;<code>\/<\/code>.<\/td><\/tr><tr><td><code>replacement<\/code><\/td><td>An optional replacement character to use for the delimiter. Defaults to the&nbsp;<code>delimiter<\/code>.<\/td><\/tr><tr><td><code>buffer_size<\/code><\/td><td>The number of characters read into the term buffer in a single pass. Defaults to&nbsp;<code>1024<\/code>. The term buffer will grow by this size until all the text has been consumed. It is advisable not to change this setting.<\/td><\/tr><tr><td><code>reverse<\/code><\/td><td>If set to&nbsp;<code>true<\/code>, emits the tokens in reverse order. Defaults to&nbsp;<code>false<\/code>.<\/td><\/tr><tr><td><code>skip<\/code><\/td><td>The number of initial tokens to skip. Defaults to&nbsp;<code>0<\/code>.<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>\u5b9e\u4f8b\u6570\u636e\uff1a<\/p>\n\n\n\n<p>In this example, we configure the&nbsp;<code>path_hierarchy<\/code>&nbsp;tokenizer to split on&nbsp;<code>-<\/code>&nbsp;characters, and to replace them with&nbsp;<code>\/<\/code>. The first two tokens are skipped:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>PUT my-index-000001\n{\n  \"settings\": {\n    \"analysis\": {\n      \"analyzer\": {\n        \"my_analyzer\": {\n          \"tokenizer\": \"my_tokenizer\"\n        }\n      },\n      \"tokenizer\": {\n        \"my_tokenizer\": {\n          \"type\": \"path_hierarchy\",\n          \"delimiter\": \"-\",\n          \"replacement\": \"\/\",\n          \"skip\": 2\n        }\n      }\n    }\n  }\n}\n\nPOST my-index-000001\/_analyze\n{\n  \"analyzer\": \"my_analyzer\",\n  \"text\": \"one-two-three-four-five\"\n}<\/code><\/pre>\n\n\n\n<p>The above example produces the following terms:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#91; \/three, \/three\/four, \/three\/four\/five ]<\/code><\/pre>\n\n\n\n<p>If we were to set&nbsp;<code>reverse<\/code>&nbsp;to&nbsp;<code>true<\/code>, it would produce the following:\uff08\u5012\u5e8f\u8f93\u51fa\uff09<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#91; one\/two\/three\/, two\/three\/, three\/ ]<\/code><\/pre>\n\n\n\n<p>\u8be6\u7ec6\u793a\u4f8b\uff1a\uff08\u8def\u5f84\u67e5\u8be2\uff09<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>PUT file-path-test\n{\n  \"settings\": {\n    \"analysis\": {\n      \"analyzer\": {\n        \"custom_path_tree\": {\n          \"tokenizer\": \"custom_hierarchy\"\n        },\n        \"custom_path_tree_reversed\": {\n          \"tokenizer\": \"custom_hierarchy_reversed\"\n        }\n      },\n      \"tokenizer\": {\n        \"custom_hierarchy\": {\n          \"type\": \"path_hierarchy\",\n          \"delimiter\": \"\/\"\n        },\n        \"custom_hierarchy_reversed\": {\n          \"type\": \"path_hierarchy\",\n          \"delimiter\": \"\/\",\n          \"reverse\": \"true\"\n        }\n      }\n    }\n  },\n  \"mappings\": {\n    \"properties\": {\n      \"file_path\": {\n        \"type\": \"text\",\n        \"fields\": {\n          \"tree\": {\n            \"type\": \"text\",\n            \"analyzer\": \"custom_path_tree\"\n          },\n          \"tree_reversed\": {\n            \"type\": \"text\",\n            \"analyzer\": \"custom_path_tree_reversed\"\n          }\n        }\n      }\n    }\n  }\n}\n\nPOST file-path-test\/_doc\/1\n{\n  \"file_path\": \"\/User\/alice\/photos\/2017\/05\/16\/my_photo1.jpg\"\n}\n\nPOST file-path-test\/_doc\/2\n{\n  \"file_path\": \"\/User\/alice\/photos\/2017\/05\/16\/my_photo2.jpg\"\n}\n\nPOST file-path-test\/_doc\/3\n{\n  \"file_path\": \"\/User\/alice\/photos\/2017\/05\/16\/my_photo3.jpg\"\n}\n\nPOST file-path-test\/_doc\/4\n{\n  \"file_path\": \"\/User\/alice\/photos\/2017\/05\/15\/my_photo1.jpg\"\n}\n\nPOST file-path-test\/_doc\/5\n{\n  \"file_path\": \"\/User\/bob\/photos\/2017\/05\/16\/my_photo1.jpg\"\n}<\/code><\/pre>\n\n\n\n<p>match\u5339\u914d\uff0c\u901a\u8fc7\u76f8\u5173\u6027\u8fdb\u884c\u5339\u914d\uff0c\u5982\u679c\u6ca1\u641c\u5230\uff0c\u4e5f\u4f1a\u6309\u7167\u76f8\u5173\u6027\u8fdb\u884c\u5339\u914d\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>GET file-path-test\/_search\n{\n  \"query\": {\n    \"match\": {\n      \"file_path\": \"\/User\/bob\/photos\/2017\/05\"\n    }\n  }\n}<\/code><\/pre>\n\n\n\n<p>\u7cbe\u51c6\u5339\u914d\uff0cterm\u6765\u8fdb\u884c\u5339\u914d\u8def\u5f84\uff0c\u901a\u5e38\u4f7f\u7528\u8be5\u65b9\u5f0f\u6765\u8fdb\u884c\u8def\u5f84\u67e5\u8be2\u5339\u914d<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>GET file-path-test\/_search\n{\n  \"query\": {\n    \"term\": {\n      \"file_path.tree\": \"\/User\/alice\/photos\/2017\/05\/16\"\n    }\n  }\n}<\/code><\/pre>\n\n\n\n<p>\u540c\u6837\uff0c\u8be5\u67e5\u8be2\u4e5f\u53ef\u7528\u4e8e\u591a\u6761\u4ef6\u8fdb\u884c\u7ec4\u5408\u67e5\u8be2\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u9644\uff1aElasticSearch version \uff1a7.10 path_hierarchy tokenizer  [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":452,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[14,15],"_links":{"self":[{"href":"https:\/\/blog.frost-s.com\/index.php\/wp-json\/wp\/v2\/posts\/433"}],"collection":[{"href":"https:\/\/blog.frost-s.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.frost-s.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.frost-s.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.frost-s.com\/index.php\/wp-json\/wp\/v2\/comments?post=433"}],"version-history":[{"count":9,"href":"https:\/\/blog.frost-s.com\/index.php\/wp-json\/wp\/v2\/posts\/433\/revisions"}],"predecessor-version":[{"id":841,"href":"https:\/\/blog.frost-s.com\/index.php\/wp-json\/wp\/v2\/posts\/433\/revisions\/841"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.frost-s.com\/index.php\/wp-json\/wp\/v2\/media\/452"}],"wp:attachment":[{"href":"https:\/\/blog.frost-s.com\/index.php\/wp-json\/wp\/v2\/media?parent=433"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.frost-s.com\/index.php\/wp-json\/wp\/v2\/categories?post=433"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.frost-s.com\/index.php\/wp-json\/wp\/v2\/tags?post=433"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}