查询建议是什么?
查询建议,为用户提供良好的使用体验。主要包括: 拼写检查; 自动建议查询词(自动补全)
拼写检查如图:
自动建议查询词(自动补全):
搜索建议
通过Suggester Api实现
原理是将输入的文本分解为Token,然后在词典中查找类似的Term返回
根据不同场景,ElasticSearch设计了4中类别的Suggesters。
Term Suggester
Phrase Suggester
Complete Suggester
Context Suggester
Term Suggester
term 词项建议器,对给入的文本进行分词,为每个词进行模糊查询提供词项建议。对于在索引中存在词默认不提供建议词,不存在的词则根据模糊查询结果进行排序后取一定数量的建议词。
类似Google搜索引擎,我给的是一个错误的单词elasticserch,但引擎友好地给出了搜索建议。
要实现这个功能,在ElasticSearch中很简单。
创建索引,并写入一些文档
POST articles/_bulk { "index" : { } } { "body": "lucene is very cool"} { "index" : { } } { "body": "Elasticsearch builds on top of lucene"} { "index" : { } } { "body": "Elasticsearch rocks"} { "index" : { } } { "body": "elastic is the company behind ELK stack"} { "index" : { } } { "body": "Elk stack rocks"} { "index" : {} } { "body": "elasticsearch is rock solid"}
搜索文档,调用suggest api。有3种Suggestion Mode
1.missing 索引中已经存在,就不提供建议
2.popular 推荐出现频率更加高的词
3.always 无论是否存在,都提供建议
POST /articles/_search { "size": 1, "query": { "match": { "body": "elasticserch" } }, "suggest": { "term-suggestion": { "text": "elasticserch", "term": { "suggest_mode": "missing", "field": "body" } } } }
返回结果
{ "took" : 130, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "term-suggestion" : [ { "text" : "elasticserch", "offset" : 0, "length" : 12, "options" : [ { "text" : "elasticsearch", "score" : 0.9166667, "freq" : 3 } ] } ] } }
Phrase Suggester
phrase 短语建议,在term的基础上,会考量多个term之间的关系,比如是否同时出现在索引的原文里,相邻程度,以及词频等
其中一些参数
max_errors 最多可以拼错的terms
confidence 限制返回结果数,默认1
POST /articles/_search { "suggest": { "my-suggestion": { "text": "lucne and elasticsear rock hello world ", "phrase": { "field": "body", "max_errors":2, "confidence":2, "direct_generator":[{ "field":"body", "suggest_mode":"missing" }], "highlight": { "pre_tag": "<em>", "post_tag": "</em>" } } } } }
结果如下:
{ "took" : 288, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "my-suggestion" : [ { "text" : "lucne and elasticsear rock hello world ", "offset" : 0, "length" : 39, "options" : [ { "text" : "lucene and elasticsearch rock hello world", "highlighted" : "<em>lucene</em> and <em>elasticsearch</em> rock hello world", "score" : 1.5788074E-4 } ] } ] } }
Completion Suggester
针对自动补全场景而设计的建议器。此场景下用户每输入一个字符的时候,就需要即时发送一次查询请求到后端查找匹配项,在用户输入速度较高的情况下对后端响应速度要求比较苛刻。因此实现上它和前面两个Suggester采用了不同的数据结构,索引并非通过倒排来完成,而是将analyze过的数据编码成FST和索引一起存放。对于一个open状态的索引,FST会被ES整个装载到内存里的,进行前缀查找速度极快。但是FST只能用于前缀查找,这也是Completion Suggester的局限所在。
类似百度这样的提示功能
在ElasticSearch要实现这样的功能也很简单。
1. 建立索引
PUT titles { "mappings": { "properties": { "title_completion": { "type": "completion" } } } }
2. 写入文档
POST titles/_bulk { "index" : { } } { "title_completion": "php是什么"} { "index" : { } } { "title_completion": "php是世界上最好的语言"} { "index" : { } } { "title_completion": "php货币"} { "index" : { } } { "title_completion": "php面试题2019"}
3. 搜索数据
POST titles/_search?pretty { "size": 0, "suggest": { "article-suggester": { "prefix": "php", "completion": { "field": "title_completion" } } } }
3. 返回结果
{ "took" : 95, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "article-suggester" : [ { "text" : "php", "offset" : 0, "length" : 3, "options" : [ { "text" : "php是世界上最好的语言", "_index" : "titles", "_type" : "_doc", "_id" : "BhTVQXkBv4MlM6fI82Mo", "_score" : 1.0, "_source" : { "title_completion" : "php是世界上最好的语言" } }, { "text" : "php是什么", "_index" : "titles", "_type" : "_doc", "_id" : "BRTVQXkBv4MlM6fI82Mo", "_score" : 1.0, "_source" : { "title_completion" : "php是什么" } }, { "text" : "php货币", "_index" : "titles", "_type" : "_doc", "_id" : "BxTVQXkBv4MlM6fI82Mo", "_score" : 1.0, "_source" : { "title_completion" : "php货币" } }, { "text" : "php面试题2019", "_index" : "titles", "_type" : "_doc", "_id" : "CBTVQXkBv4MlM6fI82Mo", "_score" : 1.0, "_source" : { "title_completion" : "php面试题2019" } } ] } ] } }
Context Suggester
是Completion Suggester的扩展,加入了上下文信息场景。
例如:
你在电器商城,输入苹果,想要找到的苹果笔记本...
你在水果商城,输入苹果,想要找的是红苹果、绿苹果...
1. 建立索引,定制mapping
PUT comments { "mappings": { "properties": { "comment_autocomplete": { "type": "completion", "contexts": [{ "type": "category", "name": "comment_category" }] } } } }
2. 并为每个文档加入Context信息
POST comments/_doc { "comment": "苹果电脑", "comment_autocomplete": { "input": ["苹果电脑"], "contexts": { "comment_category": "电器商城" } } } POST comments/_doc { "comment": "红红的冰糖心苹果", "comment_autocomplete": { "input": ["苹果"], "contexts": { "comment_category": "水果商城" } } }
3. 结合Context进行Suggestion查询
POST comments/_search { "suggest": { "MY_SUGGESTION": { "prefix": "苹", "completion": { "field": "comment_autocomplete", "contexts": { "comment_category": "电器商城" } } } } }
4. 返回结果
{ "took" : 586, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "MY_SUGGESTION" : [ { "text" : "苹", "offset" : 0, "length" : 1, "options" : [ { "text" : "苹果电脑", "_index" : "comments", "_type" : "_doc", "_id" : "CRTcQXkBv4MlM6fIDmPo", "_score" : 1.0, "_source" : { "comment" : "苹果电脑", "comment_autocomplete" : { "input" : [ "苹果电脑" ], "contexts" : { "comment_category" : "电器商城" } } }, "contexts" : { "comment_category" : [ "电器商城" ] } } ] } ] } }
附录
《本文》有 0 条评论