java – Lucene荧光笔
发布时间:2020-05-24 18:16:49 所属栏目:Java 来源:互联网
导读:Lucene 4.3.1荧光笔是如何工作的?我想从文档打印出搜索结果(作为搜索的单词和该单词后面的8个单词).我如何使用荧光笔类来做到这一点?我已将完整的txt,html和xml文档添加到文件中并将其添加到我的索引中,现在我有一个搜索公式,我可能会从中添加荧光笔功能:
|
Lucene 4.3.1荧光笔是如何工作的?我想从文档打印出搜索结果(作为搜索的单词和该单词后面的8个单词).我如何使用荧光笔类来做到这一点?我已将完整的txt,html和xml文档添加到文件中并将其添加到我的索引中,现在我有一个搜索公式,我可能会从中添加荧光笔功能: String index = "index";
String field = "contents";
String queries = null;
int repeat = 1;
boolean raw = true; //not sure what raw really does???
String queryString = null; //keep null,prompt user later for it
int hitsPerPage = 10; //leave it at 10,go from there later
//need to add all files to same directory
index = "C:UsersplibDocumentsindex";
repeat = 4;
IndexReader reader = DirectoryReader.open(FSDirectory.open(new File(index)));
IndexSearcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_43);
BufferedReader in = null;
if (queries != null) {
in = new BufferedReader(new InputStreamReader(new FileInputStream(queries),"UTF-8"));
} else {
in = new BufferedReader(new InputStreamReader(System.in,"UTF-8"));
}
QueryParser parser = new QueryParser(Version.LUCENE_43,field,analyzer);
while (true) {
if (queries == null && queryString == null) { // prompt the user
System.out.println("Enter query. 'quit' = quit: ");
}
String line = queryString != null ? queryString : in.readLine();
if (line == null || line.length() == -1) {
break;
}
line = line.trim();
if (line.length() == 0 || line.equalsIgnoreCase("quit")) {
break;
}
Query query = parser.parse(line);
System.out.println("Searching for: " + query.toString(field));
if (repeat > 0) { // repeat & time as benchmark
Date start = new Date();
for (int i = 0; i < repeat; i++) {
searcher.search(query,null,100);
}
Date end = new Date();
System.out.println("Time: "+(end.getTime()-start.getTime())+"ms");
}
doPagingSearch(in,searcher,query,hitsPerPage,raw,queries == null && queryString == null);
if (queryString != null) {
break;
}
}
reader.close();
} 解决方法我有同样的问题,最后偶然发现了这篇文章.http://vnarcher.blogspot.ca/2012/04/highlighting-text-with-lucene.html 关键部分是,当您迭代结果时,将在要突出显示的结果值上调用getHighlightedField. private String getHighlightedField(Query query,Analyzer analyzer,String fieldName,String fieldValue) throws IOException,InvalidTokenOffsetsException {
Formatter formatter = new SimpleHTMLFormatter("<span class=""MatchedText"">","</span>");
QueryScorer queryScorer = new QueryScorer(query);
Highlighter highlighter = new Highlighter(formatter,queryScorer);
highlighter.setTextFragmenter(new SimpleSpanFragmenter(queryScorer,Integer.MAX_VALUE));
highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE);
return highlighter.getBestFragment(this.analyzer,fieldName,fieldValue);
}
在这种情况下,它假定输出将是html,它只是用< span>包装突出显示的文本.使用MatchedText的css类.然后,您可以定义自定义css规则,以执行任何您想要突出显示的内容. (编辑:安卓应用网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |
