java.lang.Object

com.arangodb.entity.arangosearch.analyzer.TextAnalyzerProperties

public final class TextAnalyzerProperties extends Object

Author:: Michele Rastelli

Constructor Summary

Constructors

Constructor

Description

TextAnalyzerProperties()
Method Summary

Modifier and Type

Method

Description

boolean

equals(Object o)

SearchAnalyzerCase

getAnalyzerCase()

EdgeNgram

getEdgeNgram()

String

getLocale()

List<String>

getStopwords()

String

getStopwordsPath()

int

hashCode()

boolean

isAccent()

boolean

isStemming()

void

setAccent(boolean accent)

void

setAnalyzerCase(SearchAnalyzerCase analyzerCase)

void

setEdgeNgram(EdgeNgram edgeNgram)

void

setLocale(String locale)

void

setStemming(boolean stemming)

void

setStopwords(List<String> stopwords)

void

setStopwordsPath(String stopwordsPath)

Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- TextAnalyzerProperties
  
  public TextAnalyzerProperties()
Method Details
- getLocale
  
  public String getLocale()
  Returns:
  
  a locale in the format `language[_COUNTRY][.encoding][@variant]` (square brackets denote optional parts), e.g. `de.utf-8` or `en_US.utf-8`. Only UTF-8 encoding is meaningful in ArangoDB.
  
  See Also:
  
  Supported Languages
- setLocale
  
  public void setLocale(String locale)
- isAccent
  
  public boolean isAccent()
  
  Returns:
  
  true to preserve accented characters (default) false to convert accented characters to their base characters
- setAccent
  
  public void setAccent(boolean accent)
- getAnalyzerCase
  
  public SearchAnalyzerCase getAnalyzerCase()
- setAnalyzerCase
  
  public void setAnalyzerCase(SearchAnalyzerCase analyzerCase)
  
  Parameters:
  
  analyzerCase - defaults to SearchAnalyzerCase.lower
- isStemming
  
  public boolean isStemming()
  
  Returns:
  
  true to apply stemming on returned words (default) false to leave the tokenized words as-is
- setStemming
  
  public void setStemming(boolean stemming)
- getEdgeNgram
  
  public EdgeNgram getEdgeNgram()
  
  Returns:
  
  if present, then edge n-grams are generated for each token (word). That is, the start of the n-gram is anchored to the beginning of the token, whereas the ngram Analyzer would produce all possible substrings from a single input token (within the defined length restrictions). Edge n-grams can be used to cover word-based auto-completion queries with an index, for which you should set the following other options: - accent: false - case: SearchAnalyzerCase.lower - stemming: false
- setEdgeNgram
  
  public void setEdgeNgram(EdgeNgram edgeNgram)
- getStopwords
  
  public List<String> getStopwords()
  
  Returns:
  
  an array of strings with words to omit from result. Default: load words from stopwordsPath. To disable stop-word filtering provide an empty array []. If both stopwords and stopwordsPath are provided then both word sources are combined.
- setStopwords
  
  public void setStopwords(List<String> stopwords)
- getStopwordsPath
  
  public String getStopwordsPath()
  
  Returns:
  
  path with a language sub-directory (e.g. en for a locale en_US.utf-8) containing files with words to omit. Each word has to be on a separate line. Everything after the first whitespace character on a line will be ignored and can be used for comments. The files can be named arbitrarily and have any file extension (or none).
  Default: if no path is provided then the value of the environment variable IRESEARCH_TEXT_STOPWORD_PATH is used to determine the path, or if it is undefined then the current working directory is assumed. If the stopwords attribute is provided then no stop-words are loaded from files, unless an explicit stopwordsPath is also provided.
  Note that if the stopwordsPath can not be accessed, is missing language sub-directories or has no files for a language required by an Analyzer, then the creation of a new Analyzer is refused. If such an issue is discovered for an existing Analyzer during startup then the server will abort with a fatal error.
- setStopwordsPath
  
  public void setStopwordsPath(String stopwordsPath)
- equals
  
  public boolean equals(Object o)
  
  Overrides:
  
  equals in class Object
- hashCode
  
  public int hashCode()
  
  Overrides:
  
  hashCode in class Object

Class TextAnalyzerProperties

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

TextAnalyzerProperties

Method Details

getLocale

setLocale

isAccent

setAccent

getAnalyzerCase

setAnalyzerCase

isStemming

setStemming

getEdgeNgram

setEdgeNgram

getStopwords

setStopwords

getStopwordsPath

setStopwordsPath

equals

hashCode