Package org.apache.lucene.analysis.ko
Class KoreanAnalyzer
java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.ko.KoreanAnalyzer
- All Implemented Interfaces:
Closeable
,AutoCloseable
Analyzer for Korean that uses morphological analysis.
- Since:
- 7.4.0
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final KoreanTokenizer.DecompoundMode
private final boolean
private final UserDictionary
Fields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
-
Constructor Summary
ConstructorsConstructorDescriptionCreates a new KoreanAnalyzer.KoreanAnalyzer
(UserDictionary userDict, KoreanTokenizer.DecompoundMode mode, Set<POS.Tag> stopTags, boolean outputUnknownUnigrams) Creates a new KoreanAnalyzer. -
Method Summary
Modifier and TypeMethodDescriptionprotected Analyzer.TokenStreamComponents
createComponents
(String fieldName) Creates a newAnalyzer.TokenStreamComponents
instance for this analyzer.protected TokenStream
normalize
(String fieldName, TokenStream in) Wrap the givenTokenStream
in order to apply normalization filters.Methods inherited from class org.apache.lucene.analysis.Analyzer
attributeFactory, close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, initReader, initReaderForNormalization, normalize, tokenStream, tokenStream
-
Field Details
-
userDict
-
mode
-
stopTags
-
outputUnknownUnigrams
private final boolean outputUnknownUnigrams
-
-
Constructor Details
-
KoreanAnalyzer
public KoreanAnalyzer()Creates a new KoreanAnalyzer. -
KoreanAnalyzer
public KoreanAnalyzer(UserDictionary userDict, KoreanTokenizer.DecompoundMode mode, Set<POS.Tag> stopTags, boolean outputUnknownUnigrams) Creates a new KoreanAnalyzer.- Parameters:
userDict
- Optional: if non-null, user dictionary.mode
- Decompound mode.stopTags
- The set of part of speech that should be filtered.outputUnknownUnigrams
- If true outputs unigrams for unknown words.
-
-
Method Details
-
createComponents
Description copied from class:Analyzer
Creates a newAnalyzer.TokenStreamComponents
instance for this analyzer.- Specified by:
createComponents
in classAnalyzer
- Parameters:
fieldName
- the name of the fields content passed to theAnalyzer.TokenStreamComponents
sink as a reader- Returns:
- the
Analyzer.TokenStreamComponents
for this analyzer.
-
normalize
Description copied from class:Analyzer
Wrap the givenTokenStream
in order to apply normalization filters. The default implementation returns theTokenStream
as-is. This is used byAnalyzer.normalize(String, String)
.
-