All Packages

Package Summary
Package
Description
Text analysis.
Analyzer for Arabic.
Analyzer for Bulgarian.
Analyzer for Bengali Language.
Provides various convenience classes for creating boosts on Tokens.
Analyzer for Brazilian Portuguese.
Analyzer for Catalan.
Normalization of text before the tokenizer.
Analyzer for Chinese, Japanese, and Korean, which indexes bigrams.
Analyzer for Sorani Kurdish.
Fast, general-purpose grammar-based tokenizers.
Analyzer for Simplified Chinese, which indexes words.
SmartChineseAnalyzer Hidden Markov Model package.
Construct n-grams for frequently occurring terms and phrases.
A filter that decomposes compound words you find in many Germanic languages into the word parts.
Hyphenation code for the CompoundWordTokenFilter.
Basic, general-purpose analysis components.
A general-purpose Analyzer that can be created with a builder-style API.
Analyzer for Czech.
Analyzer for Danish.
Analyzer for German.
Analyzer for Greek.
Fast, general-purpose URLs and email addresses tokenizers.
Analyzer for English.
Analyzer for Spanish.
Analyzer for Estonian.
Analyzer for Basque.
Analyzer for Persian.
Analyzer for Finnish.
Analyzer for French.
Analyzer for Irish.
Analyzer for Galician.
Analyzer for Hindi.
Analyzer for Hungarian.
A Java implementation of Hunspell stemming and spell-checking algorithms (Hunspell), and a stemming TokenFilter (HunspellStemFilter) based on it.
Analyzer for Armenian.
Analysis components based on ICU
Tokenizer that breaks text into words with the Unicode Text Segmentation algorithm.
Additional ICU-specific Attributes for text analysis.
Analyzer for Indonesian.
Analyzer for Indian languages.
Analyzer for Italian.
Analyzer for Japanese.
Kuromoji dictionary implementation.
Additional Kuromoji-specific Attributes for text analysis.
Kuromoji utility classes.
Analyzer for Korean.
Korean dictionary implementation.
Additional Korean-specific Attributes for text analysis.
Nori utility classes.
Analyzer for Lithuanian.
Analyzer for Latvian.
MinHash filtering (for LSH).
Miscellaneous Tokenstreams.
Analyzer for Nepali.
Character n-gram tokenizers and filters.
Analyzer for Dutch.
Analyzer for Norwegian.
Analysis components for path-like strings such as filenames.
Set of components for pattern-based (regex) analysis.
Provides various convenience classes for creating payloads on Tokens.
Analysis components for phonetic search.
Analyzer for Polish.
Analyzer for Portuguese.
Automatically filter high-frequency stopwords.
Filter to reverse token text.
Analyzer for Romanian.
Analyzer for Russian.
Word n-gram filters.
TokenFilter and Analyzer implementations that use a modified version of Snowball stemmers.
Analyzer for Serbian.
Fast, general-purpose grammar-based tokenizer StandardTokenizer implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29.
Stempel: Algorithmic Stemmer
Analyzer for Swedish.
Analysis components for Synonyms.
Analyzer for Tamil.
Analyzer for Telugu Language.
Analyzer for Thai.
General-purpose attributes for text analysis.
Analyzer for Turkish.
Utility functions for text analysis.
Tokenizer that is aware of Wikipedia syntax.
Common APIs for use by backwards compatibility codecs.
Compressing helper classes.
BlockTree terms dictionary.
Lucene 5.0 file format.
Lucene 5.0 compressing format.
Lucene 6.0 file format.
Components from the Lucene 7.0 index format.
Components from the Lucene 8.0 index format.
Lucene 8.4 file format.
Lucene 8.6 file format.
Lucene 8.7 file format.
Lucene 9.0 file format.
Lucene 9.1 file format.
Lucene 9.2 file format.
Lucene 9.4 file format.
Legacy PackedInts methods
store helper
Uses already seen data (the indexed documents) to classify an input ( can be simple text or a structured document).
Uses already seen data (the indexed documents) to classify new documents.
Utilities for evaluation, data preparation, etc.
Codecs API: API for customization of the encoding and structure of the index.
Pluggable term index / block terms dictionary implementations.
Same postings format as Lucene50, except the terms dictionary also supports ords, i.e.
Codec PostingsFormat for fast access to low-frequency terms such as primary key fields.
Compressing helper classes.
Lucene 9.0 file format.
BlockTree terms dictionary.
Lucene 9.0 compressing format.
Lucene 9.4 file format.
Lucene 9.5 file format.
Term dictionary, DocValues or Postings formats that are read entirely into memory.
Postings format that can delegate to different formats per-field.
Simpletext Codec: writes human readable postings.
Pluggable term index / block terms dictionary implementations.
Pluggable term index / block terms dictionary implementations.
Unicode collation support.
Custom AttributeImpl for indexing collation keys as index terms.
The logical representation of a Document for indexing and searching.
Expressions.
Javascript expressions.
Geospatial Utility Implementations for Lucene Core
Code to maintain and access indices.
High-performance single-document main memory Apache Lucene fulltext search index.
Internal bridges to package-private internals, for use by the lucene test framework only.
Miscellaneous Lucene utilities that don't really fit anywhere else.
Misc extensions of the Document/Field API.
Misc index tools and index support.
Misc search implementations.
Misc Directory implementations.
Memory Tracker interface which allows defining custom collector level memory trackers
Misc FST classes.
Monitoring framework
Experimental classes for interacting with payloads
Filters and Queries that add to core Lucene.
Queries that compute score based upon a function.
FunctionValues for different data types.
A variety of functions to use with FunctionQuery.
Intervals queries
Document similarity query generators.
The payloads package provides Query mechanisms for finding and using payloads.
The calculus of spans.
This package contains reusable parts for javacc-generated grammars (query parsers).
A simple query parser implemented with JavaCC.
QueryParser which permits complex phrase query syntax eg "(john jon jonathan~) peters*"
Extendable QueryParser provides a simple and flexible extension mechanism by overloading query field names.
Core classes of the flexible query parser framework.
Necessary classes to implement query builders.
Base classes used to configure the query processing.
Messages usually used by query parser implementations.
Query nodes commonly used by query parser implementations.
Necessary interfaces to implement text parsers.
Interfaces and implementations used by query node processors
Utility classes to used with the Query Parser.
For Native Language Support (NLS), system of software internationalization.
Precedence Query Parser Implementation
Lucene Precedence Query Parser Processors
Lucene Flexible Query Parser Implementation
Standard Lucene Query Node Builders.
Standard Lucene Query Configuration.
Standard Lucene Query Nodes.
This package contains classes that implement interval function support for the standard syntax parser.
Lucene Query Parser
Lucene Query Node Processors.
A simple query parser for human-entered queries.
This package contains the QueryParser.jj source file for the Surround parser.
This package contains SrndQuery and its subclasses.
Parser that produces Lucene Query objects from XML streams.
XML Parser factories for different Lucene Query/Filters.
A primary-key postings format that associates a version (long) with each term and can provide fail-fast lookups by ID and version.
This package contains several point types: BigIntegerPoint for 128-bit integers LatLonPoint for latitude/longitude geospatial points
Experimental index-related classes
Additional queries (some may have caveats or limitations)
This package contains a flexible graph-based proximity query, TermAutomatonQuery, and geospatial queries.
Code to search indices.
Comparators, used to compare hits so as to determine their sort order when collecting the top results with TopFieldCollector.
Grouping.
Highlighting search terms.
Support for index-time and query-time joins.
This package contains several components useful to build a highlighter on top of the Matches API.
This package contains the various ranking models that can be used in Lucene.
Suggest alternate spellings for words.
Support for Autocomplete/Autosuggest
Analyzer based autosuggest.
Support for document suggestion
Finite-state based autosuggest.
Ternary Search Tree based autosuggest.
The UnifiedHighlighter -- a flexible highlighter that can get offsets from postings, term vectors, or analysis.
Another highlighter implementation based on term vectors.
Lucene field & query support for the spatial geometry implemented in org.apache.lucene.spatial3d.geom.
Shapes implemented using 3D planar geometry.
Binary i/o API, used for all index data.
Some utility classes.
Finite-state automaton for regular expressions.
Block KD-tree, implementing the generic spatial data structure described in this paper.
Compression utilities.
Finite state transducers
Utility classes for working with token streams as graphs.
Navigable Small-World graph, nominally Hierarchical but currently only has a single layer.
package holding hppc related classes.
Comparable object wrappers
Packed integer arrays and streams.
Egothor stemmer API.
Snowball stemmer API
Autogenerated snowball stemmer implementations.