Class IDVersionPostingsFormat
java.lang.Object
org.apache.lucene.codecs.PostingsFormat
org.apache.lucene.sandbox.codecs.idversion.IDVersionPostingsFormat
- All Implemented Interfaces:
NamedSPILoader.NamedSPI
A PostingsFormat optimized for primary-key (ID) fields that also record a version (long) for each
ID, delivered as a payload created by
longToBytes(long, org.apache.lucene.util.BytesRef)
during indexing. At search time, the
TermsEnum implementation IDVersionSegmentTermsEnum
enables fast (using only the terms
index when possible) lookup for whether a given ID was previously indexed with version > N
(see IDVersionSegmentTermsEnum.seekExact(BytesRef,long)
.
This is most effective if the app assigns monotonically increasing global version to each
indexed doc. Then, during indexing, use IDVersionSegmentTermsEnum.seekExact(BytesRef,long)
(along with LiveFieldValues
) to
decide whether the document you are about to index was already indexed with a higher version, and
skip it if so.
The field is effectively indexed as DOCS_ONLY and the docID is pulsed into the terms dictionary, but the user must feed in the version as a payload on the first token.
NOTE: term vectors cannot be indexed with this field (not that you should really ever want to do this).
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final long
version must be <= this, because we encode with ZigZag.private final int
static final long
version must be >= this.private final int
Fields inherited from class org.apache.lucene.codecs.PostingsFormat
EMPTY
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic long
bytesToLong
(BytesRef bytes) fieldsConsumer
(SegmentWriteState state) Writes a new segmentfieldsProducer
(SegmentReadState state) Reads a segment.static void
longToBytes
(long v, BytesRef bytes) Methods inherited from class org.apache.lucene.codecs.PostingsFormat
availablePostingsFormats, forName, getName, reloadPostingsFormats, toString
-
Field Details
-
MIN_VERSION
public static final long MIN_VERSIONversion must be >= this.- See Also:
-
MAX_VERSION
public static final long MAX_VERSIONversion must be <= this, because we encode with ZigZag.- See Also:
-
minTermsInBlock
private final int minTermsInBlock -
maxTermsInBlock
private final int maxTermsInBlock
-
-
Constructor Details
-
IDVersionPostingsFormat
public IDVersionPostingsFormat() -
IDVersionPostingsFormat
public IDVersionPostingsFormat(int minTermsInBlock, int maxTermsInBlock)
-
-
Method Details
-
fieldsConsumer
Description copied from class:PostingsFormat
Writes a new segment- Specified by:
fieldsConsumer
in classPostingsFormat
- Throws:
IOException
-
fieldsProducer
Description copied from class:PostingsFormat
Reads a segment. NOTE: by the time this call returns, it must hold open any files it will need to use; else, those files may be deleted. Additionally, required files may be deleted during the execution of this call before there is a chance to open them. Under these circumstances an IOException should be thrown by the implementation. IOExceptions are expected and will automatically cause a retry of the segment opening logic with the newly revised segments.- Specified by:
fieldsProducer
in classPostingsFormat
- Throws:
IOException
-
bytesToLong
-
longToBytes
-