Class BlockReader
- All Implemented Interfaces:
Accountable
,BytesRefIterator
- Direct Known Subclasses:
IntersectBlockReader
,STBlockReader
Reads fully the block in blockReadBuffer
. Then scans the block terms in memory. The
details region is lazily decoded with termStatesReadBuffer
which shares the same byte
array with blockReadBuffer
. See BlockWriter
and BlockLine
for the block
format.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.index.TermsEnum
TermsEnum.SeekStatus
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final long
protected final BlockDecoder
protected int
Offset of the start of the first line of the current block (just after the header), relative to the block start.protected BlockHeader
Current block header.protected BlockHeader.Serializer
protected IndexInput
IndexInput
on theblock file
.protected BlockLine
Current block line.protected BlockLine.Serializer
protected ByteArrayDataInput
In-memory read buffer for the current block.protected long
Current block start file pointer, absolute in theblock file
.protected IndexDictionary.Browser
Holds theIndexDictionary.Browser
once loaded.protected final IndexDictionary.BrowserSupplier
IndexDictionary.Browser
supplier for lazy loading.protected final FieldMetadata
protected BytesRefBuilder
Set whenseekExact(BytesRef, TermState)
is called.protected int
Current line index in the block.protected final PostingsReaderBase
protected BytesRef
protected BlockLine
protected final BlockTermState
protected BlockTermState
Current block line details.protected boolean
Whether the currentTermState
has been forced with a call toseekExact(BytesRef, TermState)
.protected DeltaBaseTermStateSerializer
protected ByteArrayDataInput
In-memory read buffer for the details region of the current block.Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE
-
Constructor Summary
ConstructorsModifierConstructorDescriptionprotected
BlockReader
(IndexDictionary.BrowserSupplier dictionaryBrowserSupplier, IndexInput blockInput, PostingsReaderBase postingsReader, FieldMetadata fieldMetadata, BlockDecoder blockDecoder) -
Method Summary
Modifier and TypeMethodDescriptionprotected void
protected int
compareToMiddleAndJump
(BytesRef searchedTerm) Compares the searched term to the middle term of the block.protected BlockHeader.Serializer
protected BlockLine.Serializer
protected DeltaBaseTermStateSerializer
protected BytesRef
decodeBlockBytesIfNeeded
(int numBlockBytes) int
docFreq()
Returns the number of documents containing the current term.protected IndexDictionary.Browser
impacts
(int flags) Return aImpactsEnum
.protected void
protected void
initializeHeader
(BytesRef searchedTerm, long targetBlockStartFP) Reads and setsblockHeader
.protected boolean
isBeyondLastTerm
(BytesRef searchedTerm, long blockStartFP) Indicates whether the searched term is beyond the last term of the field.protected boolean
isCurrentTerm
(BytesRef searchedTerm) protected CorruptIndexException
newCorruptIndexException
(String msg, Long fp) next()
Increments the iteration to the nextBytesRef
in the iterator.protected BytesRef
nextTerm()
Moves to the next term line and reads it, it may be in the next block.long
ord()
Returns ordinal position for current term.postings
(PostingsEnum reuse, int flags) GetPostingsEnum
for the current term, with control over whether freqs, positions, offsets or payloads are required.long
Return the memory usage of this object in bytes.protected BlockHeader
Reads the block header.protected BlockLine
Reads the current block line.protected BlockTermState
Reads theBlockTermState
on the current line.protected BlockTermState
Reads theBlockTermState
if it is not already set.Seeks to the specified term, if it exists, or to the next (ceiling) term.void
seekExact
(long ord) Not supported.boolean
Attempts to seek to the exact term, returning true if the term is found.void
Positions thisBlockReader
without re-seeking the term dictionary.protected TermsEnum.SeekStatus
seekInBlock
(BytesRef searchedTerm) Seeks to the provided term in this block.protected TermsEnum.SeekStatus
seekInBlock
(BytesRef searchedTerm, long blockStartFP) Seeks to the provided term in the block starting at the provided file pointer.term()
Returns current term.Expert: Returns the TermsEnums internal state to position the TermsEnum without re-seeking the term dictionary.long
Returns the total number of occurrences of this term across all documents (the sum of the freq() for each doc that has this term).Methods inherited from class org.apache.lucene.index.BaseTermsEnum
attributes
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
Field Details
-
BASE_RAM_USAGE
private static final long BASE_RAM_USAGE -
blockInput
IndexInput
on theblock file
. -
postingsReader
-
fieldMetadata
-
blockDecoder
-
blockHeaderReader
-
blockLineReader
-
blockReadBuffer
In-memory read buffer for the current block. -
termStatesReadBuffer
In-memory read buffer for the details region of the current block. It shares the same byte array asblockReadBuffer
, with a different position. -
termStateSerializer
-
dictionaryBrowserSupplier
IndexDictionary.Browser
supplier for lazy loading. -
dictionaryBrowser
Holds theIndexDictionary.Browser
once loaded. -
blockStartFP
protected long blockStartFPCurrent block start file pointer, absolute in theblock file
. -
blockHeader
Current block header. -
blockLine
Current block line. -
termState
Current block line details. -
blockFirstLineStart
protected int blockFirstLineStartOffset of the start of the first line of the current block (just after the header), relative to the block start. -
lineIndexInBlock
protected int lineIndexInBlockCurrent line index in the block. -
termStateForced
protected boolean termStateForcedWhether the currentTermState
has been forced with a call toseekExact(BytesRef, TermState)
.- See Also:
-
forcedTerm
Set whenseekExact(BytesRef, TermState)
is called.This optimizes the use-case when the caller calls first
seekExact(BytesRef, TermState)
and thenpostings(PostingsEnum, int)
. In this case we don't access the terms block file (we don't seek) but directly the postings file because we already have theTermState
with the file pointers to the postings file. -
scratchBlockBytes
-
scratchTermState
-
scratchBlockLine
-
-
Constructor Details
-
BlockReader
protected BlockReader(IndexDictionary.BrowserSupplier dictionaryBrowserSupplier, IndexInput blockInput, PostingsReaderBase postingsReader, FieldMetadata fieldMetadata, BlockDecoder blockDecoder) throws IOException - Parameters:
dictionaryBrowserSupplier
- to load theIndexDictionary.Browser
lazily inseekCeil(BytesRef)
.blockDecoder
- Optional block decoder, may be null if none. It can be used for decompression or decryption.- Throws:
IOException
-
-
Method Details
-
seekCeil
Description copied from class:TermsEnum
Seeks to the specified term, if it exists, or to the next (ceiling) term. Returns SeekStatus to indicate whether exact term was found, a different term was found, or EOF was hit. The target term may be before or after the current term. If this returns SeekStatus.END, the enum is unpositioned.- Specified by:
seekCeil
in classTermsEnum
- Throws:
IOException
-
seekExact
Description copied from class:TermsEnum
Attempts to seek to the exact term, returning true if the term is found. If this returns false, the enum is unpositioned. For some codecs, seekExact may be substantially faster thanTermsEnum.seekCeil(org.apache.lucene.util.BytesRef)
.- Overrides:
seekExact
in classBaseTermsEnum
- Returns:
- true if the term is found; return false if the enum is unpositioned.
- Throws:
IOException
-
isCurrentTerm
-
isBeyondLastTerm
Indicates whether the searched term is beyond the last term of the field.- Parameters:
blockStartFP
- The current block start file pointer.
-
seekInBlock
protected TermsEnum.SeekStatus seekInBlock(BytesRef searchedTerm, long blockStartFP) throws IOException Seeks to the provided term in the block starting at the provided file pointer. Does not exceed the block.- Throws:
IOException
-
seekInBlock
Seeks to the provided term in this block.Does not exceed this block;
TermsEnum.SeekStatus.END
is returned if it follows the block.Compares the line terms with the
searchedTerm
, taking advantage of the incremental encoding properties.Scans linearly the terms. Updates the current block line with the current term.
- Throws:
IOException
-
compareToMiddleAndJump
Compares the searched term to the middle term of the block. If the searched term is lexicographically equal or after the middle term then jumps to the second half of the block directly.- Returns:
- The comparison between the searched term and the middle term.
- Throws:
IOException
-
readLineInBlock
Reads the current block line. SetsblockLine
and incrementslineIndexInBlock
.- Returns:
- The
BlockLine
; or null if there no more line in the block. - Throws:
IOException
-
seekExact
Positions thisBlockReader
without re-seeking the term dictionary.The block containing the term is not read by this method. It will be read lazily only if needed, for example if
next()
is called. Callingpostings(org.apache.lucene.index.PostingsEnum, int)
after this method does require the block to be read.- Overrides:
seekExact
in classBaseTermsEnum
- Parameters:
term
- the term the TermState corresponds tostate
- theTermState
-
seekExact
public void seekExact(long ord) Not supported. -
next
Description copied from interface:BytesRefIterator
Increments the iteration to the nextBytesRef
in the iterator. Returns the resultingBytesRef
ornull
if the end of the iterator is reached. The returned BytesRef may be re-used across calls to next. After this method returns null, do not call it again: the results are undefined.- Specified by:
next
in interfaceBytesRefIterator
- Returns:
- the next
BytesRef
in the iterator ornull
if the end of the iterator is reached. - Throws:
IOException
- If there is a low-level I/O error.
-
nextTerm
Moves to the next term line and reads it, it may be in the next block. The term details are not read yet. They will be read only when needed withreadTermStateIfNotRead()
.- Returns:
- The read term bytes; or null if there is no more term for the field.
- Throws:
IOException
-
initializeHeader
Reads and setsblockHeader
. Sets null if there is no block for the field anymore.- Parameters:
searchedTerm
- The searched term; or null if none.targetBlockStartFP
- The file pointer of the block to read.- Throws:
IOException
-
initializeBlockReadLazily
- Throws:
IOException
-
createBlockHeaderSerializer
-
createBlockLineSerializer
-
createDeltaBaseTermStateSerializer
-
readHeader
Reads the block header. SetsblockHeader
.- Returns:
- The block header; or null if there is no block for the field anymore.
- Throws:
IOException
-
decodeBlockBytesIfNeeded
- Throws:
IOException
-
readTermStateIfNotRead
Reads theBlockTermState
if it is not already set. SetstermState
.- Throws:
IOException
-
readTermState
Reads theBlockTermState
on the current line. SetstermState
.Overriding method may return null if there is no
BlockTermState
(in this case the extending class must support a nulltermState
).- Returns:
- The
BlockTermState
; or null if none. - Throws:
IOException
-
term
Description copied from class:TermsEnum
Returns current term. Do not call this when the enum is unpositioned. -
ord
public long ord()Description copied from class:TermsEnum
Returns ordinal position for current term. This is an optional method (the codec may throwUnsupportedOperationException
). Do not call this when the enum is unpositioned. -
docFreq
Description copied from class:TermsEnum
Returns the number of documents containing the current term. Do not call this when the enum is unpositioned.TermsEnum.SeekStatus.END
.- Specified by:
docFreq
in classTermsEnum
- Throws:
IOException
-
totalTermFreq
Description copied from class:TermsEnum
Returns the total number of occurrences of this term across all documents (the sum of the freq() for each doc that has this term). Note that, like other term measures, this measure does not take deleted documents into account.- Specified by:
totalTermFreq
in classTermsEnum
- Throws:
IOException
-
termState
Description copied from class:TermsEnum
Expert: Returns the TermsEnums internal state to position the TermsEnum without re-seeking the term dictionary.NOTE: A seek by
TermState
might not capture theAttributeSource
's state. Callers must maintain theAttributeSource
states separately- Overrides:
termState
in classBaseTermsEnum
- Throws:
IOException
- See Also:
-
postings
Description copied from class:TermsEnum
GetPostingsEnum
for the current term, with control over whether freqs, positions, offsets or payloads are required. Do not call this when the enum is unpositioned. This method will not return null.NOTE: the returned iterator may return deleted documents, so deleted documents have to be checked on top of the
PostingsEnum
.- Specified by:
postings
in classTermsEnum
- Parameters:
reuse
- pass a prior PostingsEnum for possible reuseflags
- specifies which optional per-document values you require; seePostingsEnum.FREQS
- Throws:
IOException
-
impacts
Description copied from class:TermsEnum
Return aImpactsEnum
.- Specified by:
impacts
in classTermsEnum
- Throws:
IOException
- See Also:
-
ramBytesUsed
public long ramBytesUsed()Description copied from interface:Accountable
Return the memory usage of this object in bytes. Negative values are illegal.- Specified by:
ramBytesUsed
in interfaceAccountable
-
getOrCreateDictionaryBrowser
- Throws:
IOException
-
clearTermState
protected void clearTermState() -
newCorruptIndexException
-