Class FSTCompletion

java.lang.Object
org.apache.lucene.search.suggest.fst.FSTCompletion

public class FSTCompletion extends Object
Finite state automata based implementation of "autocomplete" functionality.
See Also:
  • Field Details

    • DEFAULT_BUCKETS

      public static final int DEFAULT_BUCKETS
      Default number of buckets.
      See Also:
    • EMPTY_RESULT

      private static final ArrayList<FSTCompletion.Completion> EMPTY_RESULT
      An empty result. Keep this an ArrayList to keep all the returned lists of single type (monomorphic calls).
    • automaton

      private final FST<Object> automaton
      Finite state automaton encoding all the lookup terms. See class notes for details.
    • rootArcs

      private final FST.Arc<Object>[] rootArcs
      An array of arcs leaving the root automaton state and encoding weights of all completions in their sub-trees.
    • exactFirst

      private boolean exactFirst
      See Also:
    • higherWeightsFirst

      private boolean higherWeightsFirst
      See Also:
  • Constructor Details

    • FSTCompletion

      public FSTCompletion(FST<Object> automaton, boolean higherWeightsFirst, boolean exactFirst)
      Constructs an FSTCompletion, specifying higherWeightsFirst and exactFirst.
      Parameters:
      automaton - Automaton with completions. See FSTCompletionBuilder.
      higherWeightsFirst - Return most popular suggestions first. This is the default behavior for this implementation. Setting it to false has no effect (use constant term weights to sort alphabetically only).
      exactFirst - Find and push an exact match to the first position of the result list if found.
    • FSTCompletion

      public FSTCompletion(FST<Object> automaton)
      Defaults to higher weights first and exact first.
      See Also:
  • Method Details

    • cacheRootArcs

      private static FST.Arc<Object>[] cacheRootArcs(FST<Object> automaton)
      Cache the root node's output arcs starting with completions with the highest weights.
    • getExactMatchStartingFromRootArc

      private int getExactMatchStartingFromRootArc(int rootArcIndex, BytesRef utf8)
      Returns the first exact match by traversing root arcs, starting from the arc rootArcIndex .
      Parameters:
      rootArcIndex - The first root arc index in rootArcs to consider when matching.
      utf8 - The sequence of utf8 bytes to follow.
      Returns:
      Returns the bucket number of the match or -1 if no match was found.
    • lookup

      public List<FSTCompletion.Completion> lookup(CharSequence key, int num)
      Lookup suggestions to key.
      Parameters:
      key - The prefix to which suggestions should be sought.
      num - At most this number of suggestions will be returned.
      Returns:
      Returns the suggestions, sorted by their approximated weight first (decreasing) and then alphabetically (UTF-8 codepoint order).
    • lookup

      Lookup suggestions to key and return a stream of matching completions. The stream fetches completions dynamically - it can be filtered and limited to acquire the desired number of completions without collecting all of them.
      Parameters:
      key - The prefix to which suggestions should be sought.
      Returns:
      Returns the suggestions
    • lookupSortedByWeight

      private Stream<FSTCompletion.Completion> lookupSortedByWeight(BytesRef key) throws IOException
      Lookup suggestions sorted by weight (descending order).
      Throws:
      IOException
    • completionStream

      private Stream<? extends FSTCompletion.Completion> completionStream(BytesRef output, int bucket, FST.Arc<Object> fromArc) throws IOException
      Return a stream of all completions starting from the provided arc.
      Throws:
      IOException
    • descendWithPrefix

      private boolean descendWithPrefix(FST.Arc<Object> arc, BytesRef utf8) throws IOException
      Descend along the path starting at arc and going through bytes in the argument.
      Parameters:
      arc - The starting arc. This argument is modified in-place.
      utf8 - The term to descend along.
      Returns:
      If true, arc will be set to the arc matching last byte of term. false is returned if no such prefix exists.
      Throws:
      IOException
    • getBucketCount

      public int getBucketCount()
      Returns the bucket count (discretization thresholds).
    • getBucket

      public int getBucket(CharSequence key)
      Returns the bucket assigned to a given key (if found) or -1 if no exact match exists.
    • getFST

      public FST<Object> getFST()
      Returns the internal automaton.