Class SymbolTable
- Direct Known Subclasses:
ShadowedSymbolTable
,SoftReferenceSymbolTable
,SynchronizedSymbolTable
addSymbol
will always return the same string
reference.
The symbol table performs the same task as String.intern()
with the following differences:
- A new string object does not need to be created in order to retrieve a unique reference. Symbols can be added by using a series of characters in a character array.
- Users of the symbol table can provide their own symbol hashing implementation. For example, a simple string hashing algorithm may fail to produce a balanced set of hashcodes for symbols that are mostly unique. Strings with similar leading characters are especially prone to this poor hashing behavior.
SymbolTable
has two parameters that affect its
performance: initial capacity and load factor. The
capacity is the number of buckets in the SymbolTable, and the
initial capacity is simply the capacity at the time the SymbolTable
is created. Note that the SymbolTable is open: in the case of a "hash
collision", a single bucket stores multiple entries, which must be searched
sequentially. The load factor is a measure of how full the SymbolTable
is allowed to get before its capacity is automatically increased.
When the number of entries in the SymbolTable exceeds the product of the load
factor and the current capacity, the capacity is increased by calling the
rehash
method.Generally, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the time cost to look up an entry (which is reflected in most SymbolTable operations, including addSymbol and containsSymbol).
The initial capacity controls a tradeoff between wasted space and the
need for rehash
operations, which are time-consuming.
No rehash
operations will ever occur if the initial
capacity is greater than the maximum number of entries the
Hashtable will contain divided by its load factor. However,
setting the initial capacity too high can waste space.
If many entries are to be made into a SymbolTable
,
creating it with a sufficiently large capacity may allow the
entries to be inserted more efficiently than letting it perform
automatic rehashing as needed to grow the table.
- Version:
- $Id: SymbolTable.java 1358350 2012-07-06 19:04:35Z mrglavas $
- Author:
- Andy Clark, John Kim, IBM
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected static final class
This class is a symbol table entry. -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected SymbolTable.Entry[]
Buckets.protected final int
A new hash function is selected and the table is rehashed when the number of keys in the bucket exceeds this threshold.protected int
The total number of entries in the hash table.protected int[]
Array of randomly selected hash function multipliers ornull
if the default String.hashCode() function should be used.protected float
The load factor for the SymbolTable.protected int
actual table sizeprotected int
The table is rehashed when its size exceeds this threshold.protected static final int
Maximum hash collisions per bucket for a table with load factor == 1.protected static final int
protected static final int
protected static final int
Default table size. -
Constructor Summary
ConstructorsConstructorDescriptionConstructs a new, empty SymbolTable with a default initial capacity (101) and load factor, which is 0.75.SymbolTable
(int initialCapacity) Constructs a new, empty SymbolTable with the specified initial capacity and default load factor, which is 0.75.SymbolTable
(int initialCapacity, float loadFactor) Constructs a new, empty SymbolTable with the specified initial capacity and the specified load factor. -
Method Summary
Modifier and TypeMethodDescriptionaddSymbol
(char[] buffer, int offset, int length) Adds the specified symbol to the symbol table and returns a reference to the unique symbol.Adds the specified symbol to the symbol table and returns a reference to the unique symbol.boolean
containsSymbol
(char[] buffer, int offset, int length) Returns true if the symbol table already contains the specified symbol.boolean
containsSymbol
(String symbol) Returns true if the symbol table already contains the specified symbol.int
hash
(char[] buffer, int offset, int length) Returns a hashcode value for the specified symbol information.int
Returns a hashcode value for the specified symbol.protected void
Randomly selects a new hash function and reorganizes this SymbolTable in order to more evenly distribute its entries across the table.protected void
rehash()
Increases the capacity of and internally reorganizes this SymbolTable, in order to accommodate and access its entries more efficiently.
-
Field Details
-
TABLE_SIZE
protected static final int TABLE_SIZEDefault table size.- See Also:
-
MAX_HASH_COLLISIONS
protected static final int MAX_HASH_COLLISIONSMaximum hash collisions per bucket for a table with load factor == 1.- See Also:
-
MULTIPLIERS_SIZE
protected static final int MULTIPLIERS_SIZE- See Also:
-
MULTIPLIERS_MASK
protected static final int MULTIPLIERS_MASK- See Also:
-
fBuckets
Buckets. -
fTableSize
protected int fTableSizeactual table size -
fCount
protected transient int fCountThe total number of entries in the hash table. -
fThreshold
protected int fThresholdThe table is rehashed when its size exceeds this threshold. (The value of this field is (int)(capacity * loadFactor).) -
fLoadFactor
protected float fLoadFactorThe load factor for the SymbolTable. -
fCollisionThreshold
protected final int fCollisionThresholdA new hash function is selected and the table is rehashed when the number of keys in the bucket exceeds this threshold. -
fHashMultipliers
protected int[] fHashMultipliersArray of randomly selected hash function multipliers ornull
if the default String.hashCode() function should be used.
-
-
Constructor Details
-
SymbolTable
public SymbolTable(int initialCapacity, float loadFactor) Constructs a new, empty SymbolTable with the specified initial capacity and the specified load factor.- Parameters:
initialCapacity
- the initial capacity of the SymbolTable.loadFactor
- the load factor of the SymbolTable.- Throws:
IllegalArgumentException
- if the initial capacity is less than zero, or if the load factor is nonpositive.
-
SymbolTable
public SymbolTable(int initialCapacity) Constructs a new, empty SymbolTable with the specified initial capacity and default load factor, which is 0.75.- Parameters:
initialCapacity
- the initial capacity of the hashtable.- Throws:
IllegalArgumentException
- if the initial capacity is less than zero.
-
SymbolTable
public SymbolTable()Constructs a new, empty SymbolTable with a default initial capacity (101) and load factor, which is 0.75.
-
-
Method Details
-
addSymbol
Adds the specified symbol to the symbol table and returns a reference to the unique symbol. If the symbol already exists, the previous symbol reference is returned instead, in order guarantee that symbol references remain unique.- Parameters:
symbol
- The new symbol.
-
addSymbol
Adds the specified symbol to the symbol table and returns a reference to the unique symbol. If the symbol already exists, the previous symbol reference is returned instead, in order guarantee that symbol references remain unique.- Parameters:
buffer
- The buffer containing the new symbol.offset
- The offset into the buffer of the new symbol.length
- The length of the new symbol in the buffer.
-
hash
Returns a hashcode value for the specified symbol. The value returned by this method must be identical to the value returned by thehash(char[],int,int)
method when called with the character array that comprises the symbol string.- Parameters:
symbol
- The symbol to hash.
-
hash
public int hash(char[] buffer, int offset, int length) Returns a hashcode value for the specified symbol information. The value returned by this method must be identical to the value returned by thehash(String)
method when called with the string object created from the symbol information.- Parameters:
buffer
- The character buffer containing the symbol.offset
- The offset into the character buffer of the start of the symbol.length
- The length of the symbol.
-
rehash
protected void rehash()Increases the capacity of and internally reorganizes this SymbolTable, in order to accommodate and access its entries more efficiently. This method is called automatically when the number of keys in the SymbolTable exceeds this hashtable's capacity and load factor. -
rebalance
protected void rebalance()Randomly selects a new hash function and reorganizes this SymbolTable in order to more evenly distribute its entries across the table. This method is called automatically when the number keys in one of the SymbolTable's buckets exceeds the given collision threshold. -
containsSymbol
Returns true if the symbol table already contains the specified symbol.- Parameters:
symbol
- The symbol to look for.
-
containsSymbol
public boolean containsSymbol(char[] buffer, int offset, int length) Returns true if the symbol table already contains the specified symbol.- Parameters:
buffer
- The buffer containing the symbol to look for.offset
- The offset into the buffer.length
- The length of the symbol in the buffer.
-