Class ColumnCardinalityCache


  • public class ColumnCardinalityCache
    extends Object
    This class is an indexing utility to cache the cardinality of a column value for every table. Each table has its own cache that is independent of every other, and every column also has its own Guava cache. Use of this utility can have a significant impact for retrieving the cardinality of many columns, preventing unnecessary accesses to the metrics table in Accumulo for a cardinality that won't change much.
    • Constructor Detail

      • ColumnCardinalityCache

        @Inject
        public ColumnCardinalityCache​(org.apache.accumulo.core.client.Connector connector,
                                      AccumuloConfig config)
    • Method Detail

      • shutdown

        @PreDestroy
        public void shutdown()
      • getCardinalities

        public com.google.common.collect.Multimap<Long,​AccumuloColumnConstraint> getCardinalities​(String schema,
                                                                                                        String table,
                                                                                                        org.apache.accumulo.core.security.Authorizations auths,
                                                                                                        com.google.common.collect.Multimap<AccumuloColumnConstraint,​org.apache.accumulo.core.data.Range> idxConstraintRangePairs,
                                                                                                        long earlyReturnThreshold,
                                                                                                        io.airlift.units.Duration pollingDuration)
        Gets the cardinality for each AccumuloColumnConstraint. Given constraints are expected to be indexed! Who knows what would happen if they weren't!
        Parameters:
        schema - Schema name
        table - Table name
        auths - Scan authorizations
        idxConstraintRangePairs - Mapping of all ranges for a given constraint
        earlyReturnThreshold - Smallest acceptable cardinality to return early while other tasks complete
        pollingDuration - Duration for polling the cardinality completion service
        Returns:
        An immutable multimap of cardinality to column constraint, sorted by cardinality from smallest to largest
        Throws:
        org.apache.accumulo.core.client.TableNotFoundException - If the metrics table does not exist
        ExecutionException - If another error occurs; I really don't even know anymore.
      • getColumnCardinality

        public long getColumnCardinality​(String schema,
                                         String table,
                                         org.apache.accumulo.core.security.Authorizations auths,
                                         String family,
                                         String qualifier,
                                         Collection<org.apache.accumulo.core.data.Range> colValues)
                                  throws ExecutionException
        Gets the column cardinality for all of the given range values. May reach out to the metrics table in Accumulo to retrieve new cache elements.
        Parameters:
        schema - Table schema
        table - Table name
        auths - Scan authorizations
        family - Accumulo column family
        qualifier - Accumulo column qualifier
        colValues - All range values to summarize for the cardinality
        Returns:
        The cardinality of the column
        Throws:
        ExecutionException