Class Recognizer

  • Direct Known Subclasses:
    DefaultRecognizer, ExplicitRecognizer

    abstract class Recognizer
    extends java.lang.Object
    Abstract Recognizer class used to determine if a candidate aggregate table has the column categories: "fact_count" column, measure columns, foreign key and level columns.

    Derived classes use either the default or explicit column descriptions in matching column categories. The basic matching algorithm is in this class while some specific column category matching and column building must be specified in derived classes.

    A Recognizer is created per candidate aggregate table. The tables columns are then categorized. All errors and warnings are added to a MessageRecorder.

    This class is less about defining a type and more about code sharing.

    Author:
    Richard M. Emberson
    • Method Detail

      • check

        public boolean check()
        Return true if the candidate aggregate table was successfully mapped into the fact table. This is the top-level checking method.

        It first checks the ignore columns.

        Next, the existence of a fact count column is checked.

        Then the measures are checked. First the specified (defined, explicit) measures are all determined. There must be at least one such measure. This if followed by checking for implied measures (e.g., if base fact table as both sum and average of a column and the aggregate has a sum measure, the there is an implied average measure in the aggregate).

        Now the levels are checked. This is in two parts. First, foreign keys are checked followed by level columns (for collapsed dimension aggregates).

        If eveything checks out, returns true.

      • getIgnoreMatcher

        protected abstract Recognizer.Matcher getIgnoreMatcher()
        Return the ignore column Matcher.
      • checkIgnores

        protected void checkIgnores()
        Check all columns to be marked as ignore.
      • makeIgnore

        protected void makeIgnore​(JdbcSchema.Table.Column aggColumn)
        Create an ignore usage for the aggColumn.
      • getFactCountMatcher

        protected abstract Recognizer.Matcher getFactCountMatcher()
        Return the fact count column Matcher.
      • checkFactCount

        protected void checkFactCount()
        Make sure that the aggregate table has one fact count column and that its type is numeric.
      • checkMeasures

        protected abstract int checkMeasures()
        Check all measure columns returning the number of measure columns.
      • makeFactCount

        protected void makeFactCount​(JdbcSchema.Table.Column aggColumn)
        Create a fact count usage for the aggColumn.
      • checkNosMeasures

        protected void checkNosMeasures​(int nosMeasures)
        Make sure there was at least one measure column identified.
      • generateImpliedMeasures

        protected void generateImpliedMeasures()
        An implied measure in an aggregate table is one where there is both a sum and average measures in the base fact table and the aggregate table has either a sum or average, the other measure is implied and can be generated from the measure and the fact_count column.

        For each column in the fact table, get its measure usages. If there is both an average and sum aggregator associated with the column, then iterator over all of the column usage of type measure of the aggregator table. If only one aggregate column usage measure is found and this RolapStar.Measure measure instance variable is the same as the the fact table's usage's instance variable, then the other measure is implied and the measure is created for the aggregate table.

      • makeMeasure

        protected void makeMeasure​(JdbcSchema.Table.Column.Usage factUsage,
                                   JdbcSchema.Table.Column.Usage aggSiblingUsage)
        Here we have the fact usage of either sum or avg and an aggregate usage of the opposite type. We wish to make a new aggregate usage based on the existing usage's column of the same type as the fact usage.
        Parameters:
        factUsage - fact usage
        aggSiblingUsage - existing sibling usage
      • matchForeignKey

        protected abstract int matchForeignKey​(JdbcSchema.Table.Column.Usage factUsage)
        This method determine how may aggregate table column's match the fact table foreign key column return in the number matched. For each matching column a foreign key usage is created.
      • checkForeignKeys

        protected java.util.List<JdbcSchema.Table.Column.Usage> checkForeignKeys()
        This method checks the foreign key columns.

        For each foreign key column usage in the fact table, determine how many aggregate table columns match that column usage. If there is more than one match, then that is an error. If there were no matches, then the foreign key usage is added to the list of fact column foreign key that were not in the aggregate table. This list is returned by this method.

        This matches foreign keys that were not "lost" or "collapsed".

        Returns:
        list on not seen foreign key column usages
      • checkLevels

        protected void checkLevels​(java.util.List<JdbcSchema.Table.Column.Usage> notSeenForeignKeys)
        This method identifies those columns in the aggregate table that match "collapsed" dimension columns. Remember that a collapsed dimension is one where the higher levels of some hierarchy are columns in the aggregate table (and all of the lower levels are missing - it has aggregated up to the first existing level).

        Here, we do not start from the fact table, we iterator over each cube. For each of the cube's dimensions, the dimension's hirarchies are iterated over. In turn, each hierarchy's usage is iterated over. if the hierarchy's usage's foreign key is not in the list of not seen foreign keys (the notSeenForeignKeys parameter), then that hierarchy is not considered. If the hierarchy's usage's foreign key is in the not seen list, then starting with the hierarchy's top level, it is determined if the combination of hierarchy, hierarchy usage, and level matches an aggregated table column. If so, then a level usage is created for that column and the hierarchy's next level is considered and so on until a for a level an aggregate table column does not match. Then we continue iterating over the hierarchy usages.

        This check is different. The others mine the fact table usages. This looks through the fact table's cubes' dimension, hierarchy, hiearchy usages, levels to match up symbolic names for levels. The other checks match on "physical" characteristics, the column name; this matches on "logical" characteristics.

        Note: Levels should not be created for foreign keys that WERE seen. Currently, this is NOT checked explicitly. For the explicit rules any extra columns MUST ge declared ignored or one gets an error.

      • inNotSeenForeignKeys

        boolean inNotSeenForeignKeys​(java.lang.String foreignKey,
                                     java.util.List<JdbcSchema.Table.Column.Usage> notSeenForeignKeys)
        Return true if the foreignKey column name is in the list of not seen foreign keys.
      • makeForeignKey

        protected void makeForeignKey​(JdbcSchema.Table.Column.Usage factUsage,
                                      JdbcSchema.Table.Column aggColumn,
                                      java.lang.String rightJoinConditionColumnName)
        Here a measure ussage is created and the right join condition is explicitly supplied. This is needed is when the aggregate table's column names may not match those found in the RolapStar.
      • matchLevels

        protected abstract void matchLevels​(Hierarchy hierarchy,
                                            HierarchyUsage hierarchyUsage)
        Match a aggregate table column given the hierarchy and hierarchy usage.
      • makeLevelColumnUsage

        protected void makeLevelColumnUsage​(JdbcSchema.Table.Column aggColumn,
                                            Hierarchy hierarchy,
                                            HierarchyUsage hierarchyUsage,
                                            java.lang.String factColumnName,
                                            java.lang.String levelColumnName,
                                            java.lang.String symbolicName,
                                            boolean isCollapsed,
                                            RolapLevel rLevel,
                                            JdbcSchema.Table.Column ordinalColumn,
                                            JdbcSchema.Table.Column captionColumn,
                                            java.util.Map<java.lang.String,​JdbcSchema.Table.Column> properties)
        Make a level column usage.

        Note there is a check in this code. If a given aggregate table column has already has a level usage, then that usage must all refer to the same hierarchy usage join table and column name as the one that calling this method was to create. If there is an existing level usage for the column and it matches something else, then it is an error.

      • checkUnusedColumns

        protected void checkUnusedColumns()
        If everything is ok, issue warning for each aggTable column that has not been identified as a FACT_COLUMN, MEASURE_COLUMN or LEVEL_COLUMN.
      • convertAggregator

        protected RolapAggregator convertAggregator​(JdbcSchema.Table.Column.Usage aggUsage,
                                                    RolapAggregator factAgg)
        Figure out what aggregator should be associated with a column usage. Generally, this aggregator is simply the RolapAggregator returned by calling the getRollup() method of the fact table column's RolapAggregator. But in the case that the fact table column's RolapAggregator is the "Avg" aggregator, then the special RolapAggregator.AvgFromSum is used.

        Note: this code assumes that the aggregate table does not have an explicit average aggregation column.

      • convertAggregator

        protected RolapAggregator convertAggregator​(JdbcSchema.Table.Column.Usage aggUsage,
                                                    RolapAggregator factAgg,
                                                    RolapAggregator siblingAgg)
        The method chooses a special aggregator for the aggregate table column's usage.
         If the fact table column's aggregator was "Avg":
           then if the sibling aggregator was "Avg":
              the new aggregator is RolapAggregator.AvgFromAvg
           else if the sibling aggregator was "Sum":
              the new aggregator is RolapAggregator.AvgFromSum
         else if the fact table column's aggregator was "Sum":
           if the sibling aggregator was "Avg":
              the new aggregator is RolapAggregator.SumFromAvg
         
        Note that there is no SumFromSum since that is not a special case requiring a special aggregator.

        if no new aggregator was selected, then the fact table's aggregator rollup aggregator is used.

      • findCubes

        protected java.util.List<RolapCube> findCubes()
        Finds all cubes that use this fact table.