Class SegmentCacheManager


  • public class SegmentCacheManager
    extends java.lang.Object
    Active object that maintains the "global cache" (in JVM, but shared between connections using a particular schema) and "external cache" (as implemented by a SegmentCache.

    Segment states

    StateMeaning
    LocalInitial state of a segment

    Decisions to be reviewed

    1. Create variant of actor that processes all requests synchronously, and does not need a thread. This would be a more 'embedded' mode of operation (albeit with worse scale-out).

    2. Move functionality into AggregationManager?

    3. Delete RolapStar.lookupOrCreateAggregation(mondrian.rolap.agg.AggregationKey) and RolapStar.lookupSegment(mondrian.rolap.agg.AggregationKey) and RolapStar.lookupAggregationShared (formerly RolapStar.lookupAggregation).

    Moved methods

    (Keeping track of where methods came from will make it easier to merge to the mondrian-4 code line.)

    1. RolapStar.getCellFromCache(mondrian.rolap.agg.CellRequest, mondrian.rolap.RolapAggregationManager.PinSet) moved from Aggregation.getCellValue

    Done

    1. Obsolete CountingAggregationManager, and property mondrian.rolap.agg.enableCacheHitCounters.

    2. AggregationManager becomes non-singleton.

    3. SegmentCacheWorker methods and segmentCache field become non-static. initCache() is called on construction. SegmentCache is passed into constructor (therefore move ServiceDiscovery into client). AggregationManager (or maybe MondrianServer) is another constructor parameter.

    5. Move SegmentHeader, SegmentBody, ConstrainedColumn into mondrian.spi. Leave behind dependencies on mondrian.rolap.agg. In particular, put code that converts Segment + SegmentWithData to and from SegmentHeader + SegmentBody (e.g. SegmentHeader#forSegment) into a utility class. (Do this as CLEANUP, after functionality is complete?)

    6. Move functionality Aggregation to Segment. Long-term, Aggregation should not be used as a 'gatekeeper' to Segment. Remove Aggregation fields columns and axes.

    9. Obsolete RolapStar.cacheAggregations. Similar effect will be achieved by removing the 'jvm cache' from the chain of caches.

    10. Rename Aggregation.Axis to SegmentAxis.

    11. Remove Segment.setData and instead split out subclass SegmentWithData. Now segment is immutable. You don't have to wait for its state to change. You wait for a Future<SegmentWithData> to become ready.

    12. Remove methods: RolapCube.checkAggregateModifications, RolapStar.checkAggregateModifications, RolapSchema.checkAggregateModifications, RolapStar.pushAggregateModificationsToGlobalCache, RolapSchema.pushAggregateModificationsToGlobalCache, RolapCube.pushAggregateModificationsToGlobalCache.

    13. Add new implementations of Future: CompletedFuture and SlotFuture.

    14. Remove methods:

    • Remove SegmentLoader.loadSegmentsFromCache - creates a SegmentHeader that has PRECISELY same specification as the requested segment, very unlikely to have a hit
    • Remove SegmentLoader.loadSegmentFromCacheRollup
    • Break up SegmentLoader.cacheSegmentData, and place code that is called after a segment has arrived

    13. Fix flush. Obsolete Aggregation.flush, and RolapStar.flush, which called it.

    18. SegmentCacheManager#locateHeaderBody (and maybe other methods) call SegmentCacheWorker.get(mondrian.spi.SegmentHeader), and that's a slow blocking call. Make waits for segment futures should be called from a worker or client, not an agent.

    Ideas and tasks

    7. RolapStar.localAggregations and .sharedAggregations. Obsolete sharedAggregations.

    8. Longer term. Move RolapStar.Bar.segmentRefs to Execution. Would it still be thread-local?

    10. Call DataSourceChangeListener.isAggregationChanged(mondrian.rolap.agg.AggregationKey). Previously called from RolapStar.checkAggregateModifications, now never called.

    12. We can quickly identify segments affected by a flush using SegmentCacheIndex.intersectRegion(java.lang.String, mondrian.util.ByteString, java.lang.String, java.lang.String, java.lang.String, mondrian.spi.SegmentColumn[]). But then what? Options:

    1. Option #1. Pull them in, trim them, write them out? But: causes a lot of I/O, and we may never use these segments. Easiest.
    2. Option #2. Mark the segments in the index as needing to be trimmed; trim them when read, and write out again. But: doesn't propagate to other nodes.
    3. Option #3. (Best?) Write a mapping SegmentHeader->Restrictions into the cache. Less I/O than #1. Method "SegmentCache.addRestriction(SegmentHeader, CacheRegion)"

    14. Move AggregationManager.getCellFromCache(mondrian.rolap.agg.CellRequest) somewhere else. It's concerned with local segments, not the global/external cache.

    15. Method to convert SegmentHeader + SegmentBody to Segment + SegmentWithData is imperfect. Cannot parse predicates, compound predicates. Need mapping in star to do it properly and efficiently? SegmentBuilder.SegmentConverter is a hack that can be removed when this is fixed. See SegmentBuilder.toSegment(mondrian.spi.SegmentHeader, mondrian.rolap.RolapStar, mondrian.rolap.BitKey, mondrian.rolap.RolapStar.Column[], mondrian.rolap.RolapStar.Measure, java.util.List<mondrian.rolap.StarPredicate>). Also see #20.

    17. Revisit the strategy for finding segments that can be copied from global and external cache into local cache. The strategy of sending N CellRequests at a time, then executing SQL to fill in the gaps, is flawed. We need to maximize N in order to reduce segment fragmentation, but if too high, we blow memory. BasicQueryTest.testAnalysis is an example of this. Instead, we should send cell-requests in batches (is ~1000 the right size?), identify those that can be answered from global or external cache, return those segments, but not execute SQL until the end of the phase. If so, CellRequestQuantumExceededException be obsoleted.

    19. Tracing. a. Remove or re-purpose FastBatchingCellReader.pendingCount; b. Add counter to measure requests satisfied by calling peek(mondrian.rolap.agg.CellRequest).

    20. Obsolete SegmentDataset and its implementing classes. SegmentWithData can use SegmentBody instead. Will save copying.

    21. Obsolete CombiningGenerator.

    22. SegmentHeader.constrain(mondrian.spi.SegmentColumn[]) is broken for N-dimensional regions where N > 1. Each call currently creates N more 1-dimensional regions, but should create 1 more N-dimensional region. SegmentHeader.excludedRegions should be a list of SegmentColumn arrays.

    23. All code that calls Future.get() should probably handle CancellationException.

    24. Obsolete handler. Indirection doesn't win anything.

    Author:
    jhyde
    • Field Detail

      • thread

        public final java.lang.Thread thread
      • cacheExecutor

        public final java.util.concurrent.ExecutorService cacheExecutor
        Executor with which to send requests to external caches.
      • sqlExecutor

        public final java.util.concurrent.ExecutorService sqlExecutor
        Executor with which to execute SQL requests.

        TODO: create using factory and/or configuration parameters. Executor should be shared within MondrianServer or target JDBC database.

      • segmentCacheWorkers

        public final java.util.List<SegmentCacheWorker> segmentCacheWorkers
    • Constructor Detail

      • SegmentCacheManager

        public SegmentCacheManager​(MondrianServer server)
    • Method Detail

      • loadSucceeded

        public void loadSucceeded​(RolapStar star,
                                  SegmentHeader header,
                                  SegmentBody body)
        Adds a segment to segment index.

        Called when a SQL statement has finished loading a segment.

        Does not add the segment to the external cache. That is a potentially long-duration operation, better carried out by a worker.

        Parameters:
        header - segment header
        body - segment body
      • loadFailed

        public void loadFailed​(RolapStar star,
                               SegmentHeader header,
                               java.lang.Throwable throwable)
        Informs cache manager that a segment load failed.

        Called when a SQL statement receives an error while loading a segment.

        Parameters:
        header - segment header
        throwable - Error
      • remove

        public void remove​(RolapStar star,
                           SegmentHeader header)
        Removes a segment from segment index.

        Call is asynchronous. It comes back immediately.

        Does not remove it from the external cache.

        Parameters:
        header - segment header
      • externalSegmentCreated

        public void externalSegmentCreated​(SegmentHeader header,
                                           MondrianServer server)
        Tells the cache that a segment is newly available in an external cache.
      • externalSegmentDeleted

        public void externalSegmentDeleted​(SegmentHeader header,
                                           MondrianServer server)
        Tells the cache that a segment is no longer available in an external cache.
      • shutdown

        public void shutdown()
        Shuts down this cache manager and all active threads and indexes.
      • peek

        public SegmentWithData peek​(CellRequest request)
        Makes a quick request to the aggregation manager to see whether the cell value required by a particular cell request is in external cache.

        'Quick' is relative. It is an asynchronous request (due to the aggregation manager being an actor) and therefore somewhat slow. If the segment is in cache, will save batching up future requests and re-executing the query. Win should be particularly noticeable for queries running on a populated cache. Without this feature, every query would require at least two iterations.

        Request does not issue SQL to populate the segment. Nor does it try to find existing segments for rollup. Those operations can wait until next phase.

        Client is responsible for adding the segment to its private cache.

        Parameters:
        request - Cell request
        Returns:
        Segment with data, or null if not in cache