While modeling a data warehouse with an Analysis Services OLAP cube for a customer I ran into a problem with a "special" fact table. The customer wanted to do the following tasks with his cost controlling system:
1. Forecasting fulltime equivalents, head counts and costs of employees for the next 12 months on a monthly basis.
2. Displaying the forecast history on a daily granularity.
Full time equivalent (FTE) means the amount of work an employee can do. If he has a working capacity of 100% he does 1.0 FTE. If he only works the half day, he does only 0.5 FTE.
Head counts are simply the amount of employees the customer has, not regarding the working capacity.
And costs are monthly salaries and further costs paid for an employee.
Because forecasts are a fact to change, we should actualize the forecasts stored in the fact table. But regarding the second requirement, we cannot simply change or delete the records in our fact table, because if we do that, we will lose our history. So there is a need to implement this fact table as a kind of “slowly changing fact” table, in which old forecasts will stay in the table but marked as “old”.
To design an OLAP cube that is as fast as possible we should use facts that can be aggregated and as less as possible calculated measures. With our “slowly changing fact” table we must have selection logic to only select the needed members, because we cannot simply summarize or count old and current records. This can only be done implementing this logic in a calculated measure.
To avoid a performance bottleneck and using pre-aggregated measures, I used a special fact table design strategy called “Compensation Record Strategy”. This strategy allows to easily pre-aggregate all measures of this fact table, and to provide access to historical forecasts.
The compensation record strategy works with the principle to eliminate a whole fact record without deleting it before adding the current one. The following table shows a sample of that:
| Employee | Submission Date | FTE | Head Count | Cost |
| Jon | Jan 01, 2007 | 1.0 | 1 | 3000 |
| Jon | Jan 06, 2007 | -1.0 | -1 | -3000 |
| Jon | Jan 06, 2007 | 0.5 | 1 | 1500 |
| Jon | Jan 15, 2007 | -0.5 | -1 | -1500 |
| Jon | Jan 15, 2007 | 0.8 | 1 | 2400 |
The first record says at January the first, that Jon works with 1.0 FTE for a salary of 3000. On January the 6th a correction was done, that says that Jon now only works for the half day. To do that, we just eliminate the record from January the first doing a kind of contra entry, which leads to zero summarizing the first two records. Immediately after that, we will just enter a new record with the new facts. Summarizing all three records from January 1st and January 6th leads to the result of 0.5.
On January 15th Jon’s employer decided to let him work for a working capacity of 80%. We compensated the old facts using a further compensation record and adding the new fact record to the table.
Conclusion
Implementing a changing fact table which allows access to historical data isn’t as easy as it looks like, if you keep performance requirements in mind. To address that you can use the described compensation record strategy which allows using the full performance of pre-aggregated measures in OLAP cubes. If you decide to use that modeling strategy you have to consider that the ETL logic will be more complicated and needs more development effort.
Technorati Tags: Data Warehouse, Data Modelling, Database, Analysis Services, OLAP