Dates as Data Warehouse Facts - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

Dates as Data Warehouse Facts

Author

Topic

indy2005
Starting Member

12 Posts

Posted - 2007-07-19 : 16:23:46

Hi,

I am looking at designing a data warehouse to manage clinical trial data. This will include obvious facts such as patient recruitment, but also project management facts such as forecast cost and resource, as well as actual cost and resource. While these are additive facts so should be fairly simple to model, we also deal with timelines. i.e. each project and each clinical trial has key milestone dates along its timeline. These are obviously not additive, but they are critical facts within each trial and include planned, baselined and actual dates. Has anyone had any experience modelling dates as facts - or on building a data warehouse on project management / schedule type information. Any advice appreciated.

Regards

Indy

nr
SQLTeam MVY

12543 Posts

Posted - 2007-07-24 : 12:25:04

What do you want the fact to be? Just the date or the date and other properties.
Nothing to stop you putting a date in a fact table - it's often added as a key to a date dimension but it depends what you want to do with it. The date dimension will give you attributes for the date (year, quarter, financial month, ...) but if you don't need any of those you can just add as a date to the fact table.
You can also have a fact table which just records state chages (again possible using the date dimension) and nothing else.
Depends on what the datamart is for.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.

indy2005
Starting Member

12 Posts

Posted - 2007-07-28 : 17:29:33

We record milestone dates for clinical trials across 1000s of studies.
We have milestones such as First Subject In, Last Subject In, Last Subject Last Visit, Clinical Study Report Created, Clinical Protocol Created. The main dimension will be the "study", and we need to be able to:

1) report milestone dates
2) archive off the milestones, so we can see how the milestones have changed over time
3) determine the intervals (in days and working days) between various milestones

The problem with dates as facts is they are not additive measures. Event the intervals are not additive. We may want to find median interval times (to see the median time it takes across the portfolio to setup studies, treat patients, create reports etc.).

In all my reading of data warehouse schemas, dates are always a dimension, but I work in project management, where dates are often facts - i.e. milestones, schedule slippages etc.

Regards

Indy

nr
SQLTeam MVY

12543 Posts

Posted - 2007-07-28 : 19:12:09

>> The problem with dates as facts is they are not additive measures.
What's the problem with that?

It sounds like you might need to build a different datamart as you seem to be trying to use one for two different reporting purposes. Maybe you need one to record the changes in these dates.

>> dates are always a dimension
There might be a date dimension but that doesn't stop you having a date on a fact table (think of it as a degenerate dimension in komball terms). But why can't you have a date dimension key?

1) Is this the predicted dates or those that have been attained - the formed sounds like a dimension, the latter a factless fact table.

2) Could be a slowly changing dimension or could be a factless fact table if you consider the change of a milestone date an event.

3) A client calculation?

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.

indy2005
Starting Member

12 Posts

Posted - 2007-07-29 : 17:57:26

Hi,

Thanks for the response. The dates will be planned dates, actual dates when they become available and even baseline dates.

The dates themselves need to be reported, and I agree that this table should drive a separate set of reporting tables. If you imagine we archive off the dates monthly and need to perform the following types of calculations:

1) How has the forecast for this date changed over time?

2) How did the actual date differ from the planned date (i.e. how well did we forecast originally for this date)

3) What are the intervals between key milestones, in days and working days

Things are further complicated due to data quality issues. We dont always have all the dates available and so make assumptions. When calculating intervals we use actuals dates where available as they are more "real", but if an actual date isnt present we use a planned date. If there is no date available, we may find an alternative milestone we can use. This can all happen in the ETL layer I guess.

If you had to know how a data point has changed over time, how best to do it?

I plan to create an archive each month which is the entire data set, labelled with the date the archive was taken. i.e. to add an Extract dimension. So the same study, same milestone will exist many times in the fact table....once for each point in time a snapshot was taken. The dates may be the same in each extract, but where there hass been a change...some form of variance can be calculated.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources