Monday, 19 January 2015

understanding dimensions

Understanding Dimensions


Dimensions give context to measures. (measures are the facts or measureable values in the fact tables).
They give specific meaning or context to the facts.
The term dimension refers to the tables.

For example: a fact table contain numbers of “product sold”, a DateDimension Table contain the following Dimensions of date pertaining to the number of product sold

1.       Data and time(12/4/14 11:12:20)
2.       Quarter(4)
3.       DayofYear(321)
4.       Weekday(Thursday)
5.       Week(44)

Each individual column in a dimension Table is and attribute. Attribute usually compress or expand data details. Data can be ‘discretized’ into smaller, summarized group.

By Discretized we mean that if we have a daysOfYear attribure and we were needed to draw a graph based on this data it would be difficult, so we summarized this data into number of weeks of number of month so our graph would decrease from 365 to 12 graphs.
Data can also be ‘drilled into’ for more detailed information like hours of the day, minutes of the hour, seconds etc.


Dimension Table Types;


A Dimension table usually stores more than just attributes, it store data that is not in fact table.
A dimension table can have at least 5 column types
·         Attribute column
o   It gives context to the measures
o   Used by tools to create pivot tables, drill down etc.
o   E.g.  day of the week, hour of the day,
·         Name column
o   Used to make the report data easier to read
o   Provide human readable name to the entities like costumer, product, order etc.
·         Key column
o   Used to uniquely identify entities and establish relationships
·         Member Property column
o   Data included to descriptive use on reports, etc.
o   Like addresses, phone no, descriptions etc.
·         Lineage column
o   Used to store auditing, source info

Slowly Changing Dimensions


Dimension data may change from time to time. For example we have a customer last name, for female , and she got married so her last name got changed. Or a costumer who is living in A city and has been buying products from us for two years and now she went somewhere else, B city for example. With that simple change our data about a costumer can completely change, now we will say that the costumer buy products from B city because we have made changes to the table and this can cause problem for us.

Now for Slowly Changing Dimensions (SCD), we have to solutions.
1.       Type 1
2.       Type 2

Type 1

We use type 1 when we don’t want the past data to be used or accessed. We simply override the data in dimension table. That is the most simplest solution for SCD. It depend upon the business you are dealing with.


Type 2


In this type we use some extra columns in order to separate the two rows in the columns. That is, when we have similar keys for a costumer and we want to maintain them we use a Surrogate key also known as Data Warehouse key in the table. We use when we need to maintain the past data and current data in the data warehouse.

No comments:

Post a Comment