Data Warehouse definition- What is Data Warehouse?  

Data Warehouse Overview

This chapter is setting the scene. It provides the end to end high level landscape of Data Warehouse. Much of this chapter talks on how Data Warehouse is different from Transaction systems, and what are we in for. Lets Look at What & why of Data Warehouse, its components & framework and what are the challenges for a typical DW project.\n

Data Warehouse definition- What is Data Warehouse?

Data Warehouse is repository of Data picked from Transaction systems, and filtered and transformed to make it available for data analysis reporting.

Data-Warehouse is a repository of Data, which can provide most OR all of the Data and information requirements of an enterprise. It means that it pulls data from all the production and other sources. Once the data is pulled onto an offline staging area, it is cleansed, transformed and loaded into a sanitized, uniform and well-organized manner so that you can run queries, reports and all kind of analysis on the data.

What is not Data Warehouse?

Data Warehouse typically does not provide the online information, as mostly the data Extraction and consolidation happens in the end of day batch-processing. For online, transaction based queries, the OLTP (Online Transaction Processing systems) is used.

Data Warehouse is not business intelligence OR OLAP. It is a repository of sanitized and consolidated data, which can be used for any purpose including Business Intelligence. The usage can be for transaction reports, data mining, Data Analysis, Statistical forecasting, valuation systems OR so on. Technically speaking, we can have a good data warehouse without good business intelligence (which is a combination of Data Analysis +Data Mining+ Performance Management Reporting..).

Data Warehouse vs. Operational Data Store (ODS)- They are different

Data Warehouse is an 'offline' integration of data, whereas Operational Data Store is an 'online' integration of data. ODS is used, when the data at a transaction (processing as well as querying) level is dispersed across various systems, and one needs to bring it together on online basis.

For example- Let us say that you want to have a single view of customer to be used by customer service, whereby they can also update the data in that single view online basis. However, the data on the customer (OPD Records,Hospitalization records, diagnostic records, pharmaceutical purchase records..) is lying in different databases. ODS could be a good choice.

The above-said concept of ODS is an ideal one. Another option for an Operational Data Store is to be used for online queries, but the information, which it provides is not real-time, but pertaining to last End of Day. For exampleyou have a single customer view, but it does not include the transactions which the customer has done today. For this kind of need, some times even Data Warehouse repository can also used for this purpose.

Data Warehouse vs. Data Mart in Business Intelligence

Data Warehouse is at an enterprise level repository, which is having a combination of various data marts. Data Warehouse carries all the dimensions and measures required and it ensures the integrity of same dimensions and measures across different data marts. Data mart is a limited set of dimensions and measures used for specific business theme. They are populated out of the Data-Warehouse Data Sets.

Typically an organization's business intelligence agenda starts with few data marts, before maturing to a full-blown data warehouse. However,most of the design & development concept apply equally to a Data-mart.

Please refer De-Normalized Data Warehouse/Data Mart for detailed comparison between a Data Warehouse and Data Marts.