jump to navigation

What is the difference b/w OLAP & OLTP, why do we keep warehouse on different server then of OLAP? April 24, 2007

Posted by Muhammad Habib in Dataware Housing.

OLTP databases provide real time accesses to its data which is being updated by other transactions. OLTP databases contrasts earlier databases that allowed updation in batch modes as all updations were provided in one go.(database Replication can be considered so).

In case of OLTP there is no such boundation and all updations can be done anytime and other users will automatically see your changes.

OLAP provides analytical processing capabilities on an OLTP database. This provides a real time analytical view of the database. Historically OLAP suffered since it would slow down OLTP database considerably. However with faster h/w and better algorithms, it’s now practical.

Data Warehouses contain historic & non operational data that is not required for day to day working of the operations. For example you purchased a pair of Shoes from a branded store 10 years ago. Chances that you will come back reclaiming your money are very less ;-).

Such data is transferred to special database known as Warehouse. If it would have being in the OLTP database then it would have consumed storage and WORSE will make transaction processing very very slow. Since best database algo’s are log(n) where n is the amount of data. As ‘n’ grows so is the time complexity.

Hence it makes sense to keep such historic data separate for analysis purposes.


ETL Concepts March 20, 2007

Posted by Muhammad Habib in Dataware Housing.

Extraction, transformation, and loading. ETL refers to the methods involved in accessing and manipulating source data and loading it into target database.

The first step in ETL process is mapping the data between source systems and target database(data warehouse or data mart). The second step is cleansing of source data in staging area. The third step is transforming cleansed source data and then loading into the target system.

Note that ETT (extraction, transformation, transportation) and ETM (extraction, transformation, move) are sometimes used instead of ETL.

Glossary of ETL (Reference:www. Oracle.com)

Source System
A database, application, file, or other storage facility from which the data in a data warehouse is derived.

The definition of the relationship and data flow between source and target objects.

Data that describes data and other structures, such as objects, business rules, and processes. For example, the schema design of a data warehouse is typically stored in a repository as metadata, which is used to generate scripts used to build and populate the data warehouse. A repository contains metadata.

Staging Area
A place where data is processed before entering the warehouse.

The process of resolving inconsistencies and fixing the anomalies in source data, typically as part of the ETL process.

The process of manipulating data. Any manipulation beyond copying is a transformation. Examples include cleansing, aggregating, and integrating data from multiple sources.

The process of moving copied or transformed data from a source to a data warehouse.

Target System
A database, application, file, or other storage facility to which the “transformed source data” is loaded in a data warehouse.