Understand Data Warehouse concept at High-Level in 4 mins.
Every organization needs to take decision on the basis of data. It would be easy to start taking decisions if there would be a pictorial representation of the data in a more meaningful way. They derive the meaningful information from all the data that company have will require creating a report on top of the data. One cannot take these business decisions if the data is not sorted/cleaned/concised/stayed relevant.
Consider a scenario in a company that is going to launch a new product on the basis of the popularity of its existing products and also to attract the investors on their new product.
- How many products did we sell last quarter?
- Did our latest product meet the expectations?
- What is the growth of company in each region?
- Did we over produce?
To answer all these in a better way, we need to have technology in place to keep the data relevant and upto date for the business. Many organisations have their own Data Warehouse on premise or on cloud which keeps track of all the historical data and fresh data and helps them take business decisions on the basis of the fact very quick and concise.
Data warehouse = database used and optimized for analytical purposes.
https://sapient.udemy.com/course/data-warehouse-the-ultimate-guide
Sources for data warehouse:
Data Warehouse brings the relevant data from different system or tool including database/excel/document/file system through ETL (Extract, transform and Load) process which basically makes all of these data into one supported formated.
Data Warehouse acts as a centralised place for the data and it is ready to consume by any of the reporting tools or analytical systems. Usually the data is kept ready for the other business users or teams to analyse or use. We create data warehouse for business intelligence
What is business Intelligence?
Business intelligence is a Strategy or technology that use to produce meaningful information from the transformed data that is stored or curated in Data Warehouse(centralized location). We can then use the data to do data analysis such as data mining , data visualization or reporting and predictive analysis.
Data Warehouse vs Data Lake?
It is true that they both act as a centralized location of the data. But there are differences between a Data Warehouse and Data Lake.
Data WareHouse :
- Stores processed/transformed data.
- Uses Databases store data in tables.
- Structured.
- Specific use case and ready to use.
Data Lake:
- Usually store raw data from sources with/without transformed it.
- Uses big data technologies.
- Unstructured.
- Use case not defined yet. immediate use or future use of data.
Data Warehouse Architecture
Data Warehouse consists of several layers. We know that we have our data sources using ETL(Extract, Load and transformation) process to load the data from data sources with some transformation into our Data Warehouse. But lets look at closer to better understand the Data Warehouse.
Layers :
- Data sources — Difference sources system that contains company data.
- Staging Area ( Not much of transformation done here expect for keeping all the data in single format that is in tabular format)
- Core Layer/Data Warehouse layer (ETL process to load the data from staging Area post transformations) — Most of the time, this is the area from which the data is pulled and used for the analytics.
- Specific Mart layer ( data mart top of data warehouse that are relevant for one specific use case), Example Sales Mart, Production Mart and Inventory Mart.
- User Area — business intelligence tools and reporting systems that consumes the data and produce meaningful reports.
Thank you for reading this article. I made it very simple and understandable for anyone to gain high level knowledge of how a typical Data Warehouse work. I will be adding more articles that will help you to understand the Data Warehouse indepth.
Please follow and like. Cheers!!!!