Press "Enter" to skip to content

Data warehousing and its concepts

Introduction
Organizations, businesses, and companies have different assets that are regarded to be such valuable. The current advancement in technology has given these institutions the sense of appreciating information as the greatest assets. Information has remained an immensely valuable asset to various institutions, and this pushes for the need of making sure that the data is well stored and readily available for use as needed. The availability of important data has been an issue since there is too much data which has been the reason behind the difficulty of extraction the important information. There is a big difference between data and information. Therefore, the core creative action was to design a phenomenon that can capture the current generated huge and massive electronic data which has been stored for years to get used in accomplishing the goals of the given institutions, thus the creation of data warehousing phenomenon.
Data warehousing overview
Corporate and any other institution database do store data on the various tasks within enterprises and institutions. The data is massive and undertaking desirable analysis can be challenged thus pushing for the acquisition of a concentrated architecture that can enable the analysis of the data beyond the normal database to enhance the decision-making process. Data warehousing comes in as a facilitator to the common database with the core goal of achieving the defined objectives through enhancing the decision making process. Data warehousing is taken as decision support system model since communities, and industry bodies have given the priority of paying attention to decision support system and in specific the data warehousing (Golfarelli & Rizzi, 2009).
Data warehousing can be defined as a common collection of approaches, measures, techniques, and tools that can get applied in supporting the senior staff n undertaking their tasks; managers, directors, senior managers, and analyst to undertake analysis on the available collected data to facilitate the undertaking the decision-making process hence improving on the entire information resources. Therefore, it can be deduced that data warehousing is a descriptive subject oriented, time variant, the integrated and non-volatile approach of data collection that provides the data that is aid the analyst in making informed and sound decisions for the institution. Data warehousing description provides an easy way of deducing the features of the concept (Ponniah, 2011).
Understanding Data Warehousing
Description of data warehousing characteristics demands the understanding as to why the data warehouse is separated from the common operational databases. Nevertheless, it should get noted that a database warehouse is a database on its own that is kept separated from the operational database whose reasons will be provided in the discussion. A data warehouse does not require a frequent updating thus making its management friendly to the concerned personnel. The content of the data warehouse consolidates the organization’s historical data that aids in analyzing the business operations and thus facilitate the roles of executives in organizing and understanding of their business towards making sound and strategic decisions (DataFactZ., 2014, August 16).
Scholars and researchers may demand the understanding as t why the data warehousing is at all times separated from the operational database. The reasons behind the separation are that operational databases are designed with a simplified well-known task and workload which is different from the respective principle of designing data warehousing which is all about querying, a complex task with the current various general forms of data. Subsequently, operational databases are instinct in giving support to the processing of concurrent transactions and thus pushing for the demand of concurrency controls and recovery mechanism as a way of enhancing consistency and robustness of the database. A data warehouse no longer requires any recovery, transition process nor concurrency controls. Another basic reason for separating data warehousing from the operational database is that the operational database querying approach gives room for reading and modifying the operations unlike the warehouse going for the read-only approach as it accesses the stored data. Also, it must be noted that operational databases are interested in current data while the data warehousing being intended in maintaining the historical data (DataFactZ., 2014, August 16).
Characteristics of Data Warehousing
Features of the data warehouse are depicted from the definition and are summarized as Subjected oriented, integrated, time-variant, data granularity and non-volatile. A data warehouse is considered subject oriented on the basis that it puts in the concentration of providing the information specifically on a subject matter unlike the generalization of the organization operations. It eliminates the principle of the common operational database by opting out of focusing on the ongoing operations of the enterprise instead going for eh modeling and analyzing data for purposes of decision making. A subject matter can be sales, products, supplies and many others. A data warehouse is also taken as integrated since it is designed by simply integrating data from various heterogeneous sources; flat files, relational databases to facilitate effective data analysis (Sravani, 2017).
A data warehouse is time variant on the simple basis that the data collected in a data warehouse is subjected to a particular and specific period frame. Therefore, the data warehousing provides that information that reflects on the historical happenings of the organization. At the same time, the data is kept for the entire life as the data in the warehouse cannot be erased thus making it non- volatile only that new data is just kept on adding. Data granularity brings the essence of the data warehousing approach of the level of details. The principle enables the data warehousing to keep the data in levels according to the data types and the expected operation of the system for querying making it easy to access data (Sravani, 2017).
Concepts and terminologies
Data warehousing describes the entire process of designing and making use of the data warehouse. It is designed for integrated data and any detail from heterogeneous sources for the purpose of facilitating analytical reporting, ad-hoc querying and decision making. The process of data warehousing involves three critical actions of data cleaning, data integration and then the data consolidation. During the integration of heterogeneous databases, two approaches are adopted namely, the query-driven approach and the update- driven approach. The query-driven approach is a traditional mode which involves the issuing of the query to the client site followed by the translation of the metadata dictionary after which the queries get mapped to the local query processor which integrates the result into the global answer set. The approach was quite inefficient, at the same time expensive. The current data warehouse systems have gone by the update-driven approach which is defined as the process of getting the information from the multiple heterogeneous sources being integrated in advance and as a result stored in the warehouse. The approach operates on the basis that the information is available for the direct analysis and querying (Rahm & Do, 2000).
A data warehouse is defined with various functions which are accomplished in adoption and use of different tools and utilities. The function of data extraction is distinctly defined with the process of gathering data from multiple different sources. The action of data cleaning involves the practice of examining, detecting ad correcting any identified errors in the data. Data transformation defines the conversion of data from the legacy format to the warehouse format respectively. Subsequently, the data loading process involves the practices of sorting, summarizing, checking of data integrity, consolidation, and building of indices and partitions while the refreshing function describes the updating the data sources to the warehouse.
There are various terminologies that are essential I understanding data warehousing; Metadata, as described in data warehouse, refers to the warehouse objects as it acts as a business directory thus aiding the decision support system in locating the contents within a data warehouse.
Data cube as described aids in representing data in multiple dimensions thus recognized by dimensions and facts. A dimension refers to the entity in correspondence to which the organization keeps the records. Data mart defines the data that is a subset of the organization overall data specifically describing a group of people in an organization. Therefore, data mart does contain the data that reflects on the features of a given particular group. In a scenario of the marketing data mart, the data mart will refer to data that describes customers, items and may be sales and this makes data mart to get confined to specific subjects. 

References
DataFactZ. (2014, August 16). Why Data Warehouse Separated from Operational Databases.

Golfarelli, M., & Rizzi, S. (2009). Data warehouse design: Modern principles and methodologies. McGraw-Hill, Inc..

Ponniah, P. (2011). Data warehousing fundamentals for IT professionals. John Wiley & Sons.

Rahm, E., & Do, H. H. (2000). Data cleaning: Problems and current approaches. IEEE Data Eng. Bull., 23(4), 3-13.

Sravani, S. (2017, June 19). Data Warehousing: Characteristics, Functions, Pros & Cons.

Carolyn Morgan is the author of this paper. A senior editor at MeldaResearch.Com in research paper writing services if you need a similar paper you can place your order from Top American Writing Services.

Be First to Comment

Leave a Reply

%d bloggers like this: