Year: 2019, Volume: 12, Issue: 47, Pages: 1-9

Original Article

Restructuring Loosely Structured Databases for Generating Local Statistics


Objective: Local-level statistics are generated from different data sources for planning and decision-making. A strategy is needed to restructure existing databases in order to construct an integrated database for generating statistics. Methods/ analysis: To achieve this objective, data structures of existing nation-wide surveys and census were studied, and useful data warehousing techniques in combining these data sources for constructing an integrated database were identified. The steps undertaken to build an integrated database were consolidated to formulate the needed strategy. To illustrate the veracity of the strategy, Philippine nation-wide survey and census databases were combined and local-level statistics were generated for exploratory data analysis. Findings: The study was able to identify features of existing databases that were used for the database restructuring and building of an integrated database. Likewise, the study was able to identify the lowest level unit of observation to which data could be disaggregated which will also serve as basis for combining different data sources. From the learning’s using some of the Philippine nation-wide surveys and census, the study has developed a strategy to access and combine different nation-wide databases that are physically separated but are related to each other in terms of their data architecture. This led to building an integrated database which could be a source for the generation of statistics that are useful in exploratory data analysis. The results of this analysis are useful for planning and decisionmaking. Furthermore, the study was able to identify ways to warehouse the generated statistics across time and space for future data analysis. Such ways were also incorporated in the developed strategy. Novelty/improvement: The formulated strategy of combining different databases into an integrated database as a source for generating local statistics is scalable to more databases for a richer set of generated statistics.
Keywords: Statistical Computing, Data Warehousing, Statistical Integrated Database, Combined Data Sources, Locallevel Statistics


