Mustafa Man, W. Aezwani W.A. Bakar, Noraida Hj. Ali and Masita Abd. Jalil
Department of Computer Science
School of Informatics and Applied Mathematics
Universiti Malaysia Terengganu
21030 Kuala Terengganu, Terengganu.
firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com
ABSTRACT. Data integration is considered as one of the hot issues to be solved especially in integrating unstructured data with multiple types and formats. This paper introduces a new model for integrating multiple types of heterogeneous data applying to mud crabs case study in Setiu Wetland (SW). The Hybrid Federated Data Warehouse (HyFeDWare) model combines two approaches which are Data Warehouse and Federated Database. Simulation result shows that the processing time for integration of unstructured biodiversity data of mud crabs are lesser than 2 seconds for 12 rows of 7 MB data. This model generally could be used to integrate any types and format of data in distributed environment.
KEYWORDS. Data Integration, Data Warehouse, Federated database, Distributed Environment.
- Aezwani, W.A.B. et al., 2010,”SIDIF: Location based technique as a determinant of effectiveness and efficiency in artificial reefs development project.”Information Technology (ITSim), 2010 International Symposium in. Vol. 2. IEEE,.
- Bowen, J. (2012). Getting Started with Talend Open Studio for Data Integration. Packt Publishing Ltd.
- Catriel, B. & Milo, T. 1999. Schemas for Integration and Translation of Structured and SemiStructured Data. Database Theory—ICDT’99. Springer Berlin Heidelberg, 296-313.
- Christine, P. and Spaccapietra. S., 1998, “Issues and approaches of database integration.” Communications of the ACM 41.5es : 166-178.
- Greenwald, R., Stackowiak, R. & Stern, J. (2013). Oracle Essentials: Oracle Database 12c. “O’Reilly Media, Inc.”.
- Haider, S., Ballester, B., Smedley, D., Zhang, J., Rice, P. & Kasprzyk, A. (2009). BioMart Central Portal—unified access to biological data. Nucleic acids research, 37(suppl 2), W23-W27.
- Hossain, M., Harari, N., Semere, D., Mårtensson, P., Ng, A. & Andersson, M. (2012). Integrated modeling and application of standardized data schema. In5th Swedish Production Symposium,(SPS12), 6-8 November, 2012, Linköping, Sweden. The Swedish Production Academy.
- Ikhwanuddin, M. et al., 2012, “Improved hatchery‐rearing techniques for juvenile production of blue swimming crab, Portunus pelagicus (Linnaeus, 1758).”Aquaculture Research 43.9 : 1251 -1259.
- Joan Bader, C. H., Razo, J., Madnick, S. & Siegel, M. (1999). An analysis of data standardization across a capital markets/financial services firm.
- Jose, Z., Pardillo, J. & Trujillo, J. 2009. A UML Profile for the Conceptual Modeling Of DataMining With Time-Series In Data Warehouses. Information and Software Technology 51, 6: 977-992.
- Kasprzyk, A. (2011). BioMart: driving a paradigm change in biological data management. Database, 2011, bar049.
- Ming Shuai, W. and Fu. X. F., 2014, “A Method of Heterogeneous Data Integration Based on SOA.” Applied Mechanics and Materials 536 : 494-498.
- Mustafa, M. et al. 2011, “Designing multiple types of spatial and non spatial databases integration model using formal specification approach.” Software Engineering (MySEC), 2011 5th Malaysian Conference in. IEEE,.
- Mustafa, M. et al. 2012, “Integration Model for Multiple Types of Spatial and Non Spatial Databases.” Signal Processing and Information Technology. Springer Berlin Heidelberg,. 95-101.
- Oracle (2013). Unstructured Data Management with Oracle Database 12c. Retrieved October 29, 2013 from http://www.oracle.com/technetwork/database/informationmanagement/unstructured-data-management-wp-12c-1896121.pdf
- Roth, M. A. et al., 2002, “Information integration: A new generation of information technology.” IBM Systems Journal 41.4: 563-577.
- Stehr, H., Duarte, J. M., Lappe, M., Bhak, J. & Bolser, D. M. (2010). PDBWiki: added value through community annotation of the Protein Data Bank.Database, 2010, baq009.