International Journal of Computer Science and Business Informatics
IJCSBI.ORG
An Efficient Connection between
Statistical Software and Database
Management System
Sunghae Jun
Department of Statistics, Cheongju University
Chungbuk 360-764 Korea
ABSTRACT
In big data era, we need to manipulate and analyze the big data. For the first step of big data manipulation, we can consider traditional database management system. To discover novel knowledge from the big data environment, we should analyze the big data. Many statistical methods have been applied to big data analysis, and most works of statistical analysis are dependent on diverse statistical software such as SAS, SPSS, or R project. In addition, a considerable portion of big data is stored in diverse database systems. But, the data types of general statistical software are different from the database systems such as Oracle, or
MySQL. So, many approaches to connect statistical software to database management system (DBMS) were introduced. In this paper, we study on an efficient connection between the statistical software and DBMS. To show our performance, we carry out a case study using real application.
Keywords
Statistical software, Database management system, Big data analysis, Database connection,
MySQL, R project.
1. INTRODUCTION
Every day, huge data are created from diverse fields, and stored in computer systems. These big data are extremely large and complex [1]. So, it is very difficult to manage and analyze them. But, big data analysis is important issue in many fields such as marketing, finance, technology, or medicine.
Big data analysis is based on statistics and machine learning algorithms. In addition, data analysis is depended on statistical software, and the data are stored in database systems. So, for big data analysis, we should manage statistical software and