Premium Essay

Bigdata Etl

In:

Submitted By somasekhar
Words 6174
Pages 25
White Paper
Big Data Analytics

Extract, Transform, and Load
Big Data with Apache Hadoop*
ABSTRACT
Over the last few years, organizations across public and private sectors have made a strategic decision to turn big data into competitive advantage. The challenge of extracting value from big data is similar in many ways to the age-old problem of distilling business intelligence from transactional data. At the heart of this challenge is the process used to extract data from multiple sources, transform it to fit your analytical needs, and load it into a data warehouse for subsequent analysis, a process known as “Extract, Transform & Load” (ETL). The nature of big data requires that the infrastructure for this process can scale cost-effectively. Apache Hadoop* has emerged as the de facto standard for managing big data. This whitepaper examines some of the platform hardware and software considerations in using Hadoop for ETL.
–  e plan to publish other white papers that show how a platform based on Apache Hadoop can be extended to
W
support interactive queries and real-time predictive analytics. When complete, these white papers will be available at http://hadoop.intel.com.

Abstract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

The ETL Bottleneck in Big Data Analytics

The ETL Bottleneck in Big Data Analytics. . . . . . . . . . . . . . . . . . . . . . 1

Big Data refers to the large amounts, at least terabytes, of poly-structured data that flows continuously through and around organizations, including video, text, sensor logs, and transactional records. The business benefits of analyzing this data can be significant. According to a recent study by the MIT Sloan School of Management, organizations that use analytics are twice as likely to be top performers in their industry as those that don’t.1

Apache Hadoop for

Similar Documents

Premium Essay

Cisco Case Study

...Cisco IT Case Study – August 2013 Big Data Analytics How Cisco IT Built Big Data Platform to Transform Data Management EXECUTIVE SUMMARY CHALLENGE ● Unlock the business value of large data sets, including structured and unstructured information ● Provide service-level agreements (SLAs) for internal customers using big data analytics services ● Support multiple internal users on same platform SOLUTION ● Implemented enterprise Hadoop platform on Cisco UCS CPA for Big Data - a complete infrastructure solution including compute, storage, connectivity and unified management ● Automated job scheduling and process orchestration using Cisco Tidal Enterprise Scheduler as alternative to Oozie RESULTS ● Analyzed service sales opportunities in one-tenth the time, at one-tenth the cost ● $40 million in incremental service bookings in the current fiscal year as a result of this initiative ● Implemented a multi-tenant enterprise platform while delivering immediate business value LESSONS LEARNED ● Cisco UCS can reduce complexity, improves agility, and radically improves cost of ownership for Hadoop based applications ● Library of Hive and Pig user-defined functions (UDF) increases developer productivity. ● Cisco TES simplifies job scheduling and process orchestration ● Build internal Hadoop skills ● Educate internal users about opportunities to use big data analytics to improve data processing and decision making NEXT STEPS ● Enable NoSQL Database and advanced...

Words: 3053 - Pages: 13

Premium Essay

Bpcl

...SPECIAL ISSUE: BUSINESS INTELLIGENCE RESEARCH BUSINESS INTELLIGENCE AND ANALYTICS: FROM BIG DATA TO BIG IMPACT Hsinchun Chen Eller College of Management, University of Arizona, Tucson, AZ 85721 U.S.A. {hchen@eller.arizona.edu} Roger H. L. Chiang Carl H. Lindner College of Business, University of Cincinnati, Cincinnati, OH 45221-0211 U.S.A. {chianghl@ucmail.uc.edu} Veda C. Storey J. Mack Robinson College of Business, Georgia State University, Atlanta, GA 30302-4015 U.S.A. {vstorey@gsu.edu} Business intelligence and analytics (BI&A) has emerged as an important area of study for both practitioners and researchers, reflecting the magnitude and impact of data-related problems to be solved in contemporary business organizations. This introduction to the MIS Quarterly Special Issue on Business Intelligence Research first provides a framework that identifies the evolution, applications, and emerging research areas of BI&A. BI&A 1.0, BI&A 2.0, and BI&A 3.0 are defined and described in terms of their key characteristics and capabilities. Current research in BI&A is analyzed and challenges and opportunities associated with BI&A research and education are identified. We also report a bibliometric study of critical BI&A publications, researchers, and research topics based on more than a decade of related academic and industry publications. Finally, the six articles that comprise this special issue are introduced and characterized in terms of the proposed BI&A research...

Words: 16335 - Pages: 66

Premium Essay

Business Intelligence

...SPECIAL ISSUE: BUSINESS INTELLIGENCE RESEARCH BUSINESS INTELLIGENCE AND ANALYTICS: FROM BIG DATA TO BIG IMPACT Hsinchun Chen Eller College of Management, University of Arizona, Tucson, AZ 85721 U.S.A. {hchen@eller.arizona.edu} Roger H. L. Chiang Carl H. Lindner College of Business, University of Cincinnati, Cincinnati, OH 45221-0211 U.S.A. {chianghl@ucmail.uc.edu} Veda C. Storey J. Mack Robinson College of Business, Georgia State University, Atlanta, GA 30302-4015 U.S.A. {vstorey@gsu.edu} Business intelligence and analytics (BI&A) has emerged as an important area of study for both practitioners and researchers, reflecting the magnitude and impact of data-related problems to be solved in contemporary business organizations. This introduction to the MIS Quarterly Special Issue on Business Intelligence Research first provides a framework that identifies the evolution, applications, and emerging research areas of BI&A. BI&A 1.0, BI&A 2.0, and BI&A 3.0 are defined and described in terms of their key characteristics and capabilities. Current research in BI&A is analyzed and challenges and opportunities associated with BI&A research and education are identified. We also report a bibliometric study of critical BI&A publications, researchers, and research topics based on more than a decade of related academic and industry publications. Finally, the six articles that comprise this special issue are introduced and characterized in terms of the proposed BI&A research framework. Keywords:...

Words: 16335 - Pages: 66

Free Essay

Big Data

...McKinsey Global Institute June 2011 Big data: The next frontier for innovation, competition, and productivity The McKinsey Global Institute The McKinsey Global Institute (MGI), established in 1990, is McKinsey & Company’s business and economics research arm. MGI’s mission is to help leaders in the commercial, public, and social sectors develop a deeper understanding of the evolution of the global economy and to provide a fact base that contributes to decision making on critical management and policy issues. MGI research combines two disciplines: economics and management. Economists often have limited access to the practical problems facing senior managers, while senior managers often lack the time and incentive to look beyond their own industry to the larger issues of the global economy. By integrating these perspectives, MGI is able to gain insights into the microeconomic underpinnings of the long-term macroeconomic trends affecting business strategy and policy making. For nearly two decades, MGI has utilized this “micro-to-macro” approach in research covering more than 20 countries and 30 industry sectors. MGI’s current research agenda focuses on three broad areas: productivity, competitiveness, and growth; the evolution of global financial markets; and the economic impact of technology. Recent research has examined a program of reform to bolster growth and renewal in Europe and the United States through accelerated productivity growth; Africa’s economic potential;...

Words: 60035 - Pages: 241

Free Essay

Mcda Analysis

...McKinsey Global Institute June 2011 Big data: The next frontier for innovation, competition, and productivity The McKinsey Global Institute The McKinsey Global Institute (MGI), established in 1990, is McKinsey & Company’s business and economics research arm. MGI’s mission is to help leaders in the commercial, public, and social sectors develop a deeper understanding of the evolution of the global economy and to provide a fact base that contributes to decision making on critical management and policy issues. MGI research combines two disciplines: economics and management. Economists often have limited access to the practical problems facing senior managers, while senior managers often lack the time and incentive to look beyond their own industry to the larger issues of the global economy. By integrating these perspectives, MGI is able to gain insights into the microeconomic underpinnings of the long-term macroeconomic trends affecting business strategy and policy making. For nearly two decades, MGI has utilized this “micro-to-macro” approach in research covering more than 20 countries and 30 industry sectors. MGI’s current research agenda focuses on three broad areas: productivity, competitiveness, and growth; the evolution of global financial markets; and the economic impact of technology. Recent research has examined a program of reform to bolster growth and renewal in Europe and the United States through accelerated productivity growth; Africa’s economic potential;...

Words: 60035 - Pages: 241