Free Essay

Application of Big Data

In:

Submitted By amogh12
Words 1240
Pages 5
Summary of application
Big data includes large sizes of data sets. These data sets are beyond the ability of commonly used software tools to capture, curate, manage and process data within a tolerable elapsed time (Big Data). Big data is a constantly moving target that means data size keeps growing indefinitely. Big data’s size usually ranges from terabytes to few petabytes of a data.
Storage of a big data is possible due to advancements made in the storage, memory and network technologies. Memories and discs are available in gigabyte and terabyte sizes.
There is no specific source from where big data can be collected. There can be multiple sources of data; every incoming data has become important to the organizations. In an IT industry this big data is treated as a gold mine. Using appropriate analysis can turn data into the knowledge, which in turn can be used for improvement of a business/organization’s needs.
Online auction giant eBay benefits from big data analysis. eBay is the world’s largest online marketplace platform, enables buying and selling products (DataStax, 2014). eBay is an American multinational e-commerce company headquartered in San Jose, California and it was founded by Pierre Omidyar in 1995.
The e-commerce giant has operations in over 30 countries, with over 100 million registered users. The latest number of a sellers listed by eBay Inc. is above 1.5 million. From every day activity online portal stems a lot of data and eventually information (Ferguson, 2013). eBay seeks to achieve highest buying price for all the listed items, because it takes a commission from each sale. eBay’s principal architect Tom Fastener explained that company has acquired firms that were using the open-source Hadoop framework, which now allows eBay to perform advanced analytics on items that are available for sale (Passingham, 2013).
Why did eBay pursue a big data initiative? eBay wants to mimic small shop experience in web world for customers. Small shops have an advantage where they can engage customers, help with their search and also make recommendations. This makes customer experience very personal, to achieve similar personalization company adopted analytics to understand its customers better.
Auction site generates huge amount of web analytics, known as “the customer journey data”. This helps company to identify how customers use ebay.com web portal (Saran, 2014).
David Stephenson, head of global business analytics at eBay says the biggest challenge is web analytics. He compares it with having video camera mounted on every customers head and recording every activity performed by a customer. This creates 100 million hours of a customer interaction data, which is highly unmanageable volume (Saran, 2014).
This effort initiated by eBay is to understand customers, apply data science techniques to allow collecting more data and new types of data.
Why do you think they selected Hadoop framework?
Apache Hadoop is a free and open source framework; also it supports second most popular language Java on open source GitHub platform. Considering these factors it makes easy for any organization to short list Hadoop for their big data problem.
In IT industry it’s a trend to acquire startups whose research is similar. eBay Inc. has acquired firms that were using Hadoop framework (Passingham, 2013). This gives eBay a leverage to use Hadoop infrastructure.

How was Hadoop used?

Figure 1. Ecosystem of Hadoop framework implementation. From “Hadoop - The Power of the Elephant,” http://www.ebaytechblog.com/2010/10/29/hadoop-the-power-of-the-elephant/#.VSR3p_nF-So by Anil Madan.
Starting from bottom up Core is Hadoop Distributed File System (HDFS) which is responsible for optimized reading and writing big blocks of data. MapReduce provides APIs and components to develop and execute jobs usig Java, Scala programming languages. Data Access comprises of Hbase, Pig and Hive for data manipulation. UC4 is a scheduler used by eBay to automate data loading from multiple sources. The top most Monitoring and Alerting is responsible for monitoring and alerting key events such as server unreachable or disk full failure. (Madan, 2010).
What techniques did they employ?
The need to effectively manage massive data eBay explored new technologies such as Apache Cassandra which is popular NoSQL database and DataStax Enterprise. Apache Cassandra gives eBay high-velocity data capability, along with NoSQL they have integrated real-time analytics bundled with DataStax (DataStax, 2014).
The company has split its data analytics needs across the three platforms. One of it is enterprise data warehouse from Teradata. According to Stephenson it is the critical component, since it’s a core transactional system which can’t go down (Saran, 2014).
Company only kept 1% of data and threw the rest, due to the cost factor. Company had an issue whether impose structure on the huge data set by throwing data, or keeping all the data collected and not being able to process. To address this issue, they started second data initiative. In which all the customer data was stored (Saran, 2014).
The company has worked with Teradata to store data cheaply with the use of commodity hardware and proprietary software to process “the customer journey data”. Using Hadoop auction site has built two 20,000 node with 80 petabytes capacity (Saran, 2014).
Was it hosted locally or via the cloud?
According to “HDFS Storage Efficiency Using Tiered Storage” article data is hosted internally in eBay’s clusters (Antony, 2015). Other blog posts suggest eBay also rely on cloud hosting provided by IBM and Teradata. Conclusion is eBay relies on hybrid data storage technique.

From a staffing perspective, did they use internal resources, external contractors, or a mixture of both? eBay Inc. utilized both internal and external resources for big data efforts. Head of global business analytics and data architect were primarily involved in big data efforts along with big data solution provider DataStax.
How successful was the big data effort? What were the outcomes?
According to Stephenson big data effort is proving value in ‘A/B testing’ (Saran, 2014). eBay has the unique ability to deliver an optimal user experience and drive purchases (DataStax, 2014). * eBay can now store vast amounts of data(250 TB) * handles 6 billion writes per day and 5 billion reads per day * provides 100% uptime of the system, even during peak (DataStax, 2014)

References
Antony, B. (2015, 1 12). HDFS Storage Efficiency Using Tiered Storage. Retrieved from ebaytechblog: http://www.ebaytechblog.com/2015/01/12/hdfs-storage-efficiency-using-tiered-storage/#.VSSCxvnF-So
Big Data. (n.d.). Retrieved from Wikipedia: http://en.wikipedia.org/wiki/Big_data
DataStax. (2014). Retrieved from DataStax: http://www.datastax.com/wp-content/uploads/2012/12/DataStax-CS-eBay.pdf
Ferguson, R. B. (2013, June 25th). How eBay Uses Data and Analytics to Get Closer to Its (Massive) Customer Base. Retrieved from sloanreview: http://sloanreview.mit.edu/article/how-ebay-uses-data-and-analytics-to-get-closer-to-its-massive-customer-base/
Madan, A. (2010, 10 29). Hadoop - The Power of the Elephant. Retrieved from ebaytechblog: http://www.ebaytechblog.com/2010/10/29/hadoop-the-power-of-the-elephant/#.VSR3p_nF-So
Passingham, M. (2013, oct 22). eBay using big data analytics to drive up price listings. Retrieved from v3: http://www.v3.co.uk/v3-uk/news/2302017/ebay-using-big-data-analytics-to-drive-up-price-listings
Saran, C. (2014, april 29). Case study: How big data powers the eBay customer journey. Retrieved from computerweekly: http://www.computerweekly.com/news/2240219736/Case-Study-How-big-data-powers-the-eBay-customer-journey

Similar Documents

Free Essay

Bigdata

...A BIG DATA APPROACH FOR HEALTH CARE APPLICATION DEVELOPMENT G.Sravya [1], A.Shalini [2], K.Raghava Rao [3] @ B.Tech Students, dept. of Electronics and Computers. K L University, Guntur, AP. *.Professor, dept. of Electronics and Computers. K L University, Guntur, AP sravyagunturi93@gmail.com , shaliniaramandla@gmail.com, raghavarao@kluniversity.in ABSTRACT: Big data is playing a vital role in present scenario. big data has become the buzzword in every field of research. big data is collection of many sets of data which contains large amount of information and little ambiguity where other traditional technologies are lagging and cannot compete with it .big data helps to manipulate and update the large amount of data which is used by every organization in any fields The main aim of this paper is to address the impact of these big data issues on health care application development but health care industry is lagging behind other sectors in using big data .although it is in early stages in health care it provides the researches to accesses what type of treatment should be taken that are more preferable for particular diseases, type of drugs required and patients records I. Introduction Health care is one of the most prominent problems faced by the society now a day. Every day a new disease is take birth which leads to illness of millions of people. Every disease has its own and unique medicine for its cure. Maintaining all the data related to...

Words: 2414 - Pages: 10

Premium Essay

Big Data

...Big Data and its Effects on Society Kayla Seifert MGT-311 November 23, 2015 Big Data is a concept that has existed for a while but only gained proper attention a couple of years ago. One can describe Big Data as extremely large data sets that have grown so big that it becomes almost impossible to manage and analyze with traditional data processing tools. Enterprises can use Big Data by building new applications, improving the effectiveness, lowering the costs of their applications, helping with competitive advantage, and increasing customer loyalty. It can also be used in other industries to enable a better system and better decision-making. Big Data has become a valuable asset to everyone around the world and continues to impact society today. The ideology of Big Data first came up in the days before the age of computers, when unstructured data were the norm and analytics was in its infancy. The first Big Data challenge came in the form of the 1880 U.S. census when the information involving about 50 million people being gathered, classified, and reported. This census contained a lot of facts to deal with, however, limited technology was available to organize and manage it. It took over seven years to manually put the data into tables and report on the data. Thanks to Big Data, the 1890 census could be placed on punch cards that could hold about 80 variables. Instead of seven years, the analysis of the data only took six weeks. Big Data allowed the government...

Words: 1697 - Pages: 7

Free Essay

Sdfdsfadsf

...White Paper Storage Infrastructures for Big Data Workflows Sponsored by: Prepared by: Eric Slack, Sr. Analyst May 2012 Storage Infrastructures for Big Data Workflows Introduction Big Data is a term used to describe data sets which have grown so large that traditional storage infrastructures are ineffective at capturing, managing, accessing and retaining them in an acceptable time frame. The thing that separates Big Data from simply a large archive is the need to process these data sets or to provide file access to multiple users quickly. Some Big Data use cases involve analytics, the computer-based analysis of large amounts of relatively small data objects, for the purpose of pulling business value from that information. Many of these involve files supporting transaction analysis or automated event processing, such as database or web analytics, which won’t be addressed in this white paper. Instead, this paper will deal with another form of Big Data that supports file processing workflows, often sequential in nature, where large files are shared by knowledge workers to create digital products, support research and perform analysis to increase productivity. Also considered will be Big Data supporting large file analytics in which files are shared by large, high performance compute clusters to support complex analysis and drive business decisions. Big Data File Processing Workflows Some of the industries using large file, Big Data sets in these two use cases include Media and...

Words: 2944 - Pages: 12

Premium Essay

Characteristics Of Big Data Visualization

...amount of data being handled and processed has increased tremendously. Big Data analytics plays a very significant part in reducing the size of the data as well as the complexity in applications that are being used for Big Data. Big Data Visualization is an important approach in creating meaningful visuals and graphical representations from the Big Data that help in better decision making and that give a clear insight into the data. Visualization, Big Data, Big Data Visualization, data visualization techniques are some of the topics that are discussed in this paper and examples for visualizations have been presented as well. Keywords— Visualization, Data processing, Data analytics, Big Data, Interactive visualizations. I. VISUALIZATION...

Words: 1246 - Pages: 5

Premium Essay

Big Data Sap Hana

...WHITE P APER Big Data: Trends, Strategies, and S AP Technology Sponsored by: SAP Carl W. Olofson August 2012 Dan Vesset THE DAWN OF THE INTE LLIGENT ECONOMY The intelligent economy has arrived. The convergence of intelligent devices, social networking, pervasive broadband communications, and analytics is redefining relationships among producers, distributors, and consumers of goods and services. The growth in volume, variety, and velocity of data has created new challenges and opportunities. The information access, analysis, and management challenges of the intelligent economy can overwhelm organizations unprepared for the emerging changes. In this environment, it is not only access to data but the ability to analyze and act upon it that creates competitive advantage in commercial transactions, enables sustainable and secure management of communities, and promotes appropriate distribution of social, healthcare, and educational services. It is not only access to data but the ability to analyze and act upon it that creates competitive advantage. www.idc.com P.508.872.8200 F.508.935.4015 In This White Paper This IDC white paper discusses the emerging technologies of the Big Data movement. It breaks out these technologies according to their most effective roles and use cases. It also discusses why Big Data has become so important at this time and how Big Data can help enterprises reach their business goals. It considers the challenges created by Big Data and how they can...

Words: 6068 - Pages: 25

Free Essay

Asdf

...asdfThis document examines the role of big data in the enterprise as it relates to network design considerations. It describes the rise of big data and the transition of traditional enterprise data models with the addition of crucial building blocks to handle the dramatic growth of data in the enterprise. According to IDC estimates, the size of the "digital universe" in 2011 will be 1.8 zettabytes (1.8 trillion gigabytes). With information growth exceeding Moore's Law, the average enterprise will need to manage 50 times more information by the year 2020 while increasing IT staff by only 1.5 percent. With this challenge in mind, the integration of big data models into existing enterprise infrastructures is a critical element when considering the addition of new big data building blocks while considering the efficiency, economics and privacy. This document also shows that the Cisco Nexus ® architectures are optimized to handle big data while providing integration into current enterprise infrastructures. In reviewing multiple data models, this document examines the effects of Apache Hadoop as a building block for big data and its effects on the network. Hadoop is an open source software platform for building reliable, scalable clusters in a scaled-out, "shared-nothing" design model for storing, processing, and analyzing enormous volumes of data at very high performance. The information presented in this document is based on the actual network traffic patterns of the Hadoop framework...

Words: 1384 - Pages: 6

Free Essay

Big Data, Nosql and Mobile Sync —Three Peas, One Pod

...DBTA | MARCH 2012 27 Big Data, NoSQL and Mobile Sync —Three Peas, One Pod By James Phillips, Co-Founder and SVP of products, Couchbase A few months ago I had the good fortune to hear VMware CEO Paul Maritz speak at a conference. Asked “which trends would you identify that will have the biggest impact on IT in the coming decade?” Paul identified two: cloud computing, and the transition underway in at the data layer—specifically mentioning Big Data and NoSQL. Paul noted that, in his experience, a shift in the data model generates farreaching ripple effects: new applications are enabled, the application development process is impacted and the infrastructure atop which these applications run changes. He saw it happen with IMS (hierarchical data model) and with the relational model. In his estimate we’re on the leading edge of another fundamental shift. Clearly Paul is not alone. It is hard to find an IT “predictions” story or blog that doesn't mention Big Data and/or NoSQL. But these terms are frequently interchanged as though they are synonyms. In part, the confusion comes from focusing too sharply on the technology itself. There are certainly similarities in implementation—notably the tendency to spread data across many servers versus storing data on a small number of very large servers. But if one softens the focus on the technology, it becomes clear there are three distinct trends driving innovation at the data layer: data growth, web application user growth and the explosion...

Words: 846 - Pages: 4

Free Essay

Big Data

...The Situation of Big Data Technology Yu Liu International American University BUS 530: Management Information Systems Matthew Keogh 2015 Summer 2 - Section C Introduction In this paper, I will list the main technologies related to big data. According to the life cycle of the data processing, big data technology can be divided into data collection and pre-processing, data storage and management, data analysis and data mining, data visualization and data privacy and security, and so on. The reason I select topic about big data My major is computer science and I have taken a few courses about data mining before. Nowadays more and more job positions about big data are showing at job seeking website, such as Monster.com. I am planning to learn some mainstream big data technologies like Hadoop. Therefore, I choose big data as my midterm paper topic. Big data in Google Google's big data analytics intelligence applications include customer sentiment analysis, risk analysis, product recommendations, message routing, customer losing prediction, the classification of the legal copy, email content filtering, political tendency forecast, species identification and other aspects. It is said that big data will generate $23 million every day for Google. Some typical applications are as follows: Based on MapReduce, Google's traditional applications include data storage, data analysis, log analysis, search quality and other data analytical applications. Based on Dremel system...

Words: 1405 - Pages: 6

Free Essay

Amazon - Big Data

...Table of Contents 1.0 Background description of Amazon.com 3 2.0 What is Big Data? 4 2.1 The Business View 5 2.2 Technical View 5 3.0 Amazon.com and Big Data 6 4.0 Identification of Amazon.com’s "Big Data" needs 6 5.0 Big Data problems to be solved and Big Data solutions 6 6.0 What are AWS (Amazon.COM web services) problems and what is the solution for it 8 6.1 Kinesis other advantages 9 7.0 Conclusion 10 8.0 References 11 1.0 Background description of Amazon.com History Amazon.com, Inc. (Amazon.com) serves consumers through its retail websites and focus on selection, price, and convenience. The Company offers programs that enables sellers to sell their products on its Websites and their own branded Websites and to fulfill orders through them , and programs that allow authors, musicians, filmmakers, application developers, and others to publish and sell content. The Company operates in two segments: North America and International. The Company serves consumers through its retail websites, and focus on selection, price, and convenience. The Company designs its Websites to enable millions of products to be sold by the Company and by third parties across dozens of product categories. Customers access its Websites directly and through its mobile Websites and apps. It also manufactures and sells Kindle devices. In October 2013, Amazon.com Inc acquired TenMarks Education Inc. Effective February 5, 2014, Amazon.com Inc acquired Double Helix Games...

Words: 3463 - Pages: 14

Premium Essay

Real Time Analytics

...* Why Big Data analytics is important? Big data is a term that refers to data sets or combinations of data sets whose size (volume), complexity (variability), and rate of growth (velocity) make them difficult to be captured, managed, processed by conventional technologies and tools, within certain time period to make them useful. Big data is vital in fact that when huge information is successfully and effectively caught, prepared organizations can pick up a more finish comprehension of their business, clients, items, contenders, and so on. This can prompt effectiveness enhancements, expanded deals, lower costs, better client benefit, or enhanced items and administrations. Following are some of the examples of big data in different fields:  Utilizing information technology (IT) logs to enhance IT investigating and security rupture discovery, pace, viability, and future event avoidance.  Use of voluminous call focus data all the more rapidly, keeps in mind the end goal to enhance client association and fulfilment. Use of online networking content keeping in mind the end goal to better and more rapidly client feeling about you/your clients, and enhance items, administrations, and client association. Fraud detection and prevention in any industry that procedures budgetary exchanges on-line, for example, shopping, keeping money, contributing, protection and medicinal services claims.  Use of money related business sector exchange data to all the more rapidly evaluate...

Words: 1729 - Pages: 7

Free Essay

Title

...Center for US Health System Reform Business Technology Office The ‘big data’ revolution in healthcare Accelerating value and innovation January 2013 Peter Groves Basel Kayyali David Knott Steve Van Kuiken Contents The ‘big data’revolution in healthcare: Accelerating value and innovation 1 Introduction1 Reaching the tipping point: A new view of big data in the healthcare industry  2 Impact of big data on the healthcare system 6 Big data as a source of innovation in healthcare 10 How to sustain the momentum 13 Getting started: Thoughts for senior leaders 17 1 The ‘big data’ revolution in healthcare: Accelerating value and innovation Introduction An era of open information in healthcare is now under way. We have already experienced a decade of progress in digitizing medical records, as pharmaceutical companies and other organizations aggregate years of research and development data in electronic databases. The federal government and other public stakeholders have also accelerated the move toward transparency by making decades of stored data usable, searchable, and actionable by the healthcare sector as a whole. Together, these increases in data liquidity have brought the industry to the tipping point. Healthcare stakeholders now have access to promising new threads of knowledge. This information is a form of “big data,” so called not only for its sheer volume but for its complexity, diversity, and timeliness...

Words: 9757 - Pages: 40

Premium Essay

Big Data In Health Care

...Although the vital purpose of big data management is similar to, and a much better understanding of the issues, the values sought, and also the challenges that involved differing significantly between business companies and healthcare organizations. Business companies use big data to influence customers' desires and behavior patterns, develop distinctive core competencies, and build innovative products or services, whereas governments and healthcare stakeholders use big data and predictive analytics to look for sustainable solutions to such problems as pursuit public health, deciding and implementing more appropriate treatment methods for patients, supporting clinical enhancements, monitoring the protection of healthcare systems, reassuring...

Words: 2155 - Pages: 9

Free Essay

The Computer Gaming World

...The Big Data Challenges [Jay Lynn] [Strayer University] [ Dr. Rose] The Big Data Challenges 1. Judge how Volvo car Corporation integrated the cloud infrastructure into its networks. More and more car companies are earning bad reputations for their unreliable service in car making. This has caused many car companies to lose money and their customers. Volvo thought of a way to keep their cars more reliable and safe by using Big data. By doing this Volvo was able to handle high volumes of data which cannot be processed by using a simple database (Turban & Volonino, 2011). An example of Big Data could be terabytes of data consisting of billions of information from different resources such as the internet, social media, and data from a cell phone carrier and etc. After Volvo Car corporation departed from their parent relationship with ford in August 2010, Volvo decided to use Big Data to make it marks with technology and design an IT infrastructure. Volvo found a successful way to use the cloud network to its advantage. Volvo is now using the cloud network to evaluate its vehicles. By using a familiar interface the employees have probably worked with before, Volvo is cutting costs on training employees on new software applications and saving money by using software works well together with other applications (Provost & Fawcett, 2013). Volvo employees could now communicate more effectively and efficiently...

Words: 1480 - Pages: 6

Free Essay

Rcv Docs

...Case Study | Alibaba Group “F5 Application Delivery Networking solutions enabled us to build globalized data centers that are secure, reliable, and fast.” Wang Zhilei, Operations Director, Alibaba B2B Overview Alibaba Builds a Secure, Reliable, and Fast Data Center Using F5 Application Delivery Networking Products Global e-commerce leader Alibaba Group is the largest e-commerce company in China. The company plans to expand its operations significantly, and needed to support several e-commerce sites in different parts of the world. By using F5® BIG-IP® Global Traffic Manager™ (GTM) and F5 BIG-IP® Local Traffic Manager™ (LTM), Alibaba was able to consistently provide high-quality and secure services to its 24/7 worldwide customer base. Business Challenge Founded in 1999, Alibaba Group has grown to include the core businesses described below. • Alibaba.com, the group’s flagship company and the world’s leading B2B e-commerce company, serves small and medium-size enterprises (SMEs) in China and around the world. It has more than 40 million registered users from more than 240 countries and regions. The company has offices in more than 40 cities across Greater China, as well as in Europe and the United States. • Taobao is China’s largest consumer e-commerce company. It incorporates Alimama, China’s largest online advertising exchange platform. With a registered user base of nearly 100 million, its transaction volume was RMB 99.96 billion (US $14.6 billion) in 2008, up 131...

Words: 1029 - Pages: 5

Premium Essay

Bigdata

...4. 4.1 Big Data Introduction In 2004, Wal-Mart claimed to have the largest data warehouse with 500 terabytes storage (equivalent to 50 printed collections of the US Library of Congress). In 2009, eBay storage amounted to eight petabytes (think of 104 years of HD-TV video). Two years later, the Yahoo warehouse totalled 170 petabytes1 (8.5 times of all hard disk drives created in 1995)2. Since the rise of digitisation, enterprises from various verticals have amassed burgeoning amounts of digital data, capturing trillions of bytes of information about their customers, suppliers and operations. Data volume is also growing exponentially due to the explosion of machine-generated data (data records, web-log files, sensor data) and from growing human engagement within the social networks. The growth of data will never stop. According to the 2011 IDC Digital Universe Study, 130 exabytes of data were created and stored in 2005. The amount grew to 1,227 exabytes in 2010 and is projected to grow at 45.2% to 7,910 exabytes in 2015.3 The growth of data constitutes the “Big Data” phenomenon – a technological phenomenon brought about by the rapid rate of data growth and parallel advancements in technology that have given rise to an ecosystem of software and hardware products that are enabling users to analyse this data to produce new and more granular levels of insight. Figure 1: A decade of Digital Universe Growth: Storage in Exabytes Error! Reference source not found.3 1 ...

Words: 22222 - Pages: 89