Free Essay

Bigdata

In:

Submitted By vasuchou
Words 4913
Pages 20
-------------------------------------------------
BIG Data

February 8, 2015
Srinivas gogineni
SAI SRAVAN KOLUKULA
February 8, 2015
Srinivas gogineni
SAI SRAVAN KOLUKULA

Introduction
Big data burst upon the scene in the first decade of the 21st century. The first organizations to embrace it were online and startup firms. Firms like Google, eBay, LinkedIn, and Facebook were built around big data from the beginning. Like many new information technologies, big data can bring about dramatic cost reductions, substantial improvements in the time required to perform a computing task, or new product and service offerings. Davenport.T (2013). Big Data is emerging from the realms of science projects at Web companies to help companies like telecommunication giants understand exactly which customers are unhappy with service and what processes caused the dissatisfaction, and predict which customers are going to change carriers. To obtain this information, billions of loosely-structured bytes of data in different locations needs to be processed until the needle in the haystack is found. The analysis enables executive management to fix faulty processes or people and maybe be able to reach out to retain the at-risk customers. The real business impact is that big data technologies can do this in weeks or months, four-or-more-times faster than traditional data warehousing approaches. Floyer.D (2015).
Literature Review The IT techniques and tools to execute big data processing are new, very important and exciting. Big data is data that is too large to process using traditional methods. It originated with Web search companies who had the problem of querying very large distributed aggregations of loosely-structured data. Google developed MapReduce to support distributed computing on large data sets on computer clusters. Inspired by Google's MapReduce and Google File System (GFS) papers, Doug Cutting created Hadoop while he was at Yahoo!, and named it after his son's stuffed elephant. Floyer.D (2015). Hadoop is an Apache project, written in Java and being built and used by a global community of contributors. Yahoo! has been the largest contributor to the project and uses Hadoop extensively across its businesses on 38,000 nodes. Floyer.D (2015).
Doug Cutting, meanwhile, joined Cloudera, a commercial Hadoop company that develops, packages, supports and distributes Hadoop (similar to the Red Hat model for Linux), making it accessible to Enterprise IT. Floyer.D (2015).

Discussion Walmart handles more than 1 million customer transactions every hour. Facebook handles 40 billion photos from its user base. Decoding the human genome originally took 10years to process; now it can be achieved in one week. Rishav. S, (2014).
Definition
‘Big Data’ Is Similar To ‘Small Data’, But Bigger In Characteristics. An Aim To Solve New Problems Or Old Problems In A Better Way Big Data Generates Value From The Storage And Processing Of Very Large Quantities Of Digital Information That Cannot Be Analyzed With Traditional Computing Techniques. The three characteristics of the big data are ‘Volume’ Velocity’ ‘Variety’
Volume:
A typical PC might have had 10 gigabytes of storage in 2000. Today, Facebook ingests 500 terabytes of new data every day. Boeing 737 will generate 240 terabytes of flight data during a single flight across the US. The smart phones, the data they create and consume; sensors embedded into everyday objects will soon result in billions of new, constantly-updated data feeds containing environmental, location, and other information, including video.
Velocity:
Clickstreams and ad impressions capture user behavior at millions of events per second high-frequency stock trading algorithms reflect market changes within microseconds machine to machine processes exchange data between billions of devices infrastructure and sensors generate massive log data in real-time on-line gaming systems support millions of concurrent users, each producing multiple inputs per second.
Variety:
Big Data isn't just numbers, dates, and strings. Big Data is also geospatial data, 3D data, audio and video, and unstructured text, including log files and social media.
Traditional database systems were designed to address smaller volumes of structured data, fewer updates or a predictable, consistent data structure. Big Data analysis includes different types of data.
Why Big Data Growth of Big Data is needed due to 1) Inrease of storage capacities 2) Increase of processing power 3) Availability of data(different data types). Every day we create 2.5 quintillion bytes of data; 90% of the data in the world today has been created in the last two years alone
Analyzing your data characteristics by 1) Selecting data sources for analysis 2) Eliminating redundant data 3)Establishing the role of NoSQL.
Overview of Big Data stores are 1) Data models: key value, graph, document, column-family 2) Hadoop Distributed File System 3) Hbase 4)Hive.
The Structure Of Big Data
Structured: Most traditional data sources like Database files, .CSV files, .XLS files.
Semi-structured: Many sources of big data like E-Mail, Word files, Log files.
Unstructured: Video data, audio data, Image data.
Figure 1 Structure of Big Data by N.Hussain, adapted from http://image.slidesharecdn.com/bigdatappt-140225061440-phpapp01/95/big-data-ppt-13-638.jpg?cb=1393330529
Figure 1 Structure of Big Data by N.Hussain, adapted from http://image.slidesharecdn.com/bigdatappt-140225061440-phpapp01/95/big-data-ppt-13-638.jpg?cb=1393330529

Integrating disparate data stores by Mapping data to the programming framework, Connecting and extracting data from storage ,Transforming data for processing and Subdividing data in preparation for Hadoop MapReduce
Employing Hadoop MapReduce * Creating the components of Hadoop MapReduce jobs * Distributing data processing across server farms * Executing Hadoop MapReduce jobs * Monitoring the progress of job flows
Components Of Big-Data Processing
Big-data projects have a number of different layers of abstraction from abstaction of the data through to running analytics against the abstracted data. Figure 1 shows the common components of analytical Big-data and their relationship to each other. The higher level components help make big data projects easier and more productive. Hadoop is often at the center of Big-data projects, but it is not a prerequisite. Floyer.D, (2015).

Figure 2 Analytical Big Data Components Source: wikibon 2015
Packaging and support of Hadoop by organizations such as Cloudera; to include Map Reduce - essentially the compute layer of big data. File-Systems such as the Hadoop Distributed File System (HDFS), which manages the retrieval and storing of data and metadata required for computation. Other file systems or databases such as Hbase (a NoSQL tabular store) or Cassandra (a NoSQL Eventually‐consistent key‐value store) can also be used. Instead of writing in JAVA, higher level languages as Pig (part of Hadoop) can be used such, simplifying the writing of computations. Hive is a Data Warehouse layer built on top of Hadoop, developed by Facebook programmers. Cascading is a thin Java library that sits on top of Hadoop that allows suites of Map Reduce jobs to be run and managed as a unit. It is widely used to develop special tools. Semi-automated modeling tools such as CR-X allow models to develop interactively at great speed, and can help set up the database that will run the analytics. Specialized scale-out analytic databases such as Greenplum or Netezza with very fast loading load & reload the data for the analytic models.
ISV big data analytical packages such as ClickFox and Merced run against the database to help address the business issues (e.g., the customer satisfaction issues mentioned in the introduction). Senthil, (2014).
How Is Big Data Different
The information management big data and analytics capabilities include: Data Management & Warehouse
Gain industry-leading database performance across multiple workloads while lowering administration, storage, development and server costs; Realize extreme speed with capabilities optimized for analytics workloads such as deep analytics, and benefit from workload-optimized systems that can be up and running in hours. Hadoop System Bring the power of Apache Hadoop to the enterprise with application accelerators, analytics, visualization, development tools, performance and security features. Stream Computing
Efficiently deliver real-time analytic processing on constantly changing data in motion and enable descriptive and predictive analytics to support real-time decisions. Capture and analyze all data, all the time, just in time. With stream computing, store less, analyze more and make better decisions faster. Content Management
Enable comprehensive content lifecycle and document management with cost-effective control of existing and new types of content with scale, security and stability. Information Integration & Governance
Build confidence in big data with the ability to integrate, understand, manage and govern data appropriately across its lifecycle. Senthil, (2014).
Big Data Sources * Automatically generated by a machine (e.g. Sensor embedded in an engine) * Typically an entirely new source of data (e.g. Use of the internet) * Not designed to be friendly (e.g. Text streams) * May not have much values. Need to focus on the important part

Figure 3 Sources of Big Data
Figure 4 Big Data Statistics
Figure 4 Big Data Statistics
Data Generation Points Examples * Mobile Devices * Microphones * Scanners * Readers * Science Facilities * Software/Programs * Social Media * Cameras

Big Data Analytics * Examining large amount of data * Appropriate information * Identification of hidden patterns, unknown correlations * Competitive advantage * Better business decisions: strategic and operational * Effective marketing, customer satisfaction, increased revenue
Applications Of Big Data HEALTH CARE | | HOMELAND SECURITY | | TRAFFIC CONTROL | | TRADING ANALYTICS | | TELECOM | | MANFACURING | | MULTI CHANNEL SALES | | SEARCH QUALITY | |
Figure 5 Applications of Big Data.
Leading Technology Vendors * IBM – NETEZZA * EMC – GREENPLUM * ORACLE – EXADATA * GOOGLE – BIGQUERY etc..
How Big Data Impacts On It Big data is a troublesome force presenting opportunities with challenges to IT organizations. By 2015 4.4 million IT jobs in Big Data ; 1.9 million is in US itself. India will require a minimum of 1 lakh data scientists in the next couple of years in addition to data analysts and data managers to support the Big Data space,Nassir Hussian,(2014). Benefits Of Big Data
Real-time big data isn’t just a process for storing Petabytes or Exabyte's of data in a data warehouse, it’s about the ability to make better decisions and take meaningful actions at the right time. Increase Visibility across The Enterprise: it can help the entire enterprise work as one functional unit. There is no longer any need for data silos for different functions such as marketing, finance, logistics, etc. Big data techniques allow us to all work from the same data set and pull out what we need. Focus on the core: As Mark Cleverley, Global Director of Public Safety at IBM puts it, “We are moving into an era where we can be roughly right more frequently and precisely wrong less. Narrowing down possibilities helps us to make the data to be actionable.”
1.Cost reduction: Big data technologies like Hadoop and cloud-based analytics can provide substantial cost advantages. While comparisons between big data technology and traditional architectures (data warehouses and marts in particular) are difficult because of differences in functionality, a price comparison alone can suggest order-of-magnitude improvements. Virtually every large company I interviewed, however, is employing big data technologies not to replace existing architectures, but to augment them. Rather than processing and storing vast quantities of new data in a data warehouse, for example, companies are using Hadoop clusters for that purpose, and moving data to enterprise warehouses as needed for production analytical applications. Well-established firms like Citi, Wells Fargo and USAA all have substantial Hadoop projects underway that exist alongside existing storage and processing capabilities for analytics. While the long-term role of these technologies in an enterprise architecture is unclear, it’s likely that they will play a permanent and important role in helping companies manage big data.
2. Faster, better decision making Analytics has always involved attempts to improve decision making, and big data doesn’t change that. Large organizations are seeking both faster and better decisions with big data, and they’re finding them. Driven by the speed of Hadoop and in-memory analytics, several companies I researched were focused on speeding up existing decisions. For example, Caesars, a leading gaming company that has long embraced analytics, is now embracing big data analytics for faster decisions. The company has data about its customers from its Total Rewards loyalty program, web clickstreams, and real-time play in slot machines. It has traditionally used all those data sources to understand customers, but it has been difficult to integrate and act on them in real time, while the customer is still playing at a slot machine or in the resort. Caesars has found that if a new customer to its loyalty program has a run of bad luck at the slots, it’s likely that customer will never come back. But if it can present, say, a free meal coupon to that customer while he’s still at the slot machine, he is much more likely to return to the casino later. The key, however, is to do the necessary analysis in real time and present the offer before the customer turns away in disgust with his luck and the machines at which he’s been playing. In pursuit of this objective, Caesars has acquired Hadoop clusters and commercial analytics software. It has also added some data scientists to its analytics group. Some firms are more focused on making better decisions analyzing new sources of data. For example, health insurance giant United Healthcare is using “natural language processing” tools from SAS to better understand customer satisfaction and when to intervene to improve it. It starts by converting records of customer voice calls to its call center into text and searching for indications that the customer is dissatisfied. The company has already found that the text analysis improves its predictive capability for customer attrition models.
3. New products and services Perhaps the most interesting use of big data analytics is to create new products and services for customers. Online companies have done this for a decade or so, but now predominantly offline firms are doing it too. GE, for example, has made a major investment in new service models for its industrial products using big data analytics. Verizon Wireless is also pursuing new offerings based on its extensive mobile device data. In a business unit called Precision Market Insights, Verizon is selling information about how often mobile phone users are in certain locations, their activities and backgrounds. Customers thus far have included malls, stadium owners and billboard firms. For the Phoenix Suns, an NBA basketball team, Verizon’s Precision Market Insights offered information on where people attending the team’s games live, what percentage of game attendees are from out of town, and how often game attendees combine a basketball game with a baseball spring training game or a visit to a fast food chain. Such insights are obviously valuable to the Suns in targeting advertising and promotions. BigData can provide benefits to your organisation through ensuring that you have as much data as possible before making important business decisions. This will enable you to feel more confident when making business choices. The wealth of data available through BigData also enables marketing strategies to be improved and more accurately targeted. This could help you to greatly increase your customer base, and push your organisation ahead of the competition. Improvements in these areas of business can ultimately lead to an increase in revenue as your organisation is able to both cut costs and attract more customers.
In summary, the benefits of BigData include: * More accurate data * Improved business decisions * Improved marketing strategy and targeting * Increased revenue due to increased customer and base and decreased costs.Tom D, Director of Research and Faculty Leader (2014).
Problems Of Big Data The first thing to note is that although big data is very good at detecting correlations, especially subtle correlations that an analysis of smaller data sets might miss, it never tells us which correlations are meaningful. A big data analysis might reveal, for instance, that from 2006 to 2011 the United States murder rate was well correlated with the market share of Internet Explorer: Both went down sharply. But it’s hard to imagine there is any causal relationship between the two. Likewise, from 1998 to 2007 the number of new cases of autism diagnosed was extremely well correlated with sales of organic food (both went up sharply), but identifying the correlation won’t by itself tell us whether diet has anything to do with autism. Second, big data can work well as an adjunct to scientific inquiry but rarely succeeds as a wholesale replacement. Molecular biologists, for example, would very much like to be able to infer the three-dimensional structure of proteins from their underlying DNA sequence, and scientists working on the problem use big data as one tool among many. But no scientist thinks you can solve this problem by crunching data alone, no matter how powerful the statistical analysis; you will always need to start with an analysis that relies on an understanding of physics and biochemistry. Third, many tools that are based on big data can be easily gamed. For example, big data programs for grading student essays often rely on measures like sentence length and word sophistication, which are found to correlate well with the scores given by human graders. But once students figure out how such a program works, they start writing long sentences and using obscure words, rather than learning how to actually formulate and write clear, coherent text. Even Google’s celebrated search engine, rightly seen as a big data success story, is not immune to “Google bombing” and “spamdexing,” wily techniques for artificially elevating website search placement. Fourth, even when the results of a big data analysis aren’t intentionally gamed, they often turn out to be less robust than they initially seem. Consider Google Flu Trends, once the poster child for big data. In 2009, Google reported — to considerable fanfare — that by analyzing flu-related search queries, it had been able to detect the spread of the flu as accurately and more quickly than the Centers for Disease Control and Prevention. A few years later, though, Google Flu Trends began to falter; for the last two years it has made more bad predictions than good ones. As a recent article in the journal Science explained, one major contributing cause of the failures of Google Flu Trends may have been that the Google search engine itself constantly changes, such that patterns in data collected at one time do not necessarily apply to data collected at another time. As the statistician Kaiser Fung has noted, collections of big data that rely on web hits often merge data that was collected in different ways and with different purposes — sometimes to ill effect. It can be risky to draw conclusions from data sets of this kind. A fifth concern might be called the echo-chamber effect, which also stems from the fact that much of big data comes from the web. Whenever the source of information for a big data analysis is itself a product of big data, opportunities for vicious cycles abound. Consider translation programs like Google Translate, which draw on many pairs of parallel texts from different languages — for example, the same Wikipedia entry in two different languages — to discern the patterns of translation between those languages. This is a perfectly reasonable strategy, except for the fact that with some of the less common languages, many of the Wikipedia articles themselves may have been written using Google Translate. In those cases, any initial errors in Google Translate infect Wikipedia, which is fed back into Google Translate, reinforcing the error. A sixth worry is the risk of too many correlations. If you look 100 times for correlations between two variables, you risk finding, purely by chance, about five bogus correlations that appear statistically significant — even though there is no actual meaningful connection between the variables. Absent careful supervision, the magnitudes of big data can greatly amplify such errors. Seventh, big data is prone to giving scientific-sounding solutions to hopelessly imprecise questions. In the past few months, for instance, there have been two separate attempts to rank people in terms of their “historical importance” or “cultural contributions,” based on data drawn from Wikipedia. One is the book “Who’s Bigger? Where Historical Figures Really Rank,” by the computer scientist Steven Skiena and the engineer Charles Ward. The other is an M.I.T. Media Lab project called Pantheon.
Both efforts get many things right — Jesus, Lincoln and Shakespeare were surely important people — but both also make some egregious errors. “Who’s Bigger?” claims that Francis Scott Key was the 19th most important poet in history; Pantheon has claimed that Nostradamus was the 20th most important writer in history, well ahead of Jane Austen (78th) and George Eliot (380th). Worse, both projects suggest a misleading degree of scientific precision with evaluations that are inherently vague, or even meaningless. Big data can reduce anything to a single number, but you shouldn’t be fooled by the appearance of exactitude. Marcus.G and Davis.E, (2014).
1. It incentivizes more collection of data and longer retention of it. If any and all data sets might turn out to prove useful for discovering some obscure but valuable correlation, you might as well collect it and hold on to it. In long run, the more useful big data proves to be, the stronger this incentivizing effect will be—but in the short run it almost doesn’t matter; the current buzz over the idea is enough to do the trick.
2. When you combine someone’s personal information with vast external data sets, you can infer new facts about that person (such as the fact that they’re pregnant, or are showing early signs of Parkinson’s disease, or are unconsciously drawn toward products that are colored red or purple). And when it comes to such facts, a person a) might not want the data owner to know b) might not want anyone to know c) might not even know themselves. The fact is, humans like to control what other people do and do not know about them—that’s the core of what privacy is, and data mining threatens to violate that principle.
3. Many (perhaps most) people are not aware of how much information is being collected (for example, that stores are tracking their purchases over time), let alone how it is being used (scrutinized for insights into their lives). The fact that Target goes to considerable trouble to hide its knowledge from its customers tells you all you need to know on that front.
4. Big data can further tilt the playing field toward big institutions and away from individuals. In economic terms, it accentuates the information asymmetries of big companies over other economic actors and allows for people to be manipulated. If a store can gain insight into just how badly I want to buy something, just how much I can afford to pay for it, just how knowledgeable I am about the marketplace, or the best way to scare me into buying it, it can extract the maximum profit from me.
5. It holds the potential to accentuate power differentials among individuals in society by amplifying existing advantages and disadvantages. Those who are savvy and well educated may get improved treatment from companies and government – while those who are poor, underprivileged, and perhaps already have some strikes against them in life (such as a criminal record) will be easily identified, and treated worse. In that way data mining may increase social stratification.
6. Data mining can be used for so-called “risk analysis” in ways that treat people unfairly and often capriciously—for example, by insurance companies or banks to approve or deny applications. Credit card companies sometimes lower a customer’s credit limit based on the repayment history of the other customers of stores where a person shops. Such “behavioral scoring” is a form of economic guilt-by-association based on making statistical inferences about a person that go far beyond anything that person can control or be aware of.
7. Its use by law enforcement raises even sharper issues—and when our national security agencies start using it to try to spot terrorists, those stakes can get even more serious. We know too little about how our security agencies are using Big Data, but such approaches have been discussed since the days of the Total Information Awareness program and before—and there is strong evidence that it’s being used by the NSA to sift through the vast volumes of communications that agency collects. The threat here is that people will be tagged and suffer adverse consequences without due process, the ability to fight back, or even knowledge that they have been discriminated against. The threat of bad effects is magnified by the fact that data mining is so ineffective at spotting true terrorists.
8. Over time such consequences will lead to chilling effects, as people become more reluctant to engage in any behaviors that will put them under the macroscope. Stanley.J (2014).
Solutions
Steps Taken To Solve Problems In Big Data
Scale
With big data you want to be able to scale very rapidly and elastically. Whenever and wherever you want. Across multiple data centers and the cloud if need be. You can scale up to the heavens or shard till the cows come home with your father’s relational database systems and never get there. And most NoSQL solutions like MongoDB or HBase have their own scaling limitations.
Performance
In an online world where nanosecond delays can cost you sales, big data must move at extremely high velocities no matter how much you scale or what workloads your database must perform. The data handling hoops of RDBMS and most NoSQL solutions put a serious drag on performance
Continous Availability
When you rely on big data to feed your essential, revenue-generating 24/7 business applications, even high availability is not high enough. Your data can never go down. A certain amount of downtime is built-in to RDBMS and other NoSQL systems.
Workload Diversity
Big data comes in all shapes, colors and sizes. Rigid schemas have no place here; instead you need a more flexible design. You want your technology to fit your data, not the other way around. And you want to be able to do more with all of that data – perform transactions in real-time, run analytics just as fast and find anything you want in an instant from oceans of data, no matter what from that data may take.
Data Security
Big data carries some big risks when it contains credit card data, personal ID information and other sensitive assets. Most NoSQL big data platforms have few if any security mechanisms in place to safeguard your big data.
Managebility
Staying ahead of big data using RDBMS technology is a costly, time-consuming and often futile endeavor. And most NoSQL solutions are plagued by operational complexity and arcane configurations.
Cost
Meeting even one of the challenges presented here with RDBMS or even most NoSQL solutions can cost a pretty penny. Doing big data the right way doesn’t have to break the bank.(Datastax, 2014)
Conclusions
Finally, big data is at its best when analyzing things that are extremely common, but often falls short when analyzing things that are less common. For instance, programs that use big data to deal with text, such as search engines and translation programs, often rely heavily on something called trigrams: sequences of three words in a row (like “in a row”). Reliable statistical information can be compiled about common trigrams, precisely because they appear frequently. But no existing body of data will ever be large enough to include all the trigrams that people might use, because of the continuing inventiveness of language.
To select an example more or less at random, a book review that the actor Rob Lowe recently wrote for this newspaper contained nine trigrams such as “dumbed-down escapist fare” that had never before appeared anywhere in all the petabytes of text indexed by Google. To witness the limitations that big data can have with novelty, Google-translate “dumbed-down escapist fare” into German and then back into English: out comes the incoherent “scaled-flight fare.” That is a long way from what Mr. Lowe intended — and from big data’s aspirations for translation. Marcus.G and Davis.E, (2014).
References
Eric Brown, (2015). Evolutionary approaches to big data problems. MIT Industrial Liaison Program. Retrived from https://newsoffice.mit.edu/2015/una-may-oreilly-evolutionary- approaches-big-data-problems-0114.
Floyer.D, (2015). Enterprise Big Data. Retrieved from http://wikibon.org/wiki/v/Enterprise_Big- data
Thomas.D,(2013). Big Data for big companies. Retrieved from http://www.sas.com/en_us/insights big-data/what-is-big-data.html
Rishav Shaw, (2014). A Comparitive And Analytical Study On Big Data. Retrieved from http://rspublication.com/ijca/2014/dec14/10.pdf
Data Stax. Retrived from http://www.datastax.com/big-data-challenges
Nassir Hussian, (2014). Big Data. Retrieved from www.slideshare.net/nasrinhussain1/big-data- ppt-31616290.
Ravinamboori, (2014). BigData.
Senthil,2014, Big data. Retrieved from http://www.slideshare.net/bobosenthil/big-data-what-why- where-when-and-how
Marcus.G and Davis.E, (2014). Problems with big data. Retrieved from http://www.nytimes.com /2014/04/17/opinion/eight-no-nine-problems-with-big-data.html
Jay.S, (2014). Retrieved from https://www.aclu.org/blog/technology-and-liberty/eight-problems- big-data.
Books
Big Data by Viktor Mayer-Schonberger. Retrieved from https://blog.transparency.org/2014/11/07/ can-big-data-solve-the-worlds-problems-including-corruption/.

Similar Documents

Free Essay

Bigdata

...A BIG DATA APPROACH FOR HEALTH CARE APPLICATION DEVELOPMENT G.Sravya [1], A.Shalini [2], K.Raghava Rao [3] @ B.Tech Students, dept. of Electronics and Computers. K L University, Guntur, AP. *.Professor, dept. of Electronics and Computers. K L University, Guntur, AP sravyagunturi93@gmail.com , shaliniaramandla@gmail.com, raghavarao@kluniversity.in ABSTRACT: Big data is playing a vital role in present scenario. big data has become the buzzword in every field of research. big data is collection of many sets of data which contains large amount of information and little ambiguity where other traditional technologies are lagging and cannot compete with it .big data helps to manipulate and update the large amount of data which is used by every organization in any fields The main aim of this paper is to address the impact of these big data issues on health care application development but health care industry is lagging behind other sectors in using big data .although it is in early stages in health care it provides the researches to accesses what type of treatment should be taken that are more preferable for particular diseases, type of drugs required and patients records I. Introduction Health care is one of the most prominent problems faced by the society now a day. Every day a new disease is take birth which leads to illness of millions of people. Every disease has its own and unique medicine for its cure. Maintaining all the data related to...

Words: 2414 - Pages: 10

Free Essay

Bigdata

...or traditional data processing applications. The challenges include capture, curation, storage,[3] search, sharing, transfer, analysis,[4] and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to "spot business trends, determine quality of research, prevent diseases, link legal citations, combat crime, and determine real-time roadway traffic conditions."[5][6][7] As of 2012, limits on the size of data sets that are feasible to process in a reasonable amount of time were on the order of exabytes of data.[8][9] Scientists regularly encounter limitations due to large data sets in many areas, including meteorology, genomics,[10] connectomics, complex physics simulations,[11] and biological and environmental research.[12] The limitations also affect Internet search, finance andbusiness informatics. Data sets grow in size in part because they are increasingly being gathered by ubiquitous information-sensing mobile devices, aerial sensory technologies (remote sensing), software logs, cameras, microphones, radio-frequency identification readers, andwireless sensor networks.[13][14] The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s;[15] as of 2012, every day 2.5 quintillion (2.5×1018) bytes of data were created.[16] The...

Words: 356 - Pages: 2

Premium Essay

Bigdata

...4. 4.1 Big Data Introduction In 2004, Wal-Mart claimed to have the largest data warehouse with 500 terabytes storage (equivalent to 50 printed collections of the US Library of Congress). In 2009, eBay storage amounted to eight petabytes (think of 104 years of HD-TV video). Two years later, the Yahoo warehouse totalled 170 petabytes1 (8.5 times of all hard disk drives created in 1995)2. Since the rise of digitisation, enterprises from various verticals have amassed burgeoning amounts of digital data, capturing trillions of bytes of information about their customers, suppliers and operations. Data volume is also growing exponentially due to the explosion of machine-generated data (data records, web-log files, sensor data) and from growing human engagement within the social networks. The growth of data will never stop. According to the 2011 IDC Digital Universe Study, 130 exabytes of data were created and stored in 2005. The amount grew to 1,227 exabytes in 2010 and is projected to grow at 45.2% to 7,910 exabytes in 2015.3 The growth of data constitutes the “Big Data” phenomenon – a technological phenomenon brought about by the rapid rate of data growth and parallel advancements in technology that have given rise to an ecosystem of software and hardware products that are enabling users to analyse this data to produce new and more granular levels of insight. Figure 1: A decade of Digital Universe Growth: Storage in Exabytes Error! Reference source not found.3 1 ...

Words: 22222 - Pages: 89

Premium Essay

Bigdata

...BIG DATA - THE MANAGEMENT REVOLUTION Summary The general theme of the article is to proof how data-driven decisions are better for businesses as data enables the managers to base their decision on evidence rather than intuition. The idea behind big data is to collect all kinds of data from various sources and to effectively utilize this data to improve the financial and operational aspects of the business. Companies that operate on digital platforms like Amazon are already experts at big data and are using the predictions based on data in a deft manner. The practice of big data should not be confined to companies that operate digitally but should be implied by other businesses as well, as big data is a revolutionary practice that provides data in larger volumes with higher velocity with which data is complied and entailed and the variety in which data is available. Furthermore, modifying a company to be data-driven is not only technologically challenging but poses a copious amount of managerial challenges as well. The decision is usually based on the senior manager who has to know how to answer questions and how to embrace evidence based decision. For a company to re-organize itself to become data-driven, the manager should concentrate on improvising five areas that include, leadership, talent management, technology, decision making and company culture. The author cities instances of big data using examples of airports where PASSUR Aerospace provided a service called...

Words: 3006 - Pages: 13

Free Essay

Bigdata

...Big Data Analytics (IB9CS) Mining, processing, analysing, and visualising large data sets. Week 6 Measuring happiness Suzy Moat Tobias Preis Suzy.Moat@wbs.ac.uk Tobias.Preis@wbs.ac.uk What we’ve covered Measuring Predicting What we’ve covered Measuring Economics Health Predicting What we’ve covered Measuring Predicting Economics Economics Health Crime What we’ve covered Measuring Predicting Economics Economics Health Crime Happiness Social networks http://www.ted.com/talks/nicholas_christakis_the_hidden_influence_of_social_networks.html Twitter and happiness Positive affect Negative affect Golder and Macy (2011, Science) Twitter and happiness Positive affect Negative affect Golder and Macy (2011, Science) Facebook and happiness Own updates % positive words % negative words Kramer et al. (2014, PNAS) Negativity reduced Positivity reduced News feed Facebook and happiness Own updates More positive % positive words % negative words More negative Kramer et al. (2014, PNAS) Negativity reduced Positivity reduced News feed Facebook and happiness Own updates % positive words % negative words Kramer et al. (2014, PNAS) Negativity reduced Positivity reduced News feed Facebook and happiness Own updates % positive words % negative words Kramer et al. (2014, PNAS) Negativity reduced Positivity ...

Words: 331 - Pages: 2

Premium Essay

Bigdata Etl

...White Paper Big Data Analytics Extract, Transform, and Load Big Data with Apache Hadoop* ABSTRACT Over the last few years, organizations across public and private sectors have made a strategic decision to turn big data into competitive advantage. The challenge of extracting value from big data is similar in many ways to the age-old problem of distilling business intelligence from transactional data. At the heart of this challenge is the process used to extract data from multiple sources, transform it to fit your analytical needs, and load it into a data warehouse for subsequent analysis, a process known as “Extract, Transform & Load” (ETL). The nature of big data requires that the infrastructure for this process can scale cost-effectively. Apache Hadoop* has emerged as the de facto standard for managing big data. This whitepaper examines some of the platform hardware and software considerations in using Hadoop for ETL. –  e plan to publish other white papers that show how a platform based on Apache Hadoop can be extended to W support interactive queries and real-time predictive analytics. When complete, these white papers will be available at http://hadoop.intel.com. Abstract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 The ETL Bottleneck in Big Data Analytics The ETL Bottleneck in Big Data Analytics. . . . . . . . . . . . . . . . . . . . . . 1 Big Data refers to the large amounts, at least terabytes, of poly-structured...

Words: 6174 - Pages: 25

Premium Essay

My Name

...CLOUD TECHNOLOGIES TO APPLY IN COUNTRYPAL INTRODUCTION CLOUD/WEB PLATFORM Cloud computing offers business owners like countrypal an opportunity to take their businesses to the next level without having to invest heavily. The entire foundation of cloud computing is based on Iaas or Infrastructure as a Service, which extensively deals with providing the hardware in a cloud. This includes servers, routers, network equipment, firewall etc. However, all this hardware is of no use without a platform to help developers develop their applications, based on which the businesses can actually operate. Cloud computing platforms, which are otherwise known as Platform as a Service (PaaS), are therefore the middle level of cloud computing, which is extremely important for businesses to utilize the infrastructure and software that are available in the cloud. In order to make the most out of cloud computing platforms, irrespective of whether you are a business owner or a developer, it is important to understand the various aspects of this middle layer. Here is a brief overview of this aspect of cloud computing. When a business requires an application to be developed for its use, usually developers invest a lot of money into gathering the requisite infrastructure for its onsite development. But with cloud computing platforms, they can be delivered on the web. So, you can easily develop your applications without investing heavy amounts of money into buying software and other necessary tools for...

Words: 1611 - Pages: 7

Free Essay

Abc Ia S Aresume

...De-Identified Personal Health Care System Using Hadoop The use of medical Big Data is increasingly popular in health care services and clinical research. The biggest challenges in health care centers are the huge amount of data flows into the systems daily. Crunching this BigData and de-identifying it in a traditional data mining tools had problems. Therefore to provide solution to the de-identifying personal health information, Map Reduce application uses jar files which contain a combination of MR code and PIG queries. This application also uses advanced mechanism of using UDF (User Data File) which is used to protect the health care dataset. Responsibilities: Moved all personal health care data from database to HDFS for further processing. Developed the Sqoop scripts in order to make the interaction between Hive and MySQL Database Wrote MapReduce code for DE-Identifying data. Loaded the processed results into Hive tables. Generated test cases using MRunit. Best-Buy – Rehosting of Web Intelligence project The purpose of the project is to store terabytes of log information generated by the ecommerce website and extract meaning information out of it. The solution is based on the open source Big Data s/w Hadoop .The data will be stored in Hadoop file system and processed using PIG scripts. Which intern includes getting the raw html data from the websites, Process the html to obtain product and pricing information, Extract various reports out of the product pricing...

Words: 500 - Pages: 2

Free Essay

Consequences of the Changing Norms of Visas

...The proposed changes in the norms of the visas will cause a lot of stress in the indian it scenario specially where they are highly dependent on their visas. The companies having higher workforce under the category of H1-B visa will suffe the same, in this category unfortunately, most of Indian it companies will fall. The amount of the fee these companies will have to pay would be much higher. Any employer which have more than 50 employees and have more than 30% but less than 50% will fall under category H1B or L1. The employer needs to pay $5000 for employee who is an extra for either of the standard. The employer which has more than 50% of the employees with H1B or L1 will have to pay $10000 fee. This is going to effect companies like TCS, WIPRO and Infosys. But will be useful in case of IBM, Dell etc which are US based. The proposed reform to the H1B visa standards were under the Senate’s comprehensive bill. It has the potential to cast a profound impact on the outstanding industry.the US government has accused Indian it companies of using the visas unfairly to send the employees from india at a lower cost, which impacts job creation in US. Most IT companies have said this would impact margin by 20 to 30 basis points and they would counter it by increasing hiring in US. Also it clears the opportunity of the local US players to get sub contracting from the bigger players. The impact on the revenues of the big companies such as Cognizant, TCS will give birth to theneed...

Words: 406 - Pages: 2

Premium Essay

Yunnan Luck Air Case Study

...1. What are Yunnan Lucky Air’s best options? Luck Air had a great business model, and that was to follow the same model as Southwest Airlines in the United States. Because Luck Air is considered a domestic airline in China they operate on a small scale compared to major competitors and so it made economical sense to offer low-cost, high-efficiency to their customers. In 2007 Lucky Air was able to more than double the amount of passengers from the year before by using a low-cost tactic. However other airlines have also caught on to offering low-cost fares for domestic routes to their passengers. With more competitors Lucky Air has decided to look at the possibility of taking a risk and to focus on e-commerce. Backed by their parent company Hainan Airlines, Luck Air has access to one of the most advanced web portals in the Chinese airline industry. But is e-commerce a good and viable option to compete with other airlines? Taking a look at what exactly e-commerce is important to grasp the understanding of how a business can operate, stay competitive and obtain a profit in a digital world. E-commerce involves digitally enabled transactions between and among organizations and individuals. Digitally enabled transactions include all those mediated by digital technology such as over the Internet, the Web and or via mobile apps. (Laudon & Traver, 2013, p. 55). Currently Airlines in China have high distribution costs which are fees paid to travel agents, the cost of staffing...

Words: 1737 - Pages: 7

Premium Essay

Paper

...Industry size Supermarkets & Grocery Stores Market Research Report | NAICS 44511 | Jan 2015 Shopping smart: Increasing premium brand sales and healthy eating trends will spur growth The Supermarkets & Grocery Stores market research report provides key industry analysis and industry statistics, measures market size, analyzes current and future industry trends and shows market share for the industry’s largest companies. IBISWorld publishes the largest collection of industry reports so you can see an industry’s supply chain, economic drivers and key buyers and markets. Market Share of Companies Kroger  Publix Super Markets Inc.  Safeway  Industry Statistics & Market Size Revenue $584bn | Annual Growth 10-15 1.3% | Annual Growth 15-20 | | Profit | Employment 2,489,995 | Businesses 42,036 | Industry Analysis & Industry Trends The Supermarkets and Grocery Stores industry has grown over the past five years, benefiting from a strengthening domestic economy. As per capita disposable income has grown over this period, some consumers traded up to premium, organic and all-natural brands, helping lift industry revenue. Over the next five years, the industry is anticipated to grow as a result of rising discretionary income, albeit at a more conservative rate than in the previous five-year period. As health concerns intensify, more consumers will seek all-natural and organic products, which are priced at a premium... purchase to read more Industry Report...

Words: 2318 - Pages: 10

Free Essay

Volvo Car Corporation

...Volvo Car Corporation The Volvo Car Corporation is a multi-million dollar operation; therefore, among the business comes finances, customers, employees, and various challenges; in which, all has to be manageable. With such a large company and with all these things taken into aspect, you have to wonder how the company manages to keep it all together and remain successful. With this evolving innovative world of technology corporations such as Volvo are able to implement new programs to improve the companies’ infrastructure. Volvo Car Corporation has done just that. In response the companies has integrated the cloud infrastructure into its networks by compiling data via internet and operates globally. It is now known that through the Volvo Corporation “data is now being captured for use within the vehicle itself, and also, increasingly, for transmission via the cloud back to the manufacturer” (“converting data into," 2011). In a recent case study it was stated that “Volvo is deploying a pilot solution based on Microsoft SQL Server 2012 Business Intelligence data management software and related BI technologies, including Microsoft SharePoint Server 2010 and Microsoft Office 2010” ("Volvo car corporation," 2012). The new advancement will provide clarity and efficiency, whereas, images and reports will be provided. The Power View, which will be included, creates a visual image for data and answers unexpected questions. In addition, SharePoint will enable employees and customers...

Words: 561 - Pages: 3

Premium Essay

Epileptic Seizure Detection Research Paper

...Seizure detection with Bigdata / Specific Problem, Gap Different technologies are available for neuroimaging e.g. magnetic resonance imaging (MRI), functional magnetic resonance imaging (FMRI) and electroencephalography (EEG) etc. The epileptic patients are normally monitored in the neurophysiological clinics using EEG, a non-invasive, multichannel technology for recording brain’s activity. Commonly used approach for epileptic seizure detection is the analysis of scalp EEG [3]. The technology used for scalp EEG is getting better rapidly. The scalp EEG used in clinics are capable of producing data at sampling rate of 2Khz. Furthermore in some studies; the number of channels used increased from tens to thousands [4]. To have an idea of the amount of data, a continuous EEG monitoring of a patient at 256 Hz with 24 channels can approximately generate 1GB data per day. With higher sampling rate and increased number of channels, EEG can produce far more data, e.g. 500GB per day [1]. All these characteristics make processing of EEG a compute intensive and data intensive task. Real time seizure detection...

Words: 840 - Pages: 4

Premium Essay

Role of Promotion Within the Marketing Mix for Mcdonald’s.

...In this task, I will be explaining the role of promotion within the marketing mix for McDonald’s. Product- The product is at the heart of the whole marketing process. A business must have the right types of products that meet the needs of the market, live up to customer’s expectations and deliver its said benefits. A quality product is one that meets both the wants and needs of its consumers. Example: “The Big Tasty, served with bacon, features a 100% beef patty with square cut lettuce, onions, two tomato slices, Big Tasty sauce and three slices of cheese, made with Emmental, all in a sesame-topped bun.” http://www.mcdonalds.co.uk/ukhome/product_nutrition.beef.204.big-tasty-with-bacon.html Price- Price is the next thing that is important within a company’s “marketing mix.” A promotional activity should inform customers of the price of the product being promoted. Although price may be an important deciding factor of the product being promoted, it also carries implications for quality and value. Pricing has a lot to do with how a product is perceived. If a product is priced too expensive for its perceived benefits, it will most likely not sell in significant volumes. However, if a product is priced too low, then it can easily be considered as somewhat inferior to its competition. In order for the company to find out how they can best fit into the competitive market, they must first take a good look at their competitor’s products and positioning. Example: Products such as...

Words: 981 - Pages: 4

Premium Essay

Essay On Rural Life

...basic need for the digital life. Transport – now-a-days the transport is also in the way of digital i.e., online reservation using different banking systems but the people who are living the rural areas do not know about such techniques. 8. Digital rural life in various countries RURAL URBAN Education Less than required level To required level Culture Diversified Diversified Equipment To required level More than required level Infrastructure To required level To required level Communication More than required level More than required level Transport As required As required Connectivity High High Banking To required level To required level Table3: urban and rural comparison 9. Effect of big data To provide the digital life in rural places, BIGDATA is the basic factor to store the information that was required to fulfill, as there will be lot of information about the people who are the residents of the rural places for the implementing of the big data application its is required to give the lot of the support to the people in the form the education, knowledge finance, man power , etc..., there are various several factors those which will come on the screen while the implementation is in process. As per the survey information even the big companies like IBM, Google, etc will be not able to store the information that will be collected from the rural places in the globe if they want to store the information they need to make their wings for than three times larger to achieve this goal...

Words: 1096 - Pages: 5