...A Statistical Perspective on Data Mining Ranjan Maitra∗ Abstract Technological advances have led to new and automated data collection methods. Datasets once at a premium are often plentiful nowadays and sometimes indeed massive. A new breed of challenges are thus presented – primary among them is the need for methodology to analyze such masses of data with a view to understanding complex phenomena and relationships. Such capability is provided by data mining which combines core statistical techniques with those from machine intelligence. This article reviews the current state of the discipline from a statistician’s perspective, illustrates issues with real-life examples, discusses the connections with statistics, the differences, the failings and the challenges ahead. 1 Introduction The information age has been matched by an explosion of data. This surfeit has been a result of modern, improved and, in many cases, automated methods for both data collection and storage. For instance, many stores tag their items with a product-specific bar code, which is scanned in when the corresponding item is bought. This automatically creates a gigantic repository of information on products and product combinations sold. Similar databases are also created by automated book-keeping, digital communication tools or by remote sensing satellites, and aided by the availability of affordable and effective storage mechanisms – magnetic tapes, data warehouses and so on. This has created a situation...
Words: 22784 - Pages: 92
...Active Learning with Support Vector Machines Kim Steenstrup Pedersen Department of Computer Science University of Copenhagen 2200 Copenhagen, Denmark kimstp@di.ku.dk Jan Kremer Department of Computer Science University of Copenhagen 2200 Copenhagen, Denmark jan.kremer@di.ku.dk Christian Igel Department of Computer Science University of Copenhagen 2200 Copenhagen, Denmark igel@di.ku.dk Abstract In machine learning, active learning refers to algorithms that autonomously select the data points from which they will learn. There are many data mining applications in which large amounts of unlabeled data are readily available, but labels (e.g., human annotations or results from complex experiments) are costly to obtain. In such scenarios, an active learning algorithm aims at identifying data points that, if labeled and used for training, would most improve the learned model. Labels are then obtained only for the most promising data points. This speeds up learning and reduces labeling costs. Support vector machine (SVM) classifiers are particularly well-suited for active learning due to their convenient mathematical properties. They perform linear classification, typically in a kernel-induced feature space, which makes measuring the distance of a data point from the decision boundary straightforward. Furthermore, heuristics can efficiently estimate how strongly learning from a data point influences the current model. This information can be used to actively...
Words: 9180 - Pages: 37
...What is Love? A Conceptual Analysis of "Love", focusing on the Love Theories of Plato, St. Augustine and Freud Nico Nuyens GRIPh Working Papers No. 0901 This paper can be downloaded without charge from the GRIPh Working Paper Series website: http//www.rug.nl/filosofie/GRIPh/workingpapers What is love? A Conceptual Analysis of “Love”, focusing on the Love Theories of Plato, St. Augustine and Freud CONTENTS INTRODUCTION............................................................................................................. 1 1. FORMAL ANALYSIS OF LOVE............................................................................... 3 2. SEMANTIC ANALYSIS OF LOVE........................................................................... 6 3. HISTORICAL ANALYSIS OF LOVE....................................................................... 9 3.1 ANCIENT GREEK PHILOSOPHY: PLATO ..................................................................... 11 3.2 CHRISTIAN PHILOSOPHY: SAINT AUGUSTINE............................................................ 18 3.3 MODERN PHILOSOPHY: FREUD ................................................................................. 27 4. COMPARATIVE EVALUATION............................................................................ 37 CONCLUSIONS ............................................................................................................. 40 REFERENCES....................................................................
Words: 19634 - Pages: 79
...Digital Image Processing: PIKS Inside, Third Edition. William K. Pratt Copyright © 2001 John Wiley & Sons, Inc. ISBNs: 0-471-37407-5 (Hardback); 0-471-22132-5 (Electronic) DIGITAL IMAGE PROCESSING DIGITAL IMAGE PROCESSING PIKS Inside Third Edition WILLIAM K. PRATT PixelSoft, Inc. Los Altos, California A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York • Chichester • Weinheim • Brisbane • Singapore • Toronto Designations used by companies to distinguish their products are often claimed as trademarks. In all instances where John Wiley & Sons, Inc., is aware of a claim, the product names appear in initial capital or all capital letters. Readers, however, should contact the appropriate companies for more complete information regarding trademarks and registration. Copyright 2001 by John Wiley and Sons, Inc., New York. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic or mechanical, including uploading, downloading, printing, decompiling, recording or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the Publisher. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158-0012, (212) 850-6011, fax (212) 850-6008, E-Mail: PERMREQ @ WILEY.COM. This publication is designed...
Words: 173795 - Pages: 696
...Employers, job seekers, and puzzle lovers everywhere delight in William Poundstone's HOW WOULD YOU MOVE MOUNT FUJI? "Combines how-to with be-smart for an audience of job seekers, interviewers, Wired-style cognitive science hobbyists, and the onlooking curious. . . . How Would You Move Mount Fuji? gallops down entertaining sidepaths about the history of intelligence testing, the origins of Silicon Valley, and the brain-jockey heroics of Microsoft culture." — Michael Erard, Austin Chronicle "A charming Trojan Horse of a book While this slim book is ostensibly a guide to cracking the cult of the puzzle in Microsoft's hiring practices, Poundstone manages to sneak in a wealth of material on the crucial issue of how to hire in today's knowledge-based economy. How Would You Move Mount Fuji? delivers on the promise of revealing the tricks to Microsoft's notorious hiring challenges. But, more important, Poundstone, an accomplished science journalist, shows how puzzles can — and cannot — identify the potential stars of a competitive company.... Poundstone gives smart advice to candidates on how to 'pass' the puzzle game.... Of course, let's not forget the real fun of the book: the puzzles themselves." — Tom Ehrenfeld, Boston Globe "A dead-serious book about recruiting practices and abstract reasoning — presented as a puzzle game.... Very, very valuable to some job applicants — the concepts being more important than the answers. It would have usefulness as well to interviewers with...
Words: 78201 - Pages: 313
...482 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 42, NO. 2, APRIL 2012 An Adaptive Differential Evolution Algorithm With Novel Mutation and Crossover Strategies for Global Numerical Optimization Sk. Minhazul Islam, Swagatam Das, Member, IEEE, Saurav Ghosh, Subhrajit Roy, and Ponnuthurai Nagaratnam Suganthan, Senior Member, IEEE Abstract—Differential evolution (DE) is one of the most powerful stochastic real parameter optimizers of current interest. In this paper, we propose a new mutation strategy, a fitnessinduced parent selection scheme for the binomial crossover of DE, and a simple but effective scheme of adapting two of its most important control parameters with an objective of achieving improved performance. The new mutation operator, which we call DE/current-to-gr_best/1, is a variant of the classical DE/current-to-best/1 scheme. It uses the best of a group (whose size is q% of the population size) of randomly selected solutions from current generation to perturb the parent (target) vector, unlike DE/current-to-best/1 that always picks the best vector of the entire population to perturb the target vector. In our modified framework of recombination, a biased parent selection scheme has been incorporated by letting each mutant undergo the usual binomial crossover with one of the p top-ranked individuals from the current population and not with the target vector with the same index as used in all variants of DE. A DE variant obtained...
Words: 11062 - Pages: 45
...Op"erations Research This page intentionally left blank Copyright © 2007, 2005 New Age International (P) Ltd., Publishers Published by New Age International (P) Ltd., Publishers All rights reserved. No part of this ebook may be reproduced in any form, by photostat, microfilm, xerography, or any other means, or incorporated into any information retrieval system, electronic or mechanical, without the written permission of the publisher. All inquiries should be emailed to rights@newagepublishers.com ISBN (13) : 978-81-224-2944-2 PUBLISHING FOR ONE WORLD NEW AGE INTERNATIONAL (P) LIMITED, PUBLISHERS 4835/24, Ansari Road, Daryaganj, New Delhi - 110002 Visit us at www.newagepublishers.com PREFACE I started my teaching career in the year 1964. I was teaching Production Engineering subjects till 1972. In the year 1972 I have registered my name for the Industrial Engineering examination at National Institution of Industrial Engineering, Bombay. Since then, I have shifted my field for interest to Industrial Engineering subjects and started teaching related subjects. One such subject is OPERATIONS RESEARCH. After teaching these subjects till my retirement in the year 2002, it is my responsibility to help the students with a book on Operations research. The first volume of the book is LINEAR PORGRAMMING MODELS. This was published in the year 2003. Now I am giving this book OPERATIONS RESEARCH, with other chapters to students, with a hope that it will help them to understand...
Words: 242596 - Pages: 971
...≈√ Guidelines on Credit Risk Management Rating Models a n d Va l i d a t i o n These guidelines were prepared by the Oesterreichische Nationalbank (OeNB) in cooperation with the Financial Market Authority (FMA) Published by: Oesterreichische Nationalbank (OeNB) Otto Wagner Platz 3, 1090 Vienna, Austria Austrian Financial Market Authority (FMA) Praterstrasse 23, 1020 Vienna, Austria Produced by: Oesterreichische Nationalbank Editor in chief: Gunther Thonabauer, Secretariat of the Governing Board and Public Relations (OeNB) ‹ Barbara Nosslinger, Staff Department for Executive Board Affairs and Public Relations (FMA) ‹ Editorial processing: Doris Datschetzky, Yi-Der Kuo, Alexander Tscherteu, (all OeNB) Thomas Hudetz, Ursula Hauser-Rethaller (all FMA) Design: Peter Buchegger, Secretariat of the Governing Board and Public Relations (OeNB) Typesetting, printing, and production: OeNB Printing Office Published and produced at: Otto Wagner Platz 3, 1090 Vienna, Austria Inquiries: Oesterreichische Nationalbank Secretariat of the Governing Board and Public Relations Otto Wagner Platz 3, 1090 Vienna, Austria Postal address: PO Box 61, 1011 Vienna, Austria Phone: (+43-1) 40 420-6666 Fax: (+43-1) 404 20-6696 Orders: Oesterreichische Nationalbank Documentation Management and Communication Systems Otto Wagner Platz 3, 1090 Vienna, Austria Postal address: PO Box 61, 1011 Vienna, Austria Phone: (+43-1) 404 20-2345 Fax: (+43-1) 404 20-2398 Internet: ...
Words: 60860 - Pages: 244
...NATIONAL INSTITUTE OF TECHNOLOGY SILCHAR Bachelor of Technology Programmes amï´>r¶ JH$s g§ñWmZ, m¡Úmo{ à VO o pñ Vw dZ m dY r V ‘ ñ Syllabi and Regulations for Undergraduate PROGRAMME OF STUDY (wef 2012 entry batch) Ma {gb Course Structure for B.Tech (4years, 8 Semester Course) Civil Engineering ( to be applicable from 2012 entry batch onwards) Course No CH-1101 /PH-1101 EE-1101 MA-1101 CE-1101 HS-1101 CH-1111 /PH-1111 ME-1111 Course Name Semester-1 Chemistry/Physics Basic Electrical Engineering Mathematics-I Engineering Graphics Communication Skills Chemistry/Physics Laboratory Workshop Physical Training-I NCC/NSO/NSS L 3 3 3 1 3 0 0 0 0 13 T 1 0 1 0 0 0 0 0 0 2 1 1 1 1 0 0 0 0 4 1 1 0 0 0 0 0 0 2 0 0 0 0 P 0 0 0 3 0 2 3 2 2 8 0 0 0 0 0 2 2 2 2 0 0 0 0 0 2 2 2 6 0 0 8 2 C 8 6 8 5 6 2 3 0 0 38 8 8 8 8 6 2 0 0 40 8 8 6 6 6 2 2 2 40 6 6 8 2 Course No EC-1101 CS-1101 MA-1102 ME-1101 PH-1101/ CH-1101 CS-1111 EE-1111 PH-1111/ CH-1111 Course Name Semester-2 Basic Electronics Introduction to Computing Mathematics-II Engineering Mechanics Physics/Chemistry Computing Laboratory Electrical Science Laboratory Physics/Chemistry Laboratory Physical Training –II NCC/NSO/NSS Semester-4 Structural Analysis-I Hydraulics Environmental Engg-I Structural Design-I Managerial Economics Engg. Geology Laboratory Hydraulics Laboratory Physical Training-IV NCC/NSO/NSS Semester-6 Structural Design-II Structural Analysis-III Foundation Engineering Transportation Engineering-II Hydrology &Flood...
Words: 126345 - Pages: 506
...9 - 8 0 2- 0 0 3 R EV : OC TO BER 25 , 2004 LYNDA M. AP PLE GATE NEO BOON S IONG NANCY B ARTL ET T DOLLY CH ANG -LEOW PSA: The World’s Port of Call Shakkei is a Japanese landscaping strategy. It means “borrowed scenery.” If you can integrate the distant scenery into the landscape of your garden, a beautiful garden can be created . . . A good landscaper is able to bring about this kind of integration. This same philosophy is true within Singapore today. If we want to realize the full potential of Singapore as a global business hub, we must leverage global resources to overcome our constraints and limitations . . . A small country is no longer small. This is our strategy to transform Singapore for the 21st century and beyond.1 Corporatised on October 1, 1997, after 33 years as the Port of Singapore Authority (PSA), the mission of PSA was to be the “World’s Port of Call.” A favorite lunch stop for many PSA visitors was the Prima Revolving Restaurant, located just outside the Brani Gate entrance to the port. From this lofty perch, Singapore harbor, port facilities, and operations could be viewed. On a sunny day in late 2000, PSA group president and former chief executive officer for Singapore’s urban redevelopment agency, Khoo Teng Chye, was entertaining a group of visitors. The panoramic view of the sea was dotted with container ships of all sizes, flying flags of many nations. As the restaurant rotated, the massive port infrastructure came into view with its many berths...
Words: 11892 - Pages: 48
...DATABASE MODELING AND DESIGN The Morgan Kaufmann Series in Data Management Systems (Selected Titles) Joe Celko’s Data, Measurements and Standards in SQL Joe Celko Information Modeling and Relational Databases, 2nd Edition Terry Halpin, Tony Morgan Joe Celko’s Thinking in Sets Joe Celko Business Metadata Bill Inmon, Bonnie O’Neil, Lowell Fryman Unleashing Web 2.0 Gottfried Vossen, Stephan Hagemann Enterprise Knowledge Management David Loshin Business Process Change, 2nd Edition Paul Harmon IT Manager’s Handbook, 2nd Edition Bill Holtsnider & Brian Jaffe Joe Celko’s Puzzles and Answers, 2 Joe Celko nd Location-Based Services ` Jochen Schiller and Agnes Voisard Managing Time in Relational Databases: How to Design, Update and Query Temporal Data Tom Johnston and Randall Weis Database Modeling with MicrosoftW Visio for Enterprise Architects Terry Halpin, Ken Evans, Patrick Hallock, Bill Maclean Designing Data-Intensive Web Applications Stephano Ceri, Piero Fraternali, Aldo Bongio, Marco Brambilla, Sara Comai, Maristella Matera Mining the Web: Discovering Knowledge from Hypertext Data Soumen Chakrabarti Advanced SQL: 1999—Understanding Object-Relational and Other Advanced Features Jim Melton Database Tuning: Principles, Experiments, and Troubleshooting Techniques Dennis Shasha, Philippe Bonnet SQL: 1999—Understanding Relational Language Components Jim Melton, Alan R. Simon Information Visualization in Data Mining and Knowledge Discovery Edited by Usama Fayyad, Georges G. Grinstein...
Words: 89336 - Pages: 358
...ETHICS IN INFORMATION TECHNOLOGY Third Edition This page intentionally left blank ETHICS IN INFORMATION TECHNOLOGY Third Edition George W. Reynolds Australia • Brazil • Japan • Korea • Mexico • Singapore • Spain • United Kingdom • United States Ethics in Information Technology, Third Edition by George W. Reynolds VP/Editorial Director: Jack Calhoun Publisher: Joe Sabatino Senior Acquisitions Editor: Charles McCormick Jr. Senior Product Manager: Kate Hennessy Mason Development Editor: Mary Pat Shaffer Editorial Assistant: Nora Heink Marketing Manager: Bryant Chrzan Marketing Coordinator: Suellen Ruttkay Content Product Manager: Jennifer Feltri Senior Art Director: Stacy Jenkins Shirley Cover Designer: Itzhack Shelomi Cover Image: iStock Images Technology Project Manager: Chris Valentine Manufacturing Coordinator: Julio Esperas Copyeditor: Green Pen Quality Assurance Proofreader: Suzanne Huizenga Indexer: Alexandra Nickerson Composition: Pre-Press PMG © 2010 Course Technology, Cengage Learning ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced, transmitted, stored or used in any form or by any means graphic, electronic, or mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, Web distribution, information networks, or information storage and retrieval systems, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without the prior written permission...
Words: 204343 - Pages: 818
...PART 2 The Global Marketing Environment CHAPTER 2 The Global Economic Environment Case 2-1 The Global Economic Crisis I n his 1997 book One World, Ready or Not, William Greider described the United States as “the buyer of last resort.” Greider explained that, for many years, the United States was the only nation that was willing to absorb production surpluses exported by companies in Europe, Asia, and Latin America. Greider asked: “Who will buy the surpluses when the United States cannot?” The conventional wisdom has long held that strong spending by consumers in other nations would keep the world economy humming. However, by 2008, Greider’s question was taking on a new urgency and the conventional wisdom was being tested. An economic crisis that had its roots in lax subprime mortgage lending practices began to spread around the globe. In the United States, where the crisis began, economic misery was widespread: The housing market collapsed, real estate values plummeted, credit tightened, and job growth slowed (see Exhibit 2-1). As the price of oil passed the $100 per barrel benchmark, the average price of a gallon of gasoline rose to $4. American consumers were, indeed, less willing and less able to buy. However, the crisis was not confined to the United States alone. Consumer-goods exporters in Asia, which Exhibit 2-1: The bursting of the global real estate bubble was only one aspect of the worst recession in decades. The ripple effects from the economic...
Words: 24814 - Pages: 100
...Answers to Conceptual Integrated Science End-of-Chapter Questions Chapter 1: About Science Answers to Chapter 1 Review Questions 1 The era of modern science in the 16th century was launched when Galileo Galilei revived the Copernican view of the heliocentric universe, using experiments to study nature’s behavior. 2 In Conceptual Integrated Science, we believe that focusing on math too early is a poor substitute forconcepts. 3 We mean that it must be capable of being proved wrong. 4 Nonscientific hypotheses may be perfectly reasonable; they are nonscientific only because they are not falsifiable—there is no test for possible wrongness. 5 Galileo showed the falseness of Aristotle’s claim with a single experiment—dropping heavy and lightobjects from the Leaning Tower of Pisa. 6 A scientific fact is something that competent observers can observe and agree to be true; a hypothesis is an explanation or answer that is capable of being proved wrong; a law is a hypothesis that has been tested over and over and not contradicted; a theory is a synthesis of facts and well-tested hypotheses. 7 In everyday speech, a theory is the same as a hypothesis—a statement that hasn’t been tested. 8 Theories grow stronger and more precise as they evolve to include new information. 9 The term supernatural literally means “above nature.” Science works within nature, not above it. 10 They rely on subjective personal experience and do not lead to testable hypotheses. They lie outside...
Words: 81827 - Pages: 328
...Annual Report 2014-15 Infosys Annual Report 2014-15 Narayana Murthy A tribute to our founders Nandan M. Nilekani S. Gopalakrishnan K. Dinesh The year 2014 was a milestone in our Company's history, when we bid farewell to three of our founders who held executive positions in the Company during the year – Narayana Murthy, S. Gopalakrishnan and S. D. Shibulal. Narayana Murthy stepped down as the Chairman of the Board on October 10, 2014. His vision, leadership and guidance have been an inspiration to Infosys, the Indian IT industry and an entire generation of technology entrepreneurs. He propelled the Company into accomplishing many firsts and in setting industry benchmarks on several fronts. He espoused the highest level of corporate governance standards that have defined Infosys over the years and made us a globally respected corporation. Between June 2013 and October 2014, he guided the Company through a period of stabilization and leadership transition. S. Gopalakrishnan stepped down as Vice Chairman of the Board on October 10, 2014. Kris, as he is popularly known, served the Company in several capacities over the last 33 years. As the Chief Executive Officer between 2007 and 2011, he steered the Company at a time when the world was faced with economic crises. Ranked as a global thought leader, Kris has led the technological evolution of the Company. S. D. Shibulal stepped down as the Company's Chief Executive Officer on July...
Words: 136409 - Pages: 546