...example, while basic aggregation operations like SUM and AVG are part of SQL, there is no support for other commonly used operations like variance and co-variance. Such computations, as well as more advanced ones like regression and principal component analysis, are usually performed using statistical packages and libraries, such as SAS [1] and SPSS [2]. From the end user’s perspective, whether the statistical calculations are being performed in the database or in a statistical package can be quite transparent, especially from a functionality viewpoint. However, once the datasets to be analyzed grow beyond a certain size, the statistical package approach becomes infeasible, either due to its inability to handle large volumes of data, or the unacceptable computation times which make interactive analysis impossible. With the increasing sophistication of data collection instrumentation, and the cheap availability of large volume and high speed storage devices, most applications are today collecting data at unprecedented rates. In addition, an increasing number of applications today want the ability to perform interactive and on-line analysis of this data in real time, such as “what-if” analysis in forecasting. The emergence of multiple gigabyte corporate data...
Words: 11702 - Pages: 47
...Data Mining Practical Machine Learning Tools and Techniques The Morgan Kaufmann Series in Data Management Systems Series Editor: Jim Gray, Microsoft Research Data Mining: Practical Machine Learning Tools and Techniques, Second Edition Ian H. Witten and Eibe Frank Fuzzy Modeling and Genetic Algorithms for Data Mining and Exploration Earl Cox Data Modeling Essentials, Third Edition Graeme C. Simsion and Graham C. Witt Location-Based Services Jochen Schiller and Agnès Voisard Database Modeling with Microsoft® Visio for Enterprise Architects Terry Halpin, Ken Evans, Patrick Hallock, and Bill Maclean Designing Data-Intensive Web Applications Stefano Ceri, Piero Fraternali, Aldo Bongio, Marco Brambilla, Sara Comai, and Maristella Matera Mining the Web: Discovering Knowledge from Hypertext Data Soumen Chakrabarti Understanding SQL and Java Together: A Guide to SQLJ, JDBC, and Related Technologies Jim Melton and Andrew Eisenberg Database: Principles, Programming, and Performance, Second Edition Patrick O’Neil and Elizabeth O’Neil The Object Data Standard: ODMG 3.0 Edited by R. G. G. Cattell, Douglas K. Barry, Mark Berler, Jeff Eastman, David Jordan, Craig Russell, Olaf Schadow, Torsten Stanienda, and Fernando Velez Data on the Web: From Relations to Semistructured Data and XML Serge Abiteboul, Peter Buneman, and Dan Suciu Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations Ian H. Witten and Eibe Frank ...
Words: 191947 - Pages: 768
...Mining Third Edition This page intentionally left blank Data Mining Practical Machine Learning Tools and Techniques Third Edition Ian H. Witten Eibe Frank Mark A. Hall AMSTERDAM • BOSTON • HEIDELBERG • LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO Morgan Kaufmann Publishers is an imprint of Elsevier Morgan Kaufmann Publishers is an imprint of Elsevier 30 Corporate Drive, Suite 400, Burlington, MA 01803, USA This book is printed on acid-free paper. Copyright © 2011 Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely...
Words: 194698 - Pages: 779
...Semester Based Credit and Grading System) University of Mumbai, Information Technology (semester V and VI) (Rev-2012) Page 1 Preamble To meet the challenge of ensuring excellence in engineering education, the issue of quality needs to be addressed, debated and taken forward in a systematic manner. Accreditation is the principal means of quality assurance in higher education. The major emphasis of accreditation process is to measure the outcomes of the program that is being accredited. In line with this Faculty of Technology of University of Mumbai has taken a lead in incorporating philosophy of outcome based education in the process of curriculum development. Faculty of Technology, University of Mumbai, in one of its meeting unanimously resolved that, each Board of Studies shall prepare some Program Educational Objectives (PEO‟s) and give freedom to affiliated Institutes to add few (PEO‟s) and course objectives and course outcomes to be clearly defined for each course, so that all faculty members in affiliated institutes understand the depth and approach of course to be taught, which will enhance learner‟s learning process. It was also resolved that, maximum senior faculty from colleges and experts from industry to be involved while revising the curriculum. I am happy to state that, each Board of studies has adhered to the resolutions passed by Faculty of Technology, and developed curriculum accordingly. In addition to outcome based education, semester based credit...
Words: 10444 - Pages: 42
...solution. Indeed, we believe that this architecture will stimulate research, and more importantly organizations, to invest in Analytics and Statistical Fraud-Scoring to be used in conjunction with the already in-place preventive techniques. Therefore, in this research we explore different strategies to build a Streambased Fraud Detection solution, using advanced Data Mining Algorithms and Statistical Analysis, and show how they lead to increased accuracy in the detection of fraud by at least 78% in our reference dataset. We also discuss how a combination of these strategies can be embedded in a Stream-based application to detect fraud in real-time. From this perspective, our experiments lead to an average processing time of 111,702ms per transaction, while strategies to further improve the performance are discussed. Keywords: Fraud Detection, Stream Computing, Real-Time Analysis, Fraud, Data Mining, Retail Banking Industry, Data Preprocessing, Data Classification, Behavior-based Models, Supervised Analysis, Semi-supervised Analysis Sammanfattning Privatbankerna har drabbats hårt av bedrägerier de senaste åren. Bedragare har lyckats kringgå forskning och tillgängliga system och lura bankerna och deras kunder. Därför vill vi införa en ny, polyvalent...
Words: 56858 - Pages: 228
...solution. Indeed, we believe that this architecture will stimulate research, and more importantly organizations, to invest in Analytics and Statistical Fraud-Scoring to be used in conjunction with the already in-place preventive techniques. Therefore, in this research we explore different strategies to build a Streambased Fraud Detection solution, using advanced Data Mining Algorithms and Statistical Analysis, and show how they lead to increased accuracy in the detection of fraud by at least 78% in our reference dataset. We also discuss how a combination of these strategies can be embedded in a Stream-based application to detect fraud in real-time. From this perspective, our experiments lead to an average processing time of 111,702ms per transaction, while strategies to further improve the performance are discussed. Keywords: Fraud Detection, Stream Computing, Real-Time Analysis, Fraud, Data Mining, Retail Banking Industry, Data Preprocessing, Data Classification, Behavior-based Models, Supervised Analysis, Semi-supervised Analysis Sammanfattning Privatbankerna har drabbats hårt av bedrägerier de senaste åren. Bedragare har lyckats kringgå forskning och tillgängliga system och lura bankerna och deras kunder. Därför vill vi införa en ny, polyvalent...
Words: 56858 - Pages: 228
...Fundamentals of Database Systems Preface....................................................................................................................................................12 Contents of This Edition.....................................................................................................................13 Guidelines for Using This Book.........................................................................................................14 Acknowledgments ..............................................................................................................................15 Contents of This Edition.........................................................................................................................17 Guidelines for Using This Book.............................................................................................................19 Acknowledgments ..................................................................................................................................21 About the Authors ..................................................................................................................................22 Part 1: Basic Concepts............................................................................................................................23 Chapter 1: Databases and Database Users..........................................................................................23 ...
Words: 229471 - Pages: 918
...® OCA Oracle Database 11g: SQL Fundamentals I Exam Guide (Exam 1Z0-051) ABOUT THE AUTHORS John Watson (Oxford, UK) works for BPLC Management Consultants, teaching and consulting throughout Europe and Africa. He was with Oracle University for several years in South Africa, and before that worked for a number of companies, government departments, and NGOs in England and Europe. He is OCP qualified in both database and Application Server administration. John is the author of several books and numerous articles on technology and has 25 years of experience in IT. Roopesh Ramklass (South Africa), OCP, is an independent Oracle specialist with over 10 years of experience in a wide variety of IT environments. These include software design and development, systems analysis, courseware development, and lecturing. He has worked for Oracle Support and taught at Oracle University in South Africa for several years. Roopesh is experienced in managing and executing IT development projects, including infrastructure systems provisioning, software development, and systems integration. About the Technical Editor Bruce Swart (South Africa) works for 2Cana Solutions and has over 14 years of experience in IT. Whilst maintaining a keen interest for teaching others, he has performed several roles including developer, analyst, team leader, administrator, project manager, consultant, and lecturer. He is OCP qualified in both database and developer roles. He has taught at Oracle University...
Words: 150089 - Pages: 601
...Oracle® Database Concepts 10g Release 2 (10.2) B14220-02 October 2005 Oracle Database Concepts, 10g Release 2 (10.2) B14220-02 Copyright © 1993, 2005, Oracle. All rights reserved. Primary Author: Michele Cyran Contributing Author: Paul Lane, JP Polk Contributor: Omar Alonso, Penny Avril, Hermann Baer, Sandeepan Banerjee, Mark Bauer, Bill Bridge, Sandra Cheevers, Carol Colrain, Vira Goorah, Mike Hartstein, John Haydu, Wei Hu, Ramkumar Krishnan, Vasudha Krishnaswamy, Bill Lee, Bryn Llewellyn, Rich Long, Diana Lorentz, Paul Manning, Valarie Moore, Mughees Minhas, Gopal Mulagund, Muthu Olagappan, Jennifer Polk, Kathy Rich, John Russell, Viv Schupmann, Bob Thome, Randy Urbano, Michael Verheij, Ron Weiss, Steve Wertheimer The Programs (which include both the software and documentation) contain proprietary information; they are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright, patent, and other intellectual and industrial property laws. Reverse engineering, disassembly, or decompilation of the Programs, except to the extent required to obtain interoperability with other independently created software or as specified by law, is prohibited. The information contained in this document is subject to change without notice. If you find any problems in the documentation, please report them to us in writing. This document is not warranted to be error-free. Except as may be expressly permitted in your license agreement...
Words: 199783 - Pages: 800
...sources and reproduced, with permission, in this textbook appear on appropriate page within text. Microsoft® and Windows® are registered trademarks of the Microsoft Corporation in the U.S.A. and other countries. Screen shots and icons reprinted with permission from the Microsoft Corporation. This book is not sponsored or endorsed by or affiliated with the Microsoft Corporation. Copyright © 2011, 2009, 2007, 2005, 2002 Pearson Education, Inc., publishing as Prentice Hall, One Lake Street, Upper Saddle River, New Jersey 07458. All rights reserved. Manufactured in the United States of America. This publication is protected by Copyright, and permission should be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means,...
Words: 193467 - Pages: 774
...DATABASE MODELING AND DESIGN The Morgan Kaufmann Series in Data Management Systems (Selected Titles) Joe Celko’s Data, Measurements and Standards in SQL Joe Celko Information Modeling and Relational Databases, 2nd Edition Terry Halpin, Tony Morgan Joe Celko’s Thinking in Sets Joe Celko Business Metadata Bill Inmon, Bonnie O’Neil, Lowell Fryman Unleashing Web 2.0 Gottfried Vossen, Stephan Hagemann Enterprise Knowledge Management David Loshin Business Process Change, 2nd Edition Paul Harmon IT Manager’s Handbook, 2nd Edition Bill Holtsnider & Brian Jaffe Joe Celko’s Puzzles and Answers, 2 Joe Celko nd Location-Based Services ` Jochen Schiller and Agnes Voisard Managing Time in Relational Databases: How to Design, Update and Query Temporal Data Tom Johnston and Randall Weis Database Modeling with MicrosoftW Visio for Enterprise Architects Terry Halpin, Ken Evans, Patrick Hallock, Bill Maclean Designing Data-Intensive Web Applications Stephano Ceri, Piero Fraternali, Aldo Bongio, Marco Brambilla, Sara Comai, Maristella Matera Mining the Web: Discovering Knowledge from Hypertext Data Soumen Chakrabarti Advanced SQL: 1999—Understanding Object-Relational and Other Advanced Features Jim Melton Database Tuning: Principles, Experiments, and Troubleshooting Techniques Dennis Shasha, Philippe Bonnet SQL: 1999—Understanding Relational Language Components Jim Melton, Alan R. Simon Information Visualization in Data Mining and Knowledge Discovery Edited by Usama Fayyad, Georges G. Grinstein...
Words: 89336 - Pages: 358
...1. An IS auditor is reviewing access to an application to determine whether the 10 most recent "new user" forms were correctly authorized. This is an example of: A. variable sampling. B. substantive testing. C. compliance testing. D. stop-or-go sampling. The correct answer is: C. compliance testing. Explanation: Compliance testing determines whether controls are being applied in compliance with policy. This includes tests to determine whether new accounts were appropriately authorized. Variable sampling is used to estimate numerical values, such as dollar values. Substantive testing substantiates the integrity of actual processing, such as balances on financial statements. The development of substantive tests is often dependent on the outcome of compliance tests. If compliance tests indicate that there are adequate internal controls, then substantive tests can be minimized. Stop-or-go sampling allows a test to be stopped as early as possible and is not appropriate for checking whether procedures have been followed. 2. The decisions and actions of an IS auditor are MOST likely to affect which of the following risks? A. Inherent B. Detection C. Control D. Business The correct answer is: B. Detection Explanation: Detection risks are directly affected by the auditor's selection of audit procedures and techniques. Inherent risks usually are not affected by the IS auditor. Control risks are controlled by the actions of the company's management. Business...
Words: 97238 - Pages: 389
...Shipp Marketing Coordinator: Suellen Ruttkay Content Product Manager: Matthew Hutchinson Senior Art Director: Stacy Jenkins Shirley Cover Designer: Itzhack Shelomi Cover Image: iStock Images Media Editor: Chris Valentine Manufacturing Coordinator: Julio Esperas Copyeditor: Andrea Schein Proofreader: Foxxe Editorial Indexer: Elizabeth Cunningham Composition: GEX Publishing Services © 2011 Cengage Learning ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced, transmitted, stored or used in any form or by any means graphic, electronic, or mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, Web distribution, information networks, or information storage and retrieval systems, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the publisher. For product information and technology assistance, contact us at Cengage Learning Customer & Sales Support, 1-800-354-9706 For permission to use material from this text or product, submit all requests online at www.cengage.com/permissions Further permissions...
Words: 189848 - Pages: 760
...Front cover WebSphere Service Registry and Repository Handbook Best practices Sample integration scenarios SOA governance Chris Dudley Laurent Rieu Martin Smithson Tapan Verma Byron Braswell ibm.com/redbooks International Technical Support Organization WebSphere Service Registry and Repository Handbook March 2007 SG24-7386-00 Note: Before using this information and the product it supports, read the information in “Notices” on page xv. First Edition (March 2007) This edition applies to Version 6, Release 0, Modification 0.1 of IBM WebSphere Service Registry and Repository (product number 5724-N72). © Copyright International Business Machines Corporation 2007. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii The team that wrote this redbook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . ...
Words: 163740 - Pages: 655