...approximately equal to the variance of the population divided by each sample's size. This statistical theory is very useful when examining returns for a given stock or index because it simplifies many analysis procedures. An appropriate sample size depends on the data available, but generally speaking, having a sample size of at least 50 observations is sufficient. Due to the relative ease of generating financial data, it is often easy to produce much larger sample sizes. • Null Hypothesis: States the assumption (numerical) to be tested, for Example: The average number of TV sets in U.S. Homes is at least three (H0: μ ≥ 3). 1. Is always about a population parameter, not about a sample statistic. ✓ H0: μ ≥ 3 X H0: [pic] ≥ 3 Always begins with the assumption that the null hypothesis is true, similar to the notion of innocent until proven guilty. Refers to the status quo. Always contains “=”, “≤” or “≥” sign. May or may not be rejected. 1. • The Alternate Hypothesis : Is the opposite of the null hypothesis e.g.: The average number of TV sets in U.S. homes is less than 3 ( HA: μ< 3 ) Challenges the status quo...
Words: 1168 - Pages: 5
...Reporting Statistics in APA Style Dr. Jeffrey Kahn, Illinois State University The following examples illustrate how to report statistics in the text of a research report. You will note that significance levels in journal articles--especially in tables--are often reported as either "p > .05," "p < .05," "p < .01," or "p < .001." APA style dictates reporting the exact p value within the text of a manuscript (unless the p value is less than .001). Please pay attention to issues of italics and spacing. APA style is very precise about these. Also, with the exception of some p values, most statistics should be rounded to two decimal places. Mean and Standard Deviation are most clearly presented in parentheses: The sample as a whole was relatively young (M = 19.22, SD = 3.45). The average age of students was 19.22 years (SD = 3.45). Percentages are also most clearly displayed in parentheses with no decimal places: Nearly half (49%) of the sample was married. Chi-Square statistics are reported with degrees of freedom and sample size in parentheses, the Pearson chi-square value (rounded to two decimal places), and the significance level: The percentage of participants that were married did not differ by gender, χ2(1, N = 90) = 0.89, p = .35. T Tests are reported like chi-squares, but only the degrees of freedom are in parentheses. Following that, report the t statistic (rounded to two decimal places) and the significance level. There was a significant effect for...
Words: 570 - Pages: 3
...TERM END EXAMINATIONS,MARCH-2013 BACHELOR OF COMMERCE, YEAR – III ELEMENTARY STASTISTICS Time: 3 hours M.Marks:60 SECTION A Note: - Attempt any 4 questions. All questions carry equal marks. (4 X 5) The answer should be limited upto 200 words. 1) What is statistics? Explain the nature and limitations of statistics? 2) What is frequency distribution? What are the different types of frequency distribution? 3) What is frequency curve? Explain cumulative frequency curve with example? 4) Suppose mean of a series of 5 item is30.four values are respectively, 10, 15, 30 and 35.estimate the missing 5th value of the series. ANSWER : Mean = (10+15+30+35+x)/5=30 Therefore, x=(30*50)-( 10+15+30+35) i.e x = 150-90, hence x=60 5) Calculate median of the following distribution of data. Class interval | 0-5 | 5-10 | 10-20 | 20-30 | 30-50 | 50-70 | 70-100 | frequency | 12 | 15 | 25 | 40 | 42 | 14 | 8 | n= 12+15+25+40+42+14+8=156 Hence median is at the average of n/2 & (n/2 +1) positon i.e 78th & 79th position Class interval | 0-5 | 5-10 | 10-20 | 20-30 | 30-50 | 50-70 | 70-100 | frequency | 12 | 15 | 25 | 40 | 42 | 14 | 8 | Position 12 27 52 92 134 148 156 6) Calculate the coefficient of correlation...
Words: 1424 - Pages: 6
...Introduction This essay focuses on the hypothesis testing in order to provide sufficient statistic assistance to make decisions associated with location, how to promote, and the pricing strategy. To be more specific, there are many hypothesis testing methods used in the following essay, including one sample t-test, independent samples t-test, paired samples t-test, ANOVA and Chi-test to achieve those goals. How much are potential patrons willing to pay for the entrees? Is the $18 amount from the forecasting model correct? [pic] Graph 1: The frequency of the price of evening meal entrée item alone Based on the process of data screening, I personally believe that 60 data are logically inconsistent. 60 data relating to the price of evening meal entrée customers would like to pay amount to 999 dollars, which has a low possibility in reality. In this case, those data should not be included during analysis. In other words, I accept the null hypothesis and conclude that the average price of evening mean entrée is not significantly different from 18 dollars. Ho: μ=18 H1: μ≠ 18 |One-sample statistics | | |N |Mean |Std. Deviation |Std. Error Mean | |What would you expect an average evening meal entree |340 |$18.8353 |$9.82784 |$.53299 | |item alone...
Words: 884 - Pages: 4
...CarZuma: Car Insurance Claim Case Study Problem Statement A car insurance company, CarZuma, has collected some of the past data about their clients or customers that what are their attributes as well as their insured vehicles attributes while insuring the same. They have heard a lot about the data analytics and are very sure that their competitors are also using some of these techniques to beat the competition. They have to be very focused to their target segment. Please help the company to answer some of their basic questions like: 1. What kind of segments generates better leads? 2. What would be a good target segment? And which segment is a “bad” segment? Solution For a car insurance company, it is important to determine different segments of customers based on the profitability to generate better leads. Hence , we have tried in our analysis to segment the customers that should be targeted for profitability and that which can be considered “bad”, i.e., not profitable to the customers. Reasons Analytics has been used to address the problem: - * Lots of data but little conclusive information * Analytical capability is a Key Competitive Advantage * Information should be appropriate to take action in corporate Our Approach: Steps Followed 1. Identification of Appropriate Data 2. Perform Required Statistical Analysis 3. Segmentation of Customers based on desired criterion 4. Characteristics identification...
Words: 5777 - Pages: 24
...Introduction The art and science of collecting, analyzing, presenting and interpreting data to make more effective decision. A collection of numerical information is called statistics. Statistical Data: According to Horace Secrist “By statistics we mean aggregate of facts affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to reasonable standard of accuracy, collected in a systematic manner for a pre-determined purpose and placed in relation to each other”. Features of This Definition Are: * Statistics are aggregate of facts. * Statistics are affected to a marked extent by multiplicity of causes. * Statistics are numerically expressed. * Statistics are enumerated or estimated according to reasonable standard of accuracy. * Statistics are collected in a systematic manner. * Statistics are for a pre-determined purpose. * Statistics should be placed in relation to each other. Functions of Statistics: * It presents facts in definite forms. * It simplifies mass of figures. * It facilitates comparison. * It helps in formulating and testing hypothesis. * It helps in prediction and * It helps in the formulation of suitable policies. Limitations of Statistics: * It does not deal with isolated measurements. * It deals only with quantitative characteristics. * Its results are true only on average. * It is only a means. * It can be misused. ...
Words: 1589 - Pages: 7
...Research, Statistics, and Psychology Psychology is the scientific investigation of mental processes and behavior (Kowalski & Westen, 2007, p. 3). During the late 19th century, psychology became an actual science because of the fascination of human behavior. Psychologists use observation to measure human behavior better to understand mental and biological processes, motives, and personality traits. Human behavior may be understood through applied and academic science (Psychology Majors, 2011). Based on this, research using the scientific method is necessary for statistical psychology. Early research and use of scientific method in psychology included the works of Edward Titchener. Titchener used structuralism to explore aspects of the mind. Research through this method focused on introspection, or individual conscious experience. Titchener used a table method similar to a chemistry periodic table to study human behavior. Titchener believed experimentation was the only scientific method to use for the study psychology (Northern Illinois University, 2003). A paradigm in psychology is a set of theoretical assertions that provide a model, abstract picture, or object of study (Kowalski & Westen, 2007, p. 11). A paradigm is a set of shared metaphors that compare any object of study through investigation. Many modern psychologists use innovative approaches to study human behavior to support traditional methods of psychology through use of research using the scientific method...
Words: 977 - Pages: 4
...Commerce b. Economics c. Technology d. Sciences 5) A study of the scope of the statistics includes: a. Nature of statistics b. Subject matter of statistics c. Limitations of statistics d. All of the above 6) ………………. Are numerical statements of facts in any department of enquiry placed in relation to each other? a. Sciences b. Statistics c. Commerce d. Mathematics 7) Whenever some definite connection exists between the two or more groups , classes or series or data there is said to be…………….. a. Index number b. Correlation c. Mean d. Median 8) Calculate range and the coefficient of range where H =18 and l =10. a. 8 , o.28 b. 9 ,0.8 c. 0 ,0.23 d. 2 ,0.32 9) ……………… is the difference between the highest value and the lowest value in a series. a. Ranges b. Coefficient of ranges c. Individual series d. Mid value 10) Coefficient of range =…………….. a. H –L b. H +L c. H –L/H +L d. None of these 11) The value of the variable which occurs most frequently in a distribution is called the…………… a. Mean b. Median c. Mode d. Quartile 12) Q1 =size of (N +1)/4 th items of the series, where in individual series N =……… a. Number of items b. Number of frequencies c. Sum of frequencies d. None of these 13) 60 students of section A of class XI,obtained 40 mean marks in statistics,40 students of section B obtained 35mean marks in statistics.Find out mean marks in statistics for...
Words: 1155 - Pages: 5
...TERM END EXAMINATIONS,MARCH-2013 BACHELOR OF COMMERCE, YEAR – III ELEMENTARY STASTISTICS Time: 3 hours M.Marks:60 SECTION A Note: - Attempt any 4 questions. All questions carry equal marks. (4 X 5) The answer should be limited upto 200 words. 1) What is statistics? Explain the nature and limitations of statistics? 2) What is frequency distribution? What are the different types of frequency distribution? 3) What is frequency curve? Explain cumulative frequency curve with example? 4) Suppose mean of a series of 5 item is30.four values are respectively, 10, 15, 30 and 35.estimate the missing 5th value of the series. 5) Calculate median of the following distribution of data. Class interval | 0-5 | 5-10 | 10-20 | 20-30 | 30-50 | 50-70 | 70-100 | frequency | 12 | 15 | 25 | 40 | 42 | 14 | 8 | 6) Calculate the coefficient of correlation between the age of husbands and wives: Age of husband (yrs) | 21 | 22 | 28 | 32 | 35 | 36 | Age of wives (yrs) | 18 | 20 | 25 | 30 | 31 | 32 | SECTION B Note: -All questions are compulsory. Each Question carries equal mark. (40 X 1) 1) If a statistical series is divided into four equal parts, the end value of each part is called a ……… a. Quartile b. Deciles c. Percentiles d. Range 2) ………………divide...
Words: 1339 - Pages: 6
...TRIDENT UNIVERSITY INTERNATIONAL Falesha R. Vonner Module 5 Case MAT 201 Dr. Lall 20 OCTOBER 2014 HYPOTHESIS TESTING AND TYPE ERRORS Answer the following problems showing your work and explaining (or analyzing) your results. 1. Explain Type I and Type II errors. Use an example if needed. Type I errors, also known as an error of the first kind involves the rejection of a true null hypothesis that is actually the equivalent to a false positive. If the null hypothesis is rejected, a statement can be made that the control does in fact have some effect on the test. But if the null hypothesis is true, then in reality the control does not fight the test in any way visible. Although, type I errors can be controlled, the value of alpha is related to the level of importance that are selected as a direct bearing on type I errors. Alpha is the maximum probability that there will be a type I error. If the value of alpha is 0.05 this equates to a 95% confidence level. Meaning there is a 5% probability that a true null hypothesis will be excluded. In the long run, one out of every twenty hypothesis tests performed at this level will result in a type I error. (www.statistics.about.com, 2014). Type II error, also known as a "false negative": the error of not rejecting a null hypothesis when the alternative hypothesis is the true state of nature. In other words, this is the error...
Words: 1449 - Pages: 6
...OPTIMIZING THE MEAN AND STANDARD DEVIATION The principal objective of this article is to come up with an alternative way to optimize the standard deviation of the quality characteristic of a product or process enabling the practitioner to assign distinct weights to process mean and standard deviation and to find trade-off solutions between them, taking their relative magnitudes into account. In this article various approaches have been proposed e.g. Response surface methodology and robust parameter design. RSM and RPD are used to get low variability, low cost, high quality, and high reliability in the process and product .The priority-based, mean squared–based, and Goal programming–based approaches in the RSM framework are attached to many drawbacks necessitating alternative methods to be used. The dual response optimization approach may be difficult for practitioners who do not have sufficient mathematical knowledge and are mostly found in journals that are difficult to find. RSM framework approaches have been proposed by Vining and Myers(1990), Copeland and Nelson(1996) and Lind and Tu(1995).Other data-driven approaches that are applicable include the ones by Ding et al.(2004), Jeong et al.(2005) and Shaibu and Cho(2009).Although the weighted square sum-based method allow overtaking of drawbacks of the priority based methods, they do not capture solutions. The proposed method uses less complicated techniques and it is assumed that an optimal solution for a dual...
Words: 1451 - Pages: 6
...Business analytics department Nick Fabrio 2013 2013 Executive Summary: The following report was prepared by the Business Analytics Department (BAD) to assist with market research. BAD have assessed whether CCResort have positioned themselves as an “upmarket” complex with the ability to appeal to families. In order to assess the outcome of the business plan, BAD have focused on researching the type of customer that is attracted to CCResort, as well as how much they are spending over and above their accommodation costs, in addition to evaluating whether specific key performance indicators have been met. Approach: The Business Analytics Department used the following statistical tools to examine CCResort: 1. Descriptive Statistics (Mean, Mode, Range, Median, Skewness and graphical interpretations) 2. Correlation 3. Confidence Intervals 4. Hypothesis Testing Significant Findings: * CCResort typically hosts families in a higher income bracket over $100,000 (69% of guests) * Guests stay seven nights on average between 44.07% and 57.93% of the time with a 95% certainty. * Expenditure for each guest on a daily basis is between $227.61 and $240.95 on average with a 95% degree of accuracy. * Of all bookings, a significant majority of guests stay for either 2 or 7 nights (88.5%) * The modal age of guest booking accommodation is 50 years old with a mean of 47.33 which most frequently occurs in groups of 4. * The data suggests that couples...
Words: 2041 - Pages: 9
...Journal of Economic Literature Vol. XXXIV (March 1996), pp. 97-114 The Standard Error of Regressions By D E I R D R E N . M C C L O S K E Y and STEPHEN T. ZILIAK University of Iowa Suggestions by two anonymous and patient referees greatly improved the paper. Our thanks also to seminars at Clark, Iowa State, Harvard, Houston, Indiana, and Kansas State universities, at Williatns College, and at the universities of Virginia and Iowa. A colleague at Iowa, Calvin Siehert, was materially helpful. T cant for science or policy and yet be insignificant statistically, ignored by the less thoughtful researchers. In the 1930s Jerzy Neyman and Egon S. Pearson, and then more explicitly Abraham Wald, argued that actual investigations should depend on substantive not merely statistical significance. In 1933 Neyman and Pearson wrote of type I and type II errors: HE IDEA OF Statistical significance is old, as old as Cicero writing on forecasts (Cicero, De Divinatione, 1. xiii. 23). In 1773 Laplace used it to test whether comets came from outside the solar system (Elizabeth Scott 1953, p. 20). The first use of the very word "significance" in a statistical context seems to be John Venn's, in 1888, speaking of differences expressed in units of probable error; Is it more serious to convict an innocent man or to acquit a guilty? That will depend on the consequences of the error; is the punishment death or fine; what is the danger to the community of released...
Words: 10019 - Pages: 41
...Copyright © 2001 CANdiensten. All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (CANdiensten, Nieuwpoortkade 25, 1055 RX Amsterdam, The Netherlands). ISBN 90-804652-2-4 Preface In September 1997 Insightful (formerly known as MathSoft) released S-PLUS 4 for Windows, which added a complete new graphical user interface to the existing S-PLUS programming environment. It allowed non-programming minded users to access the advanced visualization techniques and modern analysis methods of S-PLUS. As a result, S-PLUS has gained enormous popularity over the past years among applied statisticians and data analysts. Over the last three years the content of this book has been used for introductory courses on S-PLUS. This book is aimed at people who are (completely) new to S-PLUS. It covers the graphical user interface of S-PLUS and introduces the underlying S language. The main goal of this book is to get you started and the best way to do that is to use this book interactively during an S-PLUS session. After reading this book you should be able to import data into the system, do some data manipulation and data cleaning. Furthermore, you should be able to visualize your data, apply statistical functions to your data, enter basic S-PLUS commands and write functions. The first edition of this book appeared in 1999, it was aimed at users of S-PLUS 2000. This edition is updated for S-PLUS 6 (see Appendix C...
Words: 10252 - Pages: 42
...Statistics and Computing Series Editors: J. Chambers D. Hand W. H¨ rdle a Statistics and Computing Brusco/Stahl: Branch and Bound Applications in Combinatorial Data Analysis Chambers: Software for Data Analysis: Programming with R Dalgaard: Introductory Statistics with R, 2nd ed. Gentle: Elements of Computational Statistics Gentle: Numerical Linear Algebra for Applications in Statistics Gentle: Random Number Generation and Monte Carlo Methods, 2nd ed. H¨ rdle/Klinke/Turlach: XploRe: An Interactive Statistical a Computing Environment H¨ rmann/Leydold/Derflinger: Automatic Nonuniform Random o Variate Generation Krause/Olson: The Basics of S-PLUS, 4th ed. Lange: Numerical Analysis for Statisticians Lemmon/Schafer: Developing Statistical Software in Fortran 95 Loader: Local Regression and Likelihood Marasinghe/Kennedy: SAS for Data Analysis: Intermediate Statistical Methods ´ Ruanaidh/Fitzgerald: Numerical Bayesian Methods Applied to O Signal Processing Pannatier: VARIOWIN: Software for Spatial Data Analysis in 2D Pinheiro/Bates: Mixed-Effects Models in S and S-PLUS Unwin/Theus/Hofmann: Graphics of Large Datasets: Visualizing a Million Venables/Ripley: Modern Applied Statistics with S, 4th ed. Venables/Ripley: S Programming Wilkinson: The Grammar of Graphics, 2nd ed. Peter Dalgaard Introductory Statistics with R Second Edition 123 Peter Dalgaard Department of Biostatistics University of Copenhagen Denmark p.dalgaard@biostat.ku.dk ISSN: 1431-8784 ISBN: 978-0-387-79053-4 DOI:...
Words: 104817 - Pages: 420