...CHI-SQUARE TEST Adapted by Anne F. Maben from "Statistics for the Social Sciences" by Vicki Sharp The chi-square (I) test is used to determine whether there is a significant difference between the expected frequencies and the observed frequencies in one or more categories. Do the number of individuals or objects that fall in each category differ significantly from the number you would expect? Is this difference between the expected and observed due to sampling error, or is it a real difference? Chi-Square Test Requirements 1. Quantitative data. 2. One or more categories. 3. Independent observations. 4. Adequate sample size (at least 10). 5. Simple random sample. 6. Data in frequency form. 7. All observations must be used. Expected Frequencies When you find the value for chi square, you determine whether the observed frequencies differ significantly from the expected frequencies. You find the expected frequencies for chi square in three ways: I . You hypothesize that all the frequencies are equal in each category. For example, you might expect that half of the entering freshmen class of 200 at Tech College will be identified as women and half as men. You figure the expected frequency by dividing the number in the sample by the number of categories. In this exam pie, where there are 200 entering freshmen and two categories, male and female, you divide your sample of 200 by 2, the number of categories, to get 100 (expected frequencies) in each category. 2. You determine the expected...
Words: 1536 - Pages: 7
...Pearson Chi-Square significance value is less than .05 which means income can affect the probability that a person will eat at Hobbit’s Choice. Probable Hobbit’s patrons are more likely to make between $50,000 and 74,999 (93%) a year than non-probable patrons (7%). Income | Probable Patron | Non-Probable Patron | <$15,000 | 0% | 100% | $15,000 to 24,999 | 0% | 100% | $25,000 to 49,999 | 0% | 100% | $50,000 to 74,999 | 3% | 97% | $75,000 to 99,999 | 62.5% | 32.5% | $100,000 to 149,999 | 93% | 7% | $150,000+ | 84.8% | 15.2% | *Please see Appendix ____ for SPSS Output * The Pearson Chi-Square significance value is less than .05 which means that educational level has an effect on the probability that a person will be a patron of Hobbit’s Choice. In other words, level of education differentiates patrons from non-patrons. Probable Hobbit’s Choice patrons are more likely to have a Doctorate degree (77.8%) than non-patrons (22.2%). In fact, most/all (which one?) probable patrons have more than some college. 0% of survey respondents that list “no degree” are probable patrons. Educational Level | Probable Patron | Non-Probable Patron | Some College or Less | 0% | 100% | Associate Degree | 21.4% | 78.6% | Bachelor’s Degree | 27.7% | 72.3% | Master’s Degree | 39.5% | 60.5% | Doctorate Degree | 77.8% | 22.2% | *Please see Appendix ____ for SPSS Output * Gender does not differentiate patrons from non-patrons because its Pearson Chi-Square significance...
Words: 2269 - Pages: 10
...ANSWERS Process of Science (9.21) How Is the Chi-Square Test Used in Genetic Analysis? Lab Notebook Chi-Square test for Case 1 | | | | | | | |Phenotype |Observed No. (o) |Expected No. (e) |(o-e) |(o-e) 2 |(o-e) 2 | | | | | | |e | |Red eyes |31 |33 |2 |4 |0.1212 | |Sepia eyes |13 |11 |2 |4 |0.3636 | | |0.4848 | |(2 (to the nearest ten-thousandth) | | Questions 1. Why is it important to remove the adults in the parental generation? It is important to keep the generations separate so that you know you are crossing only F1 flies. 2. What generation will their offspring be? The new offspring are the F2 generation. 3. Based on the data obtained, is the cross in Case 1 monohybrid or dihybrid? Explain. The cross is monohybrid because only one trait –eye color– is involved...
Words: 449 - Pages: 2
...The goal of this exercise is to assess your understanding of the chi square test. Please read the problem carefully and answer the question. Good luck! Assume you have data below that displays the number of students who elect different undergraduate majors. Number of Students Selecting Different Majors | |Computer Sciences |English Literature | | | | |Pre-Med | | |Education |Engineering |Total | |50 |85 |25 |60 |80 |300 | We want to know whether those numbers differ due to chance. In other words, at 0.01 level of confidence, are some majors selected more often than others, or is the selection pattern essentially random? The null hypothesis is that the programs are equally preferred. Create a table that shows the computation of the Chi Square statistic [6 POINTS]. Use a decision rule to determine whether the null hypothesis is rejected or not [4 POINTS]. Solution: Ho: The majors are equally preferred (probability of liking each major = 1/5). HA: The majors are not equally preferred. (Using the Chi Square Statistic to evaluate to what extent the hypothesis and data have a good fit. [pic] Where, Oi is actual frequency observed in cell i ...
Words: 344 - Pages: 2
...Abstract: The purpose of my project is to find out two things about students at my school: 1. Is hair related to eye color? 2. Is favorite color related to favorite ice cream flavor? I took a survey of students, and used the chi square (χ2) statistic to see if the data is related. The χ2 statistic showed that hair color and eye color are related, but favorite color and favorite ice cream flavor are not related. Purpose: To use statistics to find out two things about students at my school: 1. Is hair related to eye color? 2. Is favorite color related to favorite ice cream flavor? Research: I chose this project because I wanted to learn more about probability and statistics. I can use statistics to answer a question about students at my school. χ2 is used to compare sets of descriptive data. Descriptive data are things like colors, flavors, names, and other things that cannot be described by just a number, like height or weight. I picked hair color and eye color because I thought they would be related. I wanted to test this. I picked favorite color and favorite ice cream flavor because I didn’t think they would be related. I wanted to test this also. Hypotheses: First Hypothesis: Eye color and hair color will be related. In statistical terms: Null Hypothesis (H0): There is no relationship between eye color and hair color. Alternative Hypothesis (HA): There...
Words: 1501 - Pages: 7
...Crosstabulation & Chi Square Robert S Michael Chi-square as an Index of Association After examining the distribution of each of the variables, the researcher’s next task is to look for relationships among two or more of the variables. Some of the tools that may be used include correlation and regression, or derivatives such as the t-test, analysis of variance, and contingency table (crosstabulation) analysis. The type of analysis chosen depends on the research design, characteristics of the variables, shape of the distributions, level of measurement, and whether the assumptions required for a particular statistical test are met. A crosstabulation is a joint frequency distribution of cases based on two or more categorical variables. Displaying a distribution of cases by their values on two or more variables is known as contingency table analysis and is one of the more commonly used analytic methods in the social sciences. The joint frequency distribution can be analyzed with the chi2 square statistic ( χ ) to determine whether the variables are statistically independent or if they are associated. If a dependency between variables does exist, then other indicators of association, such as Cramer’s V, gamma, Sommer’s d, and so forth, can be used to describe the degree which the values of one variable predict or vary with those of the other variable. More advanced techniques such as log-linear models and multinomial regression can be used to clarify the relationships contained...
Words: 3702 - Pages: 15
...Case 14.1 1. Correlations | | Prefer Drive Less than 30 Minutes | Prefer Unusual Desserts | Prefer Large Variety of Entrees | Prefer Unusual Entrees | Prefer Drive Less than 30 Minutes | Pearson Correlation | 1 | .768** | .806** | .765** | | Sig. (2-tailed) | | .000 | .000 | .000 | | N | 400 | 400 | 400 | 400 | Prefer Unusual Desserts | Pearson Correlation | .768** | 1 | .823** | .868** | | Sig. (2-tailed) | .000 | | .000 | .000 | | N | 400 | 400 | 400 | 400 | Prefer Large Variety of Entrees | Pearson Correlation | .806** | .823** | 1 | .831** | | Sig. (2-tailed) | .000 | .000 | | .000 | | N | 400 | 400 | 400 | 400 | Prefer Unusual Entrees | Pearson Correlation | .765** | .868** | .831** | 1 | | Sig. (2-tailed) | .000 | .000 | .000 | | | N | 400 | 400 | 400 | 400 | **. Correlation is significant at the 0.01 level (2-tailed). | Null Hypothesis- No relation between preference to drive 30 minutes or less and preference of menu items Alternative Hypothesis- There is a relation between the preference to drive 30 minutes or less and preference of menu items Interpretation-All the correlations have sig values that are significantly different from zero. So, we reject the null hypothesis. The correlations are positive and they are in the moderate range. As the preference to drive 30 minutes or less increases, so do preferences for unusual deserts, large variety of entrees, and unusual entrees. Correlations | | Prefer Drive Less than 30 Minutes | Prefer...
Words: 3383 - Pages: 14
...ONE WAY ANOVA One-way analysis of variance (abbreviated one-way ANOVA) is a technique used to compare means of two or more samples (using the F distribution). This technique can be used only for numerical data. The ANOVA tests the null hypothesis that samples in two or more groups are drawn from populations with the same mean values. To do this, two estimates are made of the population variance. These estimates rely on various assumptions. The ANOVA produces an F-statistic, the ratio of the variance calculated among the means to the variance within the samples. If the group means are drawn from populations with the same mean values, the variance between the group means should be lower than the variance of the samples, following the central limit theorem. A higher ratio therefore implies that the samples were drawn from populations with different mean values. Descriptives | | N | Mean | Std. Deviation | Std. Error | 95% Confidence Interval for Mean | Minimum | Maximum | | | | | | Lower Bound | Upper Bound | | | QUALITY | 1 | 19 | 3.89 | .809 | .186 | 3.50 | 4.28 | 2 | 5 | | 2 | 12 | 3.83 | .937 | .271 | 3.24 | 4.43 | 1 | 5 | | Total | 31 | 3.87 | .846 | .152 | 3.56 | 4.18 | 1 | 5 | PRICE | 1 | 19 | 2.95 | .911 | .209 | 2.51 | 3.39 | 1 | 5 | | 2 | 12 | 2.75 | 1.055 | .305 | 2.08 | 3.42 | 1 | 5 | | Total | 31 | 2.87 | .957 | .172 | 2.52 | 3.22 | 1 | 5 | BRAND | 1 | 19 | 4.11 | .809 | .186 | 3.72 | 4.50 | 3 | 5 | | 2 | 12 | 4.17 | .577 | .167...
Words: 1377 - Pages: 6
...| | | | | | | CROSSTABS VARIABLES ANALYZED | | | | | | | Row Variable ->> | Do you use Friendly Market regularly | | | | | | Column Variable ->> | I always pay cash. | | | | | | | | | | | | | | | | Observed Frequencies | | | | | | | | | Disagree | Neutral | Agree | Grand Total | | Statistical Values | No | 9 | 62 | 19 | 90 | | Chi Sq | df | Sig | Yes | 16 | 39 | 17 | 72 | | 5.38 | 2 | 0.07 | Grand Total | 25 | 101 | 36 | 162 | | | | | | | | | | | | | | There is NO significant association between these two variables. | | | | | (95% level of confidence) | | | | | | | | In total of 162 populations provide the answer for both questions, in those 17 peoples agrees the statement, 19 peoples not agreeing the statement, in total of 101 peoples giving neutral answers ,in that 62 peoples agrees the statement, 39 peoples not agreeing the statement. Using the chi square calculation, chi square values is 5.38, with degree of freedom 2, the significance of chi square value is 0.077 so, the null hypothesis is true therefore probability of 0.077(0.077%) case payment in the friendly market either in cash or credit card. Recommendation: The customer does not care about the mode of payment. Cross tabulation Analysis | | | | | | | | | | | | | | | | | | | | | | | | | | CROSSTABS VARIABLES ANALYZED | | | | | | | Row Variable...
Words: 1190 - Pages: 5
...300 | 100.0% | 0 | .0% | 300 | 100.0% | Sex * Stock Trading | 300 | 100.0% | 0 | .0% | 300 | 100.0% | Sex * Chatting | 300 | 100.0% | 0 | .0% | 300 | 100.0% | Sex * News/Weather | 300 | 100.0% | 0 | .0% | 300 | 100.0% | Sex * Music Crosstab | | Music | Total | | 0 | 1 | | Sex | 1 | Count | 91 | 97 | 188 | | | % within Sex | 48.4% | 51.6% | 100.0% | | | % within Music | 65.0% | 60.6% | 62.7% | | 2 | Count | 49 | 63 | 112 | | | % within Sex | 43.8% | 56.3% | 100.0% | | | % within Music | 35.0% | 39.4% | 37.3% | Total | Count | 140 | 160 | 300 | | % within Sex | 46.7% | 53.3% | 100.0% | | % within Music | 100.0% | 100.0% | 100.0% | Chi-Square Tests | | Value | df | Asymp. Sig. (2-sided) | Exact Sig. (2-sided) | Exact Sig. (1-sided) | Pearson Chi-Square | .611a | 1 | .434 | | | Continuity Correctionb | .438 | 1 | .508 | | | Likelihood Ratio | .612 | 1 | .434 | | | Fisher's Exact Test | | | | .474 | .254 | Linear-by-Linear...
Words: 4191 - Pages: 17
...maximum of 10 rows and 10 columns. f) T F Frequency graphs can determine the mode, Box & Whiskers does not. g) T F The birth data from the Anaheim Ducks and Los Angeles Kings proved Outliers was correct. h) T F For the Hypergeometric distribution the value of p changes each time an object is selected. i) T F Heights of adult males is a good example of the Poisson distribution. j) T F When children give their age, it’s continuous; for adults it’s integer. k) T F If a LUMAT template cell is colored, you can enter data or labels. l) T F The Box and Whiskers template gives indicators of data being normal, uniform or exponential. m) T F Goodness of Fit templates use the Chi-square distribution to give the probability of a fit. n) LUMAT stands for: Learning to Use Managerial Analysis Templates. o) The name of our Excel Training program is ExcelEverest. p) If the pieces of a pie chart in Excel add up to only...
Words: 1158 - Pages: 5
...Experimental Design and Analysis of Variance Review: chi square = we want to know whether a data set fits a certain distribution/independence model. We use the chi square distribution, then we check how far away the test statistic is from 0. As data set becomes farther away from what you expect to get, you get larger differences between expected model and actual model (you get a larger test statistic) Components of ANOVA: Factor – independent variable. We want this variable to be qualitative. Classifications of the factor is called the treatments. (ex. Color of the light vs. response variable ie height of the plant. Light is qualitative, treatments are the kinds of lights such ash red, white, violet, green. In anova, the response variable must be quantitative. If not quantitative, then go back to chi square test) When we design an experiment, the factors are controlled by you. But sometimes some factors are difficult to control, and if we want to do an experiment on that we will have to just look at observational data. Example of this kind of factor is the weather. Regardless, usually to test whether a certain factor has an effect on a response variable, we do replication. We look at replicating the experiment on more units. The more the better. If we find differences between the growths (in the mongo seeds) we do not know if this is true for the whole population, so the more elements of sample we have the better. Gasoline Mileage Case: Factor: Gas Type. Treatments:...
Words: 630 - Pages: 3
...No-show rates range between 15% to 30% in an ambulatory setting and lead to wasted resources, increased financial burdens and inaccurate or missed diagnoses of patients (Goldman et al., 1982). Previous studies have shown that various patient factors can predict future no-show behavior. For example, the type of appointment scheduled for a patient can predict patient absenteeism (Zeber, Pearson, & Smith, 2009). Zeber et al. found that colonoscopy appointments are the most commonly missed appointments (Zeber et al., 2009). Furthermore, previous missed appointments is one of the most significant predictors of no-show appointments (Dove & Schneider, 1981). Studies have also shown that patients’ various psychosocial diagnoses are indicators of missed appointments (Goldman et al., 1982). Patients diagnosed with at least one psychological diagnosis, including mood disorders, such as depression and bipolar disease, anxiety disorders, such as panic attacks and posttraumatic stress disorder, and thought disorders, such as schizophrenia and personality disorders, were more likely to miss appointments compared to patients without psychological diagnoses (Savageau et al., 2004). Finally, Perron et al. showed that patients with substance abuse disorders are more likely to miss appointments (Perron et al., 2010). In order to reduce no-show rates in a hospital gastrointestinal (GI) clinic this project analyzed potential indicators of missed appointments. Based on a conceptual model grouping...
Words: 1517 - Pages: 7
...4/7/2014 Basic Statistics: An Overview Basic Statistics: Review Descriptive Statistics Scatter graph Measures of central tendency Mean Median, quartile, deciles, percentile Mode Weighted mean GM HM Measures of dispersion Range, IQR Semi IQR Mean deviation Standard deviation Variance Coeff of variation Inferential Statistics Populations Sampling Estimation of Parameters Point Estimation Interval Estimation Unbiased Minimum Variance Consistency Efficiency Properties of Point Estimators Statistical Inference: Hypothesis Testing T test F test Chi square test Measures of shape of the curve Moments Skewness kurtosis Probability distributions Normal Distribution T-student Distribution Chi-Square Distribution F Distribution Index Number Etc. Correlational Statistics Covariance Correlations regressions 1 4/7/2014 Some Terminology Variables are things that we measure, control, or manipulate .They may be classified as: 1. Quantitative i.e. numerical Continuous: takes fractional values ex. height in cm Discrete : takes no fractional values ex. GDP Random Variable: If the value of a variable cannot be predicted in advance Non random : If the value of a variable cannot be predicted in advance Some Terminology 2. Qualitative i.e. non numerical 1. Nominal: Items are usually categorical and may have numbers...
Words: 1759 - Pages: 8
...a manuscript (unless the p value is less than .001). Please pay attention to issues of italics and spacing. APA style is very precise about these. Also, with the exception of some p values, most statistics should be rounded to two decimal places. Mean and Standard Deviation are most clearly presented in parentheses: The sample as a whole was relatively young (M = 19.22, SD = 3.45). The average age of students was 19.22 years (SD = 3.45). Percentages are also most clearly displayed in parentheses with no decimal places: Nearly half (49%) of the sample was married. Chi-Square statistics are reported with degrees of freedom and sample size in parentheses, the Pearson chi-square value (rounded to two decimal places), and the significance level: The percentage of participants that were married did not differ by gender, χ2(1, N = 90) = 0.89, p = .35. T Tests are reported like chi-squares, but only the degrees of freedom are in parentheses. Following that, report the t statistic (rounded to two decimal places) and the significance level. There was a significant effect for gender, t(54) = 5.43, p < .001, with men receiving higher scores than women. ANOVAs (both one-way and two-way) are reported like the t test, but there are two degrees-of-freedom numbers to report. First report the between-groups degrees of freedom, then report the within-groups degrees of freedom (separated by a comma). After that report the F statistic (rounded off to two decimal places)...
Words: 570 - Pages: 3