Premium Essay

Crosstabulation & Chi Square

In:

Submitted By sachinrst
Words 3702
Pages 15
Crosstabulation & Chi Square
Robert S Michael

Chi-square as an Index of Association
After examining the distribution of each of the variables, the researcher’s next task is to look for relationships among two or more of the variables. Some of the tools that may be used include correlation and regression, or derivatives such as the t-test, analysis of variance, and contingency table (crosstabulation) analysis. The type of analysis chosen depends on the research design, characteristics of the variables, shape of the distributions, level of measurement, and whether the assumptions required for a particular statistical test are met. A crosstabulation is a joint frequency distribution of cases based on two or more categorical variables. Displaying a distribution of cases by their values on two or more variables is known as contingency table analysis and is one of the more commonly used analytic methods in the social sciences. The joint frequency distribution can be analyzed with the chi2 square statistic ( χ ) to determine whether the variables are statistically independent or if they are associated. If a dependency between variables does exist, then other indicators of association, such as Cramer’s V, gamma, Sommer’s d, and so forth, can be used to describe the degree which the values of one variable predict or vary with those of the other variable. More advanced techniques such as log-linear models and multinomial regression can be used to clarify the relationships contained in contingency tables.

Considerations:

Type of variables. Are the variables of interest continuous or discrete (e.g., categorical)? Categorical variables contain integer values that indicate membership in one of several possible categories. The range of possible values for such variables is limited, and whenever the range of possible values is relatively circumscribed, the distribution is

Similar Documents

Free Essay

Descriptive Statistics

...300 | 100.0% | 0 | .0% | 300 | 100.0% | Sex * Stock Trading | 300 | 100.0% | 0 | .0% | 300 | 100.0% | Sex * Chatting | 300 | 100.0% | 0 | .0% | 300 | 100.0% | Sex * News/Weather | 300 | 100.0% | 0 | .0% | 300 | 100.0% | Sex * Music Crosstab | | Music | Total | | 0 | 1 | | Sex | 1 | Count | 91 | 97 | 188 | | | % within Sex | 48.4% | 51.6% | 100.0% | | | % within Music | 65.0% | 60.6% | 62.7% | | 2 | Count | 49 | 63 | 112 | | | % within Sex | 43.8% | 56.3% | 100.0% | | | % within Music | 35.0% | 39.4% | 37.3% | Total | Count | 140 | 160 | 300 | | % within Sex | 46.7% | 53.3% | 100.0% | | % within Music | 100.0% | 100.0% | 100.0% | Chi-Square Tests | | Value | df | Asymp. Sig. (2-sided) | Exact Sig. (2-sided) | Exact Sig. (1-sided) | Pearson Chi-Square | .611a | 1 | .434 | | | Continuity Correctionb | .438 | 1 | .508 | | | Likelihood Ratio | .612 | 1 | .434 | | | Fisher's Exact Test | | | | .474 | .254 | Linear-by-Linear...

Words: 4191 - Pages: 17

Free Essay

Chi-Square Analysis

...Pearson Chi-Square significance value is less than .05 which means income can affect the probability that a person will eat at Hobbit’s Choice. Probable Hobbit’s patrons are more likely to make between $50,000 and 74,999 (93%) a year than non-probable patrons (7%). Income | Probable Patron | Non-Probable Patron | <$15,000 | 0% | 100% | $15,000 to 24,999 | 0% | 100% | $25,000 to 49,999 | 0% | 100% | $50,000 to 74,999 | 3% | 97% | $75,000 to 99,999 | 62.5% | 32.5% | $100,000 to 149,999 | 93% | 7% | $150,000+ | 84.8% | 15.2% | *Please see Appendix ____ for SPSS Output * The Pearson Chi-Square significance value is less than .05 which means that educational level has an effect on the probability that a person will be a patron of Hobbit’s Choice. In other words, level of education differentiates patrons from non-patrons. Probable Hobbit’s Choice patrons are more likely to have a Doctorate degree (77.8%) than non-patrons (22.2%). In fact, most/all (which one?) probable patrons have more than some college. 0% of survey respondents that list “no degree” are probable patrons. Educational Level | Probable Patron | Non-Probable Patron | Some College or Less | 0% | 100% | Associate Degree | 21.4% | 78.6% | Bachelor’s Degree | 27.7% | 72.3% | Master’s Degree | 39.5% | 60.5% | Doctorate Degree | 77.8% | 22.2% | *Please see Appendix ____ for SPSS Output * Gender does not differentiate patrons from non-patrons because its Pearson Chi-Square significance...

Words: 2269 - Pages: 10

Premium Essay

Marketing Researtch Study Guide

...relationship analysis a. For two interval/ratio variables – use correlation b. For two nominal/ordinal variables – use cross-tabs 4. Does a relationship exist? 5. If relationship exists, determine the direction a. Monotonic will be increasing/decreasing b. Nonmonotonic will be looking for a pattern 6. Assess the strength of relationship a. With correlation – size of coefficient denotes the strength b. With cross-tabs – the pattern is assessed Cross-Tabulations and Chi Square • Cross-tabulations o Consists of rows and columns defined by the categories classifying each variable. Used for nonmonotonic relationships o Sometimes referred to as an “r x c” table (rows x columns) ▪ Crosstabulation cell – intersection of a row and a column o Interested in inner cells to determine relationship before statistically testing ▪ Use the chi-square for statistical tests o Tables consist of four types of numbers in each cell: ▪ Frequency ▪ Raw percentage ▪ Column percentage ▪ Row percentage o When we have two nominal-scaled variables and we want to know if they are associated, we use cross-tabulations...

Words: 4307 - Pages: 18

Premium Essay

Stats

...maximum of 10 rows and 10 columns. f) T F Frequency graphs can determine the mode, Box & Whiskers does not. g) T F The birth data from the Anaheim Ducks and Los Angeles Kings proved Outliers was correct. h) T F For the Hypergeometric distribution the value of p changes each time an object is selected. i) T F Heights of adult males is a good example of the Poisson distribution. j) T F When children give their age, it’s continuous; for adults it’s integer. k) T F If a LUMAT template cell is colored, you can enter data or labels. l) T F The Box and Whiskers template gives indicators of data being normal, uniform or exponential. m) T F Goodness of Fit templates use the Chi-square distribution to give the probability of a fit. n) LUMAT stands for: Learning to Use Managerial Analysis Templates. o) The name of our Excel Training program is ExcelEverest. p) If the pieces of a pie chart in Excel add up to only...

Words: 1158 - Pages: 5

Free Essay

One Way Anova

...ONE WAY ANOVA One-way analysis of variance (abbreviated one-way ANOVA) is a technique used to compare means of two or more samples (using the F distribution). This technique can be used only for numerical data. The ANOVA tests the null hypothesis that samples in two or more groups are drawn from populations with the same mean values. To do this, two estimates are made of the population variance. These estimates rely on various assumptions. The ANOVA produces an F-statistic, the ratio of the variance calculated among the means to the variance within the samples. If the group means are drawn from populations with the same mean values, the variance between the group means should be lower than the variance of the samples, following the central limit theorem. A higher ratio therefore implies that the samples were drawn from populations with different mean values. Descriptives | | N | Mean | Std. Deviation | Std. Error | 95% Confidence Interval for Mean | Minimum | Maximum | | | | | | Lower Bound | Upper Bound | | | QUALITY | 1 | 19 | 3.89 | .809 | .186 | 3.50 | 4.28 | 2 | 5 | | 2 | 12 | 3.83 | .937 | .271 | 3.24 | 4.43 | 1 | 5 | | Total | 31 | 3.87 | .846 | .152 | 3.56 | 4.18 | 1 | 5 | PRICE | 1 | 19 | 2.95 | .911 | .209 | 2.51 | 3.39 | 1 | 5 | | 2 | 12 | 2.75 | 1.055 | .305 | 2.08 | 3.42 | 1 | 5 | | Total | 31 | 2.87 | .957 | .172 | 2.52 | 3.22 | 1 | 5 | BRAND | 1 | 19 | 4.11 | .809 | .186 | 3.72 | 4.50 | 3 | 5 | | 2 | 12 | 4.17 | .577 | .167...

Words: 1377 - Pages: 6

Premium Essay

No-Show Clinical Data Analytics

...No-show rates range between 15% to 30% in an ambulatory setting and lead to wasted resources, increased financial burdens and inaccurate or missed diagnoses of patients (Goldman et al., 1982). Previous studies have shown that various patient factors can predict future no-show behavior. For example, the type of appointment scheduled for a patient can predict patient absenteeism (Zeber, Pearson, & Smith, 2009). Zeber et al. found that colonoscopy appointments are the most commonly missed appointments (Zeber et al., 2009). Furthermore, previous missed appointments is one of the most significant predictors of no-show appointments (Dove & Schneider, 1981). Studies have also shown that patients’ various psychosocial diagnoses are indicators of missed appointments (Goldman et al., 1982). Patients diagnosed with at least one psychological diagnosis, including mood disorders, such as depression and bipolar disease, anxiety disorders, such as panic attacks and posttraumatic stress disorder, and thought disorders, such as schizophrenia and personality disorders, were more likely to miss appointments compared to patients without psychological diagnoses (Savageau et al., 2004). Finally, Perron et al. showed that patients with substance abuse disorders are more likely to miss appointments (Perron et al., 2010). In order to reduce no-show rates in a hospital gastrointestinal (GI) clinic this project analyzed potential indicators of missed appointments. Based on a conceptual model grouping...

Words: 1517 - Pages: 7

Premium Essay

Marketing Research Cases 14 and 15

...Case 14.1 1. Correlations | | Prefer Drive Less than 30 Minutes | Prefer Unusual Desserts | Prefer Large Variety of Entrees | Prefer Unusual Entrees | Prefer Drive Less than 30 Minutes | Pearson Correlation | 1 | .768** | .806** | .765** | | Sig. (2-tailed) | | .000 | .000 | .000 | | N | 400 | 400 | 400 | 400 | Prefer Unusual Desserts | Pearson Correlation | .768** | 1 | .823** | .868** | | Sig. (2-tailed) | .000 | | .000 | .000 | | N | 400 | 400 | 400 | 400 | Prefer Large Variety of Entrees | Pearson Correlation | .806** | .823** | 1 | .831** | | Sig. (2-tailed) | .000 | .000 | | .000 | | N | 400 | 400 | 400 | 400 | Prefer Unusual Entrees | Pearson Correlation | .765** | .868** | .831** | 1 | | Sig. (2-tailed) | .000 | .000 | .000 | | | N | 400 | 400 | 400 | 400 | **. Correlation is significant at the 0.01 level (2-tailed). | Null Hypothesis- No relation between preference to drive 30 minutes or less and preference of menu items Alternative Hypothesis- There is a relation between the preference to drive 30 minutes or less and preference of menu items Interpretation-All the correlations have sig values that are significantly different from zero. So, we reject the null hypothesis. The correlations are positive and they are in the moderate range. As the preference to drive 30 minutes or less increases, so do preferences for unusual deserts, large variety of entrees, and unusual entrees. Correlations | | Prefer Drive Less than 30 Minutes | Prefer...

Words: 3383 - Pages: 14

Free Essay

Chi Square

...CHI-SQUARE TEST Adapted by Anne F. Maben from "Statistics for the Social Sciences" by Vicki Sharp The chi-square (I) test is used to determine whether there is a significant difference between the expected frequencies and the observed frequencies in one or more categories. Do the number of individuals or objects that fall in each category differ significantly from the number you would expect? Is this difference between the expected and observed due to sampling error, or is it a real difference? Chi-Square Test Requirements 1. Quantitative data. 2. One or more categories. 3. Independent observations. 4. Adequate sample size (at least 10). 5. Simple random sample. 6. Data in frequency form. 7. All observations must be used. Expected Frequencies When you find the value for chi square, you determine whether the observed frequencies differ significantly from the expected frequencies. You find the expected frequencies for chi square in three ways: I . You hypothesize that all the frequencies are equal in each category. For example, you might expect that half of the entering freshmen class of 200 at Tech College will be identified as women and half as men. You figure the expected frequency by dividing the number in the sample by the number of categories. In this exam pie, where there are 200 entering freshmen and two categories, male and female, you divide your sample of 200 by 2, the number of categories, to get 100 (expected frequencies) in each category. 2. You determine the expected...

Words: 1536 - Pages: 7

Free Essay

Bio 205 Diversity

...TITLE Determining the diversity of plant species communities on the forest interior and forest edge on the UTM campus and testing the association of two plant species; garlic mustard and Crown Vetch INTRODUCTION The underlying purpose of this field exercise was to determine the two main components of species diversity; richness and evenness. Richness is defined as the number of species along a transect while evenness is the dominance or distribution of species. In the first part of the exercise, In order to measure the richness (species diversity), 20m line transects were laid in the forest interior and forest edge on the UTM campus and the total number of individuals of each plant species were recorded. In the second part of the exercise, to measure the the association of two plant species; Garlic mustard and Crown vetch was determined by looking for the presence or absence of the species within the hoop. The hypothesis for the first part of the experiment would be that there will be higher species diversity on the forest edge. The prediction for the first exercise will be that more number of individuals will be found on the forest edge along the transect (richness) and the distribution of species will be highly uneven on the forest edge. The hypothesis for the second...

Words: 1097 - Pages: 5

Free Essay

Variance Data Analysis

...Run Number | Bubble size | Vortex | Distribution of bubbles | Foam | Evenness of flow | 1 | Large | In Middle | Evenly distributed in the bulk with a lot present near baffles | Foam present at top | Even in the middle | 2 | Larger | Present | | | More turbulent | 3 | Smaller Bubble size | No vortex seen | | Lot less foam | It was noticed that one air bubbler was bubbling more air than the air. Discuss why this happened. | 4 | Small bubble size | Present | Much less bubble formation | Present at top | Turbulent | 5 | Medium sized bubbles. Bigger than run 3. | No vortex seen | Less bubble formation | Less foam | Bubbles were seen evenly generated from both bubblers | 6 | Medium sized | Present | Evenly distributed in the bulk with a lot present near baffles | Present at top | Turbulent | 7 | Large | No vortex seen | | No Foam | Bubbles were evenly formed from both bubblers | 5.3 Analysis of Variance in parameters based on pooled data Variance in data was observed in each set of experiments due to a number of factors. These factors include stirrer speed, oxygen content, and the different sizes in air stones. From the ANOVA illustrated in Figure 1, the sources X1, X2, X3, X4 are oxygen flow rate, stone size, stirrer speed, and groups respectively. It is seen that the largest variance occurs at the stirrer speed parameter, with a F value of 24.71 as compared to oxygen flow rate at 1.82, stone size at 1.98 and groups at 0.15. This shows that the stirrer speed...

Words: 433 - Pages: 2

Premium Essay

Whatever

...4/7/2014 Basic Statistics: An Overview Basic Statistics: Review  Descriptive Statistics  Scatter graph  Measures of central tendency  Mean  Median, quartile, deciles, percentile  Mode  Weighted mean  GM  HM  Measures of dispersion  Range,  IQR  Semi IQR  Mean deviation  Standard deviation  Variance  Coeff of variation   Inferential Statistics  Populations  Sampling  Estimation of Parameters   Point Estimation Interval Estimation Unbiased Minimum Variance Consistency Efficiency  Properties of Point Estimators      Statistical Inference: Hypothesis Testing    T test F test Chi square test   Measures of shape of the curve  Moments  Skewness  kurtosis Probability distributions  Normal Distribution  T-student Distribution  Chi-Square Distribution  F Distribution Index Number   Etc. Correlational Statistics  Covariance  Correlations  regressions 1 4/7/2014 Some Terminology  Variables are things that we measure, control, or  manipulate .They may be classified as: 1. Quantitative i.e. numerical  Continuous: takes fractional values ex. height in cm  Discrete : takes no fractional values ex. GDP  Random Variable: If the value of a variable cannot be  predicted in advance Non random : If the value of a variable cannot be  predicted in advance  Some Terminology 2. Qualitative i.e. non numerical 1. Nominal: Items are usually categorical and may have numbers...

Words: 1759 - Pages: 8

Free Essay

Stress on College Student-Athletes

...Stress amongst College Athletes & Non-Athletes Jeff Bennett University of Mount Union Abstract Athletic participation and academic stress has been a challenging topic and one that has made many question about. With all the extracurricular activities that students have, they don’t realize that their stress could be affecting them more than they may know. This study examines the stress level results amongst 10 college student athletes and 10 non-athletes. The results present a rare finding regarding stress amongst the students. Stress has become an on-going issue that has affected the college satisfaction of a student. Many people don’t realize that they have stress and it’s affecting them in some sort of way or some may know they have stress but don’t want to take any type of action towards getting help. Possible explanations for the findings and implications for stress amongst athletes and non-athletes of the University of Mount Union are provided. Introduction Stress has been a rising issue concerning not only college students but also college athletes. With all the extracurricular activities the students take on in college along with the academic part of it have led to very high stress level. Controlling time and having an organized schedule is the main fix to this problem. Whether you’re a college athlete or just a normal college student, everyone has stress or has experienced stress. The purpose of this study was to figure out whether or not there is a significant...

Words: 1088 - Pages: 5

Free Essay

Paper on Grass

...Experimental Design and Analysis of Variance Review: chi square = we want to know whether a data set fits a certain distribution/independence model. We use the chi square distribution, then we check how far away the test statistic is from 0. As data set becomes farther away from what you expect to get, you get larger differences between expected model and actual model (you get a larger test statistic) Components of ANOVA: Factor – independent variable. We want this variable to be qualitative. Classifications of the factor is called the treatments. (ex. Color of the light vs. response variable ie height of the plant. Light is qualitative, treatments are the kinds of lights such ash red, white, violet, green. In anova, the response variable must be quantitative. If not quantitative, then go back to chi square test) When we design an experiment, the factors are controlled by you. But sometimes some factors are difficult to control, and if we want to do an experiment on that we will have to just look at observational data. Example of this kind of factor is the weather. Regardless, usually to test whether a certain factor has an effect on a response variable, we do replication. We look at replicating the experiment on more units. The more the better. If we find differences between the growths (in the mongo seeds) we do not know if this is true for the whole population, so the more elements of sample we have the better. Gasoline Mileage Case: Factor: Gas Type. Treatments:...

Words: 630 - Pages: 3

Free Essay

Academic Performace

...Chapter IV Results and Discussion This chapter presents the data gathered from the randomly selected thirty (30) student assistants from the College of Education. who worked in school year 2011 - 2012. The main objective of the paper is to present the relationship between the academic performance of the respondents and their level of physical and emotional stress. Spefically, this paper aimed to present the following: I. Profile of the Respondents Table 1.1 Frequency Distribution of the Respondents in terms of their Gender Gender|Frequency|Percentage| Female|25|83.33| Male|5|16.67| Total|30|100| Table 1.1 presents the frequency distribution of the respondents in terms of their gender. Based on the result, twenty five (25) which comprises 83.33% of the respondents were female and five (5) or 16.67% of them were male. The result indicates that majority of the respondents were female. This makes the researcher to affirm the fact that female were more inclined in teaching profession than male. Table 1.2 Frequency Distribution Of the Respondents in terms of their Age Age Intervals |Frequency|Percentage| 17 y/o - 18 y/o |3|10| 19 y/o - 20 y/o |12|40| 21 y/o and above|15|50| Total|30|100| Table 1.2 presents the frequency distribution of the respondents in terms of their age. Based on the results, fifteen (15) of the Student assistants or 50% of them were 21 years old and above. There were twelve (12) of them or 40% of the students aged between 19 years old...

Words: 2078 - Pages: 9

Premium Essay

Jack Get by

...a manuscript (unless the p value is less than .001). Please pay attention to issues of italics and spacing. APA style is very precise about these. Also, with the exception of some p values, most statistics should be rounded to two decimal places. 
Mean and Standard Deviation are most clearly presented in parentheses: The sample as a whole was relatively young (M = 19.22, SD = 3.45). The average age of students was 19.22 years (SD = 3.45). 
Percentages are also most clearly displayed in parentheses with no decimal places: Nearly half (49%) of the sample was married. 
Chi-Square statistics are reported with degrees of freedom and sample size in parentheses, the Pearson chi-square value (rounded to two decimal places), and the significance level: The percentage of participants that were married did not differ by gender, χ2(1, N = 90) = 0.89, p = .35. 
T Tests are reported like chi-squares, but only the degrees of freedom are in parentheses. Following that, report the t statistic (rounded to two decimal places) and the significance level. There was a significant effect for gender, t(54) = 5.43, p < .001, with men receiving higher scores than women. 
ANOVAs (both one-way and two-way) are reported like the t test, but there are two degrees-of-freedom numbers to report. First report the between-groups degrees of freedom, then report the within-groups degrees of freedom (separated by a comma). After that report the F statistic (rounded off to two decimal places)...

Words: 570 - Pages: 3