Free Essay

Multivariate Analysis

In:

Submitted By priyanshigupta
Words 6778
Pages 28
Multivariate Discriminant Analysis
Priyanshi Gupta

An Overview
 MDA is a statistical technique used to classify an observation into one of the several a priori groupings dependent on the observation’s individual characteristics. It is used primarily to classify and/or make predictions in the problems where dependent variable comes in qualitative form, for example, male or female, bankrupt or non-bankrupt etc.  So the first step is to establish explicit group classifications. We have got observations coming from k groups. We are trying to look at what is the best way or best function in order to discriminate observations coming from different groups.

 Once such function is in place, we go to classification which basically is the problem of classification of a new observation into appropriate population using the discriminant function.
 So typically in such problems, once you have a set of data (called LEARNING set of data) with observations possibly coming from different populations are pre-classified, having predefined memberships to the groups. And based on the particular previously classified data, we create a discriminant function and can use it after proper calibration to classify a new observation to be coming from one of the groups.  Discriminant analysis is used when groups are known a priori.

Types of DA Problems

 2 Group Problems...
…regression can be used

 k-Group Problem (where k>=2)...
…regression cannot be used if k>2

Example of a 2-Group DA Problem: ACME Manufacturing
 All employees of ACME manufacturing are given a pre-employment test measuring mechanical and verbal aptitude.

 Each current employee has also been classified into one of two groups: satisfactory or unsatisfactory.
 We want to determine if the two groups of employees differ with respect to their test scores.  If so, we want to develop a rule for predicting whether new applicants will be satisfactory or unsatisfactory.
Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, A Practical Introduction to Management Science 5th edition

Graph of Data for Current Employees
45
Group 1 centroid

Verbal Aptitude

40

Group 2 centroid

C1

35

C2

30
Satisfactory Employees Unsatisfactory Employees

25 25 30 35 40 45 50

Mechanical Aptitude

Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, A Practical Introduction to Management Science 5th edition

Discriminant analysis
 Discriminant analysis is used to analyze relationships between a non-metric dependent variable and metric or dichotomous independent variables.

 Discriminant analysis attempts to use the independent variables to distinguish among the groups or categories of the dependent variable.
 Discriminant analysis – creates an equation which will minimize the possibility of misclassifying cases into their respective groups or categories.  The usefulness of a discriminant model is based upon its accuracy rate, or ability to predict the known group memberships in the categories of the dependent variable. For this we make use of the HOLD-OUT sample.

Objectives of Discriminant Analysis
 Determining which of the independent variables account the most for the differences in the average score profiles of the two or more groups.

 Establishing procedures for classifying statistical units (individuals or objects) into groups on the basis of their scores on a set of independent variables.
 Establishing the number and composition of the dimensions of discrimination between groups formed from the set of independent variables.

Assumptions of Discriminant Analysis
 The observations are a random sample
 Multivariate normality: Independent variables are normal for each level of the grouping variable.  Homogeneity of variance/covariance (homoscedasticity): Variances among group variables are the same across levels of predictors. Can be tested with Box's M statistic.  Multicollinearity: Predictive power can decrease with an increased correlation between predictor variables.  Independence: Participants are assumed to be randomly sampled, and a participant’s score on one variable is assumed to be independent of scores on that variable for all other participants.

Discriminant functions
 It is similar to regression analysis

 A discriminant score can be calculated based on the weighted combination of the independent variables
 Di= a + b1x1i + b2x2i +…+ bnxni  D is predicted score (discriminant score), x is predictor and b is discriminant coefficient

 MDA sets variate’s weights to maximize between-group variance relative to within-group variance
 If group size is equal, the cut-off is mean score.  If group size is not equal, the cut-off is calculated from weighted means. Conceptually, we can think of the discriminant function or equation as defining the boundary between groups.

 Discriminant scores are standardized, so that if the score falls on one side of the boundary (standard score less than zero, the case is predicted to be a member of one group) and if the score falls on the other side of the boundary (positive standard score), it is predicted to be a member of the other group.

Number of functions
 If the dependent variable defines two groups, one statistically significant discriminant function is required to distinguish the groups; if the dependent variable defines three groups, two statistically significant discriminant functions are required to distinguish among the three groups; etc.

 If a discriminant function is able to distinguish among groups, it must have a strong relationship to at least one of the independent variables.

 The number of possible discriminant functions in an analysis is limited to the smaller of the number of independent variables or one less than the number of groups defined by the dependent variable.

Discriminant scores
 The aim of the statistical analysis in DA is to combine (weight) the variable scores in some way so that a single new composite variable, the discriminant score, is produced. We can obtain a Discriminant score for each observation.  One way of thinking about this is in terms of a food recipe, where changing the proportions (weights) of the ingredients will change the characteristics of the finished cakes. Hopefully the weighted combinations of ingredients will produce two different types of cake. Discriminant analysis works by creating a new variable called the discriminant function score which is used to predict to which group a case belongs.  Discriminant function scores are computed similarly to factor scores, i.e. using eigenvalues. The computations find the coefficients for the independent variables that maximize the measure of distance between the groups defined by the dependent variable.  The discriminant function is similar to a regression equation in which the independent variables are multiplied by coefficients and summed to produce a score.

Wilks’ Lambda
 In the first step, an F-test (Wilks’ Lambda) is used to test if the discriminant model as a whole is significant. A significant lambda means one can reject the null hypothesis that the two groups have the same mean discriminant function scores and conclude the model is discriminating.  Wilks' lambda is a statistic used as a measure of the class centres separation. It is used for testing the identity of the population means.  Therefore, Wilks’ lambda plays the same role in the multivariate domain as Fisher's F for (univariate) ANOVA.

 Definition of Wilks lambda: Wilks' lambda is defined as the proportion of the group variances which is not explained by the response variable (that identifies the classes) in the classical scheme of variance decomposition. It is therefore the ratio of :
* The intra-class variance, and * The total variance. Note the difference with ANOVA's F statistic, which is the ratio of the Explained Sum of Squares to the Residual Sum of Squares.

 Wilks' lambda is therefore a number between 0 and 1. If only a small fraction of the total inertia is not explained by the existence of groups, then these groups are well separated, and their means are significantly different. Hence :  A small (close to 0) value of Wilks' lambda means that the groups are well separated.  A large (close to 1) value of Wilks' lambda means that the groups are poorly separated.

Wilks’ Lambda
 Wilks test: The Wilks’ lambda is known under the following assumptions:  All variables are normally distributed,  Classes have identical covariance matrices  Classes have identical means,  Software often display only the p-value of the test statistic rather than the value of Wilks' lambda. Software sometimes display the value of Wilks' lambda for each and every individual independent variable. These values may then be regarded as measuring the discriminant power of the corresponding variable.  Wilks' test and variable selection: Wilk's lambda may also be used for variable selection in Discriminant Analysis. It is possible to build a statistic that is approximately F distributed, and which is a function of the Wilks' lambdas pertaining to :  A given subset of variables,  And that same subset to which a new variable has been added.  An F test is the used for identifying which new variable will most increase the group separation. This variable is the added to the model.

Good and Poor Distributions
 Since we have assumed that the independent variables have normal distribution, at the end of the DA process, it is hoped that each group will have a normal distribution of discriminant scores.  The degree of overlap between the discriminant score distributions can then be used as a measure of the success of the technique.  The top two distributions in the figure overlap too much and do not discriminate too well compared to the bottom set. Misclassification will be minimal in the lower pair, whereas many will be misclassified in the top pair

Discriminant analysis and classification
 Discriminant analysis consists of two stages: in the first stage, the discriminant functions are derived; in the second stage, the discriminant functions are used to classify the cases.  While discriminant analysis does compute correlation measures to estimate the strength of the relationship, these correlations measure the relationship between the independent variables and the discriminant scores.  A more useful measure to assess the utility of a discriminant model is classification accuracy, which compares predicted group membership based on the discriminant model to the actual, known group membership which is the value for the dependent variable.

A Classification Rule
 Compute the distance from the point in question to the centroid of each group. Assign it to the closest group.  If an observation’s discriminant score is less than or equal to some cutoff value, then assign it to group 1; otherwise assign it to group 2  What should the cutoff value be?

Cut-Off Point
 Choice of Cut-off point depends on
 Importance of correct classification,  Cost of misclassification  Prevalence (the lower the prevalence, the higher the proportion of false positives among the positive results)

 The accuracy of classification largely depends upon the selection of “optimal” cut-off point. Traditionally the cut-off point determined in studies was arbitrary, for example 0.5. This lacked theoretical justifications.

Cutoff Value
 For data that is multivariate-normal with equal covariances, the optimal cutoff value is:

Z1  Z 2 Cutoff Value = 2
 Even when the data is not multivariate-normal, this cutoff value tends to give good results.

A Refined Cutoff Value

 Costs of misclassification may differ.  Probability of group memberships may differ.

 The following refined cutoff value accounts for these considerations:
Sp  p 2 C(1 | 2)  Z1  Z2 Cutoff Value =  LN   p C(2 | 1)   2 Z1  Z2  1 
2

Cliff T. Ragsdale, Spreadsheet Modeling & Decision Analysis, A Practical Introduction to Management Science 5th edition

Problem of Misclassification
 Classification models can err in two ways. In the bankrupt, non-bankrupt firms example:  First, the model could indicate low probability of bankruptcy when, in fact, the risk is high. This is referred to as a Type I error. The cost of this error to a creditor would be the loss of interest and principal through default. In addition, creditors could incur recovery costs in a bankruptcy proceeding.  On the flip side, a model could assign high risk of bankruptcy to a low-risk firm. The resulting Type II error cost includes the opportunity cost of not lending to a good credit and lost profits. For investors in the firm, the error cost may include the premature sale of securities at distressed prices.  Altman et al. (1977) provide evidence of an asymmetric cost structure, with an estimate of a Type I error cost that is higher than Type II error cost. The cost of a Type I error was estimated from the loan loss experience of banks, and the cost of a Type II error was the opportunity cost of not lending to a nonbankrupt firm because it was predicted to become bankrupt.  Altman et al. (1977) proposed the ZETA model, which achieves lower Type I error than Altman’s (1968) MDA formulation

Efforts to calculate “Optimal Cut-Off Point” for bankruptcy models
 Joy and Tollefson (1975), Altman and Eisenbeis (1978) and Altman et al. (1977) calculated the optimal cut off point using the ZETA model. Two elements in the calculation were identified. In the selection of the optimal cut-off score of the estimated model, two things should be considered:  the prior probabilities of belonging to the failing or non-failing group (i.e. population) and  the costs of a type I and a type II error  Later, Maddala(1983) developed another optimal cut off point equation:

 There have been many more attempts to calculate the optimal cut off score, however, a fixed cut-off probability that can be used in all kinds of institutional arrangements in different countries and in all times, does not exist.
Source: Kuo H, Lee C, Lin L, Piesse J. Chapter 22, Encyclopedia of Finance (2006), Springer Science+ Business Media Inc.

Some Real Life Examples
 Loan classification problem: We want to classify a new application into one of these potential risk class to decide whether or not grant loan to that individual or not. Based on past experience on the types of applications, we build up a discriminant function. We will then decide whether it is a potentially low risk or high risk application.  Warning or Alert systems for financial crises or for extreme events: in bankruptcy prediction, credit card fraud, currency crises. Looking at the present state of a firm, one tries to classify the state into the particular class of distress.  Medical Diagnostics-constant monitoring of patients on certain health parameters and then classify the state of patients health condition into critical and non-critical.

 Predicting success or failure of a new product.

Conducting Discriminant Analysis
Formulate the Model Estimate the Discriminant Function Coefficients Determine the Significance of the Discriminant Function Interpret the Results

Assess Validity of Discriminant Analysis
 Source: Malhotra ,N. (2007). Marketing Research: An Applied Orientation. Prentice Hall

Overall test of relationship
 The overall test of relationship among the independent variables and groups defined by the dependent variable is a series of tests that each of the functions needed to distinguish among the groups is statistically significant.

 In some analyses, we might discover that two or more of the groups defined by the dependent variable cannot be distinguished using the available independent variables. While it is reasonable to interpret a solution in which there are fewer significant discriminant functions than the maximum number possible, our problems will require that all of the possible discriminant functions be significant.

SPSS Activity: Discriminant Analysis
 A large international air carrier has collected data on employees in three different job classifications: 1) customer service personnel, 2) mechanics and 3) dispatchers. The director of Human Resources wants to know if these three job classifications appeal to different personality types. Each employee is administered a battery of psychological test which include measures of interest in outdoor activity, sociability and conservativeness.  SOURCE: www.ats.ucla.edu

Steps in SPSS
 Analyse >> Classify >> Discriminant

 Select ‘JOB’ as your grouping variable and enter it into the Grouping Variable Box
 Click Define Range button and enter the lowest and highest code for your groups (here it is 1 and 3)  Click Continue.  Select your predictors (PV’s) and enter into Independents box and select .

 Enter Independents Together. If you planned a stepwise analysis you would at this point select Use Stepwise Method and not the previous instruction.
** If the set of predictor variable(PVs) is smaller or the objective is to simply determine discriminating capabilities of the entire set of PVs with no regard to the impact of an individual PV, then simultaneous approach (Independents together) is used.)** **When you have a lot of predictors, the stepwise method can be useful by automatically selecting the "best" variables to use in the model. **  Click on Statistics button and select Means, Univariate Anovas, Box’s M, Unstandardized and Within-Groups Correlation

Specifying statistical output
First, mark the Means checkbox on the Descriptives panel. We will use the group means in our interpretation.

Second, mark the Univariate ANOVAs checkbox on the Descriptives panel. Perusing these tests suggests which variables might be useful descriminators.

Third, mark the Box’s M checkbox. Box’s M statistic evaluates conformity to the assumption of homogeneity of group variances.

Fourth, click on the Continue button to close the dialog box.

Details for classification - 1
First, mark the option button to Compute from group sizes on the Prior Probabilities panel. This incorporates the size of the groups defined by the dependent variable into the classification of cases using the discriminant functions.

Second, mark the Casewise results checkbox on the Display panel to include classification details for each case in the output.

Third, mark the Summary table checkbox to include summary tables comparing actual and predicted classification.

Details for classification - 2

Fourth, mark the Leave-one-out classification checkbox to request SPSS to include a cross-validated classification in the output. This option produces a less biased estimate of classification accuracy by sequentially holding each case out of the calculations for the discriminant functions, and using the derived functions to classify the case held out.

Details for classification - 3

Fifth, accept the default of Within-groups option button on the Use Covariance Matrix panel. The Covariance matrices are the measure of the dispersion in the groups defined by the dependent variable. If we fail the homogeneity of group variances test (Box’s M), our option is use Separate groups covariance in classification.

Seventh, click on the Continue button to close the dialog box.

Sixth, mark the Combined-groups checkbox on the Plots panel to obtain a visual plot of the relationship between functions and groups defined by the dependent variable.

Groups, functions, and variables
 To interpret the relationship between an independent variable and the dependent variable, we must first identify how the discriminant functions separate the groups, and then the role of the independent variable is for each function.

 SPSS provides a table called "Functions at Group Centroids" (multivariate means) that indicates which groups are separated by which functions.

 SPSS provides another table called the "Structure Matrix" which, like its counterpart in factor analysis, identifies the loading, or correlation, between each independent variable and each function. This tells us which variables to interpret for each function. Each variable is interpreted on the function that it loads most highly on.

Which independent variables to interpret
 In a simultaneous discriminant analysis, in which all independent variables are entered together, we only interpret the relationships for independent variables that have a loading of 0.30 or higher one or more discriminant functions. A variable can have a high loading on more than one function, which complicates the interpretation. We will interpret the variable for the function on which it has the highest loading.

 In a stepwise discriminant analysis, we limit the interpretation of relationships between independent variables and groups defined by the dependent variable to those independent variables that met the statistical test for inclusion in the analysis.

Comparing accuracy rates
 To characterize our model as useful, we compare the cross-validated accuracy rate produced by SPSS to 25% more than the proportional by chance accuracy.

 The cross-validated accuracy rate is a one-at-a-time hold out method that classifies each case based on a discriminant solution for all of the other cases in the analysis. It is a more realistic estimate of the accuracy rate we should expect in the population because discriminant analysis inflates accuracy rates when the cases classified are the same cases used to derive the discriminant functions.

 Cross-validated accuracy rates are not produced by SPSS when separate covariance matrices are used in the classification, which we address more next week.

Table 1
A alysis Case Pro cessin g Summary n Unweighted Cases Valid Excluded Missing or out-of -range group codes At least one miss ing disc riminating v ariable Both miss ing or out -of -range group codes and at least one m issing disc riminating v ariable Tot al Tot al N 138 7 115 Percent 51. 1 2. 6 42. 6

10 132 270

3. 7 48. 9 100.0

The minimum ratio of valid cases to independent variables for discriminant analysis is 5 to 1, with a preferred ratio of 20 to 1. In this analysis, there are 138 valid cases and 4 independent variables. The ratio of cases to independent variables is 34.5 to 1, which satisfies the minimum requirement. In addition, the ratio of 34.5 to 1 satisfies the preferred ratio of 20 to 1.

Assumption of equal dispersion for dependent variable groups
In discriminant analysis, the best measure of overall fit is classification accuracy. The appropriateness of using the pooled covariance matrix in computing classifications is evaluated by the Box's M statistic.

We examine the probability of the Box's M statistic to determine whether or not we meet the assumption of equal dispersion of the dispersion or covariance matrices (multivariate measure of variance). This test is very sensitive, so we should select a conservative alpha value of 0.01. At that alpha level, we fail to reject the null hypothesis for this analysis.
Had we failed this test, our remedy would be to re-run the discriminant analysis requesting the use of separate covariance matrices in classification.

Table 2

Pri or Pro babi lities fo r Grou p s Cas es Us ed in Analy sis Unweighted Weighted 56 56. 000 49 49. 000 32 32. 000 137 137.000

WELF ARE 1 TOO LITTLE 2 ABOUT RIGHT 3 TOO MUCH Tot al

Prior .409 .358 .234 1. 000

In addition to the requirement for the ratio of cases to independent variables, discriminant analysis requires that there be a minimum number of cases in the smallest group defined by the dependent variable. The number of cases in the smallest group must be larger than the number of independent variables, and preferably contain 20 or more cases. The number of cases in the smallest group in this problem is 32, which is larger than the number of independent variables (4), satisfying the minimum requirement. In addition, the number of cases in the smallest group satisfies the preferred minimum of 20 cases.

NUMBER OF DISCRIMINANT FUNCTIONS - 1
The maximum possible number of discriminant functions is the smaller of one less than the number of groups defined by the dependent variable and the number of independent variables. In this analysis there were 3 groups defined by opinion about spending on welfare and 4 independent variables, so the maximum possible number of discriminant functions was 2.

NUMBER OF DISCRIMINANT FUNCTIONS - 2
In the table of Wilks' Lambda which tested functions for statistical significance, the stepwise analysis identified 2 discriminant functions that were statistically significant. The Wilks' lambda statistic for the test of function 1 through 2 functions (chisquare=21.853) had a probability of 0.001 which was less than or equal to the level of significance of 0.05.

After removing function 1, the Wilks' lambda statistic for the test of function 2 (chi-square=7.074) had a probability of 0.029 which was less than or equal to the level of significance of 0.05. The significance of the maximum possible number of discriminant functions supports the interpretation of a solution using 2 discriminant functions.

Independent variables and group membership: relationship of functions to groups
In order to specify the role that each independent variable plays in predicting group membership on the dependent variable, we must link together the relationship between the discriminant functions and the groups defined by the dependent variable, the role of the significant independent variables in the discriminant functions, and the differences in group means for each of the variables.

F unctio ns at Group Cen tro id s F unct ion WELF ARE 1 2 3 1 -. 220 .446 -. 311 2 .235 -. 031 -. 362

Uns tandardized canonical discrim inant f unct ions ev aluated at group means

Function 1 separates survey respondents who thought we spend about the right amount of money on welfare (the positive value of 0.446) from survey respondents who thought we spend too much (negative value of -0.311) or little money (negative value of -0.220) on welfare.

Function 2 separates survey respondents who thought we spend too little money on welfare (positive value of 0.235) from survey respondents who thought we spend too much money (negative value of -0.362) on welfare. We ignore the second group (-0.031) in this comparison because it was distinguished from the other two groups by function 1.

Independent variables and group membership: which predictors to interpret a,b,c,d Variab les Entered /Removed

Min. D Squared Between Groups Exact F Stat When we use the stepwise method of variable inclusion, we limit our interpretation of is tic df 1 df 2 Sig.

Step 1

2

3

Entered NUMBER OF HOURS WORKED LAST WEEK R SELF-EM P OR WORKS F OR SOMEBO DY HIGHEST Y EAR OF SCHOOL COMPLE TED

Stat is tic

independent variable predictors to those listed as statistically significant in the table of Variables Entered/Removed. We will interpret the impact on membership in groups defined by the dependent variable by the independent variables: •number of hours worked in the past week •self-employment. •highest year of school completed
.475 1 135.000 .492

.023

1 and 3

.251

1 and 2

3. 289

2

134.000

.040

.364

1 and 3

2. 433

3

133.000

.068

At each step, t he v ariable t hat maximizes the Mahalanobis distance between the t wo closest simultaneous entry of all Had we use groups is entered. variables, we would not have imposed a. Max im um number of steps is 8. this limitation. b. Max im um signif icance of F to ent er is .05. c. Minim um signif icance of F to rem ov e is .10. d.

Independent variables and group membership: predictor loadings on functions
We do not interpret loadings in the structure matrix unless they are 0.30 or higher.
Structu re Matri x F unct ion 1 HIGHEST Y EAR OF SCHOOL COMPLETED NUMBER OF HOUR S WORKED LAST WEEK R SELF -EMP OR WORKS F OR SOMEBOD Y a RESPONDENTS I NCOME .687* -. 582* .223 .101 2 .136 .345 .889* .292*

Pooled wit hin-groups correlat ions between discriminating v ariables and st andardized c anonic al disc riminant f unctions Variables ordered by absolut e size of correlat ion within f unct ion. *. Largest abs olute correlat ion between each v Based on the structure matrix, the ariable and predictor variable strongly associated any discrim Based on the structure matrix, the predictor variablesinant f unct ion

strongly associated with discriminant function 1 ariable not used in the analy sis. a. This v which distinguished between survey respondents who thought we spend about the right amount of money on welfare and survey respondents who thought we spend too much or little money on welfare were number of hours worked in the past week (r=-0.582) and highest year of school completed (r=0.687).

with discriminant function 2 which distinguished between survey respondents who thought we spend too little money on welfare and survey respondents who thought we spend too much money on welfare was self-employment (r=0.889).

Independent variables and group membership: predictors associated with first function - 1
Gro up Statistics Valid N (listwise) Unweighted Weighted

WELF ARE 1 TOO LITTLE

Mean 43. 96 13. 73 1. 93 13. 70 37. 90 14. 78 1. 90 14. 00 42. 03 13. 38 1. 75 14. 75 41. 32 14. 03

Std. Dev iation

NUMBER OF HOUR S WORKED LAST WEEK HIGHEST Y EAR OF SCHOOL COMPLETED R SELF -EMP OR WORKS F OR SOMEBODY RESPONDENTS I NCOME 2 ABOUT RIGHT NUMBER OF HOUR S WORKED LAST WEEK HIGHEST Y EAR OF SCHOOL COMPLETED R SELF -EMP OR WORKS F OR SOMEBODY RESPONDENTS I NCOME 3 TOO MUCH NUMBER OF HOUR S WORKED LAST WEEK HIGHEST Y EAR OF SCHOOL COMPLETED R SELF -EMP OR WORKS F OR SOMEBODY RESPONDENTS I NCOME Tot al NUMBER OF HOUR S WORKED LAST WEEK HIGHEST Y EAR OF SCHOOL COMPLETED

13. 240 56 56. 000 week for survey respondents who thought we

The average number of hours worked in the past spend about the right amount of money on welfare of hours worked in the past weeks for survey

2. 401 56 56. than (mean=37.90) was lower 000 the average number .260 56 56. we respondents who thought 000 spend too little

money on welfare (mean=43.96) and survey 5. 034 56 56. 000
13. 235 2. 558

respondents who thought we spend too much 50 50. 000 money on welfare (mean=42.03).
50 50. 000

This supports the relationship that "survey respondents who thought we spend about the right .303 50 50. 000 amount of money on welfare worked fewer hours in 5. 503 50. 000 the past week50 than survey respondents who thought we spend too little or much money on 10. 456 32 32. 000 welfare."
2. 524 .440 5. 304 12. 846 2. 537 32 32 32 138 138 32. 000 32. 000 32. 000 138.000 138.000

Independent variables and group membership: predictors associated with first function - 2
Gro up Statistics Valid N (listwise) Unweighted Weighted

WELF ARE 1 TOO LITTLE

Mean 43. 96 13. 73 1. 93 13. 70 37. 90 14. 78 1. 90 14. 00 42. 03 13. 38 1. 75 14. 75 41. 32 14. 03

Std. Dev iation

NUMBER OF HOUR S WORKED LAST WEEK HIGHEST Y EAR OF SCHOOL COMPLETED R SELF -EMP OR WORKS F OR SOMEBODY RESPONDENTS I NCOME 2 ABOUT RIGHT NUMBER OF HOUR S WORKED LAST WEEK HIGHEST Y EAR OF SCHOOL COMPLETED R SELF -EMP OR WORKS F OR SOMEBODY RESPONDENTS I NCOME 3 TOO MUCH NUMBER OF HOUR S WORKED LAST WEEK HIGHEST Y EAR OF SCHOOL COMPLETED R SELF -EMP OR WORKS F OR SOMEBODY RESPONDENTS I NCOME Tot al NUMBER OF HOUR S WORKED LAST WEEK HIGHEST Y EAR OF SCHOOL COMPLETED

The average highest year 000 13. 240 56 56. of school completed for

survey respondents who thought we spend about the right amount of money on welfare 2. 401 56 56. 000 (mean=14.78) was higher than the average highest year of school 56. 000 completeds for survey .260 56 respondents who thought we spend too little 5. 034 56 56. 000 money on welfare (mean=13.73) and survey respondents who thought 000 spend too much 13. 235 50 50. we money on welfare (mean=13.38).
2. 558 50 50. 000 .303 50 50. we respondents who thought 000 spend about the right

This supports the relationship that "survey

amount of money on welfare had completed more 50 50. 000 years of school than survey respondents who 10. 456 32 32. 000 thought we spend too little or much money on welfare."
5. 503 2. 524 .440 5. 304 12. 846 2. 537 32 32 32 138 138 32. 000 32. 000 32. 000 138.000 138.000

Independent variables and group membership: predictors associated with second function
Gro up Statistics Valid N (listwise) Unweighted Weighted

WELF ARE 1 TOO LITTLE

Mean 43. 96 13. 73 1. 93 13. 70 37. 90 14. 78 1. 90 14. 00 42. 03 13. 38 1. 75 14. 75 41. 32 14. 03

Std. Dev iation

NUMBER OF HOUR S WORKED LAST WEEK HIGHEST Y EAR OF SCHOOL COMPLETED R SELF -EMP OR WORKS F OR SOMEBODY RESPONDENTS I NCOME 2 ABOUT RIGHT NUMBER OF HOUR S WORKED LAST WEEK HIGHEST Y EAR OF SCHOOL COMPLETED R SELF -EMP OR WORKS F OR SOMEBODY RESPONDENTS I NCOME 3 TOO MUCH NUMBER OF HOUR S WORKED LAST WEEK HIGHEST Y EAR OF SCHOOL COMPLETED R SELF -EMP OR WORKS F OR SOMEBODY RESPONDENTS I NCOME Tot al NUMBER OF HOUR S WORKED LAST WEEK HIGHEST Y EAR OF SCHOOL COMPLETED

13. 240 56 56. 000 mean is not directly interpretable. Its interpretation 2. 401 56. 000 corresponds to56 self-employed and 2 corresponds to

Since self-employment is a dichotomous variable, the must take into account the coding by which 1

(mean=1.75), 56 when compared to the mean for survey 5. 034 56. 000

.260 56 who thought we spend 56. 000 too much money on welfare

someone else. The lower mean for survey respondents respondents who thought we spend too little money on more survey respondents who were self-employed and 50 50. fewer survey respondents 000 were working for who someone else.
50 50. 000

13. 235 50 50. 000 welfare (mean=1.93), implies that the group contained 2. 558

.303

This supports the relationship that "survey 5. 503 50 50. 000
10. 456

respondents who thought we spend too much money 32 32. 000 on welfare were more likely to be self-employed than survey respondents who thought we spend too little 2. 524 32 32. 000 money on welfare."
.440 5. 304 32 32 138 138 32. 000 32. 000 138.000 138.000

12. 846 2. 537

CLASSIFICATION USING THE DISCRIMINANT MODEL: by chance accuracy rate
The independent variables could be characterized as useful predictors of membership in the groups defined by the dependent variable if the cross-validated classification accuracy rate was significantly higher than the accuracy attainable by chance alone. Operationally, the cross-validated classification accuracy rate should be 25% or more higher than the proportional by chance accuracy rate. The proportional by chance accuracy rate of was computed by squaring and summing the proportion of cases in each group from the table of prior probabilities for groups (0.406² + 0.362² + 0.232² = 0.350).

Pri or Probabi lities for Groups Cas es Us ed in Analy sis Unweighted Weighted 56 56. 000 50 50. 000 32 32. 000 138 138.000

WELFARE 1 TOO LITTLE 2 ABOUT RIGHT 3 TOO MUCH Tot al

Prior .406 .362 .232 1. 000

CLASSIFICATION USING THE DISCRIMINANT MODEL: criteria for classification accuracy b,c Classificatio n Resu lts

Original

Count

%

a Cross-v alidated

Count

%

Predic ted Group Mem bership 1 TOO 2 ABOUT WELF ARE LI TTLE RIGHT 3 TOO MUCH 1 TOO LITTLE 43 15 6 2 ABOUT RIGHT 26 30 6 3 TOO MUCH 17 10 9 Ungrouped c ases 3 3 2 1 TOO LITTLE 67. 2 23. 4 9. 4 2 ABOUT RIGHT 41. 9 48. 4 9. 7 3 TOO MUCH 47. 2 27. 8 25. 0 Ungrouped c ases 37. 5 37. 5 25. 0 1 TOO LITTLE 43 15 6 2The cross-validated accuracy 26 computed 30 ABOUT RIGHT rate by 6 3SPSS was 50.0% which was greater than or 11 TOO MUCH 17 8 equal to the proportional by chance accuracy 4 1 TOO LITTLE 67. 2 23. 9. 4 criteria of 43.7% (1.25 x 35.0% = 43.7%). The 2 ABOUT RIGHT 41. 9 48. 4 9. 7 criteria for classification accuracy is satisfied. 3 TOO MUCH 47. 2 30. 6 22. 2

Tot al 64 62 36 8 100.0 100.0 100.0 100.0 64 62 36 100.0 100.0 100.0

a. Cross v alidation is done only f or t hose cases in the analy sis. In cross v alidation, eac h case is clas sif ied by t he f unctions deriv ed f rom all cases ot her than t hat case. b. 50. 6% of original grouped c ases correct ly classif ied. c. 50. 0% of cross-v alidated grouped cas es c orrectly classif ied.

Stepwise Discriminant Analysis


Stepwise discriminant analysis is analogous to stepwise multiple regression in that the predictors are entered sequentially based on their ability to discriminate between the groups.
An F ratio is calculated for each predictor by conducting a univariate analysis of variance in which the groups are treated as the categorical variable and the predictor as the criterion variable. The predictor with the highest F ratio is the first to be selected for inclusion in the discriminant function, if it meets certain significance and tolerance criteria.







A second predictor is added based on the highest adjusted or partial F ratio, taking into account the predictor already selected.

Stepwise Discriminant Analysis


Each predictor selected is tested for retention based on its association with other predictors selected.
The process of selection and retention is continued until all predictors meeting the significance criteria for inclusion and retention have been entered in the discriminant function. The order in which the variables were selected also indicates their importance in discriminating between the groups.





Stepwise Discriminant Analysis: Methods
 Wilks' lambda. A variable selection method for stepwise discriminant analysis that chooses variables for entry into the equation on the basis of how much they lower Wilks' lambda. At each step, the variable that minimizes the overall Wilks' lambda is entered.

 Unexplained variance. At each step, the variable that minimizes the sum of the unexplained variation between groups is entered.
 Mahalanobis distance. A measure of how much a case's values on the independent variables differ from the average of all cases. A large Mahalanobis distance identifies a case as having extreme values on one or more of the independent variables.

 Smallest F ratio. A method of variable selection in stepwise analysis based on maximizing an F ratio computed from the Mahalanobis distance between groups.
 Rao's V. A measure of the differences between group means. Also called the Lawley-Hotelling trace. At each step, the variable that maximizes the increase in Rao's V is entered. After selecting this option, enter the minimum value a variable must have to enter the analysis

Mahanabolis D
 The "Mahalanobis distance" is a rule for calculating the distance between two points. The two usual cases where the Mahalanobis distance plays an important role :  Distance of a point to the mean of a distribution,  And, distance between the means of two distributions.  Better than Euclidian Distance in certain cases: In this image, the two points A and B are equally distant from the centre µ of the distribution.  Yet, it seems inappropriate to say that they occupy "equivalent" positions with respect to O as:  A is in a low density (probability) region,  While B is in a high density (probability) region.  So, in a situation like this one, the usual Euclidian distance d ²(A, µ) = i (oi - µi)² does not seem to be the right tool for measuring the "distance" of a point to the centre of the distribution.

Mahanabolis D
 We would instead consider “two points with the same probability density” as “points equally distant from the mean" as this would make them equally probable when drawing observations from the distribution.  So, we use Mahanabolis Distance instead of Euclidian distance: D ² = (x - µ)' -1(x - µ) with the covariance matrix of the distribution. D is called the Mahalanobis distance of the point x to the mean µ of the distribution.

Mahalanobis distance and Discriminant Analysis
 Suppose you want to discriminate between two equally extended spherical classes with equal a priori probabilities. Then the best classification rule is simply to assign an observation x to the class whose centre (mean) is closer to x in the sense of the ordinary Euclidian distance.  But it is not so if the classes are not spherical anymore. We then should assign x to the class to which it has the larger probability to belong, that is the class with the largest probability density in x (because of the equal a priori probabilities), and therefore to the class with the lower value of the Mahalanobis distance of x to the class mean. For example, in the lower image of the above illustration, x should be assigned to class C1 although it is in "C2 territory" from a Euclidian point of view.

References
1. Burns, R., & Burns, R. (2008). Business Research Methods and Statistics using SPSS. California: Sage Publications Inc.

2. Hair, Black, Babin and Anderson, Multivariate Data Analysis
3. www.utexas.edu 4. www.aiaccess.net

Similar Documents

Premium Essay

Multivariate Analysis

...Multivariate Analysis of Bike Sharing Demand Name: BIA-652 Srikanth Pisipati 05/11/2015 Lavina Choudhary 1. What is Bike Sharing System? It is a means of renting the bicycles where the process of renting, returning and membership is an automated process using a network of kiosk location throughout a city. So a person can rent a bike from one location and can return it to different location. 2. Introduction/Objectives: Bike sharing data is a huge data used to do a research and predict the demand in future based on different attributed like wind speed, hour, peak time, humidity, temperature, season, holiday, working day. And, it is important to analyze so as to understand the duration of travel departure location, arrival location of different places. So, for the same we are using the bike share data with historical patterns in the Capital Bike share program in Washington, D.C. 3. Data Analysis/ explanation of data set: We are taking hourly data over the span of 2years .Then we split the data into 2 sets: Training data set which comprises of 10000 records and Testing Data set comprises of 6000 records. Training Data set: It is comprised of 1-19th days of each month Testing Data set: It is comprised of 19th to end of month So, we will predict the total bike demand in training data set for each hour and then we will test it on the testing data. 4. Attribute Explanation: Date time hourly date + timestamp Continuous Variable Season ...

Words: 1611 - Pages: 7

Free Essay

Multivariate Data

... Multivariate data is a key part of any interaction in business. The data can be used to anticipate the effect of several variables. Multivariate relationships involve multiple independent variables affecting a dependent variable. These independent variables have a distinct and measurable effect on the dependent variable. These relationships can be used by managers to make decisions. The example given is that of an automobile manufacturer that uses the data to change the methods of scheduled maintenance without affecting the longevity of the vehicle. Multivariate data can show managers how different aspects can affect an outcome. Multivariate Data Multivariate data is a system of relationships that governs nearly any interactions between objects. These data relationships show how one set of variables can have an effect on another. Whenever something happens, it happens because of many factors that come into play; several things have to come together to create the effect observed. This is true of things in nature, occurrences in life, and decisions in business. Multivariate relationships are everywhere, and the effect they have is widespread. The ability to recognize and analyze these variables can be a strong asset in business management as understanding what drives certain effects can allow a manager to more accurately predict outcomes. Being able to accurately model what is going to happen is a distinct advantage for any manager. Multivariate relationships...

Words: 908 - Pages: 4

Premium Essay

Analytics

...Analytics Concepts and Definitions Types of Analytics Descriptive Analytics: * Post Event Analytics * Add features to website and measure its effectiveness in form of clicks, link sharing, page views * Descriptive Analytics Tools -> Google Analytics, Optimizely Diagnostic Analytics: * Post Event Analytics * Analytics used to diagnose why something/phenomenon happened the way it did * It basically provides a very good understanding of a limited piece of the problem you want to solve. * Usually less than 10% of companies surveyed do this on occasion and less than 5% do so consistently. Predictive Analytics: * Used for Prediction of Phenomenon using past and current data statistics * Essentially, you can predict what will happen if you keep things as they are. * However, less than 1% of companies surveyed have tried this yet. The ones who have, found incredible results that have already made a big difference in their business. * Eg:- SAS, RapidMiner, Statistica Prescriptive Analytics:  * Prescriptive analytics automatically synthesizes big data, multiple disciplines of mathematical sciences and computational sciences, and business rules, to make predictions and then suggests decision options to take advantage of the predictions. * It is considered final phase of Analytics Some Analytics Techniques used Linear Regression In statistics, linear regression is an approach for modeling the relationship between a scalar...

Words: 1288 - Pages: 6

Premium Essay

Bivariate Statistic

...Bivariate statistical tests are nothing but a kind of statistical analysis. Such process incorporates two variables signified by X, Y in most of the cases. The purpose of these kinds of tests is to determine the empirical relationship between two different variables. This is better to see those variables are interrelated or not. A common part such kind of analysis is to find out whether those two variables are changeable in response to each and every measure or not. Such change happens simultaneously. This kind of data analysis process is useful enough to test hypotheses of association and causality. It helps to verify how it is easy to predict the easiness and prediction of the value in terms of dependent variable in case of a known case value of an independent variable. These kinds of statistical tests can be contrasted with some univariate analysis. In this case, only single variable can be analyzed. The purpose is to describe in this case. Subgroup comparison that is nothing but a process of analysis in descriptive kind between two variables is a very simple form of bivariate analysis. This is a process to analyze two different variables. ------------------------------------------------- Types of Bivariate Statistical Tests: A very usual form of bivariate analysis is to create percentage table along with a scatterplot graph. Even it includes the calculation of a simple correlation coefficient. To give an instance, such tests tend to investigate the significant zone of men...

Words: 506 - Pages: 3

Free Essay

Digital Image Processing

...UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING COMPUTER ENGINEERING DEPARTMENT Digital Image Processing Lab Manual No 03 Dated: 31st August, 2015 to 04th September, 2015 Semester: Autumn 2015 Digital Image Processing Session:-2012 Computer Lab Instructor:-Engr. Farwa UNIVERSITY OF ENGINEERING AND TECHNOLOGY, TAXILA FACULTY OF TELECOMMUNICATION AND INFORMATION ENGINEERING COMPUTER ENGINEERING DEPARTMENT Objectives:The objectives of this session is to understand following.     Image Resizing Image Interpolation Relationships between pixels Distance Transform Image Resizing:Resizing an image consists of enlarging or shrinking it, using nearest-neighbor, bilinear, or bicubic interpolation. Both resizing procedures can be executed using the imresize function. Let us first explore enlarging an image. Enlarge the cameraman image by a scale factor of 3. By default, the function uses bicubic interpolation. I=imread('cameraman.tif'); I_big1 = imresize(I,3); figure, imshow(I), title(’Original Image’); figure, imshow(I_big1), interpolation’); title(‘Enlarged Image using bicubic Use the imtool function to inspect the resized image, I_big1. Scale the image again using nearest-neighbor and bilinear interpolations. I_big2 = imresize(I,3,’nearest’); I_big3 = imresize(I,3,’bilinear’); figure, imshow(I_big2),title(‘Resized interpolation’); figure, imshow(I_big3)...

Words: 1187 - Pages: 5

Free Essay

Qm 1.4 Vrije Universiteit

...Q.M  SAMENVATTING:     Chapter  2:     Observation:  A  single  member  of  a  collection  of  items  that  we  want  to  study  such   as  person,  firm  or  a  region.   Variable:  A  characteristic  of  the  subject  or  individual,  such  as  an  eployee’s  income   or  an  invoice  amount.   Data  set:  Consists  all  the  values  of  all  of  the  variables  for  all  the  observations  we   have  chosen  to  observe.   Univariate  data  set:  one  variable.   Bivariate  data  set:  two  variables.   Multivariate  data  set:  More  than  two  variables.     Data  Types:     Categorical  Data:  Have  values  that  are  describes  by  words  rather  than  numbers.       Verbal  Label:  Example  –  Vechile  Type  (Car,  Truck,  SUV).       Coded(Binary):  Example  –  Vechile  type  (1,  2,  3).     Numerical  Data:  Arise  from  counting,  measuring  something,  or  some  kind  of   mathematical  operation.       Discrete:  Example  –  Broken  eggs  in  a  carton  (1,2,3,4…N).       Continuous:  Example  –  Patient  waiting...

Words: 382 - Pages: 2

Premium Essay

Likert

...Factor-analyzing Likert-scale data under the assumption of multivariate normality complicates a meaningful comparison of observed groups or latent classes Gitta Lubke University of California, Los Angeles Bengt Muth´n e University of California, Los Angeles Abstract Treating Likert scale data as continuous outcomes in confirmatory factor analysis violates the assumption of multivariate normality. Given certain requirements pertaining to the number of categories, skewness, size of the factor loadings, etc., it seems nevertheless possible to recover true parameter values if the data stem from a single homogenous population. It is shown in a multi-group and a latent class context that analyzing Likert data under the assumption of multi-variate normality may distort the factor structure differently across groups or classes. Hence, investigating measurement invariance, which is a necessary requirement for a meaningful comparison of observed groups or latent classes, is problematic. Analyzing subscale scores computed from Likert items does not necessarily solve the problem. Based on a power study, some conditions are established to obtain acceptable results. Questionnaires designed to measure latent variables such as personality factors or attitudes typically use Likert scales as a response format. In response to statements such as ‘does the student yell at others’, participants are asked to choose one of a given number of ordered response categories which run for instance from...

Words: 1441 - Pages: 6

Premium Essay

Assignment

...instance where by we monitor two related quality characteristics is necessary in multivariate control chart. It is recommended to use a multivariate analysis, monitoring two quality characteristic can be misleading. For instance monitoring a frequency of two radio X1 and X2 respectively the control chart can indicate being under the expected control limit ,when we run the two frequency simultaneously under the multivariate chart . Commonly occurring result is revealed, however using multivariate control chart to analyse two or more related quality characteristic the confidence interval level is high if the point accurately plotted 2. Describe briefly how a multivariate normal distribution is used as a model for multivariate process data. The multivariate normal distribution are closely related to univariate normal distribution . The multivariate normal distribution is the random vector (X1, X2…..Xk) = X. The multivariate normal distribution is defined by a vector of means µ1, µ2……µn (where ‘n’ is the constant number of variable) and the variance-covariance matrix∑. It is an extension of the univariate normal distribution for applications with a group of variables that may be correlated or with related characteristics. Suppose there are ‘J’ variables denoted with Xi…Xj. the mean of each variable is µi……µj, the vector of the Mean µ ′=[µi……µj] while the variance is σi,….. σj Multivariate normal probability function is expressed as: F(x) = 1 (2) 2 |∑|1/2−1 2(−µ)′ ∑−1...

Words: 322 - Pages: 2

Premium Essay

Norm-Package

...Package ‘norm’ February 20, 2015 Version 1.0-9.5 Date 2013/02/27 Title Analysis of multivariate normal datasets with missing values Author Ported to R by Alvaro A. Novo . Original by Joseph L. Schafer . Maintainer John Fox Description Analysis of multivariate normal datasets with missing values License file LICENSE URL http://www.stat.psu.edu/~jls/misoftwa.html#aut Repository CRAN Repository/R-Forge/Project norm Repository/R-Forge/Revision 8 Repository/R-Forge/DateTimeStamp 2013-02-27 16:01:38 Date/Publication 2013-02-28 07:11:32 NeedsCompilation yes License_restricts_use no R topics documented: .code.to.na . . . . .na.to.snglcode . da.norm . . . . . em.norm . . . . . getparam.norm . imp.norm . . . . loglik.norm . . . logpost.norm . . makeparam.norm mda.norm . . . . mdata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...

Words: 774 - Pages: 4

Premium Essay

Ethnocentrism

...Marketing badge 2009 Personality : Quick learner, emotion stability, quick adaptable, kind hearted. 38 APPENDIX A Original journal Consumer decision-making styles on domestic and imported brand clothing The Authors Cheng-Lu Wang, Department of Marketing & International Business, University of New Haven, West Haven, USA Noel Y.M. Siu, Department of Marketing, Hong Kong Baptist University, Kowloon, Hong Kong Alice S.Y. Hui, Department of Marketing, Hong Kong Baptist University, Kowloon, Hong Kong Abstract The relationship between consumers’ decision-making styles and their choice between domestic and imported brand clothing is investigated using a sample of Chinese consumers. The multivariate analysis of variance and discriminant analysis results indicate that seven decision-making styles together with other consumer behavioural characteristics can be used to distinguish and profile consumers who prefer to buy domestic, imported or both types of clothing. Empirical findings reveal that consumers who prefer to buy imported brand clothing tend to have a unique lifestyle and shopping orientation that differ from those who prefer domestic brand clothing. Conceptual contributions and managerial implications are discussed. Keyword(s): Consumers; Decision making; Brands; Clothing; China; Consumer behaviour; Market segmentation. Article Type: Research Paper, Journal: European Journal of Marketing, Volume: 38, Number: ½, Year: 2004, pp: 239-252...

Words: 1979 - Pages: 8

Free Essay

Effectiveness Analysis of an Imc Plan – Analysis on Djuice.

...|Effectiveness analysis of an IMC plan – analysis on Djuice. | |Research Report | | | | | Table of Contents Contents Executive Summary 3 Background 4 Statement of the Problem 11 Approach to the Problem 12 Research Design 14 Data Analysis 15 Results 16 Limitation and Caveats 21 Conclusion and Recommendations 22 Exhibit 23 Reference 27 Executive Summary I am going to conduct a research project on “Effectiveness analysis of an IMC plan – analysis on DJUICE”. Integrated marketing communication is integration of all marketing tools, approaches, and resources within a company which maximizes impact on consumer mind and which results into maximum profit at minimum cost. It aims to ensure consistency of message and the complementary use of media. To be an Effective brand IMC plan plays a major role. Effectiveness of a brand is measured by consumer preference. Advertising and other promotional tools, Word of Mouth, Service quality, Tariff and Offer etc are the variables for preferring a mobile phone. The objective is to identify the effect of each factor on preference of Djuice. This research project could act as the guideline to estimate what the Djuice users expect from the company and what influence...

Words: 3826 - Pages: 16

Premium Essay

Ar-Rahnu

...This study examines the applicability of theory of reasoned action (TRA) in a context of Islamic pawnshop using structural equation modelling (SEM). The present study presents a simplified theory of TRA, hence it is intended to test whether the two constructs in the theory are acceptable or not in a newly context of Islamic pawnshop. The simplified theory is tested using survey data from 250 respondents. Out of these, only 221 questionnaires are found to be usable whilst the rest are omitted owing to the incomplete responses. The results reveal that attitude was significantly related to the intention to use Islamic pawnshop. Subjective norm was also significantly associated with the intention to use Islamic pawnshop. In sum, the present study provides us valuable insights for service providers to future planning of Islamic pawnshop businesses. Key words: Attitude, subjective norm, questionnaire-survey, theory testing, Islamic pawnshop, Malaysia. INTRODUCTION In this study, an Islamic pawn is normally noted as arRahnu. Ar-Rahnu is defined as a method of providing short-term financing to a person by pawning her jewellery to banks or pawnshops as a security. It is one of the micro financing facilities available for low and middle class income earners who are seeking financial assistance to meet two purposes. Firstly, for precautionary purposes, mainly when one encounters unexpected situations such as death and accident which required quick cash, and less cumbersome financial assistance...

Words: 4692 - Pages: 19

Free Essay

Marketing

...Cluster Analysis1 Cluster analysis, like reduced space analysis (factor analysis), is concerned with data matrices in which the variables have not been partitioned beforehand into criterion versus predictor subsets. In reduced space analysis our interest centers on reducing the variable space to a smaller number of orthogonal dimensions, which maintains most of the information–metric or ordinal– contained in the original data matrix. Emphasis is placed on the variables rather than on the subjects (rows) of the data matrix. In contrast, cluster analysis is concerned with the similarity of the subjects–that is, the resemblance of their profiles over the whole set of variables. These variables may be the original set or may consist of a representation of them in reduced space (i.e., factor scores). In either case the objective of cluster analysis is to find similar groups of subjects, where “similarity” between each pair of subjects is usually construed to mean some global measure over the whole set of characteristics–either original variables or derived coordinates, if preceded by a reduced space analysis. In this section we discuss various methods of clustering and the key role that distance functions play as measures of the proximity of pairs of points. We first discuss the fundamentals of cluster analysis in terms of major questions concerning choice of proximity measure, choice of clustering technique, and descriptive measures by which the resultant clusters can...

Words: 6355 - Pages: 26

Premium Essay

Marketing

...QA Concept Introducing LoadRunner  • Why should you automate performance testing?  • What are the LoadRunner components?  • Understanding LoadRunner Terminology  • What is the load testing process?  • Getting Familiar with HP Web Tours  • Application Requirements  The Power of LoadRunner  • Creating the Load Test  • Running the Load Test  • Monitoring the Load Test  • Analyzing Results  Building Scripts  • Introducing the Virtual User Generator (VuGen)  • How do I start recording user activities?  • Using VuGen’s Wizard mode  • How do I record a business process to create a script?  • How do I view the script?  Playing Back Your Script  • How do I set the run-time behavior?  • How do I watch my script running in real time?  • Where can I view information about the replay?  • How do I know if my test passed?  • How do I search or filter the results?  Solving Common Playback Problems  • Preparing HP Web tours for playback errors  • How do I work with unique server values?  Preparing a Script for Load Testing  • How do I measure business processes?  • How do I emulate multiple users?  • How do I verify Web page content?  • How can I produce debugging information?  • Did my test succeed?  Creating a Load Testing Scenario  • Introducing the LoadRunner Controller  • How do I start the Controller?  • The Controller window at a glance  • How do I modify the script details?  • How do I generate a heavy load?  • How do I emulate real load...

Words: 468 - Pages: 2

Free Essay

Preparing Business Scenario Analyses

...Preparing Business Scenario Analyses The following general guidelines may be used in preparing for an oral or written business scenario analysis and presentation. There may be several feasible courses of action regarding the solution to any case. It is more important to concern yourself with the process of problem definition and isolation, analysis, and evaluation of alternatives, and the choice of one or more recommendations, rather than trying to find a single answer. Very often, the right answer is the one that you can propose, explain, defend, and make work. • The Process of Analyzing a Case 1. Read and study the scenario thoroughly and efficiently. Read the scenario once for familiarity, noting issues that come to the forefront. Read the scenario again. Determine all the facts, making notes about symptoms of problems, root problems, unresolved issues, and roles of key players. Watch for issues beneath the surface. 2. Isolate the problem(s). Get a feel for the overall environment by putting yourself in the position of one of the key players. Seek out the pertinent issues and problems. 3. Analyze and evaluate alternatives. a. Once the problems and issues are isolated, work at gaining a better understanding of causes. In what area of the unit do the problems exist? Why? What caused them? Examine and evaluate the strengths and weaknesses of the unit’s processes (e.g., planning, communication), human behaviors, and/or exhibits (e.g., financial statements,...

Words: 505 - Pages: 3