Monday, June 15, 2020

What is factor analysis - Free Essay Example

1.0 DEFINITION OF FACTOR ANALYSIS Factor analysis (FA) refers to a latent structure approach that can be used to analyze interrelationships among a large number of variables by explaining the underlying unobservable variables (latent variables) that are reflected in the observed variables (manifest variables) known as factors. With FA, the researcher can first identify the separate dimensions of the structure and then determine the extent to which each variable is explained by each dimension.   Once these dimensions and the explanation of each variable are determined, the summarization and reduction of data can be achieved. In summarizing the data, FA describes the underlying dimensions of data in a much smaller number of items than the original variables. It examines the pattern of correlations (or covariances) between the observed measures. Data reduction can be achieved by calculating scores for each underlying dimension and substituting them for the original variables. FA is an interdependence technique where variates (factors) are formed to maximize their explanation of the entire variable set. These groups of variable would represent dimensions within the data which the researcher needs to label them. Basically, there are two types of FA, exploratory and confirmatory. The first analysis is used to discover the nature of the construct that influence a set of response and latter, test a specified set of constructs is influencing responses in a predicted way. Data summarization The goal of data summarization is achieved by defining a small number of factors that adequately represent the original set of variables. Data reduction Data reduction is achieved by identifying representative variables from a much larger set of variables for use in subsequent multivariate analyses or creating an entirely new set of variables whilst retaining the nature and character of the original variables. Data reduction relies on the factor loadings and uses them a s a basis for either identifying variables for subsequent analysis with other techniques or making estimates of the factor themselves (factor scores or summated scales), which then replace the original variables in subsequent analysis. Factor analytic technique is run according to their purpose either an exploratory or confirmatory perspective. Many researchers consider using the Exploratory Factor Analysis (EFA) when they are searching for structure among a set of variables or as a data reduction technique. EFA technique does not set any a priori constraints on the estimation of the components or the number of components to be extracted compared to the C onfirmatory Factor Analysis (CFA). CFA is used to confirm what is expected on the basis of pre-established theory. 2.0 PURPOSE OF FACTOR ANALYSIS The primary purpose of FA is to discover simple patterns in the pattern of relationships amongst variables by defining the underlying structure in a data matrix. This could be done by data summarization) and reduction. 3.0 HISTORY OF FACTOR ANALYSIS FA was pioneered in 1904 by psychologist, Charles Spearman, who hypothesized that the enormous variety of test of mental ability (measures of mathematical skill, vocabulary, verbal skills and others) could be explained by one underlying factor of general intelligence he called g.FA was developed to analyze the test scores of g so as to determine if g is made up of a single underlying general factor or of several more limited factors measuring attributes like mathematical ability. Raymond Cattell expanded the Spearman g test by using a multi factor theory to explain intelligence. He also developed several mathematical methods such as Cree Test and similarity coefficient.   His statistical methods led to an improved version of factor analyses by statistician. 4.0 PRINCIPAL COMPONENT (PCA) VERSUS FACTOR ANALYSIS (FA) There are many debates amongst statistician on the different of Principal Component and FA. A distinct different is Principal Component assumes that responses are measured based on the underlying factors whist the latter are based on the measured responses. Principal component analysis is used when the objective is to summarize most of the original information (variance) in a minimum number of factors for prediction purposes. In contrast, FA is used primarily to identify underlying factors or dimension that reflect what the variables share in common. Principal components are defined as linear combinations of measurement, that contain small proportions of unique variance and in some instances, error variance whilst FA considers only the common or shared variance, assuming that both the unique and the error variance are not of interest in defining the structure of the variables. PCA produces an orthogonal transformation of the variables without taking into consideration of u nderlying model whilst FA is based on a proper statistical model and is more concern with explaining the covariance structure of the variables than with explaining the variance (Chatfield, 1980). The calculation of PC scores is straightforward whilst the calculation of factor scores is more complex and a variety of methods can be used. Looking at the practical perspective, principal component analysis is most appropriate when the primary concern is data reduction focusing on the minimum number of factors needed to account for the maximum portion of the total variance represented in the original set of variables. FA is most appropriate when the primary objective is to identify the latent dimension or construct represented in the original variables. 5.0 STEPS IN FACTOR ANALYSIS 5.1 TEST ASSUMPTIONS 5.1.1 FA is robust to assumptions of normality If the variables are normally distributed, then the solution is enhanced. To check normality, .. 5.1.2 Measure the sampling adequacy of sample size There are many proposed sample size for FA. Guilford (1954) recommended that the sample size should be at least 200 whilst Hair, Black, Babin Anderson (2010) stated that the minimum is to have at least five times as many observation as the number of variables to be analyzed and the more acceptable size would have 10:1 ratio. Comrey and Lee (1992) provided the following guidance in determining the adequacy of sample size: Table 1: Determining the Adequacy of Sample Size Sample Size Indication 100 Poor 200 Fair 300 Good 500 Very good 1,000 or more Excellent 5.1.3 All variables must be must be suitable for correlational analysis. The sample is identified homogeneous with the respect to the underlying factor structure. It is inappropriate to treat a subset of items as a set of items known to differ in FA such as gender, where it will mislead the representation of the unique structure of each group. There are various ways to quantify the degree of intercorrelations amongst the variables such as the Measure of Sampling Adequacy (MSA). The index ranges from 0 to 1 when each variable is perfectly predicted without error by other variables.   If MSA value falls below 0.50, researcher should identify variable for deletion to achieve an overall value of 0.50. According to Hair et al. (2010) can be interpreted as the followings: Table 2: Measure of Sampling Adequacy (MSA). Measure Of Sampling Adequacy Indication 0.8 or above Meritorious 0.7 or above Middling 0.60 or above Mediocre 0.5 or above Miserable Below 0.5 Unacceptable Another method of determining the appropriateness of FA is the Bartlet test of sphericity and Kaiser-Myer-Oikin (KMO), a statistical test for the presence of correlations among the variables that indicates the significant status of the correlation matrix among at least some of the variables. KMO should indicates more than 0.5. The factor analyst must ensure that the data matrix has sufficient correlations to justify the application of FA. The anti image correlation matrix can be used to indicate whether the data matrix is suitable for FA. It is based on the correlation matrix of unpredicted variables using multiple regression. FA should not be performed when anti image correlation is less than 0.5 due to the lack of sufficient correlation with other variables. 6.0 SELECT TYPE OF ANALYSIS 6.2. 1EXTRACTION In FA, the researchers group variables by their correlations, such that in a group (factor) have high correlations with each other. It is important to understand how much variables variance is shared with olther variables in that factor versus what cannot be shared. The total variance of any variable os composed of its common, unique and error variances. As a variable is more highly correlated with one of more variables, the commune variable known as communalities increases. 6.2.2 ROTATION This important tools refers to the movement of the reference axes of the factors from the origin to some other position. The ultimate effect of rotating the factor matrix is to redistribute the variance from earlier factors to later ones to achieve a simpler, theoretically more meaningful pattern. There are two ways of rotation, either orthogonal factor rotation or oblique factor rotation. In orthogonal factor rotation, the axes rotation is maintain at 90 degrees compared to oblique factor rotation. The major orthogonal approaches are Varimax, Quartimax and Equimax. The Varimax method encourages the detection of factors each of which is related to few variables, on the other hand, Quartimax seeks to maximize the variance of the squared loadings for each variables and tend to produce factors with high loadings for all variables. Equimax is a solution of compromise between Varimax and Quartimax. For Oblique factor rotation, Oblimin, Promax, Orthoblique, Dquart, Doblimin and Ort hoblique has been developed. Oblimin allows factors to covary and to correlate with each other. The researcher need to choose either orthogonal or oblique factor rotation based on the particular needs of a given research problem. However, Hair et al (2010) suggested that Orthogonal Rotation method is preferred when the research goal is data reduction to either a smaller number of variables or a set of uncorrelated measures for subsequent use in other multivariate techniques. Where as the oblique rotation methods are best suited to the goal of obtaining several theoretical meaningful factors or construct. The Significance of Factor Loadings Factor loadings indicatehow strongly a measured variable is correlated with a factor. A 0.30 loadings translates to approximately 10 percent explanation and a 0.50 loadings indicates that 25 percent of the variance is accounted for by the factor. Using practical significance of factor loadings, Hair et al. (2010) proposed the followings (for sample size of 100 or above): Table 3: Significance of Factor Loadings Factor Loadings Indication  ± 0.30 to 0.49 Meets the minimal level for interpretation of structure  ± 0.50 or greater Practically significant Exceed 1.7 Indicative of well defined structure Comrey Lee (1992) also proposed practical significance of factor loading as below: Table 4: Significance of Factor Loadings Factor Loadings Indication More than 0.70 Excellent Less than 0.63 Very good Less than 0.55 Good Less than 0.45 Fair Less than 0.32 Poor In relation to the table above, Hair et al (2010) provide guidelines for identifying significant factor loandings based on sample size as below: Table 5: Guidelines for Identifying Significant Factor Loadings Based on Sample Size Factor Loadings Sample Size Needed for Significant a 0.30 350 0.35 250 0.40 200 0.45 150 0.50 120 0.55 100 0.60 85 0.65 70 0.70 60 0.75 50 a Significance is based on a 0.5 significance level (ÃŽÂ ±), a power level of 80 percent, and standard errors assumed to be twice those conventional correlation coefficients. Source: Computation made with SOLO Power Analysis, BDMP Statistical Software, Inc. 1993 Assess the Communalities of Variable Communalities measures the percent of variance in a given variable explained by all the factors joint and may be interpreted as the reliability of the indicator. Communalities is used to indicate any variables that are not adequately accounted for by the factor solution. Variables with communalities less than 0.50 are considered of not having an acceptable level of explanation and researchers may then need to extract more factors to explain the variance. 6.3 DETERMINE NUMBER OF FACTORS There are number of methods to determine the optimal number of factors. Latent root Criterion/Kaiser Criterion. The latent root criterion or also known as Kaiser Criterion states that factors having latent roots or eigenvalues of the correlation matrix that are greater than 1 are considered significant. Eigenvalue refers to amount of variance explained by each principal component to each factor. Hair et all (2010) suggested that using eigenvalue for establishing a cut off is most reliable when the number of variables is between 20 and 50. Scree Test Criterion. The Cattell scree test is derived by plotting the latent roots against the number of factors in their order of extraction and the shape of the resulting curve is used to evaluate the cutoff point. From the Scree test, as one moves to the right, toward later components, the eigenvalues drop, The Cartell Scree test states to drop all other components after the one starting the elbow (a point after which the rem aining eigenvalues decline in approximately linear fashion. Variance Criterion Variance Criterion is an approach to ensure practical significance for the derived factors in which the cumulative percentages of the variance extracted by successive factors. Hair (2010) proposed that it is uncommon to accept a solution that accounts for 60 percent of the total variance as a satisfactory solution. 6.4 NAME AND DEFINE FACTORS As the variables become correlated and group together, the researchers need to label the group that can represent each group of variables as accurate as possible. 6.5 ANALYSE INTERNAL RELIABILITY Reliability is an indicator to measure internal reliability. The rationale for internal consistency is that the individual items or indicators of the scale should all be measuring the same construct and highly correlated. There are two diagnostic measures of reliabilities, either to look at the item-to-total correlation and inter item correlation or the reliability coefficient. If the researcher choose the first method, the item-to-total correlations should exceed 0.50 and inter item correlation exceed 0.30. Using reliabilities coefficient, Zikmund, Babin, Carr Griffin (2010) provide guideline in determining reliabilities as in Table below: Table 6: Coefficient alpha (ÃŽÂ ±) to Determine Reliabilities Coefficient alpha (ÃŽÂ ±) Indication Between 0.80 to 0.95 Very good Between 0.70 to 0.80 Good Between 0.60 to 0.70 Fair Below 0.60 Poor 7.0 EXPLANATORY FACTOR ANALYSIS USING STATISTICAL PACKAGE FOR SOCIAL SCIENCE (SPSS) Correlation Matrix att1 att2 att3 att4 att5 att6 att7 att8 att9 att10 att11 att12 att13 att14 att15 att16 Correlation att1 1.000 .664 .250 .435 .490 .315 .378 .328 .574 .336 .575 .338 .176 .436 .379 .560 att2 .664 1.000 .383 .506 .444 .456 .345 .260 .525 .316 .468 .414 .320 .533 .480 .674 att3 .250 .383 1.000 .457 .210 .321 .216 .054 .217 .206 .231 .225 .429 .425 .314 .296 att4 .435 .506 .457 1.000 .351 .352 .336 .240 .415 .352 .405 .416 .331 .558 .439 .529 att5 .490 .444 .210 .351 1.000 .210 .318 .194 .303 .216 .603 .330 .188 .296 .238 .352 att6 .315 .456 .321 .352 .210 1.000 .358 .128 .379 .475 .329 .290 .276 .421 .311 .486 att7 .378 .345 .216 .336 .318 .358 1.000 .256 .373 .344 .332 .320 .175 .333 .265 .397 att8 .328 .260 .054 .240 .194 .128 .256 1.000 .348 .209 .215 .128 .128 .200 .231 .265 att9 .574 .525 .217 .415 .303 .379 .373 .348 1.000 .437 .368 .383 .203 .492 .398 .609 att10 .336 .316 .206 .352 .216 .475 .344 .209 .437 1.000 .366 .296 .181 .325 .289 .419 att11 .575 .468 .231 .405 .603 .329 .332 .215 .368 .366 1.000 .338 .176 .382 .333 .445 att12 .338 .414 .225 .416 .330 .290 .320 .128 .383 .296 .338 1.000 .186 .377 .266 .386 att13 .176 .320 .429 .331 .188 .276 .175 .128 .203 .181 .176 .186 1.000 .391 .233 .318 att14 .436 .533 .425 .558 .296 .421 .333 .200 .492 .325 .382 .377 .391 1.000 .428 .579 att15 .379 .480 .314 .439 .238 .311 .265 .231 .398 .289 .333 .266 .233 .428 1.000 .559 att16 .560 .674 .296 .529 .352 .486 .397 .265 .609 .419 .445 .386 .318 .579 .559 1.000 Anti-image Matrices att1 att2 att3 att4 att5 att6 att7 att8 att9 att10 att11 att12 att13 att14 att15 att16 Anti-image Covariance att1 .399 -.141 -.001 -.012 -.051 .047 -.042 -.058 -.112 -.005 -.120 .029 .048 -.002 .014 -.016 att2 -.141 .366 -.057 -.013 -.054 -.080 .030 -.010 -.004 .051 .016 -.060 -.030 -.024 -.045 -.102 att3 -.001 -.057 .647 -.135 -.004 -.062 -.023 .075 .021 -.001 .006 .016 -.189 -.074 -.066 .063 att4 -.012 -.013 -.135 .519 -.027 .021 -.017 -.052 .011 -.046 -.024 -.097 -.019 -.107 -.057 -.047 att5 -.051 -.054 -.004 -.027 .570 .040 -.068 -.010 .010 .034 -.224 -.063 -.038 .021 .028 .009 att6 .047 -.080 -.062 .021 .040 .605 -.090 .041 -.010 -.183 -.039 -.004 -.033 -.046 .017 -.060 att7 -.042 .030 -.023 -.017 -.068 -.090 .719 -.089 -.024 -.061 -.002 -.075 .008 -.015 .001 -.035 att8 -.058 -.010 .075 -.052 -.010 .041 -.089 .817 -.102 -.032 .002 .053 -.055 .016 -.054 .017 att9 -.112 -.004 .021 .011 .010 -.010 -.024 -.102 .482 -.099 .039 -.071 .022 -.072 -.013 -.095 att10 -.005 .051 -.001 -.046 .034 -.183 -.061 -.032 -.099 .647 -.082 -.040 -.006 .023 -.014 -.029 att11 -.120 .016 .006 -.024 -.224 -.039 -.002 .002 .039 -.082 .491 -.025 .020 -.029 -.034 -.015 att12 .029 -.060 .016 -.097 -.063 -.004 -.075 .053 -.071 -.040 -.025 .711 .006 -.038 .011 .004 att13 .048 -.030 -.189 -.019 -.038 -.033 .008 -.055 .022 -.006 .020 .006 .737 -.094 .016 -.040 att14 -.002 -.024 -.074 -.107 .021 -.046 -.015 .016 -.072 .023 -.029 -.038 -.094 .501 -.026 -.064 att15 .014 -.045 -.066 -.057 .028 .017 .001 -.054 -.013 -.014 -.034 .011 .016 -.026 .632 -.125 att16 -.016 -.102 .063 -.047 .009 -.060 -.035 .017 -.095 -.029 -.015 .004 -.040 -.064 -.125 .362 Anti-image Correlation att1 .897a -.369 -.002 -.026 -.107 .095 -.079 -.101 -.256 -.009 -.271 .055 .089 -.005 .028 -.042 att2 -.369 .911a -.118 -.030 -.118 -.169 .059 -.019 -.008 .105 .037 -.118 -.058 -.055 -.095 -.280 att3 -.002 -.118 .865a -.234 -.006 -.099 -.034 .104 .037 -.001 .011 .024 -.274 -.130 -.104 .131 att4 -.026 -.030 -.234 .939a -.050 .037 -.027 -.079 .023 -.079 -.047 -.159 -.031 -.209 -.100 -.107 att5 -.107 -.118 -.006 -.050 .875a .069 -.106 -.015 .018 .056 -.423 -.098 -.058 .039 .047 .021 att6 .095 -.169 -.099 .037 .069 .907a -.136 .058 -.019 -.293 -.072 -.007 -.049 -.083 .027 -.127 att7 -.079 .059 -.034 -.027 -.106 -.136 .950a -.117 -.041 -.089 -.003 -.106 .011 -.025 .002 -.070 att8 -.101 -.019 .104 -.079 -.015 .058 -.117 .894a -.162 -.044 .003 .069 -.071 .024 -.075 .032 att9 -.256 -.008 .037 .023 .018 -.019 -.041 -.162 .921a -.177 .080 -.121 .036 -.147 -.023 -.229 att10 -.009 .105 -.001 -.079 .056 -.293 -.089 -.044 -.177 .901a -.145 -.059 -.008 .041 -.022 -.061 att11 -.271 .037 .011 -.047 -.423 -.072 -.003 .003 .080 -.145 .883a -.042 .034 -.057 -.060 -.036 att12 .055 -.118 .024 -.159 -.098 -.007 -.106 .069 -.121 -.059 -.042 .944a .009 -.063 .016 .008 att13 .089 -.058 -.274 -.031 -.058 -.049 .011 -.071 .036 -.008 .034 .009 .887a -.154 .023 -.078 att14 -.005 -.055 -.130 -.209 .039 -.083 -.025 .024 -.147 .041 -.057 -.063 -.154 .946a -.047 -.150 att15 .028 -.095 -.104 -.100 .047 .027 .002 -.075 -.023 -.022 -.060 .016 .023 -.047 .943a -.262 att16 -.042 -.280 .131 -.107 .021 -.127 -.070 .032 -.229 -.061 -.036 .008 -.078 -.150 -.262 .922a a. Measures of Sampling Adequacy(MSA) KMO and Bartletts Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .914 Bartletts Test of Sphericity Approx. Chi-Square 2491.010 df 120 Sig. .000 Communalities Initial Extraction att1 .601 .617 att2 .634 .606 att3 .353 .526 att4 .481 .514 att5 .430 .645 att6 .395 .360 att7 .281 .278 att8 .183 .164 att9 .518 .598 att10 .353 .308 att11 .509 .576 att12 .289 .274 att13 .263 .320 att14 .499 .550 att15 .368 .356 att16 .638 .682 Extraction Method: Principal Axis Factoring. Total Variance Explained Factor Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative % dimension0 1 6.452 40.324 40.324 5.959 37.243 37.243 3.346 20.915 20.915 2 1.340 8.373 48.697 .833 5.206 42.449 2.150 13.438 34.353 3 1.062 6.639 55.336 .582 3.637 46.086 1.877 11.733 46.086 4 .951 5.942 61.278 5 .841 5.253 66.531 6 .756 4.727 71.257 7 .656 4.101 75.359 8 .643 4.017 79.376 9 .577 3.608 82.985 10 .528 3.298 86.283 11 .499 3.118 89.401 12 .421 2.633 92.033 13 .389 2.431 94.464 14 .348 2.176 96.640 15 .302 1.889 98.529 16 .235 1.471 100.000 Extraction Method: Principal Axis Factoring. Factor Matrixa Factor 1 2 3 att16 .797 att2 .778 att1 .725 -.301 att14 .702 att9 .696 -.324 att4 .688 att11 .643 -.333 att15 .581 att6 .569 att5 .562 -.401 .410 att10 .526 att12 .522 att7 .519 att3 .487 .441 .308 att13 .412 .345 att8 .352 Extraction Method: Principal Axis Factoring. a. 3 factors extracted. 18 iterations required. Rotated Factor Matrixa Factor 1 2 3 att9 .732 att16 .710 .359 att1 .567 .525 att2 .555 .391 .382 att10 .498 att6 .464 .366 att15 .462 .344 att7 .424 att8 .370 att12 .361 att3 .709 att13 .545 att14 .470 .544 att4 .403 .527 att5 .770 att11 .341 .655 Extraction Method: Principal Axis Factoring. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 6 iterations. Factor Transformation Matrix Factor 1 2 3 dimension0 1 .717 .515 .470 2 -.113 .751 -.650 3 -.688 .412 .597 Extraction Method: Principal Axis Factoring. Rotation Method: Varimax with Kaiser Normalization.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.