2. AMOS clipboard error 521

3. Multiple group analysis

4. Fitting a latent growth curve model

5. Handling missing data using AMOS

6. Nested model comparisons

7. Handling non-normal data using AMOS

### AMOS Availability

#### Question:

I would like to perform a path analysis, a confirmatory factor analysis, or a structural equation model. I am most familiar with SPSS for Windows. How can I use SPSS for Windows to perform these analyses?

#### Answer:

As of this writing, SPSS for Windows does not currently support modules to perform the analyses you describe. Of the software products we support, SAS (to find information in the online guide, under "Search", type "structural equations"), LISREL, and AMOS perform these analyses. The easiest to use of these packages in the Windows environment is AMOS. AMOS is licensed for use by SPSS, Inc.; it installs itself as a menu-item in SPSS for Windows and launches upon selection by the SPSS user from the Analyze menu in SPSS for Windows.

You may access AMOS in one of two ways:

- License a copy from SPSS, Inc. for your own personal computer.
- Download the free student version of AMOS from the AMOS Development Site for your own personal computer.

If your models of interest are small, the free student version may be sufficient to meet your needs. For larger models, you will need to purchase your own copy of AMOS.

An accompanying self-paced short-course is available to download as well.

If you have further questions, email stat.admin@austin.utexas.edu

### AMOS Clipboard Error 521

#### Question:

I am running AMOS Graphics on the Windows NT Terminal Server. Every time I run a model, I get a box that pops up on my screen that mentions an error 521: Cannot paste to clipboard. What should I do about this error?

#### Answer:

This error occurs at the very end of a model fitting run by AMOS Graphics. We are currently working with Smallwaters Corporation's technical support group to correct the error. The error does not appear to influence model results or model fitting activities in any way beyond the annoyance of having to dismiss the extra dialog box at the end of the model fitting run. Once you dismiss the error dialog box, you may view your model fitting results as usual.

Multiple group analysis

#### Question:

I have data from two different groups of research participants. One group of participants took my survey in the fall semester while the second group took my survey in the spring semester. I've come up with what I think is a good confirmatory factor analysis model based on the fall data and I want to see if that model holds in the spring data. I believe this is called a "multiple group" analysis. How can I perform this analysis using AMOS?

#### Answer:

This FAQ assumes that you know how to run and interpret a single group analysis using AMOS. If you do not, see our online AMOS tutorial.

This FAQ also assumes that you know how to use AMOS to perform nested model comparisons and that you understand the assumptions and principles underlying nested model comparisons. See AMOS FAQ: Nested Model Comparisons for details.

Multiple group analysis in structural equation modeling is very useful because it allows you to compare multiple samples across the same measurement instrument or multiple population groups (e.g., males vs. females) for any identified structural equation model. AMOS allows you to test whether your groups meet the assumption that they are equal by examining whether different sets of path coefficients are invariant. In other words, you will be testing whether path coefficients in your model are equal for your groups. You can test the equalities of variables' variances, means, and intercepts, as well as the covariances between variables, and the equalities of path coefficients across two or more groups.

Before you begin testing invariance across groups, you should assess carefully your overall sample size and the equality of sample sizes across groups. Since the multiple group analysis estimates more parameters than a single group analysis, you will need proportionally more cases for a multiple group analysis to ensure stable parameter estimates and replicable results. For instance, if you had 150 cases for a single group analysis, you would want at least 300 cases for an analysis that used two groups. Furthermore, your analysis should ideally have equal numbers of cases in each group. Little is known about the impact of sharply unequal group sizes on results obtained from a multiple group SEM analysis, except that larger groups will exert more influence on the results than smaller groups. This property of multiple group analysis is not especially problematic if the group sizes mirror the proportion of individuals' group membership in the population from which the sample was drawn. On the other hand, if the sample sizes are not proportional to population sizes, errors of inference may be more likely to occur.

Different assumptions of group equality can be tested and they are often tested in a particular order (Bollen, 1989). For illustrative purposes, this example will consider one assumption of group equality. An example of sequential equality constraints in the confirmatory factor analysis context can be found in Bollen (1989); the example is discussed at length (with illustrative computer output) in LISREL FAQ: Comparing Groups using LISREL.

Consider example 20-2r from the AMOS program example set. This example illustrates a confirmatory factor analysis model in which two types of learning achievement, F1 (visual learning) and F2 (verbal learning) are each indicated by three measures. Are the factor structures the same for boys and girls? You can use a multiple group analysis in AMOS to address this question. This example program and the accompanying database is available with all versions of AMOS, including the student version.

Suppose you want to test the equality of the factor loadings for two separate groups of school children, girls and boys. To fit the models in AMOS, first draw the model for a single group and fit it for that group's sample data to ensure that the model is properly identified and that no minimization or other unexpected problems arise during the model fitting process. Then fit the same model to the second group's data. Assuming that both models converged correctly and no unusual problems were encountered during the model fitting process, you are now ready to perform the multiple group analysis. Follow these steps to perform the analysis:

1. Select Manage Groups... from the Analyze menu. Name the first group Girls. Next, click on the New button to add a second group to the analysis. Name this group Boys. Click the New button successively to add additional groups as needed. Click the Close button when you are done creating and naming the multiple groups.

2. Each newly created group is represented by its own path diagram. Select the first group's path diagram by clicking on the group's label on the left-hand side of the path diagram window. The snapshot shown below features the Girls group highlighted.

Now you will name the variances, covariances, and regression weights in both the Girls and Boys models. Since the Girls model is highlighted, you will name the parameters in this model first. To do this, use the Object Properties dialog box. In the path diagram, right-click on a latent variable. This brings up a menu list. Select “Object Properties”. Click on the “Parameters” tab. The dialog box will look like:

Uncheck the box for “All groups”, so you can give the variances different names in the two groups. Name the variances for the Girls model “V1” and “V2”. Now that the Object Properties window is open, you can simply click on the arrows from the latent to the observed variables to change the names of the regression weights and click on the covariance arrows to change the covariance names. Name the regression weights “W1”, “W2”, “W3”, and “W4”. Name the covariance “C”. Click on the error parameters to name the variance of the residuals; name them "V3" through "V8".

To name these parameters for the Boys model, highlight “boys” and go through the same procedure as before, except use the names “Va1” through “Va8”, “Wa1”, “Wa2”, “Wa3”, “Wa4”, and “Ca” to distinguish between the Boys and Girls models.

3. Select File, Data Files... to launch the Data Files dialog box. For each group, specify the relevant data file name. For this example, choose the Grant_fem SPSS database for the girls' group; choose the Grant_mal SPSS database for the boys' group. If the groups' information appears in separate data files for each group, as is the case in this example, you need to locate and specify the relevant data file for each group in the analysis by selecting the group, clicking on the File Name button, and locating the appropriate data file. Repeat this process for each group in the analysis. When you finish assigning data files to the groups, the Data Files dialog box will look like this:

If you have a single database that contains multiple groups, AMOS can also select groups of cases from a larger data file based upon a known grouping variable through the use of the Grouping Variable and Group Value buttons: Select the Grouping Variable button to identify the relevant grouping variable within a database and then use the Group Value button to select which value of the grouping variable represents the group of interest. Repeat this process for each group in the analysis.

4. Double-click on the Default Model label shown on the left side of the path diagram window (see the first figure shown above in section 2). This action launches the Manage Models window, shown below (you can also reach the Manage Models window from the Analyze menu).

5. Rename the original model to be more meaningful. In this example, unrestricted loadings is a good choice because the starting model allows different factor loadings for boys and girls. You are now ready to define a second model that imposes a set of equality constraints on the unrestriced loadings model such that the unstandardized factor loadings are equal across boys' and girls' groups. To set up this model, first type the name of the original model, unrestricted loadings, on the first row of the Parameter Constraints section of the Manage Models dialog box. Referring to the unrestricted loadings model here lets AMOS know that you want to impose the constraints that follow subject to the assumptions or constraints already implied by the unrestricted loadings model.

Next, identify the four pairs relevant factor loadings of interest: w1 through w4 in the girls group, and wa1 through wa4 in the boys group. By double-clicking on w1 and then double-clicking on wa1, AMOS will insert the appropriate equality constraint in the Parameter Constraints section of the Manage Models dialog box. Similarly, you would perform the same operation for w2 and wa2, w3 and w3a, and w4 and wa4. When you finish, your Manage Models window looks like this:

Click the Close button to dismiss the Manage Models window. You are now ready to run the analysis.

Hint: The original AMOS example program enabled boostrapping. The example will run more quickly with bootstrapping disabled. To disable bootstrapping, select View, then Analysis Properties.... Click on the Bootstrap tab, then make sure the Perform bootstrap check box is unchecked. For more information about bootstrapping, see AMOS FAQ: Handling non-normal data using AMOS.

After you have run the analysis, you can examine the model fit statistics for each model side by side. This information appears below:

These results show that both models fit the data exceptionally well. Interestingly, the more restricted equal loadings model fits the data better than the original model in which the factor loadings are allowed to vary across the girls and boys.

AMOS outputs the model comparison tests below the global fit statistics:

The results from this model comparison (Chi-square = 1.812 with 4 DF, p = .77) suggest that imposing the additional restrictions of four equal factor loadings across the two sexes of school children did not result in a statistically significant worsening of overall model fit. Notice that the nested model comparison heading mentions that the nested tests assume the baseline model is true. Be sure to begin the multiple group comparison process with a well-fitting default or starting model.

Once you have selected a final model that fits the data well, you may interpret the parameter estimates for the model. In the preceding figure, you can select the appropriate model from the left-hand side of the results table window and you can also select the group results to examine. For instance, you could select the Equal Loadings model and then examine the girls' and boys' regression weights and variances, as well as the covariance between the two factors. The path diagram mirrors this information.

Notice that while the unstandardized regression weights (factor loadings) will be the same for girls and boys in this model, the covariance between the factors and the residual variance estimates will vary across the two groups because you have not imposed any equality constraints on these quantities. You might choose to follow up the analysis above by specifying a third model that imposes further equality constraints on the factor covariance or the residual variance estimates, or both. See Bollen (1989) and LISREL FAQ: Comparing Groups using LISREL for more details on this approach to test factorial invariance.

For more information about muliple group SEM see the following references:

Arbuckle, J., & Wothke, W. (1999). AMOS 4.0 User's Guide. Chicago: Smallwaters Corporation, Inc.

Bollen, K.A. (1989). Structural equations with latent variables. New York: John Wiley & Sons.

Jaccard, J. & Wan, C. K. (1996). LISREL approaches to interaction effects in multiple regression. Thousand Oaks, CA: Sage Publications.

Joreskog, K.G., & Sorbom, D. (1993). LISREL 8 user’s reference guide. Chicago: Scientific Software International, Inc.

Fitting a latent growth curve model

#### Question:

I have data from a seven wave panel study of family relationships. My variable of interest is the amount of parental affection exhibited towards children during adolescence. I have measures of parental affection towards children at children's ages of 9, 10, 11, 12, 13, 14, and 15 years of age. I would like to fit a latent growth curve model to these data using AMOS. How do I do it?

#### Answer:

This answer has been developed by Professor Edward Anderson in the Department of Human Ecology. We are grateful to him for allowing us to reproduce this answer.

Prior to setting up the model, you should consider several issues. First, how many time points do you have? In this example, you have seven time points. To ensure a properly identified and stable solution, your analysis should have four or more time points, though it is possible to fit some growth models with as few as three time points. If you have a three time point database, you may want to meet with a consultant to discuss the particulars of your model.

Second, as shown in the diagram below, you must connect each observed variable at each time point to the latent intercept and slope variables. The intercept variable-observed variable path coefficient values are fixed to 1.00. The slope coefficient values are allowed to be freely estimated in the initial model, with the exception of the coefficients for the first and last two time points. In the example shown below, the first time point's slope parameter is set to zero, and the last time point's slope parameter is set to 1.00. This coding frames the data so we can conceptualize the growth as being 0% complete at time 1, and 100% complete at the final time point. Thus, with this coding, the first panel of data collection is treated as the starting point for the growth curve.

However, there are several other ways to think about growth. For instance, in this example you are studying children every year from age 9 to age 15. You could choose to fix the slope parameters to be 9, 10, 11, 12, 13, 14, and 15. This coding would assume straight-line growth across each year of the child's life, so that the estimated intercept of the data ("time point 0") would be the year of the child's birth. This issue of coding the slope coefficients is not trivial because the interpretation of the estimated slope coefficients as well as the means and standard deviations depend upon what input coding was specified by the data analyst for the slope parameters. For more discussion of this issue, see Stoolmiller (1995).

Although there are a number of ways to set up latent growth curve models, experience suggests that the following method usually obtains model convergence. The model is designed to facilitate easy interpretation of the results. The steps to fit the model are as follows:

1. Launch AMOS.

2. Choose Plugins ---> Growth Curve Model. Enter the number of measures when you are prompted for the number of time points. In this example, you would enter 7 for the number of time points because you have data for parental affection at seven distinct points in time.

3. Choose View ---> Analysis Properties ---> Estimation tab. Check the Estimate Means and Intercepts check box.

4. Right-click on each of the seven observed variable boxes one at a time and select Object Properties. Click on the Text tab to name each of the variables to correspond to the relevant variable names in your external data file. Click on the Parameters tab to fix the intercept parameter; fix each observed variable's intercept value to 0.

5. Right-click on the latent variable circles (labeled ICEPT and SLOPE by AMOS) and select Object Properties. Remove the 0 constraints on the means. Fix the variance to zero for ICEPT and SLOPE.

6. Right-click on each of the seven residual variance circles and select Object Properties. Fix their mean values to 0 and set their variance values to 1.00.

7. For each of the paths connecting the residual variance circles to the observed variables' rectangles, right-click on the path arrow and select Object Properties, choose the Parameters tab, and remove the default 1.00 fixed value. Label each path with a unique name (e.g., EV1 for the first variable, EV2 for the second variable, etc.) by replacing the original value of 1.00 with the new parameter name.

8. Add two new residual latent variables to the ICEPT and SLOPE latent variables by clicking on the button labeled "Add a Unique Variable to an Existing Variable" on the AMOS toolbar. This button resembles a rectangle with a circle above it, with the rectangle and circle connected by a vertical line. Once this tool is selected, click on ICEPT and then SLOPE. This action should create latent variable residuals for ICEPT and SLOPE. Label these latent variable residuals Dev ICEPT and Dev Slope, respectively.

9. For the newly-created Dev ICEPT and Dev Slope latent variable residuals, fix their mean values to 0 and their variance values to 1.00. Next, replace the 1.00 values for the path arrows connecting Dev ICEPT to ICEPT and Dev Slope. Name the newly freed parameters SD_ICEPT and SD_SLOPE, respectively.

10. Move the covariance double-headed arrow between ICEPT and SLOPE from those original latent variables and instead have it connect Dev ICEPT and Dev Slope. Label it Cov_Icept_Slope.

11. Fix each of the parameter values for the arrows leading from ICEPT to the observed variables to the value 1.00.

12. Label each of the parameter values for the arrows leading from SLOPE to the observed variables. In the diagram shown below the parameter values are labeled b1 through b7.

13. Some growth models may require that you try starting values other than the AMOS default in order to fit the model in a reasonable numer of iterations. Symptoms of the need for user-specified start values may include failures to converge within the AMOS iteration limit (typically 500 iterations) or reports by the software of a non-positive definite fitted matrix, negative residual variances, or an otherwise inadmissable solution. To specify your own starting values for a parameter, name the parameter followed by a colon and the starting value. For instance, if you want to have the starting value for parameter b3 be .75, you would label the parameter b3:.75. The example shown below specifies starting values of .50 for each slope weight except for the first weight and the last weight, which are fixed to zero and 1.00, respectively. Important Note: Setting start values is optional; you should try AMOS's own default start values first and resort to user-specified start values only if AMOS cannot converge to a proper solution in a reasonable number of iterations.

14. Double-click on the Default Model label on the left-hand side of the AMOS Graphics window area. This action launches the Manage Models window. In the Model Name section of the Manage Models window, rename the model Full LGM. In the Parameter Constraints section of the window, type b1 = 0 and b7 = 1. These constraints force AMOS to consider the first time point to be zero units. The last time point is fixed to a value of 1.00, so the intermediary slope parameter values can be interpreted as percetanges of growth as a function of time, as described above. Click the Close button to return to the AMOS graphics drawing interface.

15. Select File ---> Data Files. Select the external data file. Click OK.

16. Save your work by choosing File ---> Save As and save your model file to an appropriate location on one of your computer's disk drives.

Your model diagram should appear as follows.

Each of the EV variables represents the standard deviations of the residuals for the observed variables. SD_ICEPT and SD_SLOPE refer to the standard deviations of the intercepts and slopes, respectively. Mean_ICEPT and MEAN_SLOPE are the mean values of the cases' intercepts and slopes. COV_ICEPT_SLOPE refers to the correlation between the slopes and intercepts.

If the model fit is successful, you should see results such as these appear in your AMOS Graphics output window.

Before interpreting the results on the model diagram, you should first verify that the model fits the data well on an overall basis. This is in fact the case, as the chi-square test of overall model fit was not statistically significant (chi-square = 14.655 with 18 DF, p = .685).

After you determine that the model fit the data acceptably, you may interpret the parameter estimates shown in the path diagram above. The mean intercept value of 60.93 indicates that the average starting amount of parental affection towards adolescents was 60.93 units. The standard deviation was 17.42. The mean slope value was -8.91, and the standard deviation of the slope was 11.82. The correlation between the intercepts and the slopes was -.07. Although the means and standard deviations were statistically significant when tested with the null hypothesis that their true values are zero in the population from which this sample was drawn, the same cannot be said for the correlation between the slopes and the intercepts: the r of -.07 was not different from zero.

Substantively, the finding that the standard deviation of the intercepts is statistically significant suggests that there is non-trivial variation in the amount of parental affection received by different children at the initial age when the affection measures were taken. Furthermore, the amount of parental affection appears to lessen over time for all adolescents as shown by the mean slope value of -8.91. This latter finding is qualified, however, by the significant variation in slope values, indicating that individual adolescents' experiences of affection amounts may be quite varied over time. Interestingly, the amount of affection shown an adolescent at the initial time of measurement appeared to be unrelated to changes in affection over time, as illustrated by the non-significant correlation of -.07 between the slopes and intercepts.

The parameter estimates for the slope-variable path coefficients are also of interest. These values track the pattern of the growth curve from the starting point of zero to the ending point of 1.00 over the intervening time points. Examination of the estimated values suggests a slower than expected growth curve through the first four time points with an accelerated growth component at time points five and six. If growth were constant across time, one would expect a value of .50 (50%, or the half way point between 0 and 1.00) instead of the observed value of .39. In fact, one can compare the expected value under a linear or constant change model with the observed values, as shown in the table below.

Time Point |
Expected Value |
Observed Value |

1 |
1/6 = .167 |
.26 |

2 |
2/6 = .335 |
.31 |

3 |
3/6 = .500 |
.39 |

4 |
4/6 = .667 |
.69 |

5 |
5/6 = .833 |
.82 |

6 |
6/6 = 1.000 |
1.00 (fixed) |

Appearances, however, can be deceiving. Although there is a growth spurt between the third and fourth time points, the remaining time points appear to provide a reasonable approximation of a linear or constant change function. It is possible to use AMOS's nested model comparison features to test whether a linear growth function fits the data in the context of the latent growth curve model. To perform this test, double-click on the Full LGM model label on the left-hand side of the AMOS Graphics Diagram window to launch the Manage Models window.

In the Manage Models window, click on New. Name the new model Linear Change. In the Parameter Constraints segment of the window, enter the expected values for the linear constraints, as shown in the figure below.

Each of the intermediate time points' parameter estimates are set equal to their expected values under the linear change function. Including the Full LGM model name before the constraints allows AMOS to fit the new Linear Change model subject to the constraints of the original model; this allows AMOS to directly compare the two models using a nested chi-square test. If the test is not statistically significant, one can conclude that the more parsimonious linear change model fits the data equally as well as the more complex Full LGM model. The AMOS test for this hypothesis was not statistically significant, (chi-square = 8.338 with 5 DF, p = .139), so it is reasonable to conclude that a linear or constant rate of change is present in the population of adolescents from which this sample was drawn.

There are many possible models that can be fit within the latent growth model framework.

**For more information about latent growth analysis, see the following references:
**

McArdle, J. J. & Epstein, D. (1987). Latent growth curves within developmental structural equation models. Child Development, 58, 110-133.

Stoolmiller, M. (1995). Using latent growth curve models to study developmental processes. In J.M. Gottman (Ed.) Analysis of Developmental Change. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.

Handling Missing Data using AMOS

#### Question:

I am using AMOS to perform a structural equation modeling analysis of a database that has missing data. How does AMOS deal with missing data? Can I manually replicate the way AMOS handles missing data?

#### Answer:

This FAQ assumes that you understand the assumptions of structural equation models (SEM) and can specify and test SEMs using AMOS. If not, see our AMOS tutorial.

AMOS uses a procedure known as Full Information Maximum Likelihood (FIML, also known as "Raw Maximum Likelihood") to handle missing data. A number of investigators have shown that FIML outperforms most common methods of handling missing data, including listwise and pairwise data deletion, mean substitution, and the Similar Response Pattern Imputation (SRPI) procedure implemented in LISREL 8.30 and higher (Joreskog & Sorbom, 1993). A recent article by Enders and Bandalos (2001) compares various methods of handling missing data in the structural equation modeling context. More general discussion on missing data handling methods can be found in General FAQ: Handling missing or incomplete data. This FAQ assumes you are familiar with the contents of this General FAQ, including the assumptions underlying FIML, most notably that model residuals are normally distributed, that the fitted model is correct, and that data are missing at random (MAR). This FAQ also assumes that you are familiar with basic model specification in AMOS and know how to perform a multiple group analysis using AMOS. If you do not know how to perform a multiple group analysis using AMOS, see AMOS FAQ: Multiple group analysis.

Data analysts typically ask two types of questions when they fit structural equation models to data. The first question is: "Does the specified model fit the data on a global basis?" This question is usually addressed using a chi-square test of overall model fit. The second question is a follow-up to the first question, assuming the model cannot be rejected by the chi-square test of overall model fit: "What are the parameter estimates?"

To address the first question, that of overall model fit to the data, AMOS minimizes the discrepancy fit function defined by the available data points for each individual data point to obtain a log likelihood value for each data point. These individual log likelihood values are then summed to form an overall log likelihood value for the whole sample. The formula for this function, along with a brief description of the minimization process, can be found on line at: http://www.smallwaters.com/whitepapers/longmiss/Longitudinal and multi-group modeling with missing data.pdf.

AMOS performs this operation for two distinct models: the first model is a saturated model in which the number of estimated parameters is equal to the number of known inputs: means, variances, and covariances. For example, if you had a database containing four measured variables, it would have four variances, four means, and six covariances - 14 inputs. The saturated model would estimate the values of each of the aforementioned quantities.

The second model is the model you have specified, your structural equation model of interest. AMOS then computes the difference between the log likelihood values for the two models. This difference can be interpreted as a chi-square test with degrees of freedom equal to the difference between the two models' degrees of freedom. Since AMOS computes the log likelihood values for the two models, the difference between the log likelihood values, and the appropriate degrees of freedom, the chi-square test output by AMOS thus properly tests the overall goodness of fit of your proposed model.

To compute appropriate parameter estimates and standard errors, AMOS uses a variant of the Muthén, Hollis, & Kaplan (1987) multiple group structural equation modeling approach in which each distinct pattern of missing data is treated as a separate group in a multiple group structural equation model. By establishing equality constraints such that the estimates of the variances, covariances, means, and intercepts are the same across the different patterns of missing data, AMOS can estimate appropriate parameter estimates and standard errors.

An example may help to clarify how this process works. Suppose you have two variables, X and Y, and their sample data that appear below.

X Y

100.00 93.00

98.00 89.00

90.00 75.00

88.00 66.00

86.00 54.00

84.00 35.00

83.00 80.00

78.00 55.00

77.00 70.00

75.00 53.00

70.00 25.00

65.00 55.00

64.00 78.00

62.00 65.00

55.00 88.00

51.00 35.00

49.00 40.00

49.00 20.00

48.00 35.00

47.00 88.00

38.00 5.00

35.00 25.00

33.00 50.00

21.00 70.00

19.00 12.00

17.00 6.00

12.00 35.00

7.00 64.00

6.00 20.00

5.00 8.00

The mean for X is 53.73; its variance is 830.60. The mean for Y is 49.80 and its variance is 711.23. The covariance between X and Y is 448.41 (Note: All computations in this example assume division by N rather than N-1 because the methods described here are intended for computation of asymptotic statistics).

Suppose you now delete the first 10 cases of variable Y and fit the model using AMOS with FIML missing data handling activated. Under this scenario, the statistics for X remain unchanged, yet the mean of Y becomes 49.00; its variance is now 790.50. The covariance between X and Y is now 402.79. This model is saturated: there are an equal number of parameters estimated as there are known inputs to the analysis, so the chi-square test of model fit output by AMOS is 0.

You can set up a second model that imposes several restrictions on the model described above so that you obtain a non-saturated model. Suppose you want to test whether the mean of X is equal to the mean of Y and that the variance of X is equal to the variance of Y, simultaneously. Since two constraints are imposed on the saturated model, the resulting chi-square goodness of fit test for the second model has two degrees of freedom. The chi-square test value output by AMOS is .439 with a p-value of .803. The mean value for X and Y is 52.43 and the variance value is 849.65. Their covariance value is now 471.08.

So far, the numbers obtained above were generated using AMOS's FIML algorithm. For this example, it is possible to obtain the same numbers using the approach documented by Muthén et al. (1987). To do this, perform the following steps:

1. Split the original data file shown above into two distinct data files. The first data file contains the last twenty cases from the original sample that have complete data for both X and Y. The second data file contains the first ten cases with observed values for X and missing data points for Y. Note that both databases have X and Y variables so that AMOS recognizes that Y has missing data for the first ten cases.

2. In AMOS, set up a multiple group analysis. For the first group (n = 20), draw the model as usual: Include a rectangle for X and a rectangle for Y, allow their means and variances to be freely estimated, and draw a double-headed covariance matrix connecting the X and Y rectangles. Before you define the second group, select View, then Interface Properties, then click on the Misc tab. Select the radio button labeled Allow different path diagrams for different groups. When you click the OK button, AMOS will flash a warning message notifying you that once you make this change you cannot delete a second group. Accept this warning.

3. Define a second group, then draw a single rectangle and name it X. Link each group to its respective database and give each mean and variance a unique name. For example, for group 1, the mean value for X might be named "mean_x1" and the variance value could be named "var_x1" whereas group 2's mean for X could be "mean_x2" and its variance could be called "var_x2".

4. Define a second model that is nested under the model you just defined. In this new model, constrain the mean of X for group 1 to be equal to the mean of X for group 2. Similarly, constrain the variance of X for group 1 to be equal to the variance of X for group 2. This model is equivalent to the saturated baseline model that AMOS fits using the FIML algorithm. Notice that when you fit this model, you obtain the same estimates for the means and variances of X and Y as you did under the saturated FIML model: The mean of X is 53.73 while the variance is 830.60. The mean of Y is 49.00 and its variance is 790.50. The covariance of X and Y is 402.79. Unlike the FIML model, however, AMOS outputs a chi-square value of 35.888 with 2 degrees of freedom for this model.

5. Define a third model that is nested under the second model described in the previous paragraph. For this model, set the mean of X equal to the mean of Y, and also set the variance of X equal to the variance of Y in the first group. The common mean value of X and Y for this model is 52.43 while the shared variance value is 849.65. The covariance between X and Y is estimated at 471.08. The chi-square model fit statistic for this model is 36.312 with 4 degrees of freedom.

The nested model comparison of the third to the second model tests whether the mean of X is equal to the mean of Y and that the variance of X is equal to the variance of Y, simultaneously. The chi-square difference value is .424 with 2 degrees of freedom. Although the parameter estimate values from the Muthén et al. (1987) method are identical to those produced by AMOS automatically under FIML missing data handling, the chi-square values are slightly different due to how the chi-square values are computed. The FIML chi-square value is .439 rather than .424. Multiplying the Muthén et al. (1987) method chi-square value by (N/(N-1)) yields a rough approximation of the FIML chi-square: (30/(30-1))*.424 approximates .439.

**For more information on missing data handling methods in the structural equation modeling context, see the following publications:
**

Arbuckle, J., & Wothke, W. AMOS 4.0 User's Guide. Chicago: Smallwaters Corporation.

Enders, C.K., & Bandalos, D.L. (2001). The relative performance of full information maximum likelihood estimates for missing data in structural equation models. Structural Equation Modeling.

Joreskog, K.G., & Sorbom, D.L. (1993). PRELIS 2 user's reference guide. Chicago: Scientific Software International.

Muthén, B., Kaplan, D. and Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52, 431-462.

Nested model comparisons

#### Question:

I want to compare the equality of factor loadings with a confirmatory factor analysis. How can I do this using AMOS?

#### Answer:

This FAQ assumes that you understand the assumptions of structural equation models (SEM) and can specify and test SEMs using AMOS. If not, see our AMOS tutorial.

You can use a method known as nested model comparisons to address your question. Note that you can also compare path coefficients (regression weights), means, intercepts, and variances within path analytic and structural equation models using this approach.

Nested model comparisons work by imposing a constraint or set of multiple constraints on a starting or less retricted model to obtain a more restricted final model. Loehlin (1997) provides a nice discussion, with illustrated diagrams, of models that may be considered nested versus those that are not nested. The method described here is only appropriate for comparing nested models. Non-nested models may be compared descriptively by examining descriptive model fit statistics such as Akaike's Information Criterion (AIC) or the Bayesian Information Criterion (BIC); other things being equal, models with smaller AIC and BIC values fit better. Also, while the nested model comparisons provide a powerful tool to test competing structural equation (and related) models, the technique can be overused. As is the case with any test statistic, employing too many nested model comparisons on the same set of data can result in increased type 1 error for any given test (i.e., obtaining a seemingly significant hypothesis test that is not reflective of an underlying population difference but is instead due to chance alone). Jaccard & Wan (1996) address this issue in the context of nested model comparisons; they offer a suggested remedy, documented in General FAQ: Adjusted Bonferroni Comparisons.

Consider example 20-2r from the AMOS program example set. This example illustrates a confirmatory factor analysis model in which two types of learning achievement, F1 (visual learning) and F2 (verbal learning) are each indicated by three measures.This example program and the accompanying database is available in the Examples directory within the AMOS directory with all versions of AMOS, including the student version.

Suppose you wanted to test the equality of the visperc factor loading and the paragraph factor loading, as well as the cubes factor loading and the lozenges factor loading. Follow the steps shown below to obtain the appropriate nested model comparison.

The first step is to name the parameters you want to constrain. In this example, you can name the visperc factor loading by right-clicking on the single-headed arrow pointing from F1 to the visperc observed variable and selecting Object Properties to launch the Object Properties window. Click on the Parameters tab and enter a suitable name in the dialog box labeled Regression weight. For instance, you might call this factor loading visperc-loading. The Object Properties dialog box will then resemble the following figure.

Close the Object Properties dialog box. Next, assign similar names to the other three unconstrained factor loadings using the same method. Naming the factor loadings allows you to refer to them when you establish equality constraints as part of the nested model comparison, as shown below.

Next, double-click on the section of the AMOS diagram window labeled Default Model. This action launches the Manage Models window. You can also reach this window by selecting Manage Models... from the Analyze menu item.

Change the name of Default Model to Unrestricted Loadings. Referring to the unrestricted loadings model here lets AMOS know that you want to impose the constraints that follow subject to the assumptions or constraints already implied by the unrestricted loadings model.

Next, click the New button to create a new model. Assign the model a meaningful name, e.g., Equal Loadings. In the first section of the Parameter Constraints segment of the window, type Unrestricted Loadings. Next, double-click on the visperc-loading weight. Then double-click on the paragraph-loading weight. AMOS automatically inserts the two weights separated by an equal sign into the Parameter Constraints area of the Manage Models window. It is also possible to manually type the constraint(s) in the window. For example, try typing cubes-loading = sentence-loading in the window. When you are done, your Manage Models window should look like this:

Click the Close button to close the window, and then save your work by choosing Save As from the File menu and saving your new model file under a unique file name. Run the model and examine the overall fit measures.

Hint: The original AMOS example program enabled bootstrapping. The example will run more quickly with bootstrapping disabled. To disable bootstrapping, select View, then Analysis Properties.... Click on the Bootstrap tab and make sure the Perform bootstrap check box is unchecked. For more information about bootstrapping, see AMOS FAQ: Handling non-normal data using AMOS.

AMOS produces the fit statistics table for the original model that does not assume equal factor loadings and it also produces a second set of fit statistics for the more restricted model that assumes equal factor loadings. You can visually compare the inferential and descriptive fit statistics for each model. Below the fit measures, AMOS displays a section of output titled Model Comparisons. Select this output.

Although the overall fit statistics show that both models are acceptable, the nested model comparison that assesses the worsening of overall fit due to imposing the two restrictions on the original model shows a statistically significant chi-square value of 12.795 with two degrees of freedom, resulting in a probability value of .002. This finding suggests that the parsimony you achieve with the more restricted equal factor loadings model comes at too high a cost: the fact that the two models differ indicates that constraining the parameters in the Unrestricted Loadings model to obtain the Equal Loadings model results in a substantial worsening of overall model fit. Therefore, in most circumstances you would reject the equal factor loadings model in favor of the original model.

Unstandardized and standardized regression coefficients for both models are available in the AMOS output. To examine the parameter estimates for a particular model, first click on the Estimates output section. Then select the model's name from the list of models shown on the left-hand side of the AMOS tabular output (shown immediately above). Notice in this example that the unstandardized path coefficients from F1 to W1 (visperc) and F2 to W4 (paragraph) are equal, as are the unstandardized path coefficients from F1 to W2 (cubes), and F2 to W3 (sentence). This is as it should be because the Equal Loadings model contains these constraints. By contrast, the unstandardized regression coefficients for the first model are free to vary across the factors.

With AMOS version 4.0 and later versions, you can also perform model comparisons where you fix a parameter or set of parameters to a specific value. Zero is the most common value used, but other numeric values are permissable. At the time of this writing, AMOS does not support addition, subtraction, multiplication, or exponentiation in establishing model equality constraints.

**References**

For more information about nested model comparisons in the conext of SEM, see the following references:

Arbuckle, J., & Wothke, W. (1999). AMOS 4.0 User's Guide. Chicago: Smallwaters Corporation, Inc.

Jaccard, J. & Wan, C. K. (1996). LISREL approaches to interaction effects in multiple regression. Thousand Oaks, CA: Sage Publications.

Loehlin, J. C. (1997). Latent variable models. Mahwah, NJ: Lawrence Erlbaum.

Handling non-normal data using AMOS

#### Question:

I am using AMOS to fit a model to my data. I am concerned that I may have non-normal input data. How can I check the normality of my data and, if necessary, make adjustments for it when using AMOS?

#### Answer:

This FAQ assumes that you understand the assumptions of structural equation models (SEM) and can specify and test SEMs using AMOS. If not, see our AMOS tutorial. This FAQ also assumes you have read our FAQ on why non-normal input data are a problem for SEMs and the various methods that are used to deal with non-normal data in popular SEM software programs; see General FAQ: Handling non-normal data in structural equation modeling (SEM).

There are three steps you can take when you believe your data are not normally distributed and you are using AMOS:

1. Verify that your variables are not distributed joint multivariate normal

2. Assess overall model fit using the Bollen-Stine corrected p-value

3. Use the bootstrap to generate parameter estimates, standard errors of parameter estimates, and significance tests for individual parameters

Each of these steps is explained in more detail below.

**Diagnosing non-normality**

The first step in dealing with non-normal sample data is to verify that it is non-normal. Consider the cars.sav example database provided as part of the SPSS program. The database contains the following variables of interest:

* Engine displacement in cubic inches (Engine)

* Horsepower (Horse in the SPSS database; Horsepwr in the example shown below)

* Curb weight in pounds (Weight)

* Year of manufacture (Year)

* Acceleration from 0 to 60 miles per hour in seconds (Accel)

* Miles per gallon of gasoline consumed (MPG)

If you eliminate the cases with incomplete (i.e., missing) data, the remaining database contains 392 cases. For teaching purposes, we have kept the observation that has a value of 0 for Year, even though this is technically a missing value. Suppose you plan to fit the following model to the cars database.

AMOS can assess the univariate skewness and kurtosis of each variable contained in the model, as well as the joint multivariate kurtosis. To request that these statistics be included in the AMOS output, choose:

View

Analysis Properties

Click the Output tab and then check the Tests for normality and outliers check box. Also check the Standardized estimates and Squared multiple correlations tabs.

Run the model by selecting Calculate Estimates from the Analyze menu. Next, examine the Normality portion of the output. Each observed variable has a minimum value, maximum value, skewness value, critical ratio for skewness, kurtosis value, and critical value for kurtosis reported. Critical values that exceed +2.00 or that are smaller than -2.00 indicate statistically significant degrees of non-normality. AMOS also reports the joint multivariate kurtosis value and its associated critical ratio at the bottom of the table in the row labeled Multivariate.

Practically, very small multivariate kurtosis values (e.g., less than 1.00) are considered negligible while values ranging from one to ten often indicate moderate non-normality. Values that exceed ten indicate severe non-normality. In this example, every variable departs significantly from normality according to the critical ratio criterion.

**The Bollen-Stine Bootstrap and associated test of overall model fit**

One method to correct for non-normality in the underlying database is to use the Bollen-Stine p-value rather than the usual maximum likelihood-based p-value to assess overall model fit. To obtain the Bollen-Stine test, choose:

View

Analysis Properties

then select the Bootstrap tab and check the Perform bootstrap and Bollen-Stine boostrap check boxes. Specify the number of bootstrap samples you would like AMOS to draw for computing the Bollen-Stine p-value. The example shown here features 2000 drawn samples.

The output from the Bollen-Stine bootstrap is broken into three parts. The first section contains diagnostic information. If a solution is not found for a particular bootstrap sample or AMOS is unable to fit the model in a given bootstrap sample due to a singular covariance matrix, AMOS will draw a replacement sample to ensure that the final results are based upon the actual number of usable samples that the user initially requested.

In the bottom portion of the output shown, you can see that no samples were discarded due to inability to find a solution or due to a singular covariance matrix. However, if AMOS discards more than a few samples, you should double check your model specification and re-run the analysis.

AMOS is flexible in that it can use one of two different methods for minimization during the bootstrap process. According to the AMOS help system, Method 0 converges quickly for easy problems, but is slow for difficult problems. Method 0 is not yet available in AMOS, so the Method 0 column will always contain zero values for all rows. By contrast, Method 1 is a fast and generally reliable algorithm, so AMOS will first perform minimization using Method 1. If Method 1 minimization is too difficult for a particular bootstrap sample, AMOS will switch to Method 2 which is slower than Method 1, but more reliable. Each method's column lists the number of samples for which AMOS arrived at a successful solution for that many iterations. For instance, in 13 of the 2000 bootstrap samples AMOS arrived at a successful solution using Method 1 in seven iterations. By contrast, 178 samples converged in just four iterations when AMOS switched to Method 2. The Total row shows that 1346 of the 2000 bootstrap samples converged successfully using Method 1 whereas the remaining 654 samples employed Method 2 successfully.

The second portion of the output displays the p-value for the hypothesis test of overall model fit.

Recall that you requested 2000 bootstrap samples from AMOS. In this example AMOS found that the model fit worse than expected in 187 of the 2000 samples, or 187/2000 = .094, which is the obtained p-value of overall model fit. Using a conventional significance level of .05, you would not reject this model; you would conclude that it fits the data well.

By contrast, consider the normal-theory maximum likelihood chi-square test of model fit. This is the familiar test statistic that you would ordinarily use to assess model fit. Since it assumes joint multivariate normality of the observed variables, and, as you saw above in the diagnostics segment of this FAQ, these variables are clearly non-normally distributed, it is little surprise that this test rejects the null hypothesis of overall model fit: chi-square = 10.061 with 3 DF, p = .018. In this instance, the Bollen-Stine bootstrap enables you to accept a model that you would otherwise reject using the maximum likelihood-based chi-square.

The final segment of the Bollen-Stine bootstrap output illustrates the distribution of the chi-square values obtained for the 2000 bootstrap samples.

The most notable features of this output are the mean chi-square value and the general shape or form of the distribution of chi-square values. Across the 2000 samples, the expected chi-square value of 4.77 is higher than the value expected under joint multivariate normality, which is the same as the model's degrees of freedom: 3. The mean chi-square from the bootstrap samples serves as the critical chi-square value against which the obtained chi-square of 10.061 is compared. When the obtained chi-square is compared to 4.77, the p-value associated with that hypothesis test is .094 and is therefore not statistically significant. By contrast, when the obtained chi-square of 10.061 is compared to the critical chi-square expected under joint multivariate normality of the observed variables, 3.00, the p-value is .018 and is therefore considered statistically significant at the usual .05 cutoff criterion.

Interestingly, the form of the distribution of chi-square values obtained from the bootstrap replications shows a number of values clustering near the multivariate normal expected value of 3.00, but there are also a substantial number of values that exceed 3.00 and even some values that are in double-figures. The distribution of the chi-square values is decidedly non-normal, but that is not a problem for the Bollen-Stine test statistic.

**Bootstrapped parameter estimates and standard errors**

After you obtain satisfactory overall model fit, the next questions you are likely to pose are: What path coefficients are statistically significant and what are their values? AMOS provides an array of bootstrapping options to address these questions. Unfortunately, you cannot obtain bootstrap parameter estimates and their associated standard errors at the same time as the Bollen-Stine p-value, so you must return to the Bootstrap tab in the Analysis Properties window.

In this analysis, you deselect the Bollen-Stine bootstrap checkbox and select the Percentile confidence intervals and Bias-corrected confidence intervals check boxes. Set the number of bootstrap samples at 250 based upon the recommendations of Nevitt and Hancock (1998); the authors found little improvement in the quality of bootsrap estimates due to larger numbers of bootstrap samples. If you plan to interpret probability values (also known as p-values) as shown below, you should use a larger number of bootstrap samples (e.g., 2000) to ensure stable probability estimates.

The relevant output from the analysis appears below. Bootstrap parameter estimates are computed for each parameter estimate in the model: regression (path) coefficients, variances, covariances, and means and intercepts (if these quantities are estimated). For presentation purposes, selected output showing the original normal theory maximum likelihood-based covariance estimates and their bootstrap-based counterparts are shown here. The first figure, shown immediately below, displays the normal theory maximum likelihood estimates of the covariances of the independent variables in the model.

The initial part of this output contains the familiar Estimate, S.E. (standard error), and C.R. (Critical Ratio, the estimate divided by its standard error) quantities that are computed assuming normal distribution of the observed variables. Notice that each covariance is statistically significant. (In the column labeled 'P', three stars, '***', means the p-value is less than .001.) In particular, take note of the hypothesis test that the WEIGHT with YEAR covariance is equal to zero in the population of cars from which this sample was drawn. The normal theory parameter estimate is -504.570 with an estimated standard error of 229.771. Dividing -504.570 by 229.771 returns a critical ratio of -2.196, which is statistically significant using the conventional .05 cutoff level for statistical significance (at alpha = .05, critical ratios that fall between -1.96 and +1.96 are not statistically significant). The p-value of .028 shown in the table above is the p-value from the normal theory test of the null hypothesis that the covariance between WEIGHT and YEAR is zero in the population of cars from which this sample was drawn. Next, consider the bootstrap output from the Bootstrap Standard Errors table found in the Estimates/Bootstrap section:

The Bootstrap section of the output contains the mean of the parameter estimates from the multiple bootstrap samples. The difference between the maximum likelihood-based estimate and the bootstrap-based estimate is shown in the Bias column. Large bias values, as is the case here, suggest a substantial discrepancy between the results of the bootstrap analysis and the original normal theory-based analysis.

You can use the bootstrap Mean and SE columns to compute critical ratio values based on the bootstrap results. For example, consider testing the null hypothesis that the covariance between WEIGHT and YEAR in the table shown above is zero. The mean parameter estimate value from the 250 bootstrap samples is -503.194 with an estimated standard error equal to 441.729. Notice that the estimated standard error across the bootstrap samples is almost twice as large as the normal theory standard error. The result of this discrepancy has a profound impact on the significance test for the WEIGHT and YEAR covariance: When you divide the bootstrap parameter estimate by the estimated standard error (-503.194/441.729), the resulting critical ratio, -1.14, is not statistically significant. There is no p-value for this test reported in the AMOS output. Instead, consider referring to the bias-corrected and percentile-corrected hypothesis tests. These tables are found below the Bootstrap Standard Errors table. The bias-corrected confidence intervals and p-values are shown below.

The percentile-corrected confidence intervals and p-values are shown in the following table.

The results are largely consistent across the two methods, yielding the same substantive conclusions. Moony and Duval (1993, p. 50) note that the various available bootstrap confidence interval techniques can and frequently do perform differently under different circumstances. Therefore, there is no one best method to use in all data analysis situations. A sensible recommendation offered by Mooney and Duval is to report multiple confidence interval types and allow your audience to draw appropriate conclusions from the results.

It is possible to have upper and lower bias-corrected confidence intervals that do not include zero, yet have a p-value that is not statistically significant. This is because the p-values are computed independently of the confidence intervals. If you return to the Bootstrap tab in the Analysis Properties window and change the default 90% confidence intervals to 95%, the upper and lower confidence interval values for the covariances will change, though the p-values for both the PC and BC confidence intervals remain unchanged. In this example, the 90% confidence intervals and p-values match.

AMOS will compute bootstrap test statistics for all requested output, including standardized coefficients, squared multiple correlations (r-square values), and total and indirect effects. This last feature is very useful - even if your data meet the assumption of multivariate normality, you may still want to explore bootstrapping to test significance of indirect effects, standardized coefficients, or squared multiple correlations.

**Cautions**

There are several cautionary notes to keep in mind when you use bootstrapping with AMOS. First, AMOS requires that the input database be complete for diagnosing sample data non-normality and for using any of its bootstrap features. In other words, if you have missing data, you must solve the missing data problem before you can use AMOS's non-normality diagnostic and bootstrap features. In this example, the total number of cases in the cars database was 406. Omitting cases with missing values resulted in a database containing 392 cases. Approximately 3.5% of the original cases were lost due to missingness. According to Roth (1994), with case losses of 5% or less, removal of entire cases (listwise data deletion) is a defensible strategy for handling the incomplete data problem. If removing cases results in a loss of data that exceeds 5%, however, other methods for handling missing data may be more appropriate. See General FAQ: Handling missing or incomplete data for details on missing data handling methods.

Second, your sample size should be sufficiently large to ensure trustworthy parameter estimates. Nevitt and Hancock (1998) suggest a minimum sample size of 200 for SEMs that contain latent variables. Finally, the bootstrap method requires the data analyst to set the scale of latent variables by fixing a latent variable's value to 1.00 rather than by fixing the corresponding factor's variance value to 1.00 because under the latter scenario bootstrapped standard error estimates may be artificially inflated by switching positive and negative factor loadings across bootstrap samples (Hancock & Nevitt, 1999).

**References**

For more information about non-normal data handling in AMOS, see the following references:

Arbuckle, J., & Wothke, W. (1999). AMOS 4.0 User's Guide. Chicago, IL: Smallwaters Corporation.

Fouladi, R. T. (1998). Covariance structure analysis techniques under conditions of multivariate normality and nonnormality - Modified and bootstrap test statistics. Paper presented at the American Educational Research Association Annual Meeting, April 11-17, 1998, San Diego, CA.

Hancock, G. R., & Nevitt, J. (1999). Bootstrapping and the identification of exogenous latent variables within structural equation models. Structural Equation Modeling, 6(4), 394-399.

Mooney, C. Z., & Duval, R. D. (1993). Bootstrapping: A nonparametric approach to statistical inference. Newbury Park, CA: Sage Publications.

Nevitt, J., & Hancock, G. R. (1998). Relative performance of rescaling and resampling approaches to model chi-square and parameter standard error estimation in structural equation modeling. Paper presented at the American Educational Research Association Annual Meeting, April 11-17, 1998, San Diego, CA.

Roth, P. (1994). Missing data: A conceptual review for applied psychologists. Personnel Psychology, 47, 537-560.