Agresti, A. (2002). Categorical Data Analysis (Second ed.). New York: Wiley.
Alwin, D. F. and D. J. Jackson (1980). Measurement Models for Response Errors in Surveys: Issues and Applications, pp. 68–119. San Francisco: Jossey-Bass.
Andersen, E. B. (1980). Comparing latent distributions. Psychometrika 45, 121–134.
Andrich, D. (1988). Rasch Models for Measurement. Number 68 in Quantitative Applications in the Social Sciences. Newbury Park, CA: Sage.
Angoff, W. H. and S. F. Ford (1973). Item-race interaction on a test of scholastic aptitude. Journal of Educational Measurement 10, 95–106.
Baker, F. B. (1981). A criticism of Scheuneman’s item bias technique. Journal of Educational Measurement 18, 59–62.
Bandeen-Roche, K., D. L. Miglioretti, S. L. Zeger, and P. J. Rathouz (1997). Latent variable regression for multiple discrete outcomes. Journal of the American Statistical Association 92, 1375–1386.
Bartholomew, D., M. Knott, and I. Moustaki (2011). Latent Variable Models and Factor Analysis: A Unified Approach (Third ed.). Chichester: Wiley.
Bechtoldt, H. P. (1974). A confirmatory analysis of the factor stability hypothesis. Psychometrika 39, 319–326.
Bejar, I. I. (1980). Biased assessment of program impact due to psychometric artifacts. Psychological Bulletin 87, 513–524.
Bentler, P. M. (1980). Multivariate analysis with latent variables: Causal modeling. Annual Review of Psychology 31, 419–456.
Bieber, S. L. (1986). A hierarchical approach to multigroup factorial invariance. Journal of Classification 3, 113–134.
Bishop, G. F., A. J. Tuchfarber, and R. W. Oldendick (1986). Opinions on fictitious issues: The pressure to answer survey questions. Public Opinion Quarterly 50, 240–250.
Bollen, K. A. (1989). Structural Equations with Latent Variables. New York: Wiley.
Brown, T. A. (2003). Confirmatory factor analysis of the Penn State Worry Questionnaire: Multiple factors or methods effects? Behavior Research and Therapy 41, 1411–1426.
Byrne, B. M., R. J. Shavelson, and B. Muthén (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin 105, 456–466.
Cardall, C. and W. E. Coffman (1964). A method for comparing the performance of different groups on the items in a test. Research Bulletin 64-10, Educational Testing Service, Princeton, NJ.
Carle, A. C. (2009a). Assessing the adequacy of self-reported alcohol abuse measurement across time and ethnicity: Cross-cultural equivalence across Hispanics and Caucasians 1992, non-equivalence in 2001–2002. BMC Public Health 9, Article number 60.
Carle, A. C. (2009b). Cross-cultural invalidity of alcohol dependence measurement across Hispanics and Caucasians in 2001 and 2002. Addictive Behaviors 34, 43–50.
Carle, A. C. (2010). Interpreting the results of studies using latent variable models to assess data quality: An empirical example using confirmatory factor analysis. Quality & Quantity 44, 483–497.
Carroll, R. J., D. Ruppert, L. A. Stefanski, and C. M. Crainiceanu (2006). Measurement Error in Nonlinear Models (Second ed.). Boca Raton, FL: Chapman & Hall/CRC.
Cheung, G. W., , and R. B. Rensvold (1999). Testing factorial invariance across groups: a reconceptualization and proposed new method. Journal of Management 25, 1–27.
Cheung, G. W. and R. B. Rensvold (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling 9, 233–255.
Clogg, C. C. (1988). Latent class models for measuring. In R. Langeheine and J. Rost (Eds.), Latent Trait and Latent Class Models, pp. 173–205. New York: Plenum.
Clogg, C. C. and L. A. Goodman (1984). Latent structure analysis of a set of multidimensional contingency tables. Journal of the American Statistical Association 79, 762–771.
Clogg, C. C. and L. A. Goodman (1985). Simultaneous latent structure analysis in several groups. Sociological Methodology 15, 81–110.
Clogg, C. C. and L. A. Goodman (1986). On scaling models applied to data from several groups. Psychometrika 51, 123–135.
Cook, L. L. and A. P. Schmitt-Cascallar (2005). Establishing score comparability for tests given in different languages. In R. K. Hampleton, P. F. Merenda, and C. D. Spielberger (Eds.), Adapting Educational and Psychological Tests for Cross-Cultural Assessment, pp. 139–169. Mahwah, NJ: Lawrence Erlbaum.
Davidov, E. (2009). Measurement equivalence of nationalism and constructive patriotism in the issp: 34 countries in a comparative perspective. Political Analysis 17, 64–82.
Davidov, E., B. Meuleman, J. Billiet, and P. Schmidt (2008). Values and support for immigration: a cross-country comparison. European Sociological Review 24, 583–599.
Davidov, E., P. Schmidt, and J. Billiet (Eds.) (2011). Cross-Cultural Analysis: Methods and Applications. New York: Routledge.
Davidov, E., P. Schmidt, and S. H. Schwartz (2008). Bringing values back in: the adequacy of the european social survey to measure values in 20 countries. Public Opinion Quarterly 72, 420–445.
Dayton, C. M. and G. B. Macready (1988). Concomitant-variable latent class models. Journal of the American Statistical Association 83, 173–178.
De Beuckelaer, A. and G. Swinnen (2011). Biased latent variable mean comparisons due to measurement noninvariance: A simulation study. In E. Davidov, P. Schmidt, and J. Billiet (Eds.), Cross-Cultural Analysis: Methods and Applications, pp. 117–147. New York: Routledge.
De Jong, M. G., J.-B. E. M. Steenkamp, and J.-P. Fox (2007). Relaxing measurement invariance in cross-national consumer research using a hierarchical IRT model. Journal of Consumer Research 34, 260–278.
Drasgow, F. (1982). Biased test items and differential validity. Psychological Bulletin 92, 526–531.
Drasgow, F. and R. Kanfer (1985). Equivalence of psychological measurement in heterogeneous populations. Journal of Applied Psychology 70, 662–680.
Drasgow, F. and T. M. Probst (2005). The psychometrics of adaptation: Evaluating measurement equivalence across languages and cultures. In R. K. Hampleton, P. F. Merenda, and C. D. Spielberger (Eds.), Adapting Educational and Psychological Tests for Cross-Cultural Assessment, pp. 265–296. Mahwah, NJ: Lawrence Erlbaum.
Edelen, M. O. and B. B. Reeve (2007). Applying item response theory (IRT) modeling to questionnaire development, evaluation and refinement. Quality of Life Research 16, 5–18.
Ellis, B. B., B. Minsel, and P. Becker (1989). Evaluation of attitude survey translations: An investigation using item response theory. International Journal of Psychology 24, 665–684.
Fienberg, S. E. (1980). The Analysis of Cross-classified Categorical Data (Second ed.). Cambridge, MA: MIT Press. Reprinted in 2007 by Springer.
Formann, A. K. (2003). Latent class model diagnosis from a frequentist point of view. Biometrics 59, 189–196.
Fox, J.-P. (2010). Bayesian Item Response Modeling: Theory and Applications. New York: Springer.
Fox, J.-P. and J. Verhagen (2011). Random item effects modeling for cross-national survey data. In E. Davidov, P. Schmidt, and J. Billiet (Eds.), Cross-Cultural Analysis: Methods and Applications, pp. 461–482. New York: Routledge.
Ghorbani, N., M. N. Bing, P. J. Watson, H. K. Davison, and D. A. Mack (2002). Self-reported emotional intelligence: Construct similarity and functional dissimilarity of higher-order processing in Iran and the United States. International Journal of Psychology 37, 297–308.
Glas, C. A. W. (1998). Detection of differential item functioning using Lagrange multiplier tests. Statistica Sinica 8, 647–667.
Goodman, L. A. (2002). Latent class analysis: The empirical study of latent types, latent variables, and latent structures. In J. A. Hagenaars and A. L. McCutcheon (Eds.), Applied Latent Class Analysis, pp. 3–55. Cambridge: Cambridge University Press.
Hagenaars, J. A. and A. L. McCutcheon (Eds.) (2002). Applied Latent Class Analysis. Cambridge: Cambridge University Press.
Hampleton, R. K., P. F. Merenda, and C. D. Spielberger (Eds.) (2005). Adapting Educational and Psychological Tests for Cross-Cultural Assessment. Mahwah, NJ: Lawrence Erlbaum.
Harkness, J. A., F. J. R. van de Vijver, and P. P. Mohler (Eds.) (2003). Cross-cultural Survey Methods. Hoboken, NJ: Wiley.
Hogan, D. P., D. J. Eggebeen, and C. C. Clogg (1993). The structure of intergenerational exchanges in american families. The American Journal of Sociology 98, 1428–1458.
Hood, R. W., N. Ghorbani, P. J. Watson, A. F. Ghramaleki, M. N. Bing, H. K. Davison, R. J. Morris, and W. P. Williamson (2001). Dimensions of the mysticism scale: confirming the three-factor structure in the United States and Iran. Journal for the Scientific Study of Religion 40, 691–705.
Hui, C. H. and H. C. Triandis (1985). Measurement in cross-cultural psychology: A review and comparison of strategies. Journal of Cross-Cultural Psychology 156, 131–152.
Janssen, R. (2011). Using a Differential Item Functioning approach to investigate measurement invariance. In E. Davidov, P. Schmidt, and J. Billiet (Eds.), Cross-Cultural Analysis: Methods and Applications, pp. 359–432. New York: Routledge.
Johnson, M. (1969). Factorial invariance of African educational abilities and aptitudes. Research Bulletin 69-3, Educational Testing Service, Princeton, NJ.
Johnson, T. P. (1998). Approaches to equivalence in cross-cultural and cross-national survey research. In J. A. Harkness (Ed.), ZUMA-Nachrichten Spezial No. 3: Cross-Cultural Survey Equivalence, Mannheim. ZUMA.
Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika 36, 409–426.
Jowell, R., C. Roberts, R. Fitzgerald, and G. Eva (Eds.) (2007). Measuring Attitudes Cross-nationally: Lessons from the European Social Survey. London: Sage.
Kankaraš, M. and G. Moors (2009). Measurement equivalence in solidarity attitudes in Europe: Insights from a multiple-group latent-class factor approach. International Sociology 24, 557–579.
Kankaraš, M. and G. Moors (2011). Measurement equivalence and extreme response bias in the comparison of attitudes across Europe: A multigroup latent-class factor approach. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences 7, 68–80.
Kankaraš, M. and G. Moors (2012). Cross-national and cross-ethnic differences in attitudes: A case of Luxembourg. Cross-Cultural Research 46, 224–254.
Kankaraš, M., G. Moors, and J. K. Vermunt (2011). Testing for measurement invariance with latent class analysis. In E. Davidov, P. Schmidt, and J. Billiet (Eds.), Cross-Cultural Analysis: Methods and Applications, pp. 359–384. New York: Routledge.
Kankaraš, M., J. K. Vermunt, and G. Moors (2011). Measurement equivalence of ordinal items: A comparison of factor analytic, item response theory, and latent class approaches. Sociological Methods and Research, 279–310.
Kaplan, D. and R. George (1995). A study of the power associated with testing factor mean differences under violations of factorial invariance. Structural Equation Modeling 2, 101–118.
Kelderman, H. (1989). Item bias detection using loglinear IRT. Psychometrika 54, 681–697.
Kelderman, H. and G. B. Macready (1990). The use of loglinear models for assessing differential item functioning across manifest and latent examinee groups. Journal of Educational Measurement 27, 307–327.
Knight, A. (1978). Common factor analysis: Some recent developments in theory and practice. The Statistician 27, 27–42.
Krosnick, J. A., A. L. Holbrook, M. K. Berent, R. T. Carson, Michael, R. J. Kopp, Cameron, S. Presser, P. A. Ruud, Kerry, W. R. Moody, M. C. Green, and M. Conaway (2002). The impact of "no opinion" response options on data quality: Non-attitude reduction or an invitation to satisfice? Public Opinion Quarterly 66, 371–403.
Langeheine, R. and J. Rost (Eds.) (1988). Latent Trait and Latent Class Models. New York: Plenum Press.
Lee, S.-Y. and K.-L. Tsui (1982). Covariance structure analysis in several populations. Psychometrika 47, 297–308.
Liebler, C. A. and G. D. Sandefur (2002). Gender differences in the exchange of social support with friends, neighbors, and co-workers at midlife. Social Science Research 31, 364–391.
MacPherson, L. and M. G. Myers (2004). Invariance study of an adolescent survey-based smoking-related cognition scale: Examination across Hispanic and Caucasian groups. Preventive Medicine 39, 1026–1035.
Magidson, J. and J. K. Vermunt (2001). Latent class factor and cluster models, bi-plots and related graphical displays. Sociological Methodology 31, 223–264.
Mannetti, L., A. Pierro, A. Kruglanski, T. Taris, and P. Bezinovic (2002). A cross-cultural study of the Need for Cognitive Closure Scale: Comparing its structure in Croatia, Italy, USA and The Netherlands. British Journal of Social Psychology 41, 139–156.
Marsh, H. W. and D. Hocevar (1985). Application of confirmatory factor analysis to the study of self-concept: First- and second-order factor models and their invariance across groups. Psychological Bulletin 97, 562–582.
Mathisen, G. E., T. Torsheim, and S. Einarsen (2006). The team-level model of climate for innovation: A two-level confirmatory factor analysis. Journal of Occupational and Organizational Psychology 79, 23–35.
McClendon, M. J. and D. F. Alwin (1993). No-opinion filters and attitude measurement reliability. Sociological Methods & Research 21(4), 438–464.
McCutcheon, A. L. (1996). Multiple group association models with latent variables: An analysis of secular trends in abortion attitudes, 1972–1988. Sociological Methodology 26, 79–111.
McGaw, B. and K. G. Jöreskog (1971). Factorial invariance of ability measures in groups differing in intelligence and socioeconomic status. British Journal of Mathematical and Statistical Psychology 24, 154–168.
Mellenbergh, G. J. (1982). Contingency table models for assessing item bias. Journal of Educational Statistics 7, 105–118.
Mellenbergh, G. J. (1989). Item bias and item response theory. International Journal of Educational Research 13, 127–143.
Meredith, W. (1964a). Notes on factorial invariance. Psychometrika 29, 177–185.
Meredith, W. (1964b). Rotation to achieve factorial invariance. Psychometrika 29, 187–206.
Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika 58, 525–543.
Meredith, W. and R. E. Millsap (1992). On the misuse of manifest variables in the detection of measurement bias. Psychometrika 57, 289–311.
Meuleman, B. and J. Billiet (2012). Measuring attitudes toward immigration in europe: The cross-cultural validity of the ESS immigration scales. Ask: Research & Methods 21, 5–29.
Miller, J. D. (1998). The measurement of civic scientific literacy. Public Understanding of Science 7, 203–223.
Mills, M., G. G. van de Brunt, and J. de Bruijn (2006). Comparative research: Persistent problems and promising solutions. International Sociology 21, 619–631.
Millsap, R. E. (2007). Invariance in measurement and prediction revisited. Psychometrika 72, 461–473.
Millsap, R. E. and J. Yun-Tein (2004). Assessing factorial invariance in ordered-categorical measures. Multivariate Behavioral Research 39, 479–515.
Muthén, B. (1989). Latent variable modeling in heterogeneous populations. Psychometrika 54, 557–585.
Muthén, B. and T. Asparouhov (2012). Bayesian structural equation modeling: A more flexible representation of substantive theory. Psychological Methods 17, 313–335.
Muthén, B. and T. Asparouhov (2013). BSEM measurement invariance analysis. Mplus Web Notes: No. 17, http://www.statmodel.com.
Muthén, B. and A. Christoffersson (1981). Simultaneous factor analysis of dichotomous variables in several groups. Psychometrika 46, 407–419.
Muthén, B. and J. Lehman (1985). Multiple group IRT modeling: Applications to item bias analysis. Journal of Educational Statistics 10, 133–142.
Oberski, D. L. (2013). Evaluating sensitivity of parameters of interest to measurement invariance in latent variable models. To appear in Political Analysis.
Pardo, R., C. Midden, and J. D. Miller (2002). Attitudes towards biotechnology in the European Union. Journal of Biotechnology 98, 9–24.
Petersen, N. S. and M. R. Novick (1976). An evaluation of some models for culture-fair selection. Journal of Educational Measurement 13, 3–29.
Reiser, M. and Y. Lin (1999). A goodness-of-fit test for the latent class model when expected frequencies are small. Sociological Methodology 29, 81–111.
Rijmen, F., M. von Davier, and K. Yamamoto. Addressing item invariance across countries in large scale assessments studies. Technical report, ETS.
Rock, D. A. and N. E. Freeberg (1969). Factorial invariance of biographical factors. Multivariate Behavioral Research 4, 195–210.
Saris, W. E. and I. N. Gallhofer (2007). Design, Evaluation, and Analysis of Questionnaires for Survey Research. Hoboken, NJ: Wiley.
Scheuneman, J. (1979). A method of assessing bias in test items. Journal of Educational Measurement 16, 143–152.
Scheuneman, J. (1981). A response to Baker’s criticism. Journal of Educational Measurement 18, 63–66.
Siegers, P. (2011). Multiple group latent class analysis of religious orientations in Europe. In E. Davidov, P. Schmidt, and J. Billiet (Eds.), Cross-Cultural Analysis: Methods and Applications, pp. 385–412. New York: Routledge.
Sireci, S. G. (2005). Using bilinguals to evaluate the comparability of different language versions of a test. In R. K. Hampleton, P. F. Merenda, and C. D. Spielberger (Eds.), Adapting Educational and Psychological Tests for Cross-Cultural Assessment, pp. 117–138. Mahwah, NJ: Lawrence Erlbaum.
Sireci, S. G., L. Patsula, and R. K. Hambleton (2005). Statistical methods for identifying flaws in the test adaptation process. In R. K. Hampleton, P. F. Merenda, and C. D. Spielberger (Eds.), Adapting Educational and Psychological Tests for Cross-Cultural Assessment, pp. 93–115. Mahwah, NJ: Lawrence Erlbaum.
Skrondal, A. and S. Rabe-Hesketh (2004). Generalized Latent Variable Modeling: Multilevel, Longitudinal,and Structural Equation Models. Boca Raton, FL: Chapman & Hall / CRC.
Soares, T. M., F. B. Gonçalves, and D. Gamerman (2009). An integrated Bayesian model for DIF analysis. Journal of Educational and Behavioral Statistics 34, 348–377.
Sörbom, D. (1974). A general method for studying differences in factor means and factor structure between groups. British Journal of Mathematical and Statistical Psychology 27, 229–239.
Sörbom, D. (1978). An alternative to the methodology for analysis of covariance. Psychometrika 43, 381–396.
Steenkamp, J.-B. E. M. and H. Baumgartner (1998). Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research 25, 78–90.
Stegmueller, D. (2011). Apples and oranges? The problem of equivalence in comparative research. Political Analysis 19, 471–487.
Stein, J. A., J. W. Lee, and J. P. S (2006). Assessing cross-cultural differences through use of multiple-group invariance analyses. Journal of Personality Assessment 87, 249–258.
Stocking, M. L. and F. M. Lord (1982). Developing a common metric in item response theory. Technical report, Educational Testing Service. Published (possibly in modified form) in Applied Psychological Measurement, 7, pp. 201–210 (1983).
van de Vijver, F. and K. Leung (1997). Methods and Data Analysis for Cross-cultural Research. Thousand Oaks, CA: Sage Publications.
van de Vijver, F. J. R. and Y. H. Poortinga (2005). Conceptual and methodological issues in adapting tests. In R. K. Hampleton, P. F. Merenda, and C. D. Spielberger (Eds.), Adapting Educational and Psychological Tests for Cross-Cultural Assessment, pp. 39–63. Mahwah, NJ: Lawrence Erlbaum.
van den Wittenboer, G., J. J. Hox, and E. D. de Leeuw (2000). Latent class analysis of respondent scalability. Quality and Quantity 34, 177–191.
Vandenberg, R. J. and C. E. Lance (2000). A review and synthesis of the measurement invariance literature: suggestions, practices, and recommendations for organizational research. Organizational Research Methods 3, 4–70.
Watkins, D. (1989). The role of confirmatory factor analysis in cross-cultural research. International Journal of Psychology 24, 685–701.
Werts, C. E., D. A. Rock, R. L. Linn, and K. G. Jöreskog (1976). Comparison of correlations, variances, covariances, and regression weights with or without measurement error. Psychological Bulletin 83, 1007–1013.
Wilson, K. L. (1981). On population comparisons using factor indexes or latent variables. Social Science Research 10, 301–313.
Wolfle, L. M. and D. Robertshaw (1983). Racial differences in measurement error in educational achievement models. Journal of Educational Measurement 20, 39–49.
Woods, C. M. (2009). Evaluation of MIMIC-model methods for DIF testing with comparison to two-group analysis. Multivariate Behavioral Research 44, 1–27.
__________________________________________________________________