I plan to take a PhD student this year. If you are interested in my research, please contact me via email.
Click here for the application procedure for PhD in the Statistics department.

Contact Information

  • Office:
    Columbia House, Room 5.16
    Houghton Street, London, WC2A 2AE

  • Faculty webpage: Link

  • Google scholar page: Link

  • Email: y.chen186@lse.ac.uk


Educational Background

  • PhD, Statistics, Columbia University in the city of New York, 2016
    (My Node on Math Genealogy Tree: Click Link)

Research interest

My research aims at developing new statistical and machine learning methods and theory for SOCIAL DATA SCIENCE. With the emerging of high-dimensional data in social sciences and due to the high noise level in such data, we need more computationally efficient algorithms and statistical inference methods to make more reliable and reproducible findings.

My current research focuses on three interrelated topics including (1) large-scale item response data analysis, (2) measurement and predictive modeling based on dynamic behavioral data and (3) sequential design of dynamic systems. The first topic focuses on large-scale item response data from educational and psychological testing that are more and more commonly encountered these days. Due to the high-dimensionality and complexity of such data, the traditional statistical models, estimation methods, and computational algorithms are no longer very suitable. I have developed several new statistical models for applications including psychological and psychiatric measurement, detection of aberrant behavior in educational testing, and analysis of large-scale educational survey. In addition, new estimation methods have been proposed and numerical and stochastic optimization algorithms have been developed that are more suitable for large-scale item response data, for which statistical theory has been developed.

The second topic looks at multivariate dynamic data, specifically intensive longitudinal data in psychology and problem-solving process data in education. Intensive longitudinal data, which are data with many measurements overtime collected by smartphones, fitness trackers, and the Internet of Things, are becoming more important in social and behavioral research, especially in psychology. Problem-solving process data refer to the actions and time stamps recorded in computer logfiles, when individuals solving a computer-simulated task. Both types of data contain substantial amount of useful information, but extracting such information is challenging due to their complex data structures. Borrowing ideas from continuous-time Gaussian processes and counting processes, I have proposed statistical frameworks and models for analysing and making prediction based on such data, and developed computational algorithms to make these models applicable to real problems.

The third topic considers the stochastic control of dynamic systems and applications in social and behavioral sciences. Specifically, compound decision theory has been developed for sequential hypothesis testing, stochastic control, rank aggregation, and change point detection, with applications to online crowdsourcing, item pool quality control in educational testing, and technology-enhanced personalized learning.


  • 2018 NCME Brenda H. Loyd Outstanding Dissertation Award (Photo)

  • 2018 National Academy of Education/Spencer Postdoctoral Fellow (link)


  • Intern at Educational Testing Service, supervised by Dr. Matthias von Davier, June to July 2014

  • Visiting Scholar at Shanghai Center for Mathematical Sciences, May to July 2016

  • Assistant Professor in Department of Psychology and Institute for Quantitative Theory and Methods, Emory University, August 2016 - August 2018