Casper Albers



The list below is in reversed chronological order. Peer-reviewed publications are numbered. Non-reviewed publications are unnumbered. Explanation of symbols:
displays the abstract of the publication;
opens the paper (usually in pdf format);
directs you to the website of the publisher, based on the DOI.

Please select:
  Display all refereed publications.
  Display all non-refereed publications.


Lieke Voncken, Casper Albers, Marieke Timmerman (2018). Improving confidence intervals for normed test scores: Include uncertainty due to sampling variability. Behavior Research Methods, accepted for publication.
Test publishers usually provide confidence intervals (CIs) for normed test scores that reflect the uncertainty due to the unreliability of the tests. The uncertainty due to sampling variability in the norming phase is ignored. To express uncertainty due to norming, we propose a flexible method that is applicable in continuous norming and allows for a variety of score distributions, using Generalized Additive Models for Location, Scale, and Shape (GAMLSS; Rigby & Stasinopoulos, 2005). We assessed the performance of this method in a simulation study, by examining the quality of the resulting CIs. We varied the population model, procedure of estimating the CI, confidence level, sample size, value of the predictor, extremity of the test score, and type of variance-covariance matrix. The results showed that good quality of the CIs could be achieved in most conditions. The method is illustrated using normative data of the SON-R 6-40 test. We recommend test developers to use this approach to arrive at CIs, and thus properly express the uncertainty due to norm sampling fluctuations, in the context of continuous norming. Adopting this approach will help (e.g., clinical) practitioners to obtain a fair picture of the person assessed.
Casper Albers (2018). The statistical approach known as Machine Learning. Nieuw Archief voor Wiskunde, 5/19, 3: 215-217.

Rob Meijer, Anja Boeve, Jorge Tendeiro, Roel Bosker, Casper Albers (2018). Corrigendum: The Use of Subscores in Higher Education: When Is This Useful?. Frontiers in Psychology: Educational Psychology, 9:873.

Casper Albers, Henk Kiers, Don van Ravenzwaaij (2018). Credible Confidence: A Pragmatic View on the Frequentist vs Bayesian Debate. Collabra: Psychology, 4(1), 31.
The debate between Bayesians and frequentist statisticians has been going on for decades. Whilst there are fundamental theoretical and philosophical differences between both schools of thought, we argue that in two most common situations the practical differences are negligible when off-the-shelf Bayesian analysis (i.e., using ‘objective’ priors) is used. We emphasize this reasoning by focusing on interval estimates: confidence intervals and credible intervals. We show that this is the case for the most common empirical situations in the social sciences, the estimation of a proportion of a binomial distribution and the estimation of the mean of a unimodal distribution. Numerical differences between both approaches are small, sometimes even smaller than those between two competing frequentist or two competing Bayesian approaches. We outline the ramifications of this for scientific practice.
Nitin Bhushan, Casper Albers, Linda Steg (2018). Studying the effects of intervention programmes on household energy saving behaviours using graphical causal models. Energy Research & Social Science, Accepted for publication.

Casper Albers (2018). De Moivre-Gauss-Laplace: extraordinarily normal. Nieuw Archief voor Wiskunde, 5/19(1): 37-38.

Casper Albers (2018). Hoe tel je geheimen?. Pyhtagoras, 57(4): 8.

Christien Slofstra, Maaike Nauta, Laura Bringmann, Nicola Klein, Casper Albers, Nicolas Batalas, Marieke Wichers (2018). Individual negative affective trajectories can be detected during different depressive relapse prevention strategies. Psychotherapy and Psychosomatics, 87:243-245.
Letter to the editor
Renske Bosman, Casper Albers, Jettie de Jong, Nicolas Batalas, Marije aan het Rot (2018). No menstrual cyclicity in mood and interpersonal behaviour in nine women with self-reported PMS. Psychopathology, (accepted for publication).
Background/Aims: Before diagnosing premenstrual dysphoric disorder (PMDD), 2 months of prospective assessment are required to confirm menstrual cyclicity in symptoms. For a diagnosis of premenstrual syndrome (PMS), this is not required. Women with PMDD and PMS often report that their symptoms interfere with mood and social functioning, and are said to show cyclical changes in interpersonal behaviour, but this has not been examined using a prospective approach. We sampled cyclicity in mood and interpersonal behaviour for 2 months in women with self- reported PMS. Methods: Participants met the criteria for PMS on the Premenstrual Symptoms Screening Tool (PSST), a retrospective questionnaire. For 2 menstrual cycles, after each social interaction, they used the online software TEMPEST to record on their smartphones how they felt and behaved. We examined within-person variability in negative affect, positive affect, quarrelsomeness, and agreeableness. Results: Participants evaluated TEMPEST as positive. However, we found no evidence for menstrual cyclicity in mood and interpersonal behaviour in any of the individual women (n = 9). Conclusion: Retrospective questionnaires such as the PSST may lead to oversampling of PMS. The diagnosis of PMS, like that of PMDD, might require 2 months of prospective assessment.
Jan Gerard Hoendervanger, Anja Ernst, Casper Albers, Mark Mobach, Nico van Yperen (2018). Individual differences in satisfaction with activity-based work environments. PLoS ONE, 13(3): e0193878.
Satisfaction with activity-based work environments (ABW environments) often falls short of expectations, with striking differences among individual workers. A better understanding of these differences may provide clues for optimising satisfaction with ABW environments and associated organisational outcomes. The current study was designed to examine how specific psychological needs, job characteristics, and demographic variables relate to satisfaction with ABW environments. Survey data collected at seven organizations in the Netherlands (N = 551) were examined using correlation and regression analyses. Significant correlates of satisfaction with ABW environments were found: need for relatedness (positive), need for privacy (negative), job autonomy (positive), social interaction (positive), internal mobility (positive), and age (negative). Need for privacy appeared to be a powerful predictor of individual differences in satisfaction with ABW environments. These findings underline the importance of providing work environments that allow for different work styles, in alignment with different psychological need strengths, job characteristics, and demographic variables. Improving privacy, especially for older workers and for workers high in need for privacy, seems key to optimizing satisfaction with ABW environments.
Claire Hill, Cathy Creswell, Sarah Vigerland, ..., Casper Albers, ..., Philip Kendall (2018). Navigating the development and dissemination of internet cognitive behavioral therapy (iCBT) for anxiety disorders in children and young people: A consensus statement with recommendations from the #iCBTLorentz Workshop Group. Internet Interventions, 12, 1-10.
Initial internet-based cognitive behavioral therapy (iCBT) programs for anxiety disorders in children and young people (CYP) have been developed and evaluated, however these have not yet been widely adopted in routine practice. The lack of guidance and formalized approaches to the development and dissemination of iCBT has arguably contributed to the difficulty in developing iCBT that is scalable and sustainable beyond academic evaluation and that can ultimately be adopted by healthcare providers. This paper presents a consensus statement and recommendations from a workshop of international experts in CYP anxiety and iCBT (#iCBTLorentz Workshop Group) on the development, evaluation, engagement and dissemination of iCBT for anxiety in CYP.
Casper Albers, John Gower and Henk Kiers (2018). Rank properties for centred three-way arrays. F. Mola, C. Conversano, M. Vichi (eds), Classification, (Big) Data Analysis and Statistical Learning, Studies in Classification, Data Analysis, and Knowledge Organization, Springer, pp. 69-76.
When analysing three-way arrays, it is common practice to centre the arrays. Depending on the context, centring is performed over one, two or three modes. In this paper, we outline how centring affects the rank of the array; both in terms of maximum rank and typical rank.
Casper Albers (2018). Column van het jaar. Nieuw Archief voor Wiskunde, vijfde serie, deel 18, nummer 3, 247-248.

Daniel Lakens, Frederico Adolfi, Casper Albers, Farid Anvari, Matthew Apps, ... (66 others), Rolf Zwaan (2018). Justify Your Alpha. Nature Human Behaviour, 2(3), 168-171.
In response to recommendations to redefine statistical significance to p < .005, we propose that researchers should transparently report and justify all choices they make when designing a study, including the alpha level.
Casper Albers and Daniel Lakens (joint first author) (2018). When power analyses based on pilot data are biased: Inaccurate effect size estimators and follow-up bias. Journal of Experimental Social Psychology, 74, 187-195.
When designing a study, the planned sample size is often based on power analyses. One way to choose an effect size for power analyses is by relying on pilot data. A-priori power analyses are only accurate when the effect size estimate is accurate. In this paper we highlight two sources of bias when performing a-priori power analyses for between-subject designs based on pilot data. First, we examine how the choice of the effect size index affects the sample size and power of the main study. Based on our observations, we recommend against the use of ?2 in a-priori power analyses. Second, we examine how the maximum sample size researchers are willing to collect in a main study (e.g. due to time or financial constraints) leads to overestimated effect size estimates in the studies that are performed. Determining the required sample size exclusively based on the effect size estimates from pilot data, and following up on pilot studies only when the sample size estimate for the main study is considered feasible, creates what we term follow-up bias. We explain how follow-up bias leads to underpowered main studies. Our simulations show that designing main studies based on effect sizes estimated from small pilot studies does not yield desired levels of power due to accuracy bias and follow-up bias, even when publication bias is not an issue. We urge researchers to consider alternative approaches to determining the sample size of their studies, and discuss several options.


Tanja Krone, Casper Albers, Peter Kuppens, Marieke Timmerman (2017). A multivariate statistical model for emotion dynamics. Emotion, 18(5), 739-754.
In emotion dynamic research one distinguishes various elementary emotion dynamic features, which are studied using intensive longitudinal data. Typically, each emotion dynamic feature is quantified separately, which hampers the study of relationships between various features. Further, the length of the observed time series in emotion research is limited, and often suffers from a high percentage of missing values. In this paper we propose a vector autoregressive Bayesian dynamic model, that is useful for emotion dynamic research. The model encompasses six elementary properties of emotions, and can be applied with relatively short time series, including missing data. The individual elementary properties covered are: within person variability, innovation variability, inertia, granularity, cross-lag regression and average intensity. The model can be applied to both univariate and multivariate time series, allowing to model the relationships between emotions. One may include external variables and non-Gaussian observed data. We illustrate the usefulness of the model on data involving 50 participants self-reporting on their experience of three emotions across the period of one week using experience sampling.
Casper Albers (2017). The statistician Alan Turing. Nieuw Archief voor Wiskunde, Vijfde serie, deel 18, nummer 3, 209-210.

Casper Albers (2017). Lies, damn lies en tijdreeksen. Nieuw Archief voor Wiskunde, vijfde serie, deel 18, nummer 2, 99-101.

Christophe Sarran, Casper Albers, Patrick Sachon, Ybe Meesters (2017). Meteorological analysis of symptom data for people with seasonal affective disorder. Psychiatry Research, 257: 501-505.
It is thought that variation in natural light levels affect people with Seasonal Affective Disorder (SAD). Several meteorological factors related to luminance can be forecast but little is known about which factors are most indicative of worsening SAD symptoms. The aim of this meteorological analysis is to determine which factors are linked to SAD symptoms. The symptoms of 291 individuals with SAD in and near Groningen have been evaluated over the period 2003 to 2009. Meteorological factors linked to periods of low natural light (sunshine, global radiation, horizontal visibility, cloud cover and mist) and others (temperature, humidity and pressure) were obtained from weather observation stations. A Bayesian zero adjusted auto-correlated multilevel Poisson model was carried out to assess which variables influence the SAD symptom score BDI-II. The outcome of the study suggests that the variable sunshine duration, for both the current and previous week, and global radiation for the previous week, are significantly linked to SAD symptoms.
Nicholas Brown, Casper Albers & Stuart Ritchie (2017). Contesting the evidence for limited human lifespan. Nature, 546, E6-E7, BCA.
In their Letter, Dong et al. claimed that longitudinal mortality data indicate that human lifespan has a limit of around 115 years. We believe these authors' analyses, and, hence, their conclusions to be flawed. In this Comment, we outline four arguments to motivate our opinion.
Casper Albers and John Gower (2017). Visualising interactions in bi- and triadditive models for three-way tables. Chemometrics and Intelligent Laboratory Systems, 167: 238-247.
This paper concerns the visualisation of interaction in three-way arrays. It extends some standard ways of visualising biadditive modelling for two-way data to the case of three-way data. Three-way interaction is modelled by the Parafac method as applied to interaction arrays that have main effects and biadditive terms removed. These interactions are visualised in three and two dimensions. We introduce some ideas to reduce visual overload that can occur when the data array has many entries. Details are given on the interpretation of a novel way of representing rank-three interactions accurately in two dimensions. The discussion has implications regarding interpreting the concept of interaction in three-way arrays.
Casper Albers (2017). Slim tellen. Nieuw Archief voor Wiskunde, vijfde serie, deel 18, nummer 1, 23-25.

Lieke Voncken, Casper Albers and Marieke Timmerman (2017). Model selection in continuous test norming with GAMLSS. Assessment, (accepted for publication).
To compute norms from reference group test scores, continuous norming is preferred over traditional norming. A suitable continuous norming approach for continuous data is the use of the Box–Cox Power Exponential model, which is found in the generalized additive models for location, scale, and shape. Applying the Box–Cox Power Exponential model for test norming requires model selection, but it is unknown how well this can be done with an automatic selection procedure. In a simulation study, we compared the performance of two stepwise model selection procedures combined with four modelfit criteria (Akaike information criterion, Bayesian information criterion, generalized Akaike information criterion (3), cross-validation), varying data complexity, sampling design, and sample size in a fully crossed design. The new procedure combined with one of the generalized Akaike information criterion was the most efficient model selection procedure (i.e., required the smallest sample size). The advocated model selection procedure is illustrated with norming data of an intelligence test.
Anja Ernst & Casper Albers (2017). Regression assumptions in clinical psychology research practice - A systematic review of common misconceptions. PeerJ, 5:e3323.
Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.
Mandy van der Gaag, Saskia Kunnen, Casper Albers (2017). Micro-level mechanisms of identity development: the role of emotional experiences in commitment development. Developmental Psychology, 53(11), 2205-2217.
Based on Marcia’s theory, many researchers consider exploration and commitment as the main processes in identity development. Although some identity theorists have hypothesized that emotional experience may also be an important part of the mechanisms of identity development, empirical research to investigate this claim has been lagging behind. In this study, we shed light on the role of emotional experiences in micro-level commitment dynamics, and compare this to the role of exploration. We take a within-individual approach, and particularly focus on educational commitment. We collected weekly measurements among 103 first year university students over several months, resulting in 22 to 30 measurements for each individual. Every week, the students reported an important experience and accompanying positive and negative emotions, their level of educational exploration and commitment. We generated linear growth models for each individual separately, using Dynamic Linear Modeling. These individual models generate regression weights that indicate how strong the impact is of exploration, positive and negative emotional experiences on changes in micro-level commitment for each individual. Our main finding is that both positive and negative emotional experiences are indeed related to changes in educational commitment. Positive experiences, but surprisingly, also negative experiences, are related to increases in educational commitment for the majority of individuals. Moreover, for the large majority of individuals, the impact of emotional experiences is larger than the impact of exploration. Therefore, we conclude that it is highly likely that emotional experiences are an essential part of the micro-level mechanisms of identity development.
Rob Meijer, Anja Boeve, Jorge Tendeiro, Roel Bosker, Casper Albers (2017). The Use of Subscores in Higher Education: When Is This Useful?. Frontiers in Psychology: Educational Psychology, 8(305).
Assessment in higher education is challenging because teachers face more students, with less contact time as compared to primary and secondary education. Therefore, teachers and management are often interested in efficient ways of giving students diagnostic feedback and providing information on the basis of subscores is one method that is often used in large-scale standardized testing. In this article we discuss some recent psychometric literature that warns against the use of subscores in addition to the use of total scores. We illustrate how the added value of subscores can be evaluated using two college exams: A multiple choice exam and a combined open-ended question and multiple choice exam; these formats are often used in higher education and represent cases in which using subscores may be informative. We discuss the implications of our findings for future classroom evaluation.
Anja Boeve, Rob Meijer, Roel Bosker, Jorien Vugteveen, Rink Hoekstra and Casper Albers (2017). Implementing the flipped classroom: An exploration of study behaviour and student performance. Higher Education, 74(6): 1015-1032.
The flipped classroom is becoming more popular as a means to support student learning in higher education by requiring students to prepare before lectures and actively engaging students during lectures. While some research has been conducted into student performance in the flipped classroom, students’ study behaviour throughout a flipped course has not been investigated. This study explored students study behaviour throughout a flipped and regular course by means of bi-weekly diaries. Furthermore, student references to their learning regulation were explored in course evaluations. Results from the diaries showed that students’ study behaviour in the flipped course did not appear to be very different from students in a regular course. Furthermore, study behaviour did not appear strongly related to student performance in both the flipped and regular course. Exploration of student references to their learning regulation in the course evaluations showed that some students experienced the flipped course design as intended to support their learning process. Other students however, demonstrated resistance to changing their study behaviour even though changing study behaviour is expected in order to benefit from the flipped classroom. Further research on the relationship between students’ learning regulation and actual study behaviour and course results is necessary to understand when and why implementing the flipped classroom is successful. Recommendations that may help more effective flipped classroom implementation include considering the prior history between students and instructor(s), the broader curriculum context, and frequent expectation communication especially with large numbers of students and non-mandatory lecture attendance.


Casper Albers (2016). Er is er een jarig. Nieuw Archief voor Wiskunde, vijfde serie, deel 17, nummer 4, 273-274.

Tanja Krone, Casper Albers and Marieke Timmerman (2016). Bayesian dynamic modelling to assess differential treatment effects on panic attack frequencies. Statistical Modelling, 16(5): 343-359.
To represent the complex structure of intensive longitudinal data of multiple individuals, we propose a hierarchical Bayesian Dynamic Model (BDM). This BDM is a generalized linear hierarchical model where the individual parameters do not necessarily follow a normal distribution. The model parameters can be estimated on the basis of relatively small sample sizes and in the presence of missing time points. We present the BDM and discuss the model identification, convergence and selection. The use of the BDM is illustrated using data from a randomized clinical trial to study the differential effects of three treatments for panic disorder. The data involves the number of panic attacks experienced weekly (73 individuals, 10–52 time points) during treatment. Presuming that the counts are Poisson distributed, the BDM considered involves a linear trend model with an exponential link function. The final model included a moving average parameter and an external variable (duration of symptoms pre-treatment). Our results show that cognitive behavioural therapy is less effective in reducing panic attacks than serotonin selective re-uptake inhibitors or a combination of both. Post hoc analyses revealed that males show a slightly higher number of panic attacks at the onset of treatment than females.
Casper Albers, Tom van der Meer (2016). Misplaatste angst voor bètaficering sociale wetenschappen. Vakwerk (ledenblad Beter Onderwijs Nederland), .

Casper Albers, Tom van der Meer (2016). Misplaatste angst voor bètaficering sociale wetenschappen. (blog post).

Tanja Krone, Casper Albers, Marieke Timmerman (2016). Comparison of Estimation Procedures for Multilevel AR(1) Models. Frontiers in Psychology, 7(486).
To estimate a time series model for multiple individuals, a multilevel model may be used. In this paper we compare two estimation methods for the autocorrelation in Multilevel AR(1) models, namely Maximum Likelihood Estimation (MLE) and Bayesian Markov Chain Monte Carlo. Furthermore, we examine the difference between modeling fixed and random individual parameters. To this end, we perform a simulation study with a fully crossed design, in which we vary the length of the time series (10 or 25), the number of individuals per sample (10 or 25), the mean of the autocorrelation (-0.6 to 0.6 inclusive, in steps of 0.3) and the standard deviation of the autocorrelation (0.25 or 0.40). We found that the random estimators of the population autocorrelation show less bias and higher power, compared to the fixed estimators. As expected, the random estimators profit strongly from a higher number of individuals, while this effect is small for the fixed estimators. The fixed estimators profit slightly more from a higher number of time points than the random estimators. When possible, random estimation is preferred to fixed estimation. The difference between MLE and Bayesian estimation is nearly negligible. The Bayesian estimation shows a smaller bias, but MLE shows a smaller variability (i.e., standard deviation of the parameter estimates). Finally, better results are found for a higher number of individuals and time points, and for a lower individual variability of the autocorrelation. The effect of the size of the autocorrelation differs between outcome measures.
Jan Gerard Hoendervanger, Iris de Been, Nico van Yperen, Mark Mobach and Casper Albers (2016). Flexibility in use: Switching behaviour and satisfaction in activity-based work environments. Journal of Corporate Real Estate, 18(1), 48-62.
Despite their growing popularity among organisations, satisfaction with activity-based work (ABW) environments is found to be below expectations. Research also suggests that workers typically do not switch frequently, or not at all, between different activity settings. Hence, the purpose of this study is to answer two main questions: Is switching behaviour related to satisfaction with ABW environments? Which factors may explain switching behaviour? Design/methodology/approach
Questionnaire data provided by users of ABW environments (n = 3,189) were used to carry out ANOVA and logistic regression analyses. Findings
Satisfaction ratings of the 4 per cent of the respondents who switched several times a day appeared to be significantly above average. Switching frequency was found to be positively related to heterogeneity of the activity profile, share of communication work and external mobility. Practical implications
Our findings suggest that satisfaction with ABW environments might be enhanced by stimulating workers to switch more frequently. However, as strong objections against switching were observed and switching frequently does not seem to be compatible with all work patterns, this will presumably not work for everyone. Many workers are likely to be more satisfied if provided with an assigned (multifunctional) workstation. Originality/value
In a large representative sample, clear evidence was found for relationships between behavioural aspects and appreciation of ABW environments that had not been studied previously.
Casper Albers, Otto Kardaun, Willem Schaafsma (2016). Assigning probabilities to hypotheses in the context of a binomial distribution. Brazilian Journal of Probability and Statistics, 30(1), 127-144.
Given is the outcome s of S ~ B(n,p) (n known, p fully unknown) and two numbers 0 < a <= b < 1. Required are probabilities alpha(<,s), alpha(0, s), and alpha(>, s) of the hypothesis H(<): p < a, H(0): a < p < b, H(>): p > b, such that their sum is equal to 1. The degenerate case a = b is of special interest. A method, optimal with respect to a class of functions, is derived under Neyman-Pearsonian restrictions, and applied to a case from medicine.
Casper Albers, Rob Meijer and Jorge Tendeiro (2016). Derivation and Applicability of Asymptotic Results for Multiple Subtests Person-Fit Statistics. Applied Psychological Measurement, 40(4), 274-288.
In high-stakes testing, it is important to check the validity of individual test scores. Although a test may, in general, result in valid test scores for most test takers, for some test takers test scores may not provide a good description of a test taker's proficiency level. Person-fit statistics have been proposed to check the validity of individual test scores. In this study we first discuss the theoretical asymptotic sampling distribution of two person-fit statistics that can be used for tests that consist of multiple subtests. Second, we conducted a simulation study to investigate the applicability of this asymptotic theory for tests of finite length, in which we varied the correlation between subtests and number of items in the subtests. We showed that these distributions provide reasonable approximations, even for tests consisting of subtests of only 10 items each. These results have practical value because researchers do not have to rely on extensive simulation studies to simulate sampling distributions.


Tanja Krone, Casper Albers and Marieke Timmerman (2015). A comparative simulation study of AR(1) estimators in short time series. Quality & Quantity, 1-21.
Various estimators of the autoregressive model exist. We compare their performance in estimating the autocorrelation in short time series. In Study 1, under correct model specification, we compare the frequentist r1 estimator, C-statistic, ordinary least squares estimator (OLS) and maximum likelihood estimator (MLE), and a Bayesian method, considering flat (Bf) and symmetrized reference (Bsr) priors. In a completely crossed experimental design we vary lengths of time series (i.e., T = 10, 25, 40, 50 and 100) and autocorrelation (from ?0.90 to 0.90 with steps of 0.10). The results show a lowest bias for the Bsr, and a lowest variability for r1. The power in different conditions is highest for Bsr and OLS. For T = 10, the absolute performance of all measurements is poor, as expected. In Study 2, we study robustness of the methods through misspecification by generating the data according to an ARMA(1,1) model, but still analysing the data with an AR(1) model. We use the two methods with the lowest bias for this study, i.e., Bsr and MLE. The bias gets larger when the non-modelled moving average parameter becomes larger. Both the variability and power show dependency on the non-modelled parameter. The differences between the two estimation methods are negligible for all measurements.
Casper Albers (2015). Dutch research funding, gender bias and Simpson's paradox. Proceedings of the National Academy of Sciences, 112(50), E6828-E6829.
Based on, amongst others, three consecutive years of grant applications to the VENI programme of NWO, Van der Lee and Ellemers conclude that these data "provide compelling evidence of gender bias in personal grant applications to obtain research funding". This conclusion is based on the application of an inappropriate statistical procedure and therefore questionable, due to the so-called Simpson's paradox.
Anja Boevé, Rob Meijer, Casper Albers, Yta Beetsma, Roel Bosker (2015). Introducing Computer-Based Testing in High-Stakes Exams in Higher Education: Results of a Field Experiment. PLoS ONE, 10(12), e0143616.
The introduction of computer-based testing in high-stakes examining in higher education is developing rather slowly due to institutional barriers (the need of extra facilities, ensuring test security) and teacher, and student acceptance. From the existing literature it is unclear whether computer-based exams will result in similar results as paper-based exams and whether student acceptance can change as a result of administering computer-based exams. In this study, we compared results from a computer-based and paper-based exam in a sample of psychology students and found no differences in total scores across the two modes. Furthermore, we investigated student acceptance and change in acceptance of computer-based examining. After taking the computer-based exam, fifty percent of the students preferred paper-and-pencil exams over computer-based exams and about a quarter preferred a computer-based exam. We conclude that computer-based exam total scores are similar as paper-based exam scores, but that for the acceptance of high-stakes computer-based exams it is important that students practice and get familiar with this new mode of test administration.
Casper Albers (2015). Het referendum van GeenPeil en het Prisoners dilemma. (blog post).
Het is GeenPeil gelukt om genoeg handtekeningen te verzamelen om een raadgevend referendum over het associatieverdrag met Oekraïne af te dwingen. In de wettelijke constructie van het referendum zit een rare weeffout: het is voor voorstanders van dit verdrag vantevoren niet duidelijk of het nu verstandig is om wel of niet te stemmen. In deze blog post leg ik uit hoe het zit.
Casper Albers (2015). NWO, Gender bias and Simpson's Paradox . (blog post).
In a recent paper, it is outlined that there is significant gender inequality in the funding by the NWO Veni programme. In my blog post, I outline that the significance is lost if Simpson's paradox is taken into account.
Casper Albers (2015). NWO, Discriminatie en de Simpsonparadox . (blog post).
In een recent paper wordt beschreven hoe bij de toekenning van VENI-beurzen door NWO er, volgens de auteurs, sprake is van significante gender-ongelijkheid. In deze blog post laat ik zien dat de significantie wegvalt als rekening gehouden wordt met de Simpsonparadox.
Casper Albers and John Gower (2015). (Interactive) Visualisation of Threeway Data. Francesco Mola, CUEC Editrice (eds.). CLADAG 2015: 10th Scientific Meeting of the Classification and Data Analysis Group of the Italian Statistical Society. Book of Abstracts. ISBN 978 88 8467 749 9, p. 74-77..

Casper Albers, Anja Boevé and Rob Meijer (2015). A critique to Akdemir and Oguz (2008): Methodological and statistical issues to consider when conducting educational experiments. Computers and Education, 87(September), 238--242.
In the paper "Computer-based testing: An alternative for the assessment of Turkish undergraduate students", Akdemir and Oguz (2008) discuss an experiment to compare student performance in paper-and-pencil tests with computer-based tests, and conclude that students taking computer-based tests do not underperform compared to students taking pen-and-pencil tests. In this letter, we indicate two severe methodological and statistical flaws in this paper. We show how, in general, such flaws can affect experimental research. Due to these flaws, the conclusions by Akdemir and Oguz are unfounded: one cannot reach these conclusions on basis of this design and analysis. We provide a set of guidelines and advices to avoid methodological problems when setting up an educational experiment.
Marieke Schuppert, Casper Albers, Ruud Minderaa, Paul Emmelkamp and Maaike Nauta (2015). Severity of borderline personality symptoms in adolescence: relationship with maternal parenting stress, maternal psychopathology, and rearing styles. Journal of Personality Disorders, 29(3): 289-302.
The development of borderline personality symptomatology has been associated with environmental factors, including parenting styles and parental psychopathology. However, most studies were conducted retrospectively, and their results may be influenced by recall bias. Few studies have examined current parental rearing styles and parental psychopathology in relationship to borderline personality (BPD) symptoms in adolescents. Moreover, parenting stress has not been examined in this group. To address this, 101 adolescents (age 14-19 years) with BPD symptoms and their mothers were included in this study. Assessments were made on the severity of BPD symptoms, youth-perceived maternal rearing styles, and on psychopathology (including BPD symptoms) and parenting stress in mothers. Multiple regression analyses were used to examine potential predictors of borderline severity. Contrary to expectations, there was no correlation between the severity of BPD symptoms in adolescents and maternal parenting stress. Only youth-perceived maternal overprotection was significantly related to BPD severity. Further, the combination of perceived maternal rejection with cluster B personality traits in mothers was significantly related to BPD severity in adolescents. This study provides a contribution to the disentanglement of the developmental pathways that lead to this complex and invalidating disorder.


Casper Albers (2014). Using statistics for truly understanding psychological processes. Entry on the Heymans Institute Research Blog, 9 September.
The classical way of conducting experiments in empirical psychology is useful for understanding psychological constructs. However, using measurement-intensive longitudinal data is essential when you wish to truly understand psychological processes. For analyzing such data, new type of methods are required. Casper Albers explains the merits of these methods.
Mariska Barendse, Casper Albers, Frans Oort and Marieke Timmerman (2014). Measurement bias detection through Bayesian factor analysis. Frontiers in Psychology - Quantitative Psychology and Measurement, 5:1087.
Measurement bias has been defined as a violation of measurement invariance. Potential violators - variables that possibly violate measurement invariance - can be investigated through restricted factor analysis (RFA). The purpose of the present paper is to investigate a Bayesian approach to estimate RFA models with interaction effects, in order to detect uniform and nonuniform measurement bias. Because modeling nonuniform bias requires an interaction term, it is more complicated than modeling uniform bias. The Bayesian approach seems especially suited for such complex models. In a simulation study we vary the type of bias (uniform, nonuniform), the type of violator (observed continuous, observed dichotomous, latent continuous), and the correlation between the trait and the violator (0.0, 0.5). For each condition, 100 sets of data are generated and analyzed. We examine the accuracy of the parameter estimates and the performance of two bias detection procedures, based on the DIC fit statistic, in Bayesian RFA. Results show that the accuracy of the estimated parameters is satisfactory. Bias detection rates are high in all conditions with an observed violator, and still satisfactory in all other conditions.
Casper Albers (2014). Wie beter onderwijs wil, moet er meer geld voor over hebben. De Volkskrant, Opinie & Debat, 9 augustus.

Casper Albers and John Gower (2014). A Contribution to the Visualisation of Three-Way Arrays. Journal of Multivariate Analysis, 132: 1-8.
Visualisations of two-way arrays are well-understood. Here, a procedure, with geometric underpinning, is given for visualising rank-two three-way arrays in two-dimensions.
Casper Albers and John Gower (2014). Canonical Analysis: Ranks, Ratios and Fits. Journal of Classification, 31(1): 2-27.
Measurements of p variables for n samples are collected into a n by p matrix X, where the samples belong to one of k groups. The group means are separated by Mahalanobis distances. CVA optimally represents the group means of X in an r-dimensional space. This can be done by maximising a ratio criterion (basically one-dimensional) or, more flexibly, by minimising a rank-constrained least-squares fitting criterion (which is not confined to being one-dimensional but depends on defining an appropriate Mahalanobis metric). In modern n < p problems, where W is not of full rank, the ratio criterion is shown not to be coherent but the fit criterion, with an attention to associated metrics, readily generalises. In this context we give a unified generalisation of CVA, introducing two metrics, one in the range space of W and the other in the null space of W, that have links with Mahalanobis distance. This generalisation is computationally efficient, since it requires only the spectral decomposition of a n by n matrix.


Casper Albers, Liesbet Heyse, Jacob Dijkstra; et al. (2013). Werkbelevingsonderzoek medewerkers Faculteit GMW. Beschikbaar voor GMW-staf via link, voor anderen op aanvraag.
De invloed van de GMW arbeidscontext op uitkomsten van het arbeidsproces.
Casper Albers (2013). Rankings pimpen hoort niet. UK (Universiteitskrant Groningen), 24 September.
De universiteit maakt zich schuldig aan 'twijfelachtige praktijken', vindt universitair docent Casper Albers, door gunstige rankings te 'pimpen' en slechte weg te moffelen.
Osvaldo Anacleto, Catriona Queen and Casper Albers (2013). Forecasting multivariate road traffic flows using Bayesian dynamic graphical models, splines and other traffic variables. Australian and New Zealand Journal of Statistics, 55(2): 69-86.
Traffic flow data are routinely collected for many networks worldwide. These invariably large data sets can be used as part of a traffic management system, for which good traffic flow forecasting models are crucial. The linear multiregression dynamic model (LMDM) has been shown to be promising for forecasting flows, accommodating multivariate flow time series, while being a computationally simple model to use. While statistical flow forecasting models usually base their forecasts on flow data alone, data for other traffic variables are also routinely collected. This paper shows how cubic splines can be used to incorporate extra variables into the LMDM in order to enhance flow forecasts. Cubic splines are also introduced into the LMDM to parsimoniously accommodate the daily cycle exhibited by traffic flows.
The proposed methodology allows the LMDM to provide more accurate forecasts when forecasting flows in a real high-dimensional traffic data set. The resulting extended LMDM can deal with some important traffic modelling issues not usually considered in flow forecasting models. Additionally the model can be implemented in a real-time environment, a crucial requirement for traffic management systems designed to support decisions and actions to alleviate congestion and keep traffic flowing.
Keywords: linear multiregression dynamic model, dynamic linear model, state space models, cubic splines, occupancy, headway, speed.
Osvaldo Anacleto, Catriona Queen and Casper Albers (2013). Multivariate forecasting of road traffic flows in the presence of heteroscedasticity and measurement errors. Journal of the Royal Statistical Society, Series C: Applied Statistics, 62(2): 251 - 270.
Linear multiregression dynamic models (LMDMs), which combine a graphical representation of a multivariate time series with a state space model, have been shown to be a promising class of models for forecasting of traffic flow data. Analysis of flows at a busy motorway intersection near Manchester, UK, highlights two important modelling issues: accommodating diff erent levels of traffic variability depending on the time of day and accommodating measurement errors occurring due to data collection errors. This paper extends LMDMs to address these issues. Additionally, the paper investigates how close the approximate forecast limits usually used with the LMDM are to the true, but not so readily available, forecast limits.
Key words: data collection error; dynamic linear model; linear multiregression dynamic model; traffic modelling; variance law.


Casper Albers (2012). Nate Silver is not a witch, the frequentist says.. Significance Online, Web Exclusive Article, 15 November.
A response to "Is Nate Silver a witch?" published on Monday 12 November by Linda Wijlaars. In her web article Linda Wijlaars makes fun of frequentist statisticians because, according to Wijlaars, they are not as cool as Bayesians. Here, I shall explain that by employing a proper frequentist approach, one can show that Bayesians and frequentists are just as cool. The flaw in Wijlaars' explanation is pointed out and a perfectly acceptable frequentist way to compute the probability of Nate Silver being a witch is presented. This method fully coincides with the Bayesian method, showing that Bayesians and frequentists are not as dissimilar as they often claim.
Marieke Schuppert, Casper Albers, Ruud Minderaa, Paul Emmelkamp and Maaike Nauta (2012). Parental rearing and psychopathology in mothers of adolescents with and without borderline personality symptoms. Child and Adolescent Psychiatry and Mental Health, 6: 29.
Background: A combination of multiple factors, including a strong genetic predisposition and environmental factors, are considered to contribute to the developmental pathways to borderline personality disorder (BPD). However, these factors have mostly been investigated retrospectively, and hardly in adolescents. The current study focuses on maternal factors in BPD features in adolescence.
Methods: Actual parenting was investigated in a group of referred adolescents with BPD features (N=101) and a healthy control group (N=44). Self-reports of perceived concurrent parenting were completed by the adolescents. Questionnaires on parental psychopathology (both Axis I and Axis II disorders) were completed by their mothers.
Results: Adolescents reported significantly less emotional warmth, more rejection and more overprotection from their mothers in the BPD-group than in the control group. Mothers in the BPD group reported significantly more parenting stress compared to mothers in the control group. Also, these mothers showed significantly more general psychopathology and clusters C personality traits than mothers in the control group. Contrary to expectations, mothers of adolescents with BPD features reported the same level of cluster B personality traits, compared to mothers in the control group. Hierarchical logistic regression revealed that parental rearing styles (less emotional warmth, and more overprotection) and general psychopathology of the mother were the strongest factors differentiating between controls and adolescents with BPD symptoms.
Conclusions: Adolescents with BPD features experience less emotional warmth and more overprotection from their mothers, while the mothers themselves report more symptoms of anxiety and depression. Addition of family interventions to treatment programs for adolescents might increase the effectiveness of such early interventions, and prevent the adverse outcome that is often seen in adult BPD patients.

Keywords: Borderline personality disorder, adolescent, rearing styles, maternal psychopathology

Attained the certificate Highly Accessed.


John Gower and Casper Albers (2011). Between-Group Metrics. Journal of Classification, 28(3): 315-326.
In canonical analysis with more variables than samples, it is shown that, as well as the usual canonical means in the range-space of the within-groups dispersion matrix, canonical means may be de ned in its null space. In the range space we have the usual Mahalanobis metric; in the null space explicit expressions are given and interpreted for a new metric.
Casper Albers, Frank Critchley and John Gower (2011). Applications of Quadratic Minimisation Problems in Statistics. Journal of Multivariate Analysis, 102 (3): 714 - 722.
Albers et al. (2011) showed that the problem minx(x - t)'A(x - t) subject to x'Bx + 2b'x = k where A is positive definite or positive semi-definite has a unique computable solution. Here, several statistical applications of this problem are shown to generate special cases of the general problem that may all be handled within a general unifying methodology. These include non-trivial considerations that arise when (i) A and/or B are not of full rank and (ii) where B is indefinite. General canonical forms for A and B that underpin the minimisation methodology give insight into structure that informs understanding.
Casper Albers, Frank Critchley and John Gower (2011). Quadratic Minimisation Problems in Statistics. Journal of Multivariate Analysis, 102 (3): 698 - 713.
We consider the problem minx(x - t)'A(x - t) subject to x'Bx+2b'x=k where A is positive definite or positive semi-definite. Variants of this problem are discussed within the framework of a general unifying methodology. These include non-trivial considerations that arise when (i) A and/or B are not of full rank and (ii) t takes special forms (especially t = 0 which, under further conditions, reduces to the well-known two-sided eigenvalue solution). Special emphasis is placed on insights provided by geometrical interpretations.
Animations can be found here.


Casper Albers and John Gower (2010). A general approach to handling missing values in Procrustes analysis. Advances in Data Analysis and Classification, 4 (4): 223 - 237.
General Procrustes analysis is concerned with transforming a set of given configuration matrices to closest agreement. This paper introduces an approach useful for handling missing values in the configuration matrices in the context of general linear transformations. Centring and/or standardisation are allowed. Simplifications occur in the important case where the transformations are orthogonal. In the most general case, an interesting quadratic constrained optimisation problem appears.


Catriona Queen and Casper Albers (2009). Intervention and causality: forecasting traffic flows using a dynamic Bayesian network. Journal of the American Statistical Association, 104 (486): 669 - 681.
Real-time traffic flow data across entire networks can be used in a traffic man- agement system to monitor current traffic flows so that traffic can be directed and managed efficiently. Reliable short-term forecasting models of traffic flows are crucial for the success of any traffic management system.
The model proposed in this paper for forecasting traffic flows is a multivariate Bayesian dynamic model called the multiregression dynamic model (MDM). This model is an example of a dynamic Bayesian network and is designed to preserve the conditional independences and causal drive exhibited by the traffic flow series. Sudden changes can occur in traffic flow series in response to such events as traffic accidents or roadworks. A traffic management system is particularly useful at such times of change. To ensure that the associated forecasting model continues to pro- duce reliable forecasts, despite the change, the MDM uses the technique of external intervention. This paper will demonstrate how intervention works in the MDM and how it can improve forecast performance at times of change.
External intervention has also been used in the context of Bayesian networks to identify causal relationships between variables, and in dynamic Bayesian networks to identify lagged causal relationships between time series. This paper goes beyond the identification of lagged causal relationships previously addressed using intervention in dynamic Bayesian networks, to show how intervention in the MDM can be used to identify contemporaneous causal relationships between time series. The data used in this paper are available as supplemental material on the JASA website.


Catriona Queen and Casper Albers (2008). Forecasting traffic flows in road networks: A graphical dynamic model approach. In: International Institute of Forecasters (eds.), Proceedings of the 28th International Symposium of Forecasting.
Congestion on roads is a major problem worldwide. Many roads now have induction loops implanted into the road surface providing real-time traffic flow data. These data can be used in a traffic management system to monitor current traffic flows in a network so that traffic can be directed and managed efficiently. Reliable short-term forecasting and monitoring models of traffic flows are crucial for the success of any traffic management system.
Traffic flow data are invariably multivariate so that the flows of traffic up- stream and downstream of a particular data collection site S in the network are very informative about the flows at site S. Despite this, most of the short- term forecasting models of traffic flows are univariate and consider the flow at site S in isolation. In this paper we use a Bayesian graphical dynamic model called the Linear Multiregression Dynamic Model (LMDM) for forecasting traf- fic flow. An LMDM is a multivariate model which uses a graph in which the nodes represent time series of flows at the various data collection sites, and the links between nodes represent the conditional independence and causal struc- ture between flows at different sites. All computation in LMDMs is performed locally, so that model computation is always simple, even for arbitrarily com- plex road networks. This allows the model to work in real-time, as required by any traffic management system. LMDMs are also non-stationary and can read- ily accommodate changes in traffic flows. This is an essential property for any model for use with traffic management systems where series often exhibit tem- porary changes due to congestion or accidents, for example. Finally, LMDMs are often easily interpretable by non-statisticians, making them easy-to-use and understand.
The paper will focus on the problem of forecasting traffic flows in two separate motorway networks in the UK.
Casper Albers (2008). Some quadratic optimisation problems in psychometrics. In: K. Shigemasu, A. Okada, T.Imaizumi, and T. Hoshino (Eds.) New Trends in Psychometrics. Tokyo: Universal Academic Press, pp. 1 - 6.
In this paper, I discuss various examples arising from different areas in psychometrics. I will show that they have a common background and can be solved using a constrained quadratic optimisation algorithm developed by Albers et al. (2009).
Casper Albers and Willem Schaafsma (2008). Goodness of fit testing using a specific density estimate. Statistics and Decisions, 26 (1): 3 - 23.
To test the hypothesis H0: f = ψ that an unknown density f is equal to a specified one, ψ, an estimate f-hat of f is compared with ψ. The total variation distance || f-hat - ψ||1 is used as test statistic. The density estimate f-hat considered is a peculiar one. A table of critical values is provided, this table is applicable for arbitrary ψ. Relations with other methods, Neyman's smooth tests in particular, are discussed and power comparisons are performed. In certain situations, our test is recommendable. An example from practice is provided.
Tables with critical values can be found here
Catriona Queen, Ben Wright and Casper Albers (2008). Forecast covariances in the linear multiregression dynamic model. Journal of Forecasting, 27 (2): 175 - 191.
The linear multiregression dynamic model (LMDM) is a Bayesian dynamic model which preserves any conditional independence and causal structure across a multivariate time series. The conditional independence structure is used to model the multivariate series by separate (conditional) univariate dynamic lin- ear models, where each series has contemporaneous variables as regressors in its model. Calculating the forecast covariance matrix (which is required for calculating forecast variances in the LMDM) is not always straightforward in its current formulation. In this paper we introduce a simple algebraic form for calculating LMDM forecast covariances. Calculation of the covariance be- tween model regression components can also be useful and we shall present a simple algebraic method for calculating these component covariances. In the LMDM formulation, certain pairs of series are constrained to have zero forecast covariance. We shall also introduce a possible method to relax this restriction.


Casper Albers, Frank Critchley and John Gower (2007). Group Average Representations in Euclidean Distance Cones. Pp. 445 - 454 in: P. Brito, P. Bertrand, G. Cucumel, F. de Carvalho (eds.), "Selected Contributions in Data Analysis and Classification", Studies in Classification, Data Analysis, and Knowledge Organization-Series, Springer-Verlag.
The set of Euclidean distance matrices has a well-known representation as a convex cone. The problems of representing the group averages of K distance matrices are discussed, but not fully resolved, in the context of SMACOF, Generalized Orthogonal Procrustes Analysis and Individual Differences Scaling. The polar (or dual) cone representation, corresponding to inner-products around a centroid, is also discussed. Some new characterisations of distance cones in terms of circumhyperspheres are presented.
Catriona Queen, Ben Wright and Casper Albers (2007). Eliciting a directed acyclic graph for a multivariate time series of vehicle counts in a traffic network. Australian and New Zealand Journal of Statistics, 49 (3): 1 - 19.
The problem of modelling multivariate time series of vehicle counts in traffic networks is considered. It is proposed to use a model called the linear multiregression dynamic model (LMDM). The LMDM is a multivariate Bayesian dynamic model which uses any conditional independence and causal structure across the time series to break down the complex multivariate model into simpler univariate dynamic linear models.
The conditional independence and causal structure in the time series can be represented by a directed acyclic graph (DAG). The DAG not only gives a useful pictorial representation of the multivariate structure, but it is also used to build the LMDM. Therefore, eliciting a DAG which gives a realistic representation of the series is a crucial part of the modelling process.
A DAG is elicited for the multivariate time series of hourly vehicle counts at the junction of three major roads in the UK. A flow diagram is introduced to give a pictorial representation of the possible vehicle routes through the network. It is shown how this flow diagram, together with a map of the network, can suggest a DAG for the time series suitable for use with an LMDM.


Casper Albers, Ritsert Jansen, Jan Kok, Oscar Kuipers and Sacha van Hijum (2006). Supplementary Material to: SIMAGE: Simulation of DNA MicroArray Gene Expression data. Technical Report Groningen Bioinformatics Center.
In this report, we provide technical background to the paper "SIMAGE: Simulation of DNA MicroArray Gene Expression data"
Casper Albers, Ritsert Jansen, Jan Kok, Oscar Kuipers and Sacha van Hijum (2006). SIMAGE: SImulation of DNA-microarray Gene Expression data. BMC Bioinformatics, 7 (205).
Simulation of DNA-microarray data serves at least three purposes: (i) optimizing the design of an intended DNA microarray experiment, (ii) comparing existing pre-processing and processing methods for best analysis of a given DNA microarray experiment, (iii) educating students, lab-workers and other researchers by making them aware of the many factors influencing DNA microarray experiments.

Our model has multiple layers of factors influencing the experiment. The relative influence of such factors can differ significantly between labs, experiments within labs, etc. Therefore, we have added a module to roughly estimate their parameters from a given data set. This guarantees that our simulated data mimics real data as closely as possible.

We introduce a model for the simulation of dual-dye cDNA-microarray data closely resembling real data and coin the model and its software implementation "SIMAGE" which stands for simulation of microarray gene expression data. The software is freely accessible at:


Casper Albers, Otto Kardaun, Willem Schaafsma, Ton Steerneman, Alfred Stein (2005). Foundational Issues in Statistical Inference. Open University Statistics Group Technical Reports Series, #05/14.
Statistical inference is about using statistical data (x) to formulate an opinion about something that is defined well, but unknown (y). Testing a hypothesis H about y is one of the possibilities, the estimation or prediction of y is another one.We concentrate the attention on estimation or prediction in the sense that an opinion is required in the form of a probability distribution Q = Q(x) on the space Y of all theoretical possibilities. The data x being statistical, it is natural to incorporate probabilistic arguments in the context to let x speak about y. Assuming that (x; y) is the outcome of a pair (X; Y) of random variables (in the sense of probability theory), the `true' distribution P of (X; Y) exists. It may be exactly known in simulations and in thought experiments, but it is only partially known in real-world investigations. That is why the context to let x speak about y will involve at least some specification of a family of theoretically possible P's. We assume that the probabilistic aspects of the situation are sufficiently convincing to aim at a probabilistic form of the opinion about y, given nature's message x and 'the context'. If a probability statement is needed about some hypothesis H with respect to y, then we construct an estimator or predictor α of the truth value of H and, if the estimator seems reasonable, we use α(x) as the (epistemic) probability of H. If a distributional inference is needed about a real-valued unknown y then, apart from using the Bayesian approach, we can construct an inference by defining its distribution function Gx such that, for any real z, Gx(z) is equal to αz(x) where αz is some estimator of the truth value of H.
Casper Albers, Gerlof de Roos and Willem Schaafsma (2005). Estimating a frequency unseen: an example from ornithology. Statistica Neerlandica, 59 (3): 397 - 413.
The second author is involved in a capture–mark–recapture study of some wader species. Part of his program deals with resight observations. On a particular day he visually inspects a fairly stable population to identify the ringed birds by reading their ring-number. Some ringed birds will be missed, so observations are repeated on other days. The issue of main interest is whether, after some repetitions, we can be sufficiently sure that all the ringed birds in the population have been identified or, equivalently, that the frequency of unseen birds is zero. Most current theory is concerned with an asymptotic setting. In our 'exact' context the emphasis is on the determination of the 'probability' that the frequency of unseen birds is zero. This issue is settled by considering the more general problem of 'estimating' the frequency of the unseen birds by providing a predictive inference in the form of a probability distribution. We develop methods of inference based on the assumption of a bird-independent probability pi of identifying a ringed bird on day i, as well as without this assumption. In Section 5 we critically examine these approaches.
Rudi Alberts, Jingyuan Fu, Morris Swertz, Casper Albers, Alrik Lubbers and Ritsert Jansen (2005). Combining microarrays and genetic analysis. Briefings in bioinformatics, 6 (2): 135 - 146.
Gene expression can be studied at a genome-wide scale with the aid of modern microarray technologies. Expression profiling of tens to hundreds of individuals in a genetic population can reveal the consequences of genetic variation. In this paper it is argued that the design and analysis of such a study is not a matter of simply applying the existing and more-or-less standard computational tools for microarrays to a new type of experimental data. It is shown how to fully exploit the power of genetics through optimal experimental design and analysis for two major microarray technologies, cDNA two-colour arrays and Affymetrix short oligonucleotide arrays.
Sacha van Hijum, Anne de Jong, Richard Baerends, Harma Karsens, Naomi Kramer, Rasmus Larsen, Chris den Hengst, Casper Albers, Jan Kok and Oscar Kuipers (2005). A Generally applicable validation scheme for the assessment of factors involved in reproducibility and quality of DNA-microarray data. BMC Genomics, 6 (77).
In research laboratories using DNA-microarrays, usually a number of researchers perform experiments, each generating possible sources of error. There is a need for a quick and robust method to assess data quality and sources of errors in DNA-microarray experiments. To this end, a novel and cost-effective validation scheme was devised, implemented, and employed.

A number of validation experiments were performed on Lactococcus lactis IL1403 amplicon-based DNA-microarrays. Using the validation scheme and ANOVA, the factors contributing to the variance in normalized DNA-microarray data were estimated. Day-to-day as well as experimenter-dependent variances were shown to contribute strongly to the variance, while dye and culturing had a relatively modest contribution to the variance.

Even in cases where 90 % of the data were kept for analysis and the experiments were performed under challenging conditions (e.g. on different days), the CV was at an acceptable 25 %. Clustering experiments showed that trends can be reliably detected also from genes with very low expression levels. The validation scheme thus allows determining conditions that could be improved to yield even higher DNA-microarray data quality.
Casper Albers, Barteld Kooi and Willem Schaafsma (2005). Trying to resolve the two-envelope problem. Synthese, 145 (1): 89 - 109.
After explaining the well-known two-envelope 'paradox' by indicating the fallacy involved, we consider the two-envelope 'problem' of evaluating the 'factual' information provided to us in the form of the value contained by the envelope chosen first. We try to provide a synthesis of contributions from economy, psychology, logic, probability theory (in the form of Bayesian statistics), mathematical statistics (in the form of a decision-theoretic approach) and game theory. We conclude that the two-envelope problem does not allow a satisfactory solution. An interpretation is made for statistical science at large.


Casper Albers (2003). Distributional Inference: The Limits of Reason. PhD-Thesis, University of Groningen, The Netherlands.
Science advances by combining rational arguments and empirical information. In fields like philosophy and pure mathematics, emphasis is laid on the rational arguments, whilst in the applied sciences the collection and interpretation of data are the field of interest. In mathematical statistics one tries to combine these aspects. The primary goal is to make statistical inferences about something unknown. Such inferences can be of help in further discussion, e.g. in selecting a decision. The methods should not depend on 'the intentions that might be furthered by utilizing the knowledge inferred'. When the available data are too limited, then different procedures may yield different inferences. The statistician should refrain from providing a specific inference in case the differences are 'too large'. When such an inference can be given, this inference should be accompanied by a statement about the uncertainty of the inference. This could be done by providing a distributional inference, or by providing the results of different approaches.
An example is as follows. Ornithologist G.Th. de Roos is observing a population of Ruddy Turnstones (Arenaria Interpres) on the Frysian island Vlieland. Some of these birds are ringed, however the ring-number is not always observable, e.g. because another bird is blocking the view. After how many days of observing is it safe to assume that all ringed birds in the population have been observed at least once? This question can be answered by constructing a distributional inference about the number of present, yet unseen, ringed birds, including a probability statement about the hypothesis that all ringed birds have been seen. Of course, the results depend in some way on the probabilistic assumptions one makes, and on the statistical principles one follows.
The first part of this thesis consists of 'finger exercises' illustrating that information about the unknown can only be of value if the mechanism generating the information is (sufficiently well) known. In probability theory, information is incorporated by conditioning to it. This generates difficulties in statistical practice, because unknown aspects are involved in the joint distribution of the random variables X and Y that are behind the observations x and the unknown y. Firstly, this is extensively exemplified by a die-rolling game. From the information 'the number of eyes is even' one cannot conclude automatically: 'the probability that a six has been thrown, equals one third'. The way in which the source of information operates, should be incorporated in the statistical model. Secondly, a similar example, the two-envelopes problem, is considered. Again, the difficulties involving the numerical specification of conditional probabilities are in the forefront.
The second and most important part deals with the situation where one has a random sample x1, ..., xn from a distribution with density f. The goal is to use the sample to form an estimate of f or, almost equivalently, to generate a distributional inference about y (= xn+1). A new method is discussed to estimate the density f, where 'initial knowledge' of f is incorporated in the model. This is done by specifying a probability density psi as the 'initial guess' for f. Also the degree of confidence in this psi is quantified and incorporated in the method. By means of a multi-modal approach, incorporating aspects from both Classical and Bayesian statistics, and on basis of the sample x, 'initial guess' psi (and the degree of confidence in psi), an estimate of f is generated. When the initial guess psi is not unreasonable, this density estimate performs better, in general, than the generally used kernel methods. This is no surprise, since the kernel method makes no use of psi. It is at this point unclear how the comparison will turn out when psi is incorporated in the kernel method. To study the applicability of the developed method, an extensive data set about the pollution of Dutch waters is considered. Previous investigations showed that the different concentrations of pollutants can reasonably well be described by lognormal distributions. A complication is that the concentrations can only be measured when they are above a certain detection threshold. The density estimation theory of this thesis, adapted to mentioned complication, is used to 'fine-tune' the 'initial guess' of lognormality to the data. The resulting density estimates are better than the density estimates obtained previously by fitting lognormal densities.
The density estimation theory of this thesis can usefully be applied to the goodness of fit context where a statement is required about the truth or falsity of the hypothesis H0: f = psi. The resulting goodness of fit tests have interesting relations with the well-known chi2-test, Kolmogorovs test, and Neymans 'smooth tests'. To emphasize the usefulness of distributional inference, an example from the interface of multivariate analysis and time-series analysis is discussed.


Casper Albers and Willem Schaafsma (2002). Estimating a density by adapting an initial guess. Computational Statistics and Data Analysis, 42 (1/2): 27 - 36.
De Bruin et al. (Comput. Statist. Data Anal. 30, 1999) provide a unique method to estimate the probability density f from a sample, given an initial guess psi of f. An advantage of their estimate fn is that an approximate standard error can be provided. A disadvantage is that fn is less accurate, on the average, than more usual kernel estimates. The reason is that fn is not sufficiently smooth. As improvement, a smoothed analogue fn(m) is considered. The smoothing parameter m (the degree of a polynomial approximation) depends on the supposed quality of the initial guess psi of f. Under certain conditions, the resulting density estimate fn(m) has smaller L1-error, on the average, than kernel estimates with bandwidths based on likelihood cross-validation. The theory requires that the initial guess is made up a priori. In practice, some data peeping may be necessary. The fn(m) provided look 'surprisingly accurate'. The main advantage of fn(m) over many other density estimators is its uniqueness (when the procedures developed in this article are followed), another one is that an estimate is provided for the standard error of fn(m).


Casper Albers and Willem Schaafsma (2001). Details on the standard error of a special density estimate. Technical report IWI-2001-5-04, University of Groningen.
De Bruin et al. (Comput. Statist. Data Anal. 30, 1999) provide a unique nonparametric method to estimate the probability density f from a sample, given an initial guess psi of f. This report provides an approximation to the standard error of this estimate fn(x) of f(x). The smoothed analogues f(m)n of fn introduced in Albers et al. perform better with respect to the rate of convergence. Numerical experiences are favorable, but a satisfactory theoretical analysis seems to be impossible.
The notation of Albers et al. is used; this report can be seen as an 'appendix'.
Casper Albers (2001). Presentatie van MWTL-data m.b.v. een speciale dichtheidsschatter. Technical report IWI-2001-5-03, University of Groningen.
Extention (in Dutch) of Chapter 5 of my PhD-thesis.
Casper Albers and Willem Schaafsma (2001). How To Assign Probabilities, if you must. Statistica Neerlandica, 55 (3): 346 - 357.
Empirical evidence can sometimes be incorporated in a probabilistic analysis by conditioning with respect to the observations. Usually, the underlying probability distribution and also the conditional distribution are not completely known. The assignment of probabilities will then require a compromise. The making of such a compromise goes beyond mathematical theory: a statistical discussion is needed. It depends on the context whether the result of such discussion is almost compelling, reasonable, or not really agreeable. This is illustrated by means of a simple example from the area of predictive distributional inference.


Casper Albers and Willem Schaafsma (2000). How To Assign Probabilities, if you must, addition on article. Technical report IWI-2000-5-04, University of Groningen.
In the article `How to assign probabilities, if you must' (Albers & Schaafsma, 2001), several methods to assign probabilities applied to a two-played die rolling game were discussed. The main focus was on methods using the logarithmic loss function for the choice of proper loss function. In this additional aprt, we will discuss the Brier and Epstein loss functions as an alternative. Furthermore, and extensive graphical display of the situation will be made.


Casper Albers (1998). Estimating bivariate distributions assuming some form of dependence. Thesis for a MSc degree in Statistics, University of Groningen.
Let (X1, Y1),..., (Xn,Yn) be an independent random sample from a bivariate population with distribution H. The stochastic variables X and Y are assumed to be (positively) associated in some way. To incorporate this assumption, various mathematical-statistical definitions can be used. We prefer the concept of (positive) quadrant dependence. This thesis contains various methods for estimating the distribution function H(x, y).
Two semiparametric methods are developed and a nonparametric method is discussed. The results are not very promising: though those of the semiparametric methods display various similarities, they are considerably different. This might suggest that samples of size 50 are too small to arrive at acceptable estimates, unless restrictive assumptions are imposed. A pdf of the thesis is available from: an 'offical link', but in poorly scanned black-and-white quality, or an 'unonfficial link', in readable full colour.

Currently no preprints are available.