1Vol.:(0123456789) Scientific Reports | (2023) 13:6334 | https://doi.org/10.1038/s41598-023-33120-3 www.nature.com/scientificreports Predicting work disability among people with chronic conditions: a prospective cohort study Solja T. Nyberg 1,2*, Jaakko Airaksinen 3,4, Jaana Pentti 1,2,5,6, Jenni Ervasti 2, Markus Jokela 3, Jussi Vahtera 5,6,7, Marianna Virtanen 8,9, Marko Elovainio 3,10, G. David Batty 11 & Mika Kivimäki 1,11 Few risk prediction scores are available to identify people at increased risk of work disability, particularly for those with an existing morbidity. We examined the predictive performance of disability risk scores for employees with chronic disease. We used prospective data from 88,521 employed participants (mean age 43.1) in the Finnish Public Sector Study including people with chronic disorders: musculoskeletal disorder, depression, migraine, respiratory disease, hypertension, cancer, coronary heart disease, diabetes, comorbid depression and cardiometabolic disease. A total of 105 predictors were assessed at baseline. During a mean follow-up of 8.6 years, 6836 (7.7%) participants were granted a disability pension. C-statistics for the 8-item Finnish Institute of Occupational Health (FIOH) risk score, comprising age, self-rated health, number of sickness absences, socioeconomic position, number of chronic illnesses, sleep problems, BMI, and smoking at baseline, exceeded 0.72 for all disease groups and was 0.80 (95% CI 0.80–0.81) for participants with musculoskeletal disorders, 0.83 (0.82–0.84) for those with migraine, and 0.82 (0.81–0.83) for individuals with respiratory disease. Predictive performance was not significantly improved in models with re-estimated coefficients or a new set of predictors. These findings suggest that the 8-item FIOH work disability risk score may serve as a scalable screening tool in identifying individuals with increased risk for work disability. The probability of being in employment is strongly affected by the number of chronic diseases. According to OECD statistics, for example, 1 in 4 people aged 50–59 with no chronic diseases were not in employment, while this was the case for half of those with 2 or more chronic conditions1. The number of age-related chronic diseases in working populations is likely to increase in the future due to population ageing. This presents work life with new challenges, including a need for better preventing work disability in working populations which include a growing number of people with a chronic disease. To enable timely interventions, numerous studies have investigated predictors of work disability in the general and working populations as well as in groups with specific diseases, or those that have undergone specific treat- ment procedures2–11. Fewer studies have examined prediction in high-risk individuals across multiple common chronic conditions that increase the likelihood of work disability. For example, there are no well-validated and easily administered prediction tools available to determine the risk among employees who have depression, mus- culoskeletal disorders, respiratory disease, hypertension or cardiometabolic multimorbidity. This is an important limitation which may hinder optimal targeting of interventions to those who would benefit most. The Finnish Institute of Occupational Health (FIOH) has previously formulated a risk prediction model for work disability for use in the general working population2. This model had a C-index of 0.8 indicating high OPEN 1Clinicum, Faculty of Medicine, University of Helsinki, Tukholmankatu 8B, 00014 Helsinki, Finland. 2Finnish Institute of Occupational Health, Helsinki, Finland. 3Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland. 4Institute of Criminology and Legal Policy, University of Helsinki, Helsinki, Finland. 5Department of Public Health, University of Turku, Turku, Finland. 6Centre for Population Health Research, University of Turku, Turku, Finland. 7Turku University Hospital, Turku, Finland. 8School of Educational Sciences and Psychology, University of Eastern Finland, Joensuu, Finland. 9Division of Insurance Medicine, Karolinska Institutet, Stockholm, Sweden. 10Finnish Institute for Health and Welfare, Helsinki, Finland. 11Department of Epidemiology and Public Health, University College London, London, UK. *email: solja.nyberg@helsinki.fi 2Vol:.(1234567890) Scientific Reports | (2023) 13:6334 | https://doi.org/10.1038/s41598-023-33120-3 www.nature.com/scientificreports/ discriminative ability. In the present study, we examined whether the variables used also have the capacity to accurately determine work disability risk among employees with chronic conditions. We first evaluated the pre- dictive performance of the 8-item FIOH risk score in nine common disease groups; then developed a modified version where we re-estimated the coefficients; and, lastly, built a model with a new set of predictors selected from a large pool of additional variables and ascertained whether these new models improved risk prediction. To evaluate the relevance of the risk models in clinical decision making, we dichotomized the score to distinguish test positives from test negatives, and examined detection rate and false positive rate for this measure12. Methods Study population. Participants were from the Finnish Public Sector study, a prospective cohort study of public sector employees from 10 municipalities and 21 hospitals in the same geographical areas in Finland. The eligible population represented 31.4% of all municipal employees in Finland at the time of the study, including all employees with job contract in the 10 municipalities and 21 hospitals at the time of the surveys13. Partici- pants responded to questionnaire surveys conducted in 2000–2002, 2004, 2008, or 2012. In the present study, we used data from the full study population of respondents including employees with and without chronic conditions. Using self-reports of physician-diagnosed diseases, complemented with records from the cancer registry and drug reimbursement register (for respiratory disease, hypertension, coronary heart disease, muscu- loskeletal disorders, or diabetes), we categorized participants into subgroups with different chronic conditions including those living with musculoskeletal disorder, depression, migraine, respiratory disease, hypertension, cancer, coronary heart disease, diabetes and comorbid depression and cardiometabolic disease (co-occurrence of mental and physical illnesses with major public health importance) at baseline. We considered the baseline for each participant with disease to be at the survey when the particular condition was first reported. Ethical approval was obtained from the ethics committee of the Helsinki-Uusimaa Hospital District Ethics Committee (HUS/1210/2016). All participants provided a written informed consent. This study was conducted according to the guidelines of the Helsinki declaration. Details of the study design and participants have been previously described14. Potential risk predictors. We used a pool of 105 variables which included those in the 8-item risk pre- diction FIOH-model of work disability for the general working population: age category (< 35; 35–39; 40–44; 45–49; 50–54; 55 + years), BMI (< 18.5; 18.5- < 25; 25–30; 30 + kg/m2), socioeconomic status (SES), smoking (yes; no), number of chronic diseases (0, 1, 2, 3+), self-rated health (range 1–5), difficulty falling asleep (range from 1 (never) to 6 (almost every night)) and number of sickness absences in previous year before baseline (0, 1, 2, 3+)2. Other available variables included sex, alcohol consumption, physical inactivity, psychological distress and working conditions (job control, job demands, job strain, effort, reward and effort-reward imbalance, relational justice, procedural justice, participatory safety, support for innovation, vision, task orientation, social capital at workplace, shift work and working night shifts). We measured SES based on occupational titles categorised according to the occupational classification of Statistics Finland, which is based on the World Health Organiza- tion/International Labor Organization International Standard Classification of Occupations (ISCO-88)15 into 7 occupational groups: ‘Managers’ (ISCO class 1), ‘professionals’ (ISCO class 2, for example, physicians and teach- ers), ‘technicians and associate professionals’ (ISCO class 3, for example, registered nurses), ‘clerical support workers’ (ISCO class 4, for example, secretaries), ‘service workers’ (ISCO class 5, for example, cooks), ‘manual workers’ (ISCO classes 6–8, for example, maintenance workers) and ‘elementary occupations’ (ISCO class 9, for example, cleaners). The predictors are presented in Table 1 and a detailed list of individual items is provided in Appendix (p. 2–4). Ascertainment of work disability. All study members were insured in some pension scheme. Records were obtained from the national register at the Finnish Centre of Pensions16, an organisation which has a statu- tory obligation to curate records of all pensions in Finland. Disability pension records including start date and diagnosis according to the World Health Organization International Classification of Diseases, version 10. The outcome was full-time disability pensions (temporary or permanent) which is defined as the capacity for work being impaired by at least 60% due to a disease, injury, or other disability. These records have been widely used in research context17–19. Statistical methods. We imputed missing data on predictors (3.7% of all observations) as follows: we first complemented missing responses on chronic diseases using data from the cancer registry and the drug reim- bursement register, then set responses on chronic diseases that were still missing as ‘no’ answers, and height as median value of all non-missing responses per individual. Other predictors were imputed using single imputa- tion with predictive mean matching20. The follow-up ended in case of death, retirement (age-related or early retirement on other than health grounds), work disability, or a maximum follow-up time of 10 years, whichever came first. The unadjusted bivariate associations between the individual predictor variables and work disability are shown with Manhattan plots. We performed three steps to select the best model for each disease group. In the first, we examined whether the FIOH model was valid within the disease groups. According to the FIOH model, the work disability risk is estimated as follows2: P(x) =[(ln(10) - linear prediction)/scale)] 3Vol.:(0123456789) Scientific Reports | (2023) 13:6334 | https://doi.org/10.1038/s41598-023-33120-3 www.nature.com/scientificreports/ where Φ is the standard cumulative normal distribution. We applied the model and the coefficients that were formulated by Airaksinen et al.2. Coefficients for linear predictor in the FIOH-model were fitted with lognormal distribution and are provided in the report by Airaksinen et al.2 and in the Appendix of this paper (page 5). The second step was to examine whether the model for each disease groups could be improved by re-estimat- ing the coefficients of the FIOH-model, using the same set of the explanatory variables as was used in the FIOH model. We call this the re-estimated FIOH model. For fitting the models, we used a similar method (including assumption of lognormal distribution) as Airaksinen et al.2. The third step was to examine whether the prediction could be improved by selecting a completely new set of predictors for each disease group. Following the conventional approach in developing prediction algorithms for survival data, we fitted the models with Weibull distribution21,22. For each disease group we first ran a redun- dancy analysis to exclude variables that could be readily predicted using all other variables. We then specified a parametric survival model that included all the candidate variables as predictors (‘full model’). To obtain a more parsimonious algorithm, we derived the predicted work disability risks from the full model for each individual. We then used backward stepwise ordinary least squares regression to select eight predictors, by predicting risks derived from the full model as the outcome. The number of items in the new models was chosen to be the same as in the FIOH-model because a short questionnaire is easy to administer and quick to respond, maximising the response rate. If the selected eight predictors included any summary variables (for example job strain), we repeated the previous steps with the summary variable(s) broken down to individual items. The models achieved for each disease group as a result of the third step are referred as the new model. The performance of each prediction model was evaluated using Harrell’s C-index, which is the concordance between predicted and observed survival20. This index gives the probability that a randomly selected individual who experienced the outcome during the follow-up, had a higher risk score than a randomly selected individual who did not experience the outcome. The C-index has a range from 0.5 (no predictive ability) to 1 (maximum predictive ability). C-index under 0.7 represents poor, 0.7–0.8 good, and > 0.8 strong discrimination ability. To examine whether the original findings from the general working population were replicable in our dataset, we computed C-statistics for the entire study population with and without chronic conditions and among those with no history of sickness absence one year prior to baseline (the low-risk population11). Calibration of the model—that is, how accurately the predicted absolute risks correspond to the observed absolute risks—was assessed using calibration plots. We additionally plotted observed and predicted events by deciles of 10-year risk for each model, excluding by age and baseline those with less than 10 years of potential follow-up time. We compared the performance of the FIOH-model with both new models using the C-statistics with 95% confidence intervals. If the confidence intervals were overlapping, the FIOH-model was chosen. To evaluate the final model for an individual, we dichotomized the score into ‘test positive’ versus ‘test nega- tive’ using alternative thresholds of 5; 10; 20; 30; 40; 50 and 60% for a positive test result. For positive test cases, we calculated false positive rate (the proportion of test positive cases who did not experience work disability), detection rate (the proportion of work disability cases who were test positive) and the ratio of true to false posi- tives. The formulas were as follows: Table 1. Potential predictors. *Items included in the FIOH-model. Potential predictor Demographics (3 items) Work characteristics (22 items) Age* Job strain (scale + 2 items) Sex  Job control (6 items) Socioeconomic status*  Job demand (3 items) Effort-Reward imbalance (scale + 2 items) Health (35 items)  Effort (1 item)  BMI*  Reward (3 items)  Jenkins sleep scale (scale + 4 items)* Shift work  Chronic diseases (sum + 12 items)* Night shift  Self-rated health* Social capital at workplace (scale)  No. of sickness absences in previous year*  GHQ (2 scales + 12 items) Team climate (18 items)  Participatory safety (scale + 4 items) Risk behavior (12 items)  Vision (scale + 4 items)  Smoking*  Task orientation (scale + 3 items)  Alcohol consumption (scale + 5 items)  Support for innovation (scale + 3 items)  Inactivity (scale + 4 items) Management (15 items)  Relational justice (scale + 6 items)  Procedural justice (scale + 7 items) 4Vol:.(1234567890) Scientific Reports | (2023) 13:6334 | https://doi.org/10.1038/s41598-023-33120-3 www.nature.com/scientificreports/ Ratio of true to false positives = 1 : (b/a), where a, b, c and d represent different combinations of risk scores and work disability as defined below: Risk score Work disability during the follow-up Yes No Test positive a b Test negative c d All analyses were performed using R 4.2.2 (packages: mice23, rms24, leaps25 and Hmisc26) and SAS 9.4. Ethics approval. Ethical approval was obtained from the ethics committee of the Helsinki-Uusimaa Hospi- tal District Ethics Committee (HUS/1210/2016). Consent to participate. Informed consent was obtained from all individual participants included in the study. Results Baseline characteristics. Figure 1 shows sample selection. The target population was municipal employ- ees during the survey years in Finland, on average of 426,500 men and women during the survey years. The eligible population in the 10 towns and 21 hospitals which participated in the FPS study represented 31.4% of all municipal employees. Of these, 78.7% responded to at least one of the four questionnaire surveys. Linked records from national health and work disability registers were available for 85.0% of the respondents, a total of 89 543 adults. We excluded from analyses people who were on disability pension or retired, at age 65 years or older or with extreme values in BMI (< 15 or > 50) at baseline. Table 2 shows characteristics of the resulting analytical sample of 88,521 participants. Of them, 70,805 were women, the mean age was 43.1 years, and the largest occupational groups were ‘professionals’ (for example, teachers) and ‘associate professionals’ (for example, registered nurses). 73,996 had no history of short-term work disability (no sickness absences one year before baseline) and this group was denoted as the low-risk popula- tion. The number of participants with specific, prevalent condition at baseline varied between 1162 (comorbid depression and cardiometabolic disease) and 33,601 (musculoskeletal disorders). Work disability during follow-up. During a mean follow-up of 8.6 years, 6836 (7.7%) participants were granted a disability pension. The incidence of work disability was 14.1% in those with musculoskeletal disorders, 9.9% for migraine, 15.1% for hypertension, 12.2% for respiratory disease (chronic bronchitis or asthma), 16.7% for depression, 16.4% for diabetes, 14.2% for cancer, 22.6% for coronary heart disease, and 27.7% for comorbid depression and cardiometabolic disease. The most common causes of work disability were diseases of the mus- culoskeletal system and connective tissue (44.2%), followed by mental, behavioural and neurodevelopmental disorders (23.7%), and these proportions varied by disease group (Fig. 2, for details, see Appendix p. 6). Associations of potential risk predictors and work disability. The unadjusted bivariate associations between 105 predictor variables and work disability are shown in Fig. 3. All variables of the 8-item FIOH-risk score were associated with work disability (p < 0.0001), the strongest associations seen for age and self-rated health. Many other health-related variables were also strongly associated with work disability whereas the asso- ciations of the items related to risk behaviours, work characteristics, team climate and management were weaker. The pattern of findings was similar in all disease groups (Appendix, p. 7–15). Selection of the model for each disease group. Development of the re-estimated and new models is reported in Appendix (p. 16–17.) The predictors in the new models included 4 to 7 of the 8 predictors in the FIOH-model, depending on the participant group. Each of these models included age, socioeconomic status, self-rated health and the number of sickness absences in previous year. Individual chronic diseases were also part of the models for every disease group. Work-related items were included in the models for musculoskeletal disorders, hypertension, respiratory disease, diabetes, cancer, and comorbid depression and cardiometabolic disease. Table 3 shows that C-statistics for the FIOH-model among all employees, that is, those with and without chronic conditions was (0.84, 95% CI 0.84 to 0.85), a similar C-statistics in the magnitude as in the original study2. This result was little changed in analysis of complete case without imputations (N = 78 479, C-statistics 0.84, 95% CI 0.84 to 0.85). Among those with no sickness absence at baseline the C-statistic was (0.82, 95% CI 0.81 to 0.83). The table also provides a comparison of the performance between the FIOH-model and the two alternatives. The FIOH-model performed well in all disease groups. The C-statistics was ≥ 0.80 in those with musculoskeletal disorders (0.80, 95% CI 0.80 to 0.81), migraine (0.83, 95% CI 0.82 to 0.84) and respiratory dis- ease (0.82, 95% CI 0.81 to 0.83). For all other subgroups, including hypertension, depression, diabetes, cancer, coronary heart disease or comorbid depression and cardiometabolic disease, C-statistics was ≥ 0.72 but less than 0.80. The C-statistics for the re-estimated FIOH-models and the new models were virtually the same as for the False positive rate = b/(b+ d) Detection rate = a/(a+ c) 5Vol.:(0123456789) Scientific Reports | (2023) 13:6334 | https://doi.org/10.1038/s41598-023-33120-3 www.nature.com/scientificreports/ FIOH-model, suggesting no improvement in predictive performance. Calibration plots for the existing model indicated a high correspondence between the predicted and the observed risk in all disease groups (Fig. 4). Predictive performance of the FIOH model. Table 4 shows the detection and false positive rates for dichotomized FIOH-risk scores using various risk thresholds to dichotomise the score to indicate high versus low risk (for results of sensitivity, specificity and positive and negative predictive value, see Appendix, p. 18). With a high threshold for risk (≥ 50% predicted probability indicating high risk), the false positive rate ranged between 2.6% and 8.6% in all disease groups. Exceptions were participants with coronary heart disease (19.0%) and those with comorbid depression and cardiometabolic disease (20.1%) for whom false positive rate was markedly higher. The detection rate varied between 18.9% (participants with cancer) and 52.8% (participants with comorbid depression and cardiometabolic disease) and the ratio of true-to-false positives ranged from 1:1.0 to 1:1.4. With a low threshold (5% predicted probability indicating high risk), detection rate raised to 92.0% or 99.2% (less than 1 in 10 disability cases were missed), but with very high false positive rate (54.2% to 94.4% depending on the disease group). For both the 50% and 5% thresholds, the detection and false positive rates were slightly lower in the total working population, but the ratio of true-to-false positives was approximately the same as in the population with chronic diseases. Figure 1. Flow diagram for total sample and disease groups. 6Vol:.(1234567890) Scientific Reports | (2023) 13:6334 | https://doi.org/10.1038/s41598-023-33120-3 www.nature.com/scientificreports/ Table 2. Baseline characteristics of all participants and subgroups of individuals with no history of sickness absence and those with a chronic condition at baseline. CHD, coronary heart disease; CMD, cardiometabolic disease (diabetes or CHD); MSD, musculoskeletal disorder. *Participants with and without chronic conditions at baseline. † Participants with no sickness absence at baseline. ‡ Self-reported bronchial asthma, myocardial infarction, angina pectoris, cerebrovascular diseases, migraine, depression or diabetes. Baseline characteristic All* Low risk population† MSD Migraine Hypertension Respiratory disease Depression Diabetes Cancer CHD Depression & CMD (N = 88,521) (N = 73,996) (N = 33,601) (N = 22,065) (N = 16,793) (N = 15,372) (N = 14,347) (N = 3829) (N = 3584) (N = 1761) (N = 1162) Sex  Women 80.0 79.2 81.2 89.9 76.5 82.5 83.9 72.2 88.1 63.5 74.0  Men 20.0 20.8 18.8 10.1 23.5 17.5 16.1 27.8 11.9 36.5 26.0 Age, y  < 35 22.3 23.7 7.2 21.0 3.2 16.8 12.6 8.3 4.2 2.5 6.0  35–39 14.5 15.0 9.0 14.9 5.3 12.8 12.4 8.1 5.9 2.7 6.6  40–44 16.5 16.6 14.1 16.8 11.2 16.1 16.5 11.5 10.2 4.9 9.7  45–49 16.4 16.1 19.0 16.3 17.4 16.9 19.4 14.6 16.6 11.5 15.8  50–54 16.0 15.2 23.2 16.3 27.1 17.7 19.5 21.6 25.0 25.3 25.7  55 + 14.4 13.4 27.6 14.7 35.9 19.8 19.7 36.1 38.2 53.1 36.2 BMI, kg/m2  < 18.5 1.3 1.3 0.7 1.5 0.4 1.2 1.1 0.4 0.7 0.6 0.4  18.5– < 25 54.5 56.1 44.6 53.6 29.9 46.6 48.2 22.8 48.8 35.0 28.7  25–30 31.8 31.2 37.1 31.1 41.4 34.1 33.6 35.4 35.6 40.9 35.3  30 + 12.5 11.4 17.6 13.9 28.4 18.2 17.0 41.4 14.9 23.5 35.6 Socio-economic status  Managers 2.2 2.4 2.8 2.2 3.5 2.7 2.3 3.3 3.2 3.8 2.4  Professionals 28.9 30.9 25.7 29.1 24.9 30.4 30.2 24.4 30.9 21.1 24.3  Technicians and associate professionals 27.1 27.4 22.7 26.5 23.1 23.7 24.1 23.2 23.8 21.0 22.9  Clerical sup- port workers 6.4 6.4 7.1 7.5 8.3 7.5 7.9 8.3 8.9 8.3 9.9  Service workers 21.5 20.1 25.1 23.0 22.4 22.5 22.5 21.0 21.5 19.8 21.3  Manual workers 4.2 4.0 5.2 2.7 6.1 3.9 3.6 7.3 3.0 10.3 8.3  Elementary occupations 9.7 8.8 11.4 8.9 11.6 9.1 9.5 12.5 8.8 15.8 11.0 Smoking  No 81.0 81.8 80.1 82.9 82.5 79.6 76.6 78.8 85.3 81.2 75.8  Yes 19.0 18.2 19.9 17.1 17.5 20.4 23.4 21.2 14.7 18.8 24.2 Chronic illness ‡  0 64.6 67.3 53.8 0.0 50.9 23.7 0.0 0.0 59.0 0.0 0.0  1 27.7 26.4 33.1 70.4 33.8 47.1 56.9 57.1 28.8 31.8 0.0  2 6.6 5.5 10.5 24.7 11.3 22.7 34.7 27.9 8.2 36.5 46.6  3+ 1.2 0.9 2.6 4.9 4.0 6.5 8.4 15.0 4.0 31.7 53.4 Self-rated health  1 (highest) 41.5 45.3 22.1 32.9 17.0 25.8 19.0 14.5 22.7 9.9 10.1  2 35.3 35.5 38.3 37.6 37.3 36.7 35.2 34.5 37.8 27.3 23.1  3 19.1 16.8 31.1 23.6 35.5 28.9 33.2 37.2 29.8 40.9 37.6  4 3.7 2.3 7.8 5.4 9.2 7.8 11.3 12.5 8.4 18.8 25.4  5 (lowest) 0.3 0.2 0.7 0.6 0.9 0.8 1.3 1.3 1.3 3.1 3.9 Difficulty falling asleep  1 (never) 46.2 47.8 40.5 41.1 41.0 39.5 29.7 44.0 42.1 38.4 29.9  2 28.2 28.8 26.9 27.8 26.0 27.3 24.6 24.9 26.7 23.1 20.6  3 12.7 12.4 14.4 14.2 14.4 15.0 16.3 13.3 12.9 15.8 16.0  4 9.0 8.0 11.8 11.5 11.6 11.8 17.2 11.7 11.7 12.8 17.0  5 1.5 1.2 2.2 1.9 2.2 2.3 4.1 1.9 2.4 3.2 4.7  6 (almost every night) 2.5 1.8 4.2 3.6 4.8 4.2 8.0 4.3 4.3 6.8 12.0 Number of sickness absences in previous year  0 83.6 100.0 74.3 79.6 76.2 76.7 66.7 76.2 64.6 67.0 64.0  1 13.4 0.0 19.9 16.3 18.2 18.1 25.4 18.1 28.6 23.9 24.6  2 2.5 0.0 4.8 3.5 4.6 4.2 6.5 4.7 6.0 7.0 8.3  3+ 0.5 0.0 1.1 0.7 1.1 1.0 1.4 1.1 0.8 2.2 3.1 7Vol.:(0123456789) Scientific Reports | (2023) 13:6334 | https://doi.org/10.1038/s41598-023-33120-3 www.nature.com/scientificreports/ Discussion Our study shows that a short, self-administered survey instrument has predictive utility for work disability in people with chronic conditions and comorbidity. This survey-based 8-item risk calculator includes age, self-rated health, number of sickness absences, socioeconomic position, the number of comorbidities, sleep problems, body mass index, and smoking habit. The algorithm can be used in many settings, including members of the public who have web access, and the assessment does not require laboratory testing or other clinical measurements. The calculator might be used to identify working-age people with common chronic diseases with an elevated risk of future work disability and so might facilitate early intervention. Approximately one third of people at age 40 have a chronic condition and the proportion rises to 75% by age 6527. Despite this high prevalence and the urgent need for new measures to prevent their work disability, we are not aware of other large-scale studies on risk stratification algorithms for work disability in employees with chronic conditions. In our study, C-index exceeded 0.72 in all disease groups and was 0.80 or greater for those with musculoskeletal disorders, migraine and respiratory disease. These results indicate good discrimination and are comparable to those reported for established risk prediction tools currently used in clinical practice. For example, the C-index is 0.72 for the Pooled Cohort Equations to predict the 10-year risk of cardiovascular disease events using 9 risk factors (age, sex, race, diabetes status, smoking status, antihypertensive medication use, total cholesterol, HDL cholesterol levels, and systolic blood pressure)28; between 0.74 and 0.77 for the FINDRISC model to predict the risk of type 2 diabetes using 8 characteristics (age, BMI, waist circumference, physical activ- ity, diet, history of antihypertensive medication use, history of high blood glucose and family history)29,30; and 0.70 to 0.91 for QRISK3 to predict future cardiovascular disease based on a wide range of risk factors obtained from electronic health records (age, sex, ethnicity, socioeconomic deprivation, angina or heart attack in a 1st degree relative at age < 60, chronic kidney disease, migraine, corticosteroids, systemic lupus erythematosus, use of atypical antipsychotics, severe mental illness, steroid treatment, erectile dysfunction, total/ HDL cholesterol ratio, BMI, systolic blood pressure variability)31. In clinical decision making, dichotomised predictive scores are used to distinguish patients who should receive intervention or referrals for further assessments, although few studies have reported relevant performance metrics in this regard12. While the performance of our dichotomized work disability risk score fell short of the best established clinical screening tests, such as mammography for breast cancer (detection rate 75% with a false positive rate of 8%)32,33 and faecal immunochemical test (FIT) for colon cancer (79%/6%)34, it was similar to those reported for widely used cardiovascular disease risk scores, such as QRISK2 (detection and false positive rates 40% and 13% for men and 26% and 6% for women)12 and thus appears to provide a useful tool to aid decisions of targeting preventive interventions. More specifically, our risk calculator had a relatively high false positive rate for test positive thresholds that allowed high detection rates. Conversely, the use of a threshold that provided low (approximately 5%) false positive rate, resulted in detection of only 20–25% of disability cases. This means that the score is useful in informing the targeting of interventions with no significant harm from overtreatment and Figure 2. Causes of work disability at follow-up in all participants and those with prevalent diseases. 8Vol:.(1234567890) Scientific Reports | (2023) 13:6334 | https://doi.org/10.1038/s41598-023-33120-3 www.nature.com/scientificreports/ in informing referrals to more detailed assessments for potential tailored preventive measures. By contrast, the score should not be used for expensive or new interventions with uncertain safety profiles as many people who will not benefit from the intervention will be targeted. The same applies to predictors of cardiovascular disease that are currently used in health care. The present study benefits from the use of data which are from a country (Finland) where ascertainment of work disability pension was possible with linkage to comprehensive records from the national pension register with virtually full coverage for those in employment10,11. Large sample size, contributing to higher precision and lower risk of type 2 error, and the relatively high response rate are additional strengths. The age and sex Figure 3. Bivariate unadjusted associations between predictors at baseline and work disability at follow-up in all participants with and without chronic disease at baseline. Bars represent -LOG10(p-value) and are cut at the maximum value of 300. Table 3. C-index for the FIOH, re-estimated and new prediction models in all participants and subgroups of individuals with no history of sickness absence and those with a chronic condition at baseline. *Participants with and without chronic conditions at baseline. † Participants with no sickness absence at baseline. ‡ CMD, cardiometabolic disease (diabetes or coronary heart disease). Population C-index (95% confidence intervals) FIOH model Re-estimated FIOH model New model All* 0.84 (0.84, 0.85) Low risk population† 0.82 (0.81, 0.83) Disease group at baseline  Musculoskeletal disease 0.80 (0.80, 0.81) 0.80 (0.80, 0.81) 0.80 (0.79, 0.81)  Migraine 0.83 (0.82, 0.84) 0.83 (0.83, 0.84) 0.83 (0.83, 0.84) Hypertension 0.79 (0.78, 0.80) 0.79 (0.78, 0.80) 0.79 (0.79, 0.80) Respiratory disease 0.82 (0.81, 0.83) 0.82 (0.81, 0.83) 0.82 (0.81, 0.83) Depression 0.78 (0.77, 0.78) 0.78 (0.77, 0.79) 0.78 (0.77, 0.78) Diabetes 0.78 (0.76, 0.80) 0.79 (0.77, 0.81) 0.79 (0.77, 0.81) Cancer 0.72 (0.70, 0.74) 0.74 (0.71, 0.76) 0.74 (0.72, 0.76) Coronary heart disease 0.76 (0.73, 0.78) 0.76 (0.74, 0.78) 0.77 (0.74, 0.79) Comorbid depression and CMD‡ 0.77 (0.74, 0.79) 0.78 (0.76, 0.80) 0.78 (0.75, 0.80) 9Vol.:(0123456789) Scientific Reports | (2023) 13:6334 | https://doi.org/10.1038/s41598-023-33120-3 www.nature.com/scientificreports/ distribution (mean age 43.1, 80% women) in the analytic sample corresponded to those in the eligible popula- tion of 133,966 municipal employees (mean age 45.7, 80% women)13. We defined disease groups using data from self-reported physician-diagnosed conditions. Validation studies supported the accuracy of self-reports as a measure of prevalent chronic diseases35–44. This study has also several limitations. Although work disability is defined by impairment, receipt of a dis- ability pension may additionally be dependent on non-medical factors, such as disability pension regulations, the work environment, the nature of the job, and the extent to which a workplace is prepared to accommodate Figure 4. Observed and predicted incidence of work disability by deciles of the work disability risk score. 10 Vol:.(1234567890) Scientific Reports | (2023) 13:6334 | https://doi.org/10.1038/s41598-023-33120-3 www.nature.com/scientificreports/ the disability. Our study largely comprised women and all study participants were drawn from public sector workplaces, dominated by professionals, such as health care workers and teachers, deviating from the general population in which the incidence of work disability is greater11. However, among the Finnish workforce during the years 1997–2006, the yearly number of men and women with incident work disability was 111 and 100 per 10,000 persons45. This is higher but still comparable with the 89 per 1000 person-years in our total population. The generalizability of the present findings should therefore be investigated in different settings, study popula- tions and other countries with different disability pension policies. Predictive performance may vary by country Table 4. Detection rate, false positive rate and the ratio true-to-false positives for the FIOH prediction model in all participants and subgroups of individuals with no history of sickness absence and those with a chronic condition at baseline. Predictive performance for a positive test Threshold for a positive test result 5 10 20 30 40 50 60 Musculoskeletal disorders  Detection rate (%) 95.3 86.6 65.9 46.6 32.7 21.1 12.8  False positive rate (%) 70.2 50.4 26.6 13.6 7.1 3.6 1.7  Ratio true to false positives 1:4.5 1:3.5 1:2.5 1:1.8 1:1.3 1:1.0 1:0.8 Migraine  Detection rate (%) 92.0 83.1 63.6 47.6 33.3 22.1 14.3  False positive rate (%) 54.2 37.0 19.3 10.0 5.2 2.6 1.2  Ratio true to false positives 1:5.4 1:4.1 1:2.8 1:1.9 1:1.4 1:1.1 1:0.7 Hypertension  Detection rate (%) 96.4 89.6 71.2 53.2 37.3 24.8 15.2  False positive rate (%) 80.5 61.7 34.4 17.9 9.4 4.9 2.2  Ratio true to false positives 1:4.7 1:3.9 1:2.7 1:1.9 1:1.4 1:1.1 1:0.8 Respiratory disease  Detection rate (%) 93.0 84.8 66.0 49.9 35.8 24.1 15.1  False positive rate (%) 59.9 42.9 23.2 12.9 7.2 3.8 1.8  Ratio true to false positives 1:4.6 1:3.6 1:2.5 1:1.9 1:1.4 1:1.1 1:0.8 Depression  Detection rate (%) 93.4 86.2 69.7 53.7 40.1 27.4 18.0  False positive rate (%) 72.3 54.3 31.7 18.5 10.7 6.1 3.0  Ratio true to false positives 1:3.9 1:3.2 1:2.3 1:1.7 1:1.3 1:1.1 1:0.8 Diabetes  Detection rate (%) 97.0 91.6 77.7 62.4 46.3 32.8 20.7  False positive rate (%) 80.8 68.0 46.3 27.6 16.1 8.6 4.3  Ratio true to false positives 1:4.2 1:3.8 1:3.0 1:2.3 1:1.8 1:1.3 1:1.1 Cancer  Detection rate (%) 93.1 80.4 58.3 42.2 30.3 18.9 13.2  False positive rate (%) 78.8 59.3 33.5 18.8 9.7 4.3 2.1  Ratio true to false positives 1:5.1 1:4.5 1:3.5 1:2.7 1:1.9 1:1.4 1:1.0 Coronary heart disease  Detection rate (%) 99.2 97.2 88.9 76.1 62.8 47.2 31.7  False positive rate (%) 94.4 87.5 66.4 45.5 30.3 19.0 10.3  Ratio true to false positives 1:3.3 1:3.1 1:2.6 1:2.0 1:1.7 1:1.4 1:1.1 Comorbid depression and cardiometabolic disease  Detection rate (%) 98.1 96.3 90.7 79.8 67.7 52.8 38.5  Falsepositive rate (%) 89.0 79.9 61.9 45.2 32.3 20.1 11.8  Ratio true to false positives 1:2.4 1:2.2 1:1.8 1:1.5 1:1.2 1:1.0 1:0.8 Low-risk population  Detection rate (%) 81.6 63.5 35.0 17.2 8.3 3.5 1.2  False positive rate (%) 38.2 21.5 7.7 2.7 0.9 0.4 0.1  Ratio true to false positives 1:8.1 1:5.9 1:3.8 1:2.7 1:2.0 1:1.9 1:1.6 Total cohort  Detection rate (%) 87.9 75.0 52.5 35.9 24.0 15.0 9.1  False positive rate (%) 43.7 27.0 12.0 5.7 2.8 1.4 0.62  Ratio true to false positives 1:5.9 1:4.3 1:2.7 1:1.9 1:1.4 1:1.1 1:0.8 11 Vol.:(0123456789) Scientific Reports | (2023) 13:6334 | https://doi.org/10.1038/s41598-023-33120-3 www.nature.com/scientificreports/ because in addition to the impairment, receipt of a disability pension is dependent on non-medical factors, such as disability pension regulations, the work environment, the nature of the job, and the extent to which the work- place is prepared to accommodate the disability46. Additionally, we followed the same parametric methodological approach as in the original FIOH study2, although a non-parametric approach (for example discrete-time models) would have avoided parametric assumptions. Conclusion Detection of individuals at high risk is a precondition for effective targeted interventions to prevent long-term work disability and a basis for developing cost-effective strategies to avoid early labour-market exit. Predictive performance of the simple and cost-free FIOH-work disability risk score was comparable to those observed for established widely-used risk scores for other outcomes, such as cardiovascular diseases47. These findings sug- gest that it is possible to predict work disability in a working population with chronic disease using a scalable internet-based tool with a reasonable accuracy and thus aid decisions of targeted interventions and referrals to detailed assessments of tailored interventions. Data availability Statistical syntax for the analysis of the present study is available in Appendix, pp. 19–21. Pseudonymised ques- tionnaire data from the FPS study can be shared upon request to Dr Jenni Ervasti (jenni.ervasti@ttl.fi). Linked health records require separate permission from the Findata, the Health and Social Data Permit Authority in Finland. Received: 12 September 2022; Accepted: 7 April 2023 References 1. OECD/European Union. ‘The labour market impacts of ill-health’. In Health at a Glance: Europe 2016: State of Health in the EU Cycle (OECD Publishing, 2016). https:// doi. org/ 10. 1787/ 97892 64265 592- en 2. Airaksinen, J. et al. Development and validation of a risk prediction model for work disability: Multicohort study. Sci. Rep. 7, 13578. https:// doi. org/ 10. 1038/ s41598- 017- 13892-1 (2017). 3. Foster, N. E. et al. Prevention and treatment of low back pain: Evidence, challenges, and promising directions. Lancet 391, 2368– 2383. https:// doi. org/ 10. 1016/ S0140- 6736(18) 30489-6 (2018). 4. Krause, N., Frank, J. W., Dasinger, L. K., Sullivan, T. J. & Sinclair, S. J. Determinants of duration of disability and return-to-work after work-related injury and illness: Challenges for future research. Am. J. Ind. Med. 40, 464–484. https:// doi. org/ 10. 1002/ ajim. 1116 (2001). 5. Cheadle, A. et al. Factors influencing the duration of work-related disability: A population-based study of Washington State work- ers’ compensation. Am. J. Public Health 84, 190–196. https:// doi. org/ 10. 2105/ ajph. 84.2. 190 (1994). 6. Vles, W. J. et al. Prevalence and determinants of disabilities and return to work after major trauma. J. Trauma 58, 126–135. https:// doi. org/ 10. 1097/ 01. ta. 00001 12342. 40296. 1f (2005). 7. Turner, J. A., Franklin, G. & Turk, D. C. Predictors of chronic disability in injured workers: A systematic literature synthesis. Am. J. Ind. Med. 38, 707–722. https:// doi. org/ 10. 1002/ 1097- 0274(200012) 38:6% 3c707:: aid- ajim10% 3e3.0. co;2-9 (2000). 8. den Bakker, C. M. et al. Prognostic factors for return to work and work disability among colorectal cancer survivors: A systematic review. PLoS ONE 13, e0200720. https:// doi. org/ 10. 1371/ journ al. pone. 02007 20 (2018). 9. Detaille, S. I., Heerkens, Y. F., Engels, J. A., van der Gulden, J. W. & van Dijk, F. J. Common prognostic factors of work disability among employees with a chronic somatic disease: A systematic review of cohort studies. Scand. J. Work Environ. Health 35, 261–281. https:// doi. org/ 10. 5271/ sjweh. 1337 (2009). 10. Laaksonen, M., Blomgren, J. & Gould, R. Sickness allowance trajectories preceding disability retirement: A register-based retro- spective study. Eur. J. Public Health 26, 1050–1055. https:// doi. org/ 10. 1093/ eurpub/ ckw081 (2016). 11. Salonen, L., Blomgren, J., Laaksonen, M. & Niemela, M. Sickness absence as a predictor of disability retirement in different occu- pational classes: A register-based study of a working-age cohort in Finland in 2007–2014. BMJ Open 8, e020491. https:// doi. org/ 10. 1136/ bmjop en- 2017- 020491 (2018). 12. Hingorani, A. et al. Polygenic scores in disease prediction: Evaluation using the relevant performance metrics. Medrxiv https:// doi. org/ 10. 1101/ 2022. 02. 18. 22271 049 (2022). 13. https:// www. kt. fi/ en/ munic ipal- sector- and- perso nnel, https:// www. kt. fi/ en/ munic ipal- sector- and- perso nnel. Accessed 1 March 2023. 14. Kivimäki, M. et al. Socioeconomic position, co-occurrence of behavior-related risk factors, and coronary heart disease: The Finnish public sector study. Am. J. Public Health 97, 874–879. https:// doi. org/ 10. 2105/ ajph. 2005. 078691 (2007). 15. International Labour Organization International Standard Classification of Occupations, ISCO-88 (2004). 16. ETK, https:// www. etk. fi/ en/. Accessed 1 March 2023. 17. Salonsalmi, A., Laaksonen, M., Lahelma, E. & Rahkonen, O. Drinking habits and disability retirement. Addiction 107, 2128–2136. https:// doi. org/ 10. 1111/j. 1360- 0443. 2012. 03976.x (2012). 18. von Bondorff, M. B. et al. Early life origins of all-cause and cause-specific disability pension: findings from the Helsinki Birth Cohort Study. PLoS ONE 10, e0122134. https:// doi. org/ 10. 1371/ journ al. pone. 01221 34 (2015). 19. Lahti, J., Holstila, A., Manty, M., Lahelma, E. & Rahkonen, O. Changes in leisure time physical activity and subsequent disability retirement: A register-linked cohort study. Int. J. Behav. Nutr. Phys. Act 13, 99. https:// doi. org/ 10. 1186/ s12966- 016- 0426-2 (2016). 20. Harrell, F. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis (Springer, 2015). 21. Li, Y., Sperrin, M., Ashcroft, D. M. & van Staa, T. P. Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: Longitudinal cohort study using cardiovascular disease as exemplar. BMJ 371, m3919. https:// doi. org/ 10. 1136/ bmj. m3919 (2020). 22. Ng, R. et al. Development and validation of the chronic disease population risk tool (CDPoRT) to predict incidence of adult chronic disease. JAMA Netw. Open 3, e204669. https:// doi. org/ 10. 1001/ jaman etwor kopen. 2020. 4669 (2020). 23. van Buuren, S. & Groothuis-Oudshoorn, K. Mice: Multivariate Imputation by chained equations in R. J. Stat. Softw. 45, 1–67. https:// doi. org/ 10. 18637/ jss. v045. i03 (2011). 24. Harrell Jr FE. rms: Regression Modeling Strategies. R package version 6.2–0, https:// CRAN.R- proje ct. org/ packa ge= rms 25. Thomas Lumley based on Fortran code by Alan Miller. leaps: Regression Subset Selection. R package version 3.1, https:// CRAN.R- proje ct. org/ packa ge= leaps 12 Vol:.(1234567890) Scientific Reports | (2023) 13:6334 | https://doi.org/10.1038/s41598-023-33120-3 www.nature.com/scientificreports/ 26. Harrell Jr F. Hmisc: Harrell Miscellaneous. R package version 4.6–0, https:// CRAN.R- proje ct. org/ packa ge= Hmisc 27. Barnett, K. et al. Epidemiology of multimorbidity and implications for health care, research, and medical education: A cross- sectional study. Lancet 380, 37–43. https:// doi. org/ 10. 1016/ S0140- 6736(12) 60240-2 (2012). 28. Muntner, P. et al. Validation of the atherosclerotic cardiovascular disease Pooled Cohort risk equations. JAMA 311, 1406–1415. https:// doi. org/ 10. 1001/ jama. 2014. 2630 (2014). 29. Salinero-Fort, M. A. et al. Performance of the Finnish diabetes risk score and a simplified Finnish diabetes risk score in a commu- nity-based, cross-sectional programme for screening of undiagnosed type 2 diabetes mellitus and dysglycaemia in Madrid, Spain: The SPREDIA-2 study. PLoS ONE 11, e0158489. https:// doi. org/ 10. 1371/ journ al. pone. 01584 89 (2016). 30. Jolle, A. et al. Validity of the FINDRISC as a prediction tool for diabetes in a contemporary Norwegian population: a 10-year follow-up of the HUNT study. BMJ Open Diabetes Res. Care 7, e000769. https:// doi. org/ 10. 1136/ bmjdrc- 2019- 000769 (2019). 31. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357, j2099. https:// doi. org/ 10. 1136/ bmj. j2099 (2017). 32. Elmore, J. G., Armstrong, K., Lehman, C. D. & Fletcher, S. W. Screening for breast cancer. JAMA 293, 1245–1256. https:// doi. org/ 10. 1001/ jama. 293. 10. 1245 (2005). 33. Carney, P. A. et al. Individual and combined effects of age, breast density, and hormone replacement therapy use on the accuracy of screening mammography. Ann. Intern. Med. 138, 168–175. https:// doi. org/ 10. 7326/ 0003- 4819- 138-3- 20030 2040- 00008 (2003). 34. Tinmouth, J., Lansdorp-Vogelaar, I. & Allison, J. E. Faecal immunochemical tests versus guaiac faecal occult blood tests: What clinicians and colorectal cancer screening programme organisers need to know. Gut 64, 1327–1337. https:// doi. org/ 10. 1136/ gutjnl- 2014- 308074 (2015). 35. Colditz, G. A. et al. Validation of questionnaire information on risk factors and disease outcomes in a prospective cohort study of women. Am. J. Epidemiol. 123, 894–900. https:// doi. org/ 10. 1093/ oxfor djour nals. aje. a1143 19 (1986). 36. Beckett, M., Weinstein, M., Goldman, N. & Yu-Hsuan, L. Do health interview surveys yield reliable data on chronic illness among older respondents?. Am. J. Epidemiol. 151, 315–323. https:// doi. org/ 10. 1093/ oxfor djour nals. aje. a0102 08 (2000). 37. Paganini-Hill, A. & Chao, A. Accuracy of recall of hip fracture, heart attack, and cancer: A comparison of postal survey data and medical records. Am. J. Epidemiol. 138, 101–106. https:// doi. org/ 10. 1093/ oxfor djour nals. aje. a1168 32 (1993). 38. Haapanen, N., Miilunpalo, S., Pasanen, M., Oja, P. & Vuori, I. Agreement between questionnaire data and medical records of chronic diseases in middle-aged and elderly Finnish men and women. Am. J. Epidemiol. 145, 762–769. https:// doi. org/ 10. 1093/ aje/ 145.8. 762 (1997). 39. Bergmann, M. M., Jacobs, E. J., Hoffmann, K. & Boeing, H. Agreement of self-reported medical history: Comparison of an in- person interview with a self-administered questionnaire. Eur. J. Epidemiol. 19, 411–416. https:// doi. org/ 10. 1023/b: ejep. 00000 27350. 85974. 47 (2004). 40. Kehoe, R., Wu, S. Y., Leske, M. C. & Chylack, L. T. Jr. Comparing self-reported and physician-reported medical history. Am. J. Epidemiol. 139, 813–818. https:// doi. org/ 10. 1093/ oxfor djour nals. aje. a1170 78 (1994). 41. Okura, Y., Urban, L. H., Mahoney, D. W., Jacobsen, S. J. & Rodeheffer, R. J. Agreement between self-report questionnaires and medical record data was substantial for diabetes, hypertension, myocardial infarction and stroke but not for heart failure. J. Clin. Epidemiol. 57, 1096–1103. https:// doi. org/ 10. 1016/j. jclin epi. 2004. 04. 005 (2004). 42. Harlow, S. D. & Linet, M. S. Agreement between questionnaire data and medical records. The evidence for accuracy of recall. Am. J. Epidemiol. 129, 233–248. https:// doi. org/ 10. 1093/ oxfor djour nals. aje. a1151 29 (1989). 43. Kriegsman, D. M., Penninx, B. W., van Eijk, J. T., Boeke, A. J. & Deeg, D. J. Self-reports and general practitioner information on the presence of chronic diseases in community dwelling elderly. A study on the accuracy of patients’ self-reports and on determinants of inaccuracy. J. Clin. Epidemiol. 49, 1407–1417. https:// doi. org/ 10. 1016/ s0895- 4356(96) 00274-0 (1996). 44. Oksanen, T. et al. Self-report as an indicator of incident disease. Ann. Epidemiol. 20, 547–554. https:// doi. org/ 10. 1016/j. annep idem. 2010. 03. 017 (2010). 45. Pensola, T., Gould, R. & Polvinen, A. Ammatit ja työkyvyttömyyseläkkeet: Masennukseen, muihin mielenterveyden häiriöihin sekä tuki ja liikuntaelinten sairauksiin perustuvat eläkkeet (2010). 46. Hytti, H. Why are Swedes sick but Finns unemployed?. Int. J. Soc. Welf. 15, 131–141. https:// doi. org/ 10. 1111/j. 1468- 2397. 2006. 00412.x (2006). 47. Khanji, M. Y. et al. Cardiovascular risk assessment: A systematic review of guidelines. Ann. Intern. Med. 165, 713–722. https:// doi. org/ 10. 7326/ M16- 1110 (2016). Author contributions All authors participated in designing the study, generating hypotheses, interpreting the data, and critically review- ing the paper. S.T.N., with M.K., wrote the first draft of the report. S.T.N., with help from J.A. and J.P., performed data analysis. S.T.N., J.A., J.P., and M.K. had full access to all the data. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted. Funding This study was supported by Finnish Work Environment Fund (190424) and Academy of Finland (329202). STN was supported by NordForsk (75021) and Finnish Work Environment Fund (190424), JP was supported by the Academy of Finland (311492), JV by Academy of Finland (321409 and 329240), GDB by the MRC (MR/ P023444/1) and NIA (1R56AG052519-01) and MK by the Wellcome Trust (221854/Z/20/Z), the UK Medical Research Council (MR/S011676/1), the US National Institute on Aging (R01AG056477), the Academy of Finland (329202, 350426), and the Finnish Work Environment Fund (190424). Competing interests The authors declare no competing interests. Additional information Supplementary Information The online version contains supplementary material available at https:// doi. org/ 10. 1038/ s41598- 023- 33120-3. Correspondence and requests for materials should be addressed to S.T.N. Reprints and permissions information is available at www.nature.com/reprints. Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. 13 Vol.:(0123456789) Scientific Reports | (2023) 13:6334 | https://doi.org/10.1038/s41598-023-33120-3 www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. © The Author(s) 2023