Date of Award
Spring 1-1-2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Public Health
First Advisor
Ma, Xiaomei
Abstract
Background: In the United States (US), the incidence of early-onset colorectal cancer (EOCRC), defined as colorectal cancer (CRC) diagnosed among individuals <50 years of age, has been increasing rapidly. Compared to late-onset CRC (i.e., CRC diagnosed among individuals aged 50 years and older), EOCRC is characterized by more aggressive pathology and distinct genetic profiles, indicating that the etiology of EOCRC may be different. There have been a limited number of studies examining the risk factors of EOCRC, and none have focused on early life risk factors, which could be particularly relevant to early-onset diseases. In May 2021, the US Preventive Services Task Force (USPSTF) reduced the recommended age of CRC screening from 50 years to 45 years, and the COVID-19 pandemic influenced the provision of preventive services across the country. These events could impact the patterns of screening for EOCRC, but no study has systematically assessed such impacts in a large sample of the US population. Furthermore, the role of ecological factors on survival disparities among individuals with EOCRC remains understudied, and prior studies evaluating these factors have not accounted for potential spatial correlation. These knowledge gaps warrant a set of epidemiological studies to clarify the risk factors, screening patterns, and prognostic factors of EOCRC. Methods: We conducted three innovative research studies to understand EOCRC etiology, screening, and survival using a range of study designs and analytical methods. In the first study, we included 1,221 cases diagnosed with EOCRC at the age of 0-39 years during 1988-2021 and 61,050 controls without cancer matched on year of birth from the California Linkage Study of Early-Onset Cancers, which is a nested case-control study within the California birth cohort. Odds ratios (OR) and 95% confidence intervals (CI) were estimated from multivariable logistic regression models and subgroup analysis was conducted based on sex, ethnicity, and age at CRC diagnosis. In the second study, we conducted a retrospective cohort study using deidentified claims data from beneficiaries insured by the largest commercial provider in the US and aged 45-49 years between 2017 and 2022. Absolute and relative changes in screening uptake were compared between a 20-month period preceding (May 2018 to December 2019) and a 20-month period following (May 2021 to December 2022) the USPSTF recommendation. Interrupted time-series analysis and autoregressive integrated moving average models were used to evaluate changes in screening uptake, adjusting for temporal autocorrelation and seasonality. In the third study, we obtained EOCRC survival data from the Surveillance, Epidemiology, and End Results Program from January 1, 2000 to December 31, 2019 and used a Bayesian analytic approach to identify county-level ecological factors associated with EOCRC survival. Specifically, we conducted principal component (PC) analysis to reduce the dimensionality of data on 36 county-level social, behavioral, and preventive factors from the Centers for Disease Control and Prevention. The association between the identified PCs and survival was evaluated using multivariable spatial generalized linear mixed models. Counties with residual low and high survival (i.e., unexplained by the PCs) were classified as hotspots and coldspots, respectively. Results: In the first study (Chapter 2), we found that after adjusting for demographic, birth, and parental characteristics, males had 34% higher risk of EOCRC compared to females (OR = 1.34, 95% CI: 1.20, 1.51), and Hispanic individuals had 43% higher risk of EOCRC compared to non-Hispanic (NH) White individuals (OR = 1.34, 95% CI: 1.20, 1.51). Having a foreign-born mother was associated with lower risk of EOCRC (OR = 0.84, 95% CI: 0.73, 0.97). Among females, paternal age of 35 years and older was associated with higher risk of EOCRC compared to paternal age between 20-24 years (OR = 1.56, 95% CI: 1.08, 2.25) and every 500g increase in birthweight was associated with 10% increase in risk of EOCRC (OR = 1.10, 95% CI: 1.01, 1.21). Stratified analyses based on sex, ethnicity, and age at CRC diagnosis suggested heterogeneity in the etiology of EOCRC across the subgroups. In the second study (Chapter 3), among 10,221,114 distinct beneficiaries aged 45-49 years and eligible for CRC screening, we observed an increase in the mean bi-monthly CRC screening uptake from 0.50% (standard deviation [SD] = 0.02%) to 1.51% (SD = 0.59%) between the two 20-month periods preceding (May 2018 to December 2019) and following (May 2021 to December 2022) the USPSTF recommendation (p < 0.001). This represents an absolute change of 1.01 percentage points (95% CI: 0.62, 1.40) and a relative change of 202.51% (95% CI: -30.59%, 436.87%). Compared to average risk beneficiaries residing in zip codes that were in the bottom 20% for socioeconomic status (SES), average risk beneficiaries residing in zip codes that were in the top 20% for SES experienced the largest absolute (1.25% vs. 0.74%) and relative changes (214.01% vs. 167.73%) in screening (p = 0.02). Since the recommendation was issued, screening uptake also increased fastest among average-risk beneficiaries residing in the zip codes that were in the top 20% for SES (0.24 percentage points every two months, 95% CI: 0.23, 0.25) and metropolitan areas (0.20 percentage points every two months, 95% CI: 0.19, 0.21). In the third study (Chapter 4), we identified four PCs that can be used to explain 67.70% of the spatial variability in 5-year survival among 75,215 individuals with EOCRC: PC1) poverty, chronic disease, health risk behaviors (β = -0.03, 95% credible interval (CrI): -0.04, -0.03); PC2) younger age, chronic disease-free, minority status (β = -0.01, 95% CrI: -0.02, 0.00); PC3) urban environment, preventive services (β = 0.02, 95% CrI: 0.00, 0.03); and PC4) older age (-0.04, 95% CrI: -0.06, -0.02). Among individuals with distant malignancies, the residual spatial variability remained high for two US counties: 1) residents in Salt Lake County, Utah experiencing 26.5% (95% CrI: 1.5%, 47.8%) lower odds of survival [hotspot], and 2) residents in Riverside County, California experiencing 37% (95% CrI: 7.97%, 78.8%) higher odds of survival [coldspot] after adjustment for county-level factors. Conclusions: Findings from this dissertation indicate that demographic, birth, and parental characteristics play a role in the etiology of EOCRC, with differences across subgroups by sex, ethnicity and age at diagnosis. Notably, maternal birthplace and paternal age may be novel risk factors for EOCRC and require future research to validate our observation and evaluate possible underlying mechanisms. Furthermore, among privately insured beneficiaries aged 45-49 years, CRC screening uptake increased after the USPSTF recommendation, with potential disparities based on SES and locality. Finally, county-level ecological factors are strongly associated with survival among individuals with EOCRC. Yet there is some evidence of survival disparities among individuals with distant malignancies that remain unexplained by the factors that we examined.
Recommended Citation
Siddique, Sunny, "Examining Risk Factors, Screening, and Survival of Early-Onset Colorectal Cancer in the United States" (2025). Yale Graduate School of Arts and Sciences Dissertations. 1682.
https://elischolar.library.yale.edu/gsas_dissertations/1682