Last data update: Sep 23, 2024. (Total: 47723 publications since 2009)
Records 1-11 (of 11 Records) |
Query Trace: Lyles RH [original query] |
---|
Prevalence of amyotrophic lateral sclerosis in the United States, 2018
Mehta P , Raymond J , Zhang Y , Punjani R , Han M , Larson T , Muravov O , Lyles RH , Horton DK . Amyotroph Lateral Scler Frontotemporal Degener 2023 1-7 OBJECTIVE: To estimate prevalent ALS cases in the United States for calendar year 2018. METHODS: The National ALS Registry (Registry) compiled data from national administrative databases (from the Centers for Medicare and Medicaid Services, the Veterans Health Administration, and the Veterans Benefits Administration) and enrollment data voluntarily submitted through a web portal (www.cdc.gov/als). We used log-linear capture-recapture (CRC) model-based methodology to estimate the number of cases not ascertained by the Registry. RESULTS: The Registry identified 21,655 cases of ALS in 2018, with an age-adjusted prevalence of 6.6 per 100,000 U.S. population. When CRC methods were used, an estimated 29,824 cases were identified, for an adjusted prevalence of 9.1 per 100,000 U.S. population. The demographics of cases of ALS did not change from previous year's reports. ALS continues to impact Whites, males, and persons over 50 years of age more so than other comparison groups. The results from the present report suggest case ascertainment for the Registry has improved, with the estimate of missing prevalent cases decreasing from 44% in 2017 to 27% in in 2018. DISCUSSION: Consistent with previous estimates that used CRC, ALS prevalence in the United States is about 29,824 cases per year. |
Sensitivity and Uncertainty Analysis for Two-Stream Capture-Recapture Methods in Disease Surveillance (preprint)
Zhang Y , Chen J , Ge L , Williamson JM , Waller LA , Lyles RH . medRxiv 2022 23 Capture-recapture methods are widely applied in estimating the number (N) of prevalent or cumulatively incident cases in disease surveillance. Here, we focus the bulk of our attention on the common case in which there are two data streams. We propose a sensitivity and uncertainty analysis framework grounded in multinomial distribution-based maximum likelihood, hinging on a key dependence parameter that is typically non-identifiable but is epidemiologically interpretable. Focusing on the epidemiologically meaningful parameter unlocks appealing data visualizations for sensitivity analysis and provides an intuitively accessible framework for uncertainty analysis designed to leverage the practicing epidemiologist's understanding of the implementation of the surveillance streams as the basis for assumptions driving estimation of N. By illustrating the proposed sensitivity analysis using publicly available HIV surveillance data, we emphasize both the need to admit the lack of information in the observed data and the appeal of incorporating expert opinion about the key dependence parameter. The proposed uncertainty analysis is an empirical Bayes-like approach designed to more realistically acknowledge variability in the estimated N associated with uncertainty in an expert's opinion about the non-identifiable parameter, together with the statistical uncertainty. We demonstrate how such an approach can also facilitate an appealing general interval estimation procedure to accompany capture-recapture methods. Simulation studies illustrate the reliable performance of the proposed approach for quantifying uncertainties in estimating N in various contexts. Finally, we demonstrate how the recommended paradigm has the potential to be directly extended for application to data from more than two surveillance streams. Copyright The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. |
Sensitivity and uncertainty analysis for two-stream capture-recapture methods in disease surveillance
Zhang Y , Chen J , Ge L , Williamson JM , Waller LA , Lyles RH . Epidemiology 2023 34 (4) 601-610 Capture-recapture methods are widely applied in estimating the number (N) of prevalent or cumulatively incident cases in disease surveillance. Here, we focus the bulk of our attention on the common case in which there are two data streams. We propose a sensitivity and uncertainty analysis framework grounded in multinomial distribution-based maximum likelihood, hinging on a key dependence parameter that is typically non-identifiable but is epidemiologically interpretable. Focusing on the epidemiologically meaningful parameter unlocks appealing data visualizations for sensitivity analysis and provides an intuitively accessible framework for uncertainty analysis designed to leverage the practicing epidemiologist's understanding of the implementation of the surveillance streams as the basis for assumptions driving estimation of N. By illustrating the proposed sensitivity analysis using publicly available HIV surveillance data, we emphasize both the need to admit the lack of information in the observed data and the appeal of incorporating expert opinion about the key dependence parameter. The proposed uncertainty analysis is a simulation-based approach designed to more realistically acknowledge variability in the estimated N associated with uncertainty in an expert's opinion about the non-identifiable parameter, together with the statistical uncertainty. We demonstrate how such an approach can also facilitate an appealing general interval estimation procedure to accompany capture-recapture methods. Simulation studies illustrate the reliable performance of the proposed approach for quantifying uncertainties in estimating N in various contexts. Finally, we demonstrate how the recommended paradigm has the potential to be directly extended for application to data from more than two surveillance streams. |
A censored quantile regression approach for relative survival analysis: Relative survival quantile regression
Williamson JM , Lin HM , Lyles RH . Biom J 2023 65 (5) e2200127 We propose a censored quantile regression model for the analysis of relative survival data. We create a hybrid data set consisting of the study observations and counterpart randomly sampled pseudopopulation observations imputed from population life tables that adjust for expected mortality. We then fit a censored quantile regression model to the hybrid data incorporating demographic variables (e.g., age, biologic sex, calendar time) corresponding to the population life tables of demographically-similar individuals, a population versus study covariate, and its interactions with the variables of interest. These latter variables can be interpreted as relative survival parameters that depict the differences in failure quantiles between the study participants and their population counterparts. |
Novel application of one-step pooled molecular testing and maximum likelihood approaches to estimate the prevalence of malaria parasitaemia among rapid diagnostic test negative samples in western Kenya.
Shah MP , Chebore W , Lyles RH , Otieno K , Zhou Z , Plucinski M , Waller LA , Odongo W , Lindblade KA , Kariuki S , Samuels AM , Desai M , Mitchell RM , Shi YP . Malar J 2022 21 (1) 319 BACKGROUND: Detection of malaria parasitaemia in samples that are negative by rapid diagnostic tests (RDTs) requires resource-intensive molecular tools. While pooled testing using a two-step strategy provides a cost-saving alternative to the gold standard of individual sample testing, statistical adjustments are needed to improve accuracy of prevalence estimates for a single step pooled testing strategy. METHODS: A random sample of 4670 malaria RDT negative dried blood spot samples were selected from a mass testing and treatment trial in Asembo, Gem, and Karemo, western Kenya. Samples were tested for malaria individually and in pools of five, 934 pools, by one-step quantitative polymerase chain reaction (qPCR). Maximum likelihood approaches were used to estimate subpatent parasitaemia (RDT-negative, qPCR-positive) prevalence by pooling, assuming poolwise sensitivity and specificity was either 100% (strategy A) or imperfect (strategy B). To improve and illustrate the practicality of this estimation approach, a validation study was constructed from pools allocated at random into main (734 pools) and validation (200 pools) subsets. Prevalence was estimated using strategies A and B and an inverse-variance weighted estimator and estimates were weighted to account for differential sampling rates by area. RESULTS: The prevalence of subpatent parasitaemia was 14.5% (95% CI 13.6-15.3%) by individual qPCR, 9.5% (95% CI (8.5-10.5%) by strategy A, and 13.9% (95% CI 12.6-15.2%) by strategy B. In the validation study, the prevalence by individual qPCR was 13.5% (95% CI 12.4-14.7%) in the main subset, 8.9% (95% CI 7.9-9.9%) by strategy A, 11.4% (95% CI 9.9-12.9%) by strategy B, and 12.8% (95% CI 11.2-14.3%) using inverse-variance weighted estimator from poolwise validation. Pooling, including a 20% validation subset, reduced costs by 52% compared to individual testing. CONCLUSIONS: Compared to individual testing, a one-step pooled testing strategy with an internal validation subset can provide accurate prevalence estimates of PCR-positivity among RDT-negatives at a lower cost. |
Extrapolating sparse gold standard cause of death designations to characterize broader catchment areas
Lyles RH , Cunningham SA , Kundu S , Bassat Q , Mandomando I , Sacoor C , Akelo V , Onyango D , Zielinski-Gutierrez E , Taylor AW . Epidemiol Methods 2020 9 (1) The Child Health and Mortality Prevention Surveillance (CHAMPS) Network is designed to elucidate and track causes of under-5 child mortality and stillbirth in multiple sites in sub-Saharan Africa and South Asia using advanced surveillance, laboratory and pathology methods. Expert panels provide an arguable gold standard determination of underlying cause of death (CoD) on a subset of child deaths, in part through examining tissue obtained via minimally invasive tissue sampling (MITS) procedures. We consider estimating a population-level distribution of CoDs based on this sparse but precise data, in conjunction with data on subgrouping characteristics that are measured on the broader population of cases and are potentially associated with selection for MITS and with cause-specific mortality. We illustrate how estimation of each underlying CoD proportion using all available data can be addressed equivalently in terms of a Horvitz-Thompson adjustment or a direct standardization, uncovering insights relevant to the designation of appropriate subgroups to adjust for non-representative sampling. Taking advantage of the functional form of the result when expressed as a multinomial distribution-based maximum likelihood estimator, we propose small-sample adjustments to Bayesian credible intervals based on Jeffreys or related weakly informative Dirichlet prior distributions. Our analyses of early data from CHAMPS sites in Kenya and Mozambique and accompanying simulation studies demonstrate the validity of the adjustment approach under attendant assumptions, together with marked performance improvements associated with the proposed adjusted Bayesian credible intervals. Adjustment for non-representative sampling of those validated via gold standard diagnostic methods is a critical endeavor for epidemiologic studies like CHAMPS that seek extrapolation of CoD proportion estimates. |
Regression analysis for differentially misclassified correlated binary outcomes
Tang L , Lyles RH , King CC , Hogan JW , Lo Y . J R Stat Soc Ser C Appl Stat 2015 64 (3) 433-449 In many epidemiological and clinical studies, misclassification may arise in one or several variables, resulting in potentially invalid analytic results (e.g. estimates of odds ratios of interest) when no correction is made. Here we consider the situation in which correlated binary response variables are subject to misclassification. Building on prior work, we provide an approach to adjust for potentially complex differential misclassification via internal validation sampling applied at multiple study time points. We seek to estimate the parameters of a primary generalized linear mixed model that accounts for baseline and/or time-dependent covariates. The misclassification process is modelled via a second generalized linear model that captures variations in sensitivity and specificity parameters according to time and a set of subject-specific covariates that may or may not overlap with those in the primary model. Simulation studies demonstrate the precision and validity of the method proposed. An application is presented based on longitudinal assessments of bacterial vaginosis conducted in the 'HIV epidemiology research' study. |
Binary regression with differentially misclassified response and exposure variables
Tang L , Lyles RH , King CC , Celentano DD , Lo Y . Stat Med 2015 34 (9) 1605-20 Misclassification is a long-standing statistical problem in epidemiology. In many real studies, either an exposure or a response variable or both may be misclassified. As such, potential threats to the validity of the analytic results (e.g., estimates of odds ratios) that stem from misclassification are widely discussed in the literature. Much of the discussion has been restricted to the nondifferential case, in which misclassification rates for a particular variable are assumed not to depend on other variables. However, complex differential misclassification patterns are common in practice, as we illustrate here using bacterial vaginosis and Trichomoniasis data from the HIV Epidemiology Research Study (HERS). Therefore, clear illustrations of valid and accessible methods that deal with complex misclassification are still in high demand. We formulate a maximum likelihood (ML) framework that allows flexible modeling of misclassification in both the response and a key binary exposure variable, while adjusting for other covariates via logistic regression. The approach emphasizes the use of internal validation data in order to evaluate the underlying misclassification mechanisms. Data-driven simulations show that the proposed ML analysis outperforms less flexible approaches that fail to appropriately account for complex misclassification patterns. The value and validity of the method are further demonstrated through a comprehensive analysis of the HERS example data. |
Validation data-based adjustments for outcome misclassification in logistic regression: an illustration
Lyles RH , Tang L , Superak HM , King CC , Celentano DD , Lo Y , Sobel JD . Epidemiology 2011 22 (4) 589-97 Misclassification of binary outcome variables is a known source of potentially serious bias when estimating adjusted odds ratios. Although researchers have described frequentist and Bayesian methods for dealing with the problem, these methods have seldom fully bridged the gap between statistical research and epidemiologic practice. In particular, there have been few real-world applications of readily grasped and computationally accessible methods that make direct use of internal validation data to adjust for differential outcome misclassification in logistic regression. In this paper, we illustrate likelihood-based methods for this purpose that can be implemented using standard statistical software. Using main study and internal validation data from the HIV Epidemiology Research Study, we demonstrate how misclassification rates can depend on the values of subject-specific covariates, and we illustrate the importance of accounting for this dependence. Simulation studies confirm the effectiveness of the maximum likelihood approach. We emphasize clear exposition of the likelihood function itself, to permit the reader to easily assimilate appended computer code that facilitates sensitivity analyses as well as the efficient handling of main/external and main/internal validation-study data. These methods are readily applicable under random cross-sectional sampling, and we discuss the extent to which the main/internal analysis remains appropriate under outcome-dependent (case-control) sampling. |
Likelihood-based methods for estimating the association between a health outcome and left- or interval-censored longitudinal exposure data
Wannemuehler KA , Lyles RH , Manatunga AK , Terrell ML , Marcus M . Stat Med 2010 29 (16) 1661-1672 The Michigan Female Health Study (MFHS) conducted research focusing on reproductive health outcomes among women exposed to polybrominated biphenyls (PBBs). In the work presented here, the available longitudinal serum PBB exposure measurements are used to obtain predictions of PBB exposure for specific time points of interest via random effects models. In a two-stage approach, a prediction of the PBB exposure is obtained and then used in a second-stage health outcome model. This paper illustrates how a unified approach, which links the exposure and outcome in a joint model, provides an efficient adjustment for covariate measurement error. We compare the use of empirical Bayes predictions in the two-stage approach with results from a joint modeling approach, with and without an adjustment for left- and interval-censored data. The unified approach with the adjustment for left- and interval-censored data resulted in little bias and near-nominal confidence interval coverage in both the logistic and linear model setting. Published in 2010 by John Wiley & Sons, Ltd. |
A conditional expectation approach for associating ambient air pollutant exposures with health outcomes
Wannemuehler KA , Lyles RH , Waller LA , Hoekstra RM , Klein M , Tolbert P . Environmetrics 2009 20 (7) 877-894 Our research focuses on the association between exposure to an airborne pollutant and counts of emergency department visits attributed to a specific chronic illness. The motivating example for this analysis of measurement error in time series studies of air pollution and acute health outcomes was a study of emergency department visits from a 20-county Atlanta metropolitan statistical area from 1993-1999. The research presented illustrates the impact of using various surrogates for unobserved measurements of ambient concentrations at the zip code level. Simulation results indicate that the impact of measurement error on the association between pollutant exposure and a health outcome can be substantial. The proposed conditional expectation approach provided reliable estimates of the association and exhibited good confidence interval coverage for a variety of magnitudes of association. Use of a single-centrally located monitor, the arithmetic average, the nearest-neighbor monitor, and the inverse-distance weighted average surrogates resulted in biased estimates and poor coverage rates, especially for larger magnitudes of the association. A focus on obtaining reasonable exposure measurements within clearly defined subregions is important when the pollutant exposure of interest exhibits strong spatial variability. |
- Page last reviewed:Feb 1, 2024
- Page last updated:Sep 23, 2024
- Content source:
- Powered by CDC PHGKB Infrastructure