Last data update: Oct 07, 2024. (Total: 47845 publications since 2009)
Records 1-10 (of 10 Records) |
Query Trace: Schenker N[original query] |
---|
Small area estimation of cancer risk factors and screening behaviors in US counties by combining two large national health surveys
Liu B , Parsons V , Feuer EJ , Pan Q , Town M , Raghunathan TE , Schenker N , Xie D . Prev Chronic Dis 2019 16 E119 BACKGROUND: National health surveys, such as the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS), collect data on cancer screening and smoking-related measures in the US noninstitutionalized population. These surveys are designed to produce reliable estimates at the national and state levels. However, county-level data are often needed for cancer surveillance and related research. METHODS: To use the large sample sizes of BRFSS and the high response rates and better coverage of NHIS, we applied multilevel models that combined information from both surveys. We also used relevant sources such as census and administrative records. By using these methods, we generated estimates for several cancer risk factors and screening behaviors that are more precise than design-based estimates. RESULTS: We produced reliable, modeled estimates for 11 outcomes related to smoking and to screening for female breast cancer, cervical cancer, and colorectal cancer. The estimates were produced for 3,112 counties in the United States for the data period from 2008 through 2010. CONCLUSION: The modeled estimates corrected for potential noncoverage bias and nonresponse bias in the BRFSS and reduced the variability in NHIS estimates that is attributable to small sample size. The small area estimates produced in this study can serve as a useful resource to the cancer surveillance community. |
An American Thoracic Society/National Heart, Lung, and Blood Institute Workshop Report: Addressing respiratory health equality in the United States
Celedon JC , Burchard EG , Schraufnagel D , Castillo-Salgado C , Schenker M , Balmes J , Neptune E , Cummings KJ , Holguin F , Riekert KA , Wisnivesky JP , Garcia JGN , Roman J , Kittles R , Ortega VE , Redline S , Mathias R , Thomas A , Samet J , Ford JG . Ann Am Thorac Soc 2017 14 (5) 814-826 Health disparities related to race, ethnicity, and socioeconomic status persist and are commonly encountered by practitioners of pediatric and adult pulmonary, critical care, and sleep medicine in the United States. To address such disparities and thus progress toward equality in respiratory health, the American Thoracic Society and the National Heart, Lung, and Blood Institute convened a workshop in May of 2015. The workshop participants addressed health disparities by focusing on six topics, each of which concluded with a panel discussion that proposed recommendations for research on racial, ethnic, and socioeconomic disparities in pulmonary, critical care, and sleep medicine. Such recommendations address best practices to advance research on respiratory health disparities (e.g., characterize broad ethnic groups into subgroups known to differ with regard to a disease of interest), risk factors for respiratory health disparities (e.g., study the impact of new tobacco or nicotine products on respiratory diseases in minority populations), addressing equity in access to healthcare and quality of care (e.g., conduct longitudinal studies of the impact of the Affordable Care Act on respiratory and sleep disorders), the impact of personalized medicine on disparities research (e.g., implement large studies of pharmacogenetics in minority populations), improving design and methodology for research studies in respiratory health disparities (e.g., use study designs that reduce participants' burden and foster trust by engaging participants as decision-makers), and achieving equity in the pulmonary, critical care, and sleep medicine workforce (e.g., develop and maintain robust mentoring programs for junior faculty, including local and external mentors). Addressing these research needs should advance efforts to reduce, and potentially eliminate, respiratory, sleep, and critical care disparities in the United States. |
Multiple imputation of completely missing repeated measures data within persons from a complex sample: application to accelerometer data in the National Health and Nutrition Examination Survey
Liu B , Yu M , Graubard BI , Troiano RP , Schenker N . Stat Med 2016 35 (28) 5170-5188 The Physical Activity Monitor component was introduced into the 2003-2004 National Health and Nutrition Examination Survey (NHANES) to collect objective information on physical activity including both movement intensity counts and ambulatory steps. Because of an error in the accelerometer device initialization process, the steps data were missing for all participants in several primary sampling units, typically a single county or group of contiguous counties, who had intensity count data from their accelerometers. To avoid potential bias and loss in efficiency in estimation and inference involving the steps data, we considered methods to accurately impute the missing values for steps collected in the 2003-2004 NHANES. The objective was to come up with an efficient imputation method that minimized model-based assumptions. We adopted a multiple imputation approach based on additive regression, bootstrapping and predictive mean matching methods. This method fits alternative conditional expectation (ace) models, which use an automated procedure to estimate optimal transformations for both the predictor and response variables. This paper describes the approaches used in this imputation and evaluates the methods by comparing the distributions of the original and the imputed data. A simulation study using the observed data is also conducted as part of the model diagnostics. Finally, some real data analyses are performed to compare the before and after imputation results. |
A note on the effect of data clustering on the multiple-imputation variance estimator: a theoretical addendum to the Lewis et al. article in JOS 2014
He Y , Shimizu I , Schappert S , Xu J , Beresovsky V , Khan D , Valverde R , Schenker N . J Off Stat 2016 32 (1) 147-164 Multiple imputation is a popular approach to handling missing data. Although it was originally motivated by survey nonresponse problems, it has been readily applied to other data settings. However, its general behavior still remains unclear when applied to survey data with complex sample designs, including clustering. Recently, Lewis et al. (2014) compared single-and multiple-imputation analyses for certain incomplete variables in the 2008 National Ambulatory Medicare Care Survey, which has a nationally representative, multistage, and clustered sampling design. Their study results suggested that the increase of the variance estimate due to multiple imputation compared with single imputation largely disappears for estimates with large design effects. We complement their empirical research by providing some theoretical reasoning. We consider data sampled from an equally weighted, single-stage cluster design and characterize the process using a balanced, one-way normal random-effects model. Assuming that the missingness is completely at random, we derive analytic expressions for the within-and between-multiple-imputation variance estimators for the mean estimator, and thus conveniently reveal the impact of design effects on these variance estimators. We propose approximations for the fraction of missing information in clustered samples, extending previous results for simple random samples. We discuss some generalizations of this research and its practical implications for data release by statistical agencies. © Statistics Sweden. |
Identifying implausible gestational ages in preterm babies with Bayesian mixture models.
Zhang G , Schenker N , Parker JD , Liao D . Stat Med 2012 32 (12) 2097-113 Infant birth weight and gestational age are two important variables in obstetric research. The primary measure of gestational age used in US birth data is based on a mother's recall of her last menstrual period, which has been shown to introduce random or systematic errors. To mitigate some of those errors, Oja et al., Platt et al., and Tentoni et al. estimated the probabilities of gestational ages being misreported under the assumption that the distribution of infant birth weights for a true gestational age is approximately Gaussian. From this assumption, Oja et al. fitted a three-component mixture model, and Tentoni et al. and Platt et al. fitted two-component mixture models. We build on their methods and develop a Bayesian mixture model. We then extend our methods using reversible jump Markov chain Monte Carlo to incorporate the uncertainty in the number of components in the model. We conduct simulation studies and apply our methods to singleton births with reported gestational ages of 23-32 weeks using 2001-2008 US birth data. Results show that a three-component mixture model fits the birth data better for gestational ages reported as 25 weeks or less; and a two-component mixture model fits better for the higher gestational ages. Under the assumption that our Bayesian mixture models are appropriate for US birth data, our research provides useful statistical tools to identify records with implausible gestational ages, and the techniques can be used in part of a multiple-imputation procedure for missing and implausible gestational ages. (Published 2012. This article is a US Government work and is in the public domain in the USA.) |
Estimating standard errors for life expectancies based on complex survey data with mortality follow-up: a case study using the National Health Interview Survey Linked Mortality Files
Schenker N , Parsons VL , Lochner KA , Wheatcroft G , Pamuk ER . Stat Med 2011 30 (11) 1302-11 Life expectancy is an important measure for health research and policymaking. Linking individual survey records to mortality data can overcome limitations in vital statistics data used to examine differential mortality by permitting the construction of death rates based on information collected from respondents at the time of interview and facilitating estimation of life expectancies for subgroups of interest. However, use of complex survey data linked to mortality data can complicate the estimation of standard errors. This paper presents a case study of approaches to variance estimation for life expectancies based on life tables, using the National Health Interview Survey Linked Mortality Files. The approaches considered include application of Chiang's traditional method, which is straightforward but does not account for the complex design features of the data; balanced repeated replication (BRR), which is more complicated but accounts more fully for the design features; and compromise, 'hybrid' approaches, which can be less difficult to implement than BRR but still account partially for the design features. Two tentative conclusions are drawn. First, it is important to account for the effects of the complex sample design, at least within life-table age intervals. Second, accounting for the effects within age intervals but not across age intervals, as is done by the hybrid methods, can yield reasonably accurate estimates of standard errors, especially for subgroups of interest with more homogeneous characteristics among their members. Published in 2011 by John Wiley & Sons, Ltd. |
Multiple imputation of missing dual-energy X-ray absorptiometry data in the National Health and Nutrition Examination Survey
Schenker N , Borrud LG , Burt VL , Curtin LR , Flegal KM , Hughes J , Johnson CL , Looker AC , Mirel L . Stat Med 2010 30 (3) 260-76 In 1999, dual-energy x-ray absorptiometry (DXA) scans were added to the National Health and Nutrition Examination Survey (NHANES) to provide information on soft tissue composition and bone mineral content. However, in 1999-2004, DXA data were missing in whole or in part for about 21 per cent of the NHANES participants eligible for the DXA examination; and the missingness is associated with important characteristics such as body mass index and age. To handle this missing-data problem, multiple imputation of the missing DXA data was performed. Several features made the project interesting and challenging statistically, including the relationship between missingness on the DXA measures and the values of other variables; the highly multivariate nature of the variables being imputed; the need to transform the DXA variables during the imputation process; the desire to use a large number of non-DXA predictors, many of which had small amounts of missing data themselves, in the imputation models; the use of lower bounds in the imputation procedure; and relationships between the DXA variables and other variables, which helped both in creating and evaluating the imputations. This paper describes the imputation models, methods, and evaluations for this publicly available data resource and demonstrates properties of the imputations via examples of analyses of the data. The analyses suggest that imputation helps to correct biases that occur in estimates based on the data without imputation, and that it helps to increase the precision of estimates as well. Moreover, multiple imputation usually yields larger estimated standard errors than those obtained with single imputation. Published in 2010 by John Wiley & Sons, Ltd. |
State-based estimates of mammography screening rates based on information from two health surveys
Davis WW , Parsons VL , Xie D , Schenker N , Town M , Raghunathan TE , Feuer EJ . Public Health Rep 2010 125 (4) 567-78 OBJECTIVES: We compared national and state-based estimates for the prevalence of mammography screening from the National Health Interview Survey (NHIS), the Behavioral Risk Factor Surveillance System (BRFSS), and a model-based approach that combines information from the two surveys. METHODS: At the state and national levels, we compared the three estimates of prevalence for two time periods (1997-1999 and 2000-2003) and the estimated difference between the periods. We included state-level covariates in the model-based approach through principal components. RESULTS: The national mammography screening prevalence estimate based on the BRFSS was substantially larger than the NHIS estimate for both time periods. This difference may have been due to nonresponse and noncoverage biases, response mode (telephone vs. in-person) differences, or other factors. However, the estimated change between the two periods was similar for the two surveys. Consistent with the model assumptions, the model-based estimates were more similar to the NHIS estimates than to the BRFSS prevalence estimates. The state-level covariates (through the principal components) were shown to be related to the mammography prevalence with the expected positive relationship for socioeconomic status and urbanicity. In addition, several principal components were significantly related to the difference between NHIS and BRFSS telephone prevalence estimates. CONCLUSIONS: Model-based estimates, based on information from the two surveys, are useful tools in representing combined information about mammography prevalence estimates from the two surveys. The model-based approach adjusts for the possible nonresponse and noncoverage biases of the telephone survey while using the large BRFSS state sample size to increase precision. |
The use of covariates to identify records with implausible gestational ages using the birthweight distribution
Parker JD , Liao D , Schenker N , Branum A . Paediatr Perinat Epidemiol 2010 24 (5) 424-32 The objective of this study was to evaluate the usefulness of covariates in identifying birth records with implausible values of gestational age. Birthweight distributions for births with early reported gestational ages are markedly bimodal, suggesting a mixture of two distributions. Most births form a normal-shaped left-hand (primary) distribution and a smaller number form the right-hand (secondary) distribution. The births in the secondary distribution are thought to have gestational age mistakenly reported. Prior work has found that births in the secondary distribution are at higher risk of poor outcomes than those in the primary distribution. Using 2002 US Natality data for gestational ages 26-35 weeks, we fit normal mixture models to birthweight with and without covariates (maternal race, education, parity, age, region of the country, prenatal care initiation) by reported gestational age. Additional models were stratified by infant sex. This approach allowed for the relationship between the covariates and birthweight to differ between the components. Mixture models fit reasonably well for reported gestational ages <33 weeks, but not for later weeks. Counter to the hypothesis, results were similar for models with and without covariates or stratification or both, although stratified models without covariates predicted slightly more girls and slightly fewer boys in the secondary distribution than did the corresponding unstratified models. For reported gestational ages <33 weeks, predictions from the four sets of models were highly correlated and predictions were similar for subgroups defined by the clinical estimates of gestational age and other covariates. For births with reported gestational ages of 29 or more weeks, the proportion in the secondary distribution exceeded 30%, although this varied by maternal characteristics. The use of covariates and stratification complicated model fitting without materially improving identification of implausible gestational age values, supporting inferences from prior studies using data 'cleaned' without consideration of maternal or infant characteristics. |
Improving on analyses of self-reported data in a large-scale health survey by using information from an examination-based survey
Schenker N , Raghunathan TE , Bondarenko I . Stat Med 2009 29 (5) 533-45 Common data sources for assessing the health of a population of interest include large-scale surveys based on interviews that often pose questions requiring a self-report, such as, 'Has a doctor or other health professional ever told you that you have health condition of interest ?' or 'What is your height/weight ?' Answers to such questions might not always reflect the true prevalences of health conditions (for example, if a respondent misreports height/weight or does not have access to a doctor or other health professional). Such 'measurement error' in health data could affect inferences about measures of health and health disparities. Drawing on two surveys conducted by the National Center for Health Statistics, this paper describes an imputation-based strategy for using clinical information from an examination-based health survey to improve on analyses of self-reported data in a larger interview-based health survey. Models predicting clinical values from self-reported values and covariates are fitted to data from the National Health and Nutrition Examination Survey (NHANES), which asks self-report questions during an interview component and also obtains clinical measurements during a physical examination component. The fitted models are used to multiply impute clinical values for the National Health Interview Survey (NHIS), a larger survey that obtains data solely via interviews. Illustrations involving hypertension, diabetes, and obesity suggest that estimates of health measures based on the multiply imputed clinical values are different from those based on the NHIS self-reported data alone and have smaller estimated standard errors than those based solely on the NHANES clinical data. The paper discusses the relationship of the methods used in the study to two-phase/two-stage/validation sampling and estimation, along with limitations, practical considerations, and areas for future research. Published in 2009 by John Wiley & Sons, Ltd. |
- Page last reviewed:Feb 1, 2024
- Page last updated:Oct 07, 2024
- Content source:
- Powered by CDC PHGKB Infrastructure