Last data update: Dec 02, 2024. (Total: 48272 publications since 2009)
Records 1-10 (of 10 Records) |
Query Trace: Mirel LB[original query] |
---|
Proposed framework for adopting privacy-preserving record linkage for public health action
Pathak A , Serrer L , Bhalla M , King R , Mirel LB , Srinivasan A , Baier P , Zapata D , David-Ferdon C , Luxenberg S , Gundlapalli AV . J Public Health Manag Pract 2024 OBJECTIVES: To propose a framework for adoption of privacy-preserving record linkage (PPRL) for public health applications. METHODS: Twelve interviews with subject matter experts (SMEs) were conducted virtually and coded using an inductive approach. A collaborative session was conducted with SMEs to identify key steps in the PPRL project lifecycle which informed development of a PPRL implementation checklist. RESULTS: This framework has 2 decision-making levels: the organization level and the project or program level. Organization-level considerations include PPRL governance, the optimal choice among approved PPRL solutions, the need for longitudinal linkages, the potential issue of vendor lock-in, and costs. Program-level considerations include characteristics of the PPRL use case, linkage quality and accuracy, data privacy and use, security thresholds, compatibility with data owners' data architecture, and trade-offs between open-source and commercial PPRL solutions. A PPRL implementation checklist was developed to guide public health practitioners considering PPRL for data linkage. CONCLUSIONS: The framework may be considered by public health entities to guide adoption and implementation of PPRL in public health research and surveillance. Public health experts may refer to this framework and the PPRL implementation checklist when determining the appropriateness of PPRL for specific use cases and implementation planning. |
Privacy preserving record linkage for public health action: opportunities and challenges
Pathak A , Serrer L , Zapata D , King R , Mirel LB , Sukalac T , Srinivasan A , Baier P , Bhalla M , David-Ferdon C , Luxenberg S , Gundlapalli AV . J Am Med Inform Assoc 2024 OBJECTIVES: To understand the landscape of privacy preserving record linkage (PPRL) applications in public health, assess estimates of PPRL accuracy and privacy, and evaluate factors for PPRL adoption. MATERIALS AND METHODS: A literature scan examined the accuracy, data privacy, and scalability of PPRL in public health. Twelve interviews with subject matter experts were conducted and coded using an inductive approach to identify factors related to PPRL adoption. RESULTS: PPRL has a high level of linkage quality and accuracy. PPRL linkage quality was comparable to that of clear text linkage methods (requiring direct personally identifiable information [PII]) for linkage across various settings and research questions. Accuracy of PPRL depended on several components, such as PPRL technique, and the proportion of missingness and errors in underlying data. Strategies to increase adoption include increasing understanding of PPRL, improving data owner buy-in, establishing governance structure and oversight, and developing a public health implementation strategy for PPRL. DISCUSSION: PPRL protects privacy by eliminating the need to share PII for linkage, but the accuracy and linkage quality depend on factors including the choice of PPRL technique and specific PII used to create encrypted identifiers. Large-scale implementations of PPRL linking millions of observations-including PCORnet, National Institutes for Health N3C, and the Centers for Disease Control and Prevention COVID-19 project have demonstrated the scalability of PPRL for public health applications. CONCLUSIONS: Applications of PPRL in public health have demonstrated their value for the public health community. Although gaps must be addressed before wide implementation, PPRL is a promising solution to data linkage challenges faced by the public health ecosystem. |
Evaluating data quality for blended data using a data quality framework
Parker JD , Mirel LB , Lee P , Mintz R , Tungate A , Vaidyanathan A . Stat J IAOS 2024 40 (1) 125-136 In 2020 the U.S. Federal Committee on Statistical Methodology (FCSM) released 'A Framework for Data Quality', organized by 11 dimensions of data quality grouped among three domains of quality (utility, objectivity, integrity). This paper addresses the use of the FCSM Framework for data quality assessments of blended data. The FCSM Framework applies to all types of data, however best practices for implementation have not been documented. We applied the FCSM Framework for three health-research related case studies. For each case study, assessments of data quality dimensions were performed to identify threats to quality, possible mitigations of those threats, and trade-offs among them. From these assessments the authors concluded: 1) data quality assessments are more complex in practice than anticipated and expert guidance and documentation are important; 2) each dimension may not be equally important for different data uses; 3) data quality assessments can be subjective and having a quantitative tool could help explain the results, however, quantitative assessments may be closely tied to the intended use of the dataset; 4) there are common trade-offs and mitigations for some threats to quality among dimensions. This paper is one of the first to apply the FCSM Framework to specific use-cases and illustrates a process for similar data uses. © 2024 - IOS Press. All rights reserved. |
A methodological assessment of privacy preserving record linkage using survey and administrative data
Mirel LB , Resnick DM , Aram J , Cox CS . Stat J IAOS 2022 38 (2) 413-421 BACKGROUND: The National Center for Health Statistics (NCHS) links data from surveys to administrative data sources, but privacy concerns make accessing new data sources difficult. Privacy-preserving record linkage (PPRL) is an alternative to traditional linkage approaches that may overcome this barrier. However, prior to implementing PPRL techniques it is important to understand their effect on data quality. METHODS: Results from PPRL were compared to results from an established linkage method, which uses unencrypted (plain text) identifiers and both deterministic and probabilistic techniques. The established method was used as the gold standard. Links performed with PPRL were evaluated for precision and recall. An initial assessment and a refined approach were implemented. The impact of PPRL on secondary data analysis, including match and mortality rates, was assessed. RESULTS: The match rates for all approaches were similar, 5.1% for the gold standard, 5.4% for the initial PPRL and 5.0% for the refined PPRL approach. Precision ranged from 93.8% to 98.9% and recall ranged from 98.7% to 97.8%, depending on the selection of tokens from PPRL. The impact of PPRL on secondary data analysis was minimal. DISCUSSION: The findings suggest PPRL works well to link patient records to the National Death Index (NDI) since both sources have a high level of non-missing personally identifiable information, especially among adults 65 and older who may also have a higher likelihood of linking to the NDI. CONCLUSION: The results from this study are encouraging for first steps for a statistical agency in the implementation of PPRL approaches, however, future research is still needed. © 2022-IOS Press. All rights reserved. |
Using supervised machine learning to identify efficient blocking schemes for record linkage.
Campbell SR , Resnick DM , Cox CS , Mirel LB . Stat J IAOS 2021 37 (2) 673-680 Record linkage enables survey data to be integrated with other data sources, expanding the analytic potential of both sources. However, depending on the number of records being linked, the processing time can be prohibitive. This paper describes a case study using a supervised machine learning algorithm, known as the Sequential Coverage Algorithm (SCA). The SCA was used to develop the join strategy for two data sources, the National Center for Health Statistics' (NCHS) 2016 National Hospital Care Survey (NHCS) and the Center for Medicare & Medicaid Services (CMS) Enrollment Database (EDB), during record linkage. Due to the size of the CMS data, common record joining methods (i.e. blocking) were used to reduce the number of pairs that need to be evaluated to identify the vast majority of matches. NCHS conducted a case study examining how the SCA improved the efficiency of blocking. This paper describes how the SCA was used to design the blocking used in this linkage. © 2021-IOS Press. All rights reserved. |
Using synthetic data to replace linkage derived elements: A case study
Resnick DM , Cox CS , Mirel LB . Health Serv Outcomes Res Methodol 2021 21 389-406 While record linkage can expand analyses performable from survey microdata, it also incurs greater risk of privacy-encroaching disclosure. One way to mitigate this risk is to replace some of the information added through linkage with synthetic data elements. This paper describes a case study using the National Hospital Care Survey (NHCS), which collects patient records under a pledge of protecting patient privacy from a sample of U.S. hospitals for statistical analysis purposes. The NHCS data were linked to the National Death Index (NDI) to enhance the survey with mortality information. The added information from NDI linkage enables survival analyses related to hospitalization, but as the death information includes dates of death and detailed causes of death, having it joined with the patient records increases the risk of patient re-identification (albeit only for deceased persons). For this reason, an approach was tested to develop synthetic data that uses models from survival analysis to replace vital status and actual dates-of-death with synthetic values and uses classification tree analysis to replace actual causes of death with synthesized causes of death. The degree to which analyses performed on the synthetic data replicate results from analysis on the actual data is measured by comparing survival analysis parameter estimates from both data files. Because synthetic data only have value to the degree that they can be used to produce statistical estimates that are like those based on the actual data, this evaluation is an essential first step in assessing the potential utility of synthetic mortality data. |
Using linked survey paradata to improve sampling strategies in the Medical Expenditure Panel Survey
Mirel LB , Chowdhury SR . J Off Stat 2017 33 (2) 367-383 Using paradata from a prior survey that is linked to a new survey can help a survey organization develop more effective sampling strategies. One example of this type of linkage or subsampling is between the National Health Interview Survey (NHIS) and the Medical Expenditure Panel Survey (MEPS). MEPS is a nationally representative sample of the U.S. civilian, noninstitutionalized population based on a complex multi-stage sample design. Each year a new sample is drawn as a subsample of households from the prior year’s NHIS. The main objective of this article is to examine how paradata from a prior survey can be used in developing a sampling scheme in a subsequent survey. A framework for optimal allocation of the sample in substrata formed for this purpose is presented and evaluated for the relative effectiveness of alternative substratification schemes. The framework is applied, using real MEPS data, to illustrate how utilizing paradata from the linked survey offers the possibility of making improvements to the sampling scheme for the subsequent survey. The improvements aim to reduce the data collection costs while maintaining or increasing effective responding sample sizes and response rates for a harder to reach population. |
The prevalence of using iodine-containing supplements is low among reproductive-age women, National Health and Nutrition Examination Survey 1999-2006
Gahche JJ , Bailey RL , Mirel LB , Dwyer JT . J Nutr 2013 143 (6) 872-7 During pregnancy, the iodine requirement rises to meet demands for neurological development and fetal growth. If these requirements are not met, irreversible pathological cognitive and behavioral changes to the fetus may ensue. This study estimated the prevalence of iodine-containing dietary supplement (DS) use and intakes of iodine from DSs among pregnant women and nonpregnant women of reproductive age (15-39 y) who were interviewed and examined in NHANES 1999-2006 (n = 6404). Although 77.5% of pregnant women reported taking one or more DSs in the past 30 d, only 22.3% consumed an iodine-containing supplement. Most pregnant women reported using one DS and reported taking this product daily. The vast majority of iodine-containing DSs reported by pregnant women claimed an iodine content of 150 mcg iodine/serving on the label. Pregnant women using at least one DS containing iodine had a mean daily iodine intake of 122 mcg/d from supplements; the median value was 144 mcg/d. Median urinary iodine concentrations (UICs) were similar for pregnant and nonpregnant women in the population aged 15-39 y. The median UIC was 148 mcg/L for pregnant women and 133 mcg/L for nonpregnant women. The WHO has established a cutoff for insufficient iodine intake at <150 mcg/L for pregnant women and <100 mg/L for those who are not pregnant. This suggests that as a population, we may not be meeting adequate intakes of iodine for pregnant women. More research is needed on the iodine intakes of pregnant women and women of reproductive age on their total iodine intake from all sources, not just DSs. |
Serum soluble transferrin receptor concentrations in US preschool children and non-pregnant women of childbearing age from the National Health and Nutrition Examination Survey 2003-2010
Mei Z , Pfeiffer CM , Looker AC , Flores-Ayala RC , Lacher DA , Mirel LB , Grummer-Strawn LM . Clin Chim Acta 2012 413 1479-84 BACKGROUND: Serum soluble transferrin receptor (sTfR) is recommended as a sensitive and accurate measure of iron deficiency (ID) in populations when only a single indicator can be used. The lack of assay standardization and of representative data on the distribution of sTfR in at-risk populations currently limits its utility. METHODS: Using data from NHANES 2003-2010, we examined the distribution of sTfR and developed assay-specific cutoff values for defining elevated sTfR in 2 US populations groups: children aged 1-5 y (n=2820) and non-pregnant women aged 15-49 y (n=6575). RESULTS: On average, children had higher geometric mean sTfR concentrations (4.09mg/l; 95% CI: 4.04-4.14) than non-pregnant women (3.31mg/l; 95% CI: 3.26-3.35) (p<0.001). Among children, those aged 1-2 y (compared to those aged 3-5 y), boys (compared to girls), and non-Hispanic black (NHB) children (compared to non-Hispanic white (NHW) and Mexican-American (MA) children) had higher sTfR concentrations. Among non-pregnant women, adolescents (15-19 y) had higher sTfR concentrations than adults aged 20-34 y but not compared to adults aged 35-49 y; NHB women (compared to NHW and MA women) and multiparous women (compared to nulliparous women) had higher sTfR concentrations. The derived cutoff values (97.5th percentile in a defined healthy reference population) for defining elevated sTfR in the US were 6.00mg/l for children 1-5 y and 5.33mg/l for non-pregnant women 15-49 y. CONCLUSIONS: A different sTfR cutoff value may be needed in children and non-pregnant women to define ID. |
Levels of plasma trans-fatty acids in non-Hispanic white adults in the United States in 2000 and 2009
Vesper HW , Kuiper HC , Mirel LB , Johnson CL , Pirkle JL . JAMA 2012 307 (6) 562-3 Levels of trans-fatty acids (TFAs) in blood come from natural sources, such as milk, and industrial sources, such as partially hydrogenated vegetable oils. Dietary intake of TFAs increases low-density lipoprotein cholesterol (LDL-C) and has other adverse metabolic effects.1 Changing to a diet low in TFAs may lower the LDL-C level and decrease the risk for cardiovascular disease. To assist consumers, the Food and Drug Administration amended its regulations in 2003 to require that TFA content be declared on the nutrition label of foods and dietary supplements.2 Some community and state health departments have required restaurants to limit TFAs and reductions have been shown in supermarket and restaurant products. | The public health impact of these changes on TFA blood levels in the population is unknown. A preliminary study was conducted to determine plasma concentrations of TFAs in a subset of non-Hispanic white adults in the National Health and Nutrition Examination Survey (NHANES) in 2000 and 2009. |
- Page last reviewed:Feb 1, 2024
- Page last updated:Dec 02, 2024
- Content source:
- Powered by CDC PHGKB Infrastructure