Monday, October 8, 2012
Present: Abby Crocker, Kairn Kelley, Ben Littenberg, Charlie MacLean, Connie van Eeghen
1. Start Up: Commentary on Atkins: The World is Fat. Interesting, but not novel.
2. Presentation: Ben: NHANES III exploration
a. NHANES III is a repeated, cross sectional survey of the non-institutionalized, non-military U.S. population. The data sets and instructions on how to use them are on a CDC website. Ben has merged several data sets from this survey, including adults, leptin, physical exam, and mortality (as recorded 18 years later) files which he shared with the group via STATA. These data include some children and although some fields are missing, BMI (as an example) is present.
b. There are 33,000 subjects with 3600 variables (some of which are administrative; some have been calculated by Ben). Note that the use of “missing” as a value is field dependent and needs to be interpreted based on that field and, possibly, related fields. Ben has calculated new variables to classify “missing” in some helpful ways.
c. There is an extensive list of fields describing the population (demographics)
d. There is a group of fields related to weights. Although this population represents the majority of the adult US population, the sample was disproportionately weighted to some subgroups of the population. The weights provide the multipliers to adjust for those differences in several ways: one multiplier is for a pseudo-stratum weight, one is for a pseudo-personal weight, and there are weights for specific surveys or groups of questions that were asked (e.g. allergy) and phases (years that the surveys were administered). The weights can be applied directly to individual field values (age, weight of person, lab values, etc.) that, when summarized, reflect the underlying population.
e. Ben opened a script file (.do) that provided examples of data field cleaning or generation for the use of queries related to cardiograms. The weight is selected according to the portion of the population being studied and is used for calculations for “survey” commands. A similar set of commands sets up the data for time analysis (e.g. Kaplan-Meier).
f. We practiced exploring the data set using this script file, finding an apparent relationship between “cutting back on meals” and “mortality.” However, by adding other conceptually related variables (income class, age, sex) we found that mortality is better explained by age (oh, yeah) and income. Good exercise, and helpful preparation to future, similar studies we are able to do.
a. Oct 11: Rodger and Connie: R03 resubmission draft
b. Oct 18: Christina Cruz, 3rd year FM resident with questionnaire for mild serotonin
c. Oct 25: Abby: update
d. Nov 1: Kairn update?
e. Future agenda to consider:
i. Kairn – review of draft article on IRR
ii. Ben: budgeting exercise for grant applications; NHANES – lower female mortality for women taking birth control medications
iii. Rodger: Mixed methods article; article on Behavior’s Influence on Medical Conditions (unpublished); drug company funding. Also: discuss design for PCBH clinical and cost research. Also: Prezi demo.
iv. Amanda: presentation and interpretation of data in articles
Posted by Connie at 10/08/2012 09:15:00 AM