Present: Abby
Crocker, Kairn Kelley, Amanda Kennedy, Rodger Kessler, Ben Littenberg, Charlie
MacLean, Connie van Eeghen
Guest: Steve
Kappel
1.
Start Up: Introductions
of mutually excited data hounds to each other, of which some have the start of
a research question (e.g. use of opiate medications) and others have
overlapping research interests and questions.
Some also have an interest in using large data sets for education, QI
interventions, and opportunities for prospective provider interventions.
2.
Presentation: Steve
Kappel: Understanding/using VHCURES: The Kingdom of Messy Data
a. We
know the data base is an excellent source for paid pharmaceutical by all
payors. Some of key considerations:
i.
A claim is a small “chunk” of clinical info wrapped
inside financial data: who received the service, the service, who paid for it,
and some data about the patient
ii.
OnPoint is the vendor that assembles the data and tries
to identify patients/providers consistently – they are good, not great, at
this. The payers individually create the
identifiers; this leads to a lot of variability among payers.
1. Refreshed
quarterly and provided directly by OnPoint.
Lag: paid claims are posted by the end of the quarter. In general, data are up to date as of 6 months
prior to the request date. Right now:
data are up to date for all of 2011 and the first quarter of 2012.
2. Provider
names vary greatly; NPIs are pretty good, although there are many for
organizations and individual providers.
Should be clean in the next few months.
3. Patient
data include date of birth; Charlie is requesting access through IRB. Birth date, zip code, and gender generate
very good matches everywhere except for Burlington.
iii.
Connecting claims across patients is good, not
great. No actual names and SSNs; these
fields encrypted consistently but encryptions will vary if not exactly
matched.
1. SSN
is frequently missing; insurers are increasingly less willing to use (30% no
SSN)
2. Referring
provider information is not carried into the claim.
3. Prescribing
provider is available, but not the clinical reason for prescribing.
4. Data
base validation is needed: large scale chart review is being planned using
PRISM clinical data and FAHC claims data (electronic to electronic) – which is
limited form of validation.
5. Babies
and mothers should be linked through subscriber information, as well as related
claims data
b. Data
history starts in 2007, for all claims from almost any insurer (85%) including
TPA’s (self-insured), Medicaid, and out of state payers for Vermont resident
beneficiaries. Medicare is in the
process of being included: they have released primary care medical home claims
(not to be released to anyone else).
Should be completely available in one year.
i.
Medicare: 65 and over, disabled children, ALS, ENRD –
these are absent.
ii.
Dual eligibility: can be identified, but no Medicare
claims
iii.
Non-Medicare: includes all covered expenses, except
self-pay. Includes those covered by the
deductible; does not include denied claims.
iv.
Claims with very small dollar values are usually
wrap-around (secondary) insurance coverage.
Easy to flag the primary paid claim.
v.
Claims with $0 value are those paid as part of
deductibles.
vi.
Claims with negative values are adjustments –
complicated reworking of reversals and re-processing. These are separate transactions in BCBS;
OnPoint bundles these together – which makes it hard to replicate data across
time, as adjustments often occur in later quarters.
vii.
Claims are also affected by what is covered: some
diagnoses are paid more easily than others; this affects claims documentation
1. Example:
it is hard to find diabetes on a medical claim – because this doesn’t affect
the reimbursement. But the existence of
the diabetes diagnoses affects the medical claims generated for the
patient. This diagnosis must be inferred
from other patterns that are evident from claims data (meds, tests, and
procedures)
2. Can
be used for comparative analyses: patients that appear to have diabetes and
those that don’t, with the resulting differences in utilization and cost
c. Requests
for data need to address these issues as “inclusion criteria,” with the
additional requirement of a plan to link claims together
i.
Clean requests: $ spent for an easy-to-find diagnosis
on claims
ii.
Less clean: $ spent for diagnosis recorded elsewhere
iii.
Even less: Providers connect to patient for diagnosis
iv.
Pharmaceuticals: can track the history of medication
claims, although this is messy as insurance payers change within patient and
excludes out of pocket expenditures
1. Example:
we can look for people with a pain-related problem (like hip replacement),
remove the patients with previous long term opiate use, and look forward to find
subsequent use of opiates
2. Another:
we can look for presentation to ED for musculo-skeletal injury, not already on
narcotics for the previous 12 months; question is “how often do people ‘get
stuck’ on opiates from a cold start?”
This is similar to studying the incidence (not the prevalence) of
chronic pain managed by opiates.
v.
Exclusions can be organized at the personal level (not
the claim level), in which markers from the claim identify the person (and all
related claims) with that characteristic (e.g. diabetes identified by a
specific medication). These are very
explicit definitions (e.g. Boolean algorithms); the more the definition
corresponds to the patient (rather than the claim), the cleaner.
vi.
It is possible to include patients in the insurer data
base who have not generated claims through the eligibility file, which includes
all subscribers and beneficiaries (there is also a separate provider data base)
1. Every
month of coverage is represented by a record for each patient in the
eligibility data base
2. A
break in the record indicates change in coverage
3. Markers
for identifying changes: January and July of each year; milestone ages (65 and
26)
d. VHCURES
studies have not been published yet; this makes for a good start to a FINER
topic under any circumstances. Some
caveats:
i.
Cleaning the data will take a little more time. Good to start thinking about research
questions now; requests could be planned as early as April 2013.
ii.
IRB clearance is required
iii.
Must bring a bag of cookies
e. A
limited scope project to consider now: controlled substances prescribed by
primary care providers could be used to look at new users of opiates (given
that we don’t know how clean the patient MPI is). Next step: refresh the data set (a 97 second
transaction).
i.
Begin to analyze
ii.
Run and compare with PRISM data – a source of
validation, along with the FAHC warehouse
iii.
Mom’s and babies: no methadone (given in clinics
without a claim); covers all prescriptions; does not include medications
provided during the hospital stay.
However, if most moms are on Medicaid, DIVA might be a better source – or
good to compare the two as another method of validation.
f. Candidate
questions:
i.
Methodological: can link babies and moms
1. Can
we study their utilization
ii.
Methodological: Can we find hospitalizations and match
them
iii.
Methodological: Can we find incidents leading to opiate
use and track the natural history?
iv.
Match to birth registry, DIVA, and DMV…
g. Next
steps
i.
Charlie to add everyone at CROW as key personnel study
protocol with IRB
ii.
Charlie to get refreshed data for his data set from
Steve soon
iii.
CROW to work on together – see below
h. Thank
you Steve!
a.
Feb 14: Abby: Breastfeeding manuscript (no Ben)
b.
Feb 21: Kairn: F31 (no Amanda)
c.
Feb 28: Rodger – PCORI (no Connie, no Kairn)
d.
Mar 7: Connie: manuscript review (no Ben, no Kairn)
e.
Mar 14: Charlie: VCHURES Opiate Data Mining (everyone
will be here!)
f. Future
agenda to consider:
i.
Christina Cruz, 3rd year FM resident with
questionnaire for mild serotonin withdrawal syndrome?
ii.
Peter Callas or other faculty on multi-level modeling
iii.
Charlie MacLean: demonstration of Tableau
Recorder: Connie van Eeghen
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.