Modeling Concepts
Here are some notes/conversations related to concept modeling that may be helpful resources when confronting modeling issues.
Modeling diagnoses
Hamish wrote:
If I create a concept "malaria" say a boolean and I want to also have "Malaria diagnosis date" how do I link these two items so that it is clear that they refer to the same item?
Also as Darius and I discussed today what if Carole our epidemiologist friend says she wants all the diagnoses for her data collection form to have true, false, unknown and "no data available". Should we just create hundreds of coded concepts for this or formalize is as a standard construct? If we formalize it does that help us to handle such data efficiently in the same analysis and data extraction tools that normally take a boolean?
I would use DIAGNOSIS or PATIENT REPORTED DIAGNOSIS or similar generically and then record the diagnoses as answers (e.g. MALARIA) for most cases. That gives you the choice of recording the date of diagnosis into the obs date (probably simpler) or creating a DATE OF DIAGNOSIS observation to link dates to diagnoses through an obs_group_id.
If someone wants multiple choice answers for 1..n diagnois, then you are collecting two observations in each case: a multiple choice answer and the diagnosis for which the answer applies. There are a several ways to skin this.
1. Flat by diagnosis (add coded answers for each diagnosis)
MALARIA with coded answers TRUE, FALSE, UNKNOWN, NO DATA AVAILABLE
- PRO
- simple, MALARIA/etc could still be used as answers
- CON
- not scalable, ties you to one model of answers to diagnoses and these must be replicated/managed for *every* diagnosis in the system.
2. Fully abstract (one concept for diagnsosis and a 2nd for answer to multiple choice question)
INQUIRED DIAGNOSIS as coded answered by diagnosis DIAGNOSIS STATUS as coded with coded answers TRUE, FALSE, UNKNOWN, NO DATA AVAILABLE Tied together with obs_group_id
- PRO
- easily scalable
- CON
- needs tools/knowledge to convert back to one complex data point, diagnoses stored in a questionnaire-specific manner
3. Abstract, but use existing DIAGNOSIS concept
INQUIRED DIAGNOSIS as coded answered by diagnosis DIAGNOSIS STATUS as coded with coded answers FALSE, UNKNOWN, NO DATA AVAILABLE Linked with obs_group_id
Store TRUE answers as DIAGNOSIS with coded answer MALARIA, ASTHMA, etc.
- PRO
- re-use of DIAGNOSIS, so you can still search DIAGNOSIS concept for all known diagnoses
- CON
- application must know to treat TRUE answers differently
4. Flat by answer (one concept per answer for questionnaire)
DIAGNOSIS DIAGNOSIS DENIED DIAGNOSIS STATUS UNKNOWN DIAGNOSIS DATA NOT AVAILABLE each is coded and is answered with the diagnosis
CAROLE1 QUESTIONNAIRE DIAGNOSES OPTIONS as a concept_set of the above concepts
- PRO
- scalable, could re-use DIAGNOSIS concept or have a separate concept for diagnosis = true on questionnaire
- CON
- requires search for multiple concepts (could be facilitated by concept_set)
