- |
- Discussion |
- Edit |
- History
MVP Concept Data Model Changes
The purpose of this document is to discuss the issue of how terminology is stored and maintained within OpenMRS and to propose some modifications. The current OpenMRS program does not properly address the “interface/reference†terminology model, nor does it save the actual description that is collected from the user interface. I think everyone agrees that is would be better if we could store and display the actual description or term selected by the user (whether or not it was the concept name or a synonym).
Definitions used in this paper:
DESCRIPTION is the "name" or text representation or label for a concept. It is just an identifying String. There can be many descriptions which describe a concept. It is also called a term.
DEFINITION is the textual definition for a concept. It is currently called Description in the OpenMRS database.
CONCEPT is an idea which is unambiguous and defined within the database and which may be recorded or viewed with different descriptions.
SYNONYMS are descriptions which in the current system are mapped to a concept and mean the same thing as the concept. They are not tracked independently.
WORDS are individual atoms of descriptions and are useful for indexing and finding the descriptions, but they are not involved in translating meaning.
The current system has Synonyms which are useful for assisting in searching for a concept, but these are not stored in the database. This is problematic for several reasons. The most important of which is for the purpose of display. We should display the same description which was chosen by the user. For example, if I chose "Lou Gehrig's Disease", I should not be shown "Amyotrophic Lateral Sclerosis" next time I view the information. Also, when translating concepts between users (such as language translations) we may also want to display a preferred patient-language term, or preferred clinician-language term.
There are two ways which we can now track these Descriptions. A fully-normalized approach would have a CONCEPT_DESCRIPTION table which would contain unique strings (the descriptions) and a CONCEPT_DESCRIPTION_MAP table would join these unique identifiers with the concept along with parameters such as language and location.
The CONCEPT_DESCRIPTION table would essentially contain text strings along with a unique identifier and creator information (modified date/user information could also be captured). The text can be stored in both “right case†and in mixed case format. This allows for different methods of display. Right case has all words which must be uppercased upper cased. When a term is selected in the form or through the website, both the CONCEPT_DESCRIPTION_MAP_ID and the CONCEPT_ID would be stored in the record.
A less technically pure method, but one which is a little simpler, eliminates the description table and combines the description with the mapping table. In this view, the description may still contain a unique identifier, but the descriptions are repeated over and over if the same description is mapped to a concept in more than one language or location. In this situation, the CONCEPT_DESCRIPTION_MAP_ID and CONCEPT_ID would still be stored in the record, but the advantage is that you can look up the actual textual description using the code by querying one table, and not having to use a view which combines the map table and the description table. My preference is to use the fully relational model as the MVP concept table will be quite large and we will be using many languages and this will lead to many duplicated descriptions in the mapping table.
For the examples in this paper, the following information is used for the CONCEPT_DESCRIPTION table:
In the new table architecture, the CONCEPT table would be essentially unchanged. For the example being used for this paper the following information is used (note that the names are kept in the concept_description table with maps that determine preferred language):
A CONCEPT_FLAG_MAP hangs off the mapping table and this provides a relational table to handle as many flags as we need. Originally, this would have three flags: PREF_LANGUAGE_FLAG, PREF_LOCALE_FLAG, and INDEX_ONLY_FLAG. The PREF_LANGUAGE_FLAG can be "1" for only one row for a given language, and the same applies to PREF_LOCALE_FLAG for any country. The INDEX_ONLY_FLAG is used for descriptions which include misspellings or other errors which can be used to create a word index, but should not be used for display purposes. In the future, this is where preferred patient term or preferred clinician term can be stored. The flag table is used by business logic described at the end of the paper. To make the system extensible, a mapping table is used to a FLAG table, rather than just adding columns to the CONCEPT_FLAG_MAP table. The FLAG table contains a lookup of names and definitions of the flags.
The following table denormalizes the mapping table, description table and flag table for example purposes:
The original CONCEPT_NAME table would be used only to capture the definition information and although these definitions can be in different languages, the purpose of this table is not to provide descriptions for searching or display. It probably should be renamed CONCEPT_DEFINITION and not include descriptions at all, only a definition.
How to Display a Concept ID that is stored in the patient record:
For the example tables provided above, we can define some display rules and demonstrate how the system might work. The display rules are the following:
1. Whenever displaying the description for a concept, always include the original stored description from the patient record (this is obtained by using the concept_description_map_id).
2. Display the preferred description for that concept (for the locale and language settings of the browser), followed by the original description in parentheses after the concept.
3. Do not display the original concept description if it is the same as the preferred concept description.
4. The preferred description is chosen based on the following additional rules:
a. Select all non-medication concept descriptions where both locale and language are the same, and where the PREF_LOCATION_FLAG=1.
b. If there are no matches for both location and language, match language only where PREF_LANGUAGE_FLAG=1.
c. For Medication concepts, select where both location and language are the same, and where the PREF_LOCATION_FLAG="1".
d. If there are not matches for both location and language, match location where the PREF_LOCATION_FLAG=1.
The following examples illustrate how the tables would work:
Rough Draft, Proposed Model Changes
Based on discussions over teleconference and our recent meeting in Cape Town, the following draft was developed:
Issues still remaining:
- higher level drug compounds will take advantage of this change, but specific compounds will not at this point. need to consider how the drug table can be included in this change
- mapping issues are incomplete, although it's our intent to map vocabularies to the concept and not the concept's name.
Comments from Andy: I've had a chance to look over the proposed concept tables in more detail. I have the following observations:
1) I think it would be good to have audit trail on the concept_names and mappings as well... a user_id changed and date/time changed would be good.
2) Concept_name_type is only present/absent. I had recommended that we have a value there so certain flags (such as patient term) could have a range of values (for example 0=never show to patient, 2=secondary term and 1=primary patient term.
3) We need to remember about directionality on the name-concept pair. For example, given a particular concept, we should be able to display the correct name based on flags (as described in the document). However, given a particular name selected during a search, we need to be able to choose one and only one concept. I think we began to discuss this in CT, but depending on whether the name selected is for convenience or not should determine which concept. In other words, I think that a name in a given locale should go to one an only one concept. (otherwise it is ambiguous and should be used for searching only, perhaps).
4) Concept_name_type has the description as an integer, probably needs to be text.
5) The concept_map table needs to be able to both describe whether a mapping is broader than/narrower than/same as... for example, but also needs to have a preferred flag (boolean). This would be in addition to the IS_TO_SOURCE column (also wouldn't it make sense to have this as an integer lookup for speed).
6) We would need audit trail mapping on the concept_map table as well in case people change the mapping (it should be tracked).
7) On more thought, I wonder whether we need a "version" column in here as well as effective dates. the map may apply to SNOMED CT, code 838384, July 2007, etc. It might not be needed, but for some sources (such as GPIs for drugs) they sometimes reuse codes. Also, if you are looking to implement with certain sources such as ICD, they roll over on a particular date. It is a major pain to have to update the table at exactly the time of the rollover. I f you have effective start and end dates, then you can simply check based on these columns and load maps in whenever you want. Do you think the date voided column can be the end date?
Andy, I've placed some responses in the forum. --Burke 23:52, 10 May 2007 (EDT)









