How Document Analysis application will work¶

Application Objective
Document Analysis is an application intended for retrieving clinical information from patient documents. By using this information we build CAC (Computer Assisted Coding), Auto Query Alert System, Patient Admit Criteria Forms Automation and lot of other health care applications.
Application Scope
Currently application scope is limited to well text formatted documents. Documents in other formats like PDF,text Images or OCR..etc not supported.
What type of clinical information available in patient documents¶
Diseases, Signs and Symptoms, Procedures, Findings, Laboratory Tests, Medications, Vital signs, Treatment plan, admit information, past medical history ..etc
Technical Stack and Knowledge resources used in Application¶
Document processing using GATE work flow¶
Step 1: Embedding GATE Framework into Application¶
A. Initiating GATE¶
1 2 | |
B.Load GATE Plug-ins¶
1 2 3 4 | |
Step 3: Document Preprocessing¶
some times document will get in different formats from same facility like, document may be in base64 format, or some times there is no proper line breaks conversion in Physician reports, to analysis documents properly we need specified format so we convert those documents in intended format
1 2 | |
Basically in above procedure we will decide document type by Document Type ID that is in 3,6,9.
3,6 will denote that is physician document that need some line break replacement issues.
and
9 means source document is Base 64 format we will convert it into text format.
Step 4: Retrieving Document by work type¶
There are 4 types of analysis based on document work type namely
1 2 3 4 5 6 7 | |
Query for getting pending documents by work type
1 2 3 4 5 6 7 8 9 10 11 | |
Step 5: Building Document Corpus¶
A: Create Corpus¶
From step 4 we will get all pending documents that need to be analyze based on work type. The next step in the process is to build corpus for all these set of document using GATE gate.corpora.CorpusImpl class
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | |
B: Set Running status for documents¶
After building corpus, we are ready to process documents, to make aware of these documents are running to front end user we will update status with 2, that means all status 2 documents are currently under running process.
1 2 3 4 5 6 7 8 | |
C: Delete Modified Cached List file Gaz bins¶
In this we will delete cached list files is any are modified and will recreate Gaz bins again using update list files from system.
1 2 3 4 5 6 7 8 9 10 | |
Step 6: Document Processing with GATE default Resources & User defined list files (Dictionaries)¶
A: Creation of GATE Controller¶
Note
Controllers are used to create GATE applications. A Controller handles a set of Processing Resources and can execute them following a particular strategy.GATE provides a series of serial controllers (i.e. controllers that run their PRs in sequence):
GATE default coder for Creating an ANNIE application and running it over a corpus check GATE Site
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | |
Document Analysis used code for controller creation
1 2 3 | |
B: Add processing resources to controller and running corpus with resources¶
In this section we will add GATE default processing resources to controller and will process corpus with this resources.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | |
C: User Defined Gazetteers (Clinical Dictionary)¶
In above section at line 41,42 we call to external Gazetteer processing methods. In this section we will show how and what are the user defined list and dictionary files we will add to the system
Default dictionary files irrelevant to activated modules
| Dictionary File Path | Dictionary File Name | Purpose |
|---|---|---|
| MedResource/Adjectives | Adjectives.def | Describe the purpose of the dictionary |
| MedResource/Adjectives | Condition.def | Describe the purpose of the dictionary |
| MedResource/BodyPart | BodyLocation.def | Describe the purpose of the dictionary |
| MedResource/BodyPart | BodyPart.def | Describe the purpose of the dictionary |
| MedResource/BodyPart | BodySpace.def | Describe the purpose of the dictionary |
| MedResource/BodyPart | BodySubstance.def | Describe the purpose of the dictionary |
| MedResource/BodyPart | BodySystem.def | Describe the purpose of the dictionary |
| MedResource/General | SectionHeadings.def | Describe the purpose of the dictionary |
| MedResource/General | PdocHeadings.def | Describe the purpose of the dictionary |
| MedResource/LabData | LaboratoryProcedure2.def | Describe the purpose of the dictionary |
| MedResource/Medications | Medication.def | Describe the purpose of the dictionary |
| MedResource/Procedure | DiagnosticProcedure.def | Describe the purpose of the dictionary |
| MedResource/Procedure | LaboratoryProcedure.def | Describe the purpose of the dictionary |
| MedResource/Procedure | PreventiveProcedure.def | Describe the purpose of the dictionary |
| MedResource/Subheadings | Subheadings.def | Describe the purpose of the dictionary |
| MedResource/DischargeSummary | DischargeCondition.def | Describe the purpose of the dictionary |
| DiseaseJape | list.def | Describe the purpose of the dictionary |
| DiseaseJape | prenegation.def | Describe the purpose of the dictionary |
| DiseaseJape | site.def | Describe the purpose of the dictionary |
| DiseaseJape | dlist1.def | Describe the purpose of the dictionary |
| DiseaseJape | dlist2.def | Describe the purpose of the dictionary |
| DiseaseJape | dlist3.def | Describe the purpose of the dictionary |
| DiseaseJape | dlist5.def | Describe the purpose of the dictionary |
| DiseaseJape/stage | stage.def | Describe the purpose of the dictionary |
| DiseaseJape | ESignReport.def | Describe the purpose of the dictionary |
| IDiseaseJape | IDisorder.def | Describe the purpose of the dictionary |
| IDiseaseJape | ISite.def | Describe the purpose of the dictionary |
| DiseaseJape | Habits.def | Describe the purpose of the dictionary |
| Cancer | cancer.def | Describe the purpose of the dictionary |
| Radiology | RadiologyHeadings.def | Describe the purpose of the dictionary |
| DiseaseJape/CoreMeasures | CoreMeasure.def | Describe the purpose of the dictionary |
| JAPE/Laboratory | ResLab.def | Describe the purpose of the dictionary |
Procedure Related Dictionary files
| Dictionary File Path | Dictionary File Name | Purpose |
|---|---|---|
| MedResource/ICD | PCS.def | Describe the purpose of the dictionary |
| MedResource/ICD/LST | PCSComb.def | Describe the purpose of the dictionary |
| MedResource/ICD/PCSGeneral | PCSGeneral.def | Describe the purpose of the dictionary |
| MedResource/ICD | PCS_BP.def | Describe the purpose of the dictionary |
| MedResource/ICD/Operations/OPList | DiscreteOperations.def | Describe the purpose of the dictionary |
| MedResource/ICD/Operations/OPList | DiscreteSites.def | Describe the purpose of the dictionary |
| MedResource/ICD/Operations/OPList | PTADb.def | Describe the purpose of the dictionary |
| MedResource/ICD/Operations/OPList | PortCath.def | Describe the purpose of the dictionary |
E&M Code Related Dictionary Files
| Dictionary File Path | Dictionary File Name | Purpose |
|---|---|---|
| MedResource/EMCode/History | EMCodeHeaders.def | Describe the purpose of the dictionary |
| MedResource/EMCode/History | HPIElements.def | Describe the purpose of the dictionary |
| MedResource/EMCode/History | HPISignorSymptom.def | Describe the purpose of the dictionary |
| MedResource/EMCode/Examination | ExamElements.def | Describe the purpose of the dictionary |
| MedResource/EMCode/MDMElements/DTOptions | DTOptions.def | Describe the purpose of the dictionary |
| MedResource/EMCode/MDMElements/AmountAndComplexity | ACDataReviewed.def | Describe the purpose of the dictionary |
| MedResource/EMCode/MDMElements/RiskLevel | MdmRiskElements.def | Describe the purpose of the dictionary |
SNOMEDCT Related Dictionary Files
| Dictionary File Path | Dictionary File Name | Purpose |
|---|---|---|
| DiseaseJape/SnomedCT | Snomedct.def | Describe the purpose of the dictionary |
| DiseaseJape/SnomedCT | Snomedct1.def | Describe the purpose of the dictionary |
| DiseaseJape/SnomedCT | DiseaseOrSyndrome.def | Describe the purpose of the dictionary |
| DiseaseJape/SnomedCT | DiseaseOrSyndrome1.def | Describe the purpose of the dictionary |
| DiseaseJape/SnomedCT | DiseaseOrSyndrome2.def | Describe the purpose of the dictionary |
| DiseaseJape/SnomedCT | TherapeuticPreventiveProcedure.def | Describe the purpose of the dictionary |
| DiseaseJape/SnomedCT | TherapeuticPreventiveProcedure1.def | Describe the purpose of the dictionary |
| DiseaseJape/SnomedCT | Bacterium.def | Describe the purpose of the dictionary |
| DiseaseJape/SnomedCT | Bacterium2.def | Describe the purpose of the dictionary |
| DiseaseJape/SnomedCT | Finding.def | Describe the purpose of the dictionary |
| DiseaseJape/SnomedCT | Finding1.def | Describe the purpose of the dictionary |
| DiseaseJape/SnomedCT | Fungus.def | Describe the purpose of the dictionary |
| DiseaseJape/SnomedCT | InjuryOrPoisoning.def | Describe the purpose of the dictionary |
| DiseaseJape/SnomedCT | NeoplasticProcess.def | Describe the purpose of the dictionary |
Query Alert Related Dictionary Files
| Dictionary File Path | Dictionary File Name | Purpose |
|---|---|---|
| DiseaseJape | queryword.def | Describe the purpose of the dictionary |
Interqual Related Dictionary Files
| Dictionary File Path | Dictionary File Name | Purpose |
|---|---|---|
| DiseaseJape/IQAnalysis | IQData.def | Describe the purpose of the dictionary |
Milliman Related Dictionary Files
| Dictionary File Path | Dictionary File Name | Purpose |
|---|---|---|
| DiseaseJape/MillimanAnalysis | MillimanData.def | Describe the purpose of the dictionary |
Compliance Audit Related Dictionary Files
| Dictionary File Path | Dictionary File Name | Purpose |
|---|---|---|
| ComplianceAudit | Compliance.def | Describe the purpose of the dictionary |
Processing Corpus with above Dictionary Files
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | |
**For Some other Module Dictionary files processing **
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | |
D: User Defined JAPE2 for Clinical Information Pattern Recognition¶
JAPE is a Java Annotation Patterns Engine. JAPE provides finite state transduction over annotations based on regular expressions. JAPE is a version of CPSL – Common Pattern Specification Language.
As part of this we use almost 2453 unique JAPE files to retrieve clinical information from documents based on different modules requirement.
** Code for how JAPE are added to Resource and process over corpus**
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
Summary
This will complete analysis of document using GATE and user defined JAPES, from here application will work based on Module specific requirements Ex: IF ICD code is target module then it will look for ICD code based on disease combination annotation features.
Reading Clinical Information from Corpus (Old Analysis)¶
A: Delete previous information from tables by Job ID and Database¶
In this method application will delete all information from tables on current analysis job id
Procedure Name:DeletedAllTableDataByJobId
Tables: NA (All Annotation related tables)
B: Reading Medications Information¶
All Medication related information from patient documents will read here
currently not using
This is old annotation currently no information is inserting from this set please check procedure This
Input Annotation Set:PatientDetails
Procedure Name: InsertMedInfo
Table: Med
C: Reading Patient Demographic Information¶
In this application will read patient age, Country, gender information from documents
Input Annotation Set: PatientDetails
Procedure/Query Name: insert into PatientInfo (docId,siteId,Worktype,WorktypeId,AnnotId,Age,value1,Country,Gender,startId,endId) values(?,?,?,?,?,?,?,?,?,?,?)
Table: PatientInfo
D: Reading Review of systems (ROS) information¶
In this application will read Review of systems (ROS) information from documents
Input Annotation Set: Review_Of_Systems1, Review_Of_Systems
Procedure/Query Name:InsertROSInfo (......)
Table: Reviewofsystems
E: Reading Review of systems NEG (ROS) information¶
Currently it is trying to insert ROS sub headings in to table, but inserting all null values
Null Insertion
Currently inserting null sub headings into table
Input Annotation Set: ROSNeg
Procedure/Query Name:insert into ROSNeg (docId,Subheading) values(?,?)
Table: Rosneg
E: Reading Physical Examination (PExam) Information¶
reading patient physician exam information
Input Annotation Set: PhysicalExamination
Procedure/Query Name: ----
Table: Pexamcontent, PExam
F: Reading Vital Signs Information¶
reading patient vital information
Input Annotation Set: PhysicalExamination2
Procedure/Query Name: ----
Table: Vsigns
G: Reading Physician treatment plan Information¶
reading patient treatment plan information
Input Annotation Set: Plan03
Procedure/Query Name: InsertPlanRecordInfo (-----)
Table: Plan4, planrecord
H: Reading Patient Admit order Information¶
reading patient admit orders like , 2 day in patient, admitted as in patient etc
Input Annotation Set: Admitdata2, Admitdata1, ESignFinal
Procedure/Query Name: InsertAdmitOrders (-----)
Table: Admitorders
I: Reading Patient Laboratory data Information¶
reading patient laboratory test related information, value is not separated
Input Annotation Set: LaboratoryData
Procedure/Query Name: InsertLabContentInfo (-----)
Table: LabContent
I: Reading Patient ICD9 Version Procedure Related Information¶
reading patient Procedure related information regarding ICD 9
Old Version and specific to Work type O
Reading procedure related information regarding ICD 9
Input Annotation Set: Procedure1
Procedure/Query Name: GetCPTCodeNew (-----)
Table: Procedure
J: Reading Patient ICD9 Version Diagnosis Related Information¶
reading patient DIagnosis related information regarding ICD 9
Old Version and specific to Work type O
Reading Diagnosis related information regarding ICD 9, but not inserting
Input Annotation Set: Diagnosis2
Procedure/Query Name: GetICDCode (-----), no procedure in DB with This name there is another procedure GetICDCode1
Table: Disorder
K: Reading Patient Discharge Diagnosis Related Information¶
reading patient Discharge DIagnosis related information
old version and specific to Work type D
Reading Discharge Diagnosis related information
Input Annotation Set: Discharge Diagnosis
Procedure/Query Name: GetICDCode (-----), no procedure in DB with This name there is another procedure GetICDCode1
Table: Disorder
K: Reading Patient Discharge condition/ Diet/ Activity level Related Information¶
reading patient Discharge Condition, Activity Level, diet information
specific to Work type D
Reading Discharge Diagnosis related information
Input Annotation Set: DischargeCondition, Activity Level, DischargeDiet
Procedure/Query Name: InsertDischargeCOnditionInfo
Table: DischargeCondition
L: Non Using Annotation SETS¶
Following are not running in application currently
Not Running currently
Following Annotation sets are currently not Processing
Input Annotation Set: Allergies, Complaint, FHistory, Habits5, Impression, Past_Medical_History, SurgicalHistory, HPILocation , Severity, Timing, Present_Illness
Procedure/Query Name: NA
Table: NA
Reading Clinical Information from Corpus (New Analysis, Annotation Processor)¶
Analysis of corpus from AnootationProcessor Java files
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
Process of all modules and corresponding annotations¶
Compliance Audits
Method Name: getComplianceAudit
Input Annotation Set: ComplianceCE,ComplianceDST1,ComplianceTT1,ComplianceFinal
Procedure Name: GetEncphKeywordsList
Table Name: EncphkeyAnnotations
Work Type: All
Description: To Insert Compliance audit related keywords
SNOMEDCT Code
Method Name: getSnomedCtKeywords_TTY
Input Annotation Set: SnomedCT4, SnomedCT5, SnomedCT6, SnomedCT7, SnomedCT9, SnomedCT10, SnomedDiff
Procedure Name: GetSnomedCtwordsList
Table Name: SnomedTerms
Work Type: All
Description: To Insert SNOMEDCT Keywords with code
All Clinical Annotations
Method Name: getSignOrSymptomsInfo
Input Annotation Set: Alldis
Procedure Name: Direct Query
Table Name: signorsymptom
Work Type: All
Description: To Insert all document clinical words in to table
Radiology Report Annotations
Method Name: getRadSymptomsInfo
Input Annotation Set: RadDis
Procedure Name: Direct Query
Table Name: Radsymptoms
Work Type: All (Radiology)
Description: To Insert radiology clinical annotations
Medical Subject Headings
Method Name: getSectionHeaders
Input Annotation Set: SectionHeader
Procedure Name: Direct Query
Table Name: SectionHeaders
Work Type: All
Description: To Insert all section headings from documents
Vital signs
Method Name: getVitalSigns
Input Annotation Set: VitalsTest, VitalSigns, VitalSigns1
Procedure Name: InsertVitalSigns
Table Name: SectionHeaders
Work Type: All
Description: To Insert all section headings from documents
**Following are the complete module list implemented in Application **
getSymptomAnnotations(); insertMedicationDataIntoDB(); // Implemented to get Medications from Document to table insertHomeMedicationDataIntoDB(); insertImmunizationDataIntoDB(); insertAllergiesDataIntoDB(); insertHabitsDataIntoDB(); insertLabDataIntoDB(); // Implemented to get Labs from Document to table ICD10PregnancyInfo(); ICD10DiabetesInfo(poa); ICD10ComplaintInfo(); ICD10PresentIllnessInfo(); ICD10PastMedicalHistoryInfo(); ICD10AllergiesDiseaseInfo(); ICD10SampleHabitsInfo(); burnsInfo(); DrugsInfo(); ICD10BirthWeightInfo(); if (neoNatal == 1) { getNewbornInfo(poa); } ICD10ProcedureSymptomsKeywords(); ICD10ImpressionFromPMH(docNo, poa); ICD10BMI(); ICD10CheckObesity(); if (isPreg) { OBGMaternalDx(); timeDuration("OBGMaternalDx"); OBGDeliveryCount(); timeDuration("OBGDeliveryCount"); OBGHabitsInfo(); timeDuration("OBGHabitsInfo"); } DxDrugEffects(); fetchFxTerms2(); fetchFxTerms3(); DXKeywordInfo(); InsertLongTermDrugDiseases(accountNo, docNo, poa); ICD10DxComboChange(accountNo, docNo, poa); ICD10HtnAnalyser(accountNo); ICD10CHFAnalyser(accountNo); insertICD10(docNo, accountNo, worktype, poa); getWcodes(docNo, accountNo, worktype); getXcodes(docNo, accountNo, worktype); getVcodes(docNo, accountNo, worktype, poa); getYCodes(docNo, accountNo, worktype, poa); Icd10ExcludeCodes(accountNo); ICD10Codes(docNo, poa);
//Procedure Coding Methods Here
getProcedureDate(); findDirectPCS(); getAbsoluteAVSDBodyPart(); getAbsoluteBodyPart(); getAbsolutePTA(); getAbsoluteTRMB(); getHematomaOp(); getDiverticulectomy(); getBiopsyOperation(); getSkinFlapsOp(); getAmputation(); getEctomies(); getEctomyInspection(); getInsertionPortCath(); getAbsoluteEndarterectomy(); getEndarterectomyPatchAngioplasty(); getAbsoluteAtherectomy(); getAbsoluteFulguration(); getAbsoluteFractureRd2(); getAbsoluteOperation(); getPCS(); getKeyFeatures(); insertPCSInfo();
ICD10POAUpdateByExemption(); getDocBodyParts(); getInterqualValue();
E & M Code Analysis¶
For Evaluation and Management we have separate analysis methods Please find following code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | |
Saving Corpus Data Store (Analysis status)¶
By this method Application will store Corpus analysis state in file system
1 2 3 4 5 6 | |
Clearing All¶
In this application will clear all data documents, corpus from memory
1 2 3 4 5 6 7 8 9 10 11 | |
Conclusion¶
Conclusion
The Primary intention for writing this application is to facilitate computer assisted medical coding feature to end user.But in the development process there is a lot of other clinical information has been analyzed and retrieved like Compliance Audits, Interqual, Milliman, Query Alerts, Admit orders and E & M Code.
Following is the brief work flow of the application
-
End user, client will define and provide requirements for data to be retrieved from clinical documents and will provide sample set of documents
-
Developer will look into the context of requirement from lot of pre existing documents
-
Next will gather required clinical information as List or Dictionary files
-
Using GATE framework, it will generate primary annotation set (NER)
-
Above that developer will write GRAMMAR (JAPE) rules according to requirement need
-
Test on new documents, and continues trail and errors on new data and will finalize the new component and will integrate in existing system.
Thank You¶
Document By: Krishna Reddy 04/06/2018