Preparing Data for Custom Entities

The first step in creating custom entities is preparing your input file and your test file. The custom entities feature requires that the entities in those files be surrounded by magicWord you specify in your training options file (which is discussed in the next topic).

Let's say you are extracting diagnoses from unstructured data in your input file and you have designated the magicWord DIAGNOSIS in your training options file. Every time the name of a disease or condition appears in the text, the word would be enclosed with that magicWord, as follows:

The term diagnostic criteria designates the specific combination of
signs, symptoms, and test results that the clinician uses to attempt
to determine the correct diagnosis. Some examples of diagnostic 
criteria, also known as clinical case definitions, are: Amsterdam 
criteria for DIAGNOSIShereditary nonpolyposis colorectal cancerDIAGNOSIS
McDonald criteria for DIAGNOSISmultiple sclerosisDIAGNOSIS ACR criteria 
for DIAGNOSISsystemic lupus erythematosusDIAGNOSIS Centor criteria for 
DIAGNOSISstrep throatDIAGNOSIS.

For information about identifying magicWord, see the next topic.