The main goal of this paper is to develop a spell checker module for clinical text in Russian. The described approach combines string distance measure algorithms with technics of machine learning embedding methods. Our overall precision is 0.86, lexical precision - 0.975 and error precision is 0.74. We develop spell checker as a part of medical text mining tool regarding the problems of misspelling, negation, experiencer and temporality detection.
This article describes the study results of echocardiographic (ECHO) test data for 4P medicine applied to cardiovascular patients. Data from more than 145,000 echocardiographic tests were analyzed. One of the objectives of the study is the possibility to identify patterns and relationships in patient characteristics for more accurate appointment procedures based on the history of the disease and the individual characteristics of the patient. This is achieved by using classifications models based on machine learning methods. Early detection of disease risks and "accurate" appointment of diagnostic procedures makes a significant contribution to value-based medicine. Moreover, it was also possible to identify the classes and characteristics of patients for whom repeated diagnostic procedures are well founded. Calculation of personal risks from empirical retrospective data helps to detect the disease in early stages. Identifying patients with high risk of disease complications allow physicians to make right decisions about timely treatment, which can significantly improve the quality of treatment, and help to avoid diseases complications, optimize costs and improve the quality of medical care.
Developing predictive modeling in medicine requires additional features from unstructured clinical texts. In Russia, there are no instruments for natural language processing to cope with problems of medical records. This paper is devoted to a module of negation detection. The corpus-free machine learning method is based on gradient boosting classifier is used to detect whether a disease is denied, not mentioned or presented in the text. The detector classifies negations for five diseases and shows average F-score from 0.81 to 0.93. The benefits of negation detection have been demonstrated by predicting the presence of surgery for patients with the acute coronary syndrome.