Using deep learning methods to process text data from thyroid ultrasound
Abstract
The purpose of this article is to study various intelligent approaches to processing Russian-language textual medical information obtained as a result of ultrasound of the thyroid gland, solving problems of classifying diseases according to the EU-TIRADS system and generating a doctor’s conclusion based on the description of the disease. As part of the research, a machine learning pipeline was developed, including the stages of data preprocessing and model training. Transformer and hybrid architectures have been used to design deep learning models. The paper proposes methods for preprocessing unstructured medical descriptions to adapt them to the required format of the tasks being solved. The results obtained during the study showed that when solving a classification problem, achieving stable and high results using neural network architectures is possible only with careful selection of hyperparameters and taking into account their mutual influence. When solving the problem of generating ultrasound doctor's reports, transformer architectures and large language models show good results on large volumes of data. The proposed solution within the framework of the “Intelligent Ultrasound Physician Assistant” software package will automate the doctor’s work and improve the quality of diagnosis.
Full Text:
PDF (Russian)References
Juhlin C. C., Baloch Z. W. The 3rd edition of Bethesda system for reporting thyroid cytopathology: Highlights and comments //Endocrine Pathology. – 2024. – Т. 35. – №. 1. – С. 77-79.
Lozhkin I., Tsuguleva K., Zaytsev K. & oth.(2023) Development of Neural Network Models for Obtaining Information About Nodular Neoplasms of the Thyroid Gland Based on Ultrasound Images/ Journal of Theoretical and Applied Information Technology, 15th August 2023 -- Vol. 101. No. 15, 2023 p.p. 6076-6091.
Egger, R., Gokce, E. (2022). Natural Language Processing (NLP): An Introduction. In: Egger, R. (eds) Applied Data Science in Tourism. Tourism on the Verge. Springer, Cham.
Wahdan, A., Salloum, S.A., Shaalan, K. (2022). Qualitative Study in Natural Language Processing: Text Classification. In: Al-Emran, M., Al-Sharafi, M.A., Al-Kabi, M.N., Shaalan, K. (eds) Proceedings of International Conference on Emerging Technologies and Intelligent Systems. ICETIS 2021. Lecture Notes in Networks and Systems, vol 322. Springer
Wahdan, A., Salloum, S.A., Shaalan, K. (2022). Qualitative Study in Natural Language Processing: Text Classification. In: Al-Emran, M., Al-Sharafi, M.A., Al-Kabi, M.N., Shaalan, K. (eds) Proceedings of International Conference on Emerging Technologies and Intelligent Systems. ICETIS 2021. Lecture Notes in Networks and Systems, vol 322. Springer
Wang, Z., Ezukwoke, K., Hoayek, A. et al. Natural language processing (NLP) and association rules (AR)-based knowledge extraction for intelligent fault analysis: a case study in semiconductor industry. J Intell Manuf (2023).
Parsaeimehr, E., Fartash, M. & Akbari Torkestani, J. Improving Feature Extraction Using a Hybrid of CNN and LSTM for Entity Identification. Neural Process Lett 55, 5979–5994 (2023)
Prusty, S., Patnaik, S., Sahoo, G., Rautaray, J., Prusty, S.G.P. (2024). Unstructured Text Classification Using NLP and LSTM Algorithms. In: Nakamatsu, K., Patnaik, S., Kountchev, R. (eds) AI Technologies and Virtual Reality. AIVR 2023. Smart Innovation, Systems and Technologies, vol 382. Springer
Zou F. et al. A sufficient condition for convergences of adam and rmsprop //Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. – 2019. – С. 11127-11135.
Poluru, Eswaraiah & Syed, Hussain. (2023). A Hybrid Deep Learning GRU based Approach for Text Classification using Word Embedding. EAI Endorsed Transactions on Internet of Things. 10. 10.4108/eetiot.4590.
Demir-Kavuk O. et al. Prediction using step-wise L1, L2 regularization and feature selection for small data sets with large number of features //BMC bioinformatics. – 2011. – Т. 12. – С. 1-10
Yvon, F. (2023). Transformers in Natural Language Processing. In: Chetouani, M., Dignum, V., Lukowicz, P., Sierra, C. (eds) Human-Centered Artificial Intelligence. ACAI 2021. Lecture Notes in Computer Science(), vol 13500. Springer
Diuldin E., Makanov A., Shifman B., Bobrova E., Osnovin S., Zaytsev K., Garmash A., Abdulkhabirova F. (2024). Using deep learning to generate and classify thyroid cytopathology reports according to the Bethesda system. Revue d'Intelligence Artificielle, Vol. 38, No. 2, pp. 729-737. https://doi.org/10.18280/ria.380237
Pedregosa F. et al. Scikit-learn: Machine learning in Python //the Journal of machine Learning research. – 2011. – Т. 12. – С. 2825-2830.
Jain S. M. Hugging face //Introduction to transformers for NLP: With the hugging face library and models to solve problems. – Berkeley, CA : Apress, 2022. – С. 51-67.
Fasoli A. et al. 4-bit quantization of LSTM-based speech recognition models //arXiv preprint arXiv:2108.12074. – 2021.
Sepulveda E. J. B. et al. Towards Enhanced RAC Accessibility: Leveraging Datasets and LLMs //arXiv preprint arXiv:2405.08792. – 2024.
Refbacks
- There are currently no refbacks.
Abava Кибербезопасность IT Congress 2024
ISSN: 2307-8162