Project / product name: MediStream
Link to the project: http://
Team leader: Yurii Havrylko
Challenge: 7. Automated Medical Discharge Message Enhancement
Problem: The main challenge of medical discharge messaging is that we have a bunch of unstructured text information with a lot of abbreviations, medical terms, and written in unstructured styles but with different formats. It's also difficult to distinguish between complicated cases and routine ones. On the other hand, patient data is highly sensitive and should be processed and housed securely, preferably within IKEM, to prevent unauthorized storage or usage by SaaS providers for training purposes.
Solution: We approach this problem comprehensively - for the summarising part we decide to fine tune open source LLM CzeGPT-2 (already trained on Czech language for summarising). Because the basic version do not have comprehensive knowledge about medicine we create a corpus of knowledge from IKEM reporting DB and fine tune on that. This model can be used for summarising findings and all procedures into one report, or rephrasing existing.
Impact: Our solution can offer tremendous impact, especially evident in the time pressure faced by doctors, partly because they are occupied by summarizing discharge reports. Our solution will slash the time needed for this task as the doctors will only have to check reports. Doctors reading the reports of patients will also save tons of time as they will have important, complicated reports marked.
Feasibility & financials: Our solution is highly feasible as it does not require any external services to be used by IKEM, our models can be run on IKEM’s existing in-house servers.
What is new about your solution?: Trained MedCzeGPT-2 LLM for summarization of medical texts Classifier to distinguish complex cases from uneventful
What you have built at the hackathon - text explanation + code (e.g. GitHub link): https://github.com/YuraHavrylko/medi-computing-machine
What you had before the hackathon, please mention open source as well: We just have basic ideas about what we can do in terms of summarizing
What comes next and what you wish to achieve: From the vision part, we can improve the quality of summarizer and classifier by trainin on the larger data set and adding more comprehensive preprocessing