In our Interviews with Innovators, we talked with Barry Canton from Gingko Bioworks about using large language models (LLMs) in bioengineering. Itβs a fascinating read:
Inspired by this interview, we compiled a list of medical LLMs. The field of medicine is one of the crucial and rapidly advancing areas for the application of LLMs. We hope youβll find this list helpful.
LLaMA-based models
MEDITRON-70B: Built on Llama-2, pretrained on a medical corpus, achieving high performance on medical benchmarks. Paper, code
PMC-LLaMA: Adapted LLaMA using PubMed articles, significantly improving performance on medical benchmarks. Paper, code
Clinical Camel: Fine-tuned from LLaMA-2, utilizes QLoRA and dialogue-based knowledge encoding for superior performance on medical benchmarks. Paper
ChatDoctor: Fine-tuned from LLaMA, using 100,000 patient-doctor dialogues to enhance diagnostic accuracy. Paper, code
Radiology-Llama2: Specialized in radiology, trained on extensive radiology datasets, excelling in generating clinically useful impressions. Paper
Hippocrates: An open-source framework for developing medical LLMs, with models fine-tuned from Mistral and LLaMA 2. Paper, blog post
PaLM-based models
Med-PaLM: Using PaLM and Flan-PaLM for medical question answering, showing improved clinical standards alignment. Paper, blog post
Med-PaLM 2: An advanced LLM for medical question answering, trained with PaLM 2 architecture and medical-specific fine-tuning. Paper, blog post
Mistral-based models
BioMistral: Fine-tuned from Mistral, trained on PubMed Central data, supporting multilingual evaluation. Paper, code
BERT, BART, and T5-based models
BioBART: Based on BART, pre-trained on PubMed abstracts, excelling in biomedical text generation tasks. Paper, code
ClinicalBERT: Pre-trained on clinical notes from MIMIC-III, significantly improving performance on clinical NLP tasks. Paper, code
PubMedBERT: Pretrained from scratch on biomedical text, outperforming models pretrained on general-domain corpora. Paper
BioBERT: Adapted from BERT, pre-trained on biomedical corpora, excelling in named entity recognition, relation extraction, and question answering. Paper
SciBERT: Pretrained on scientific text, significantly outperforming BERT on various scientific NLP tasks. Paper, code
ClinicalT5: A T5-based model tailored for clinical text, pre-trained on clinical datasets, excelling in medical text generation and question answering. Paper, code
GPT and Megatron-based models
GatorTronGPT: Using a GPT-3 architecture with 20 billion parameters, trained on clinical and English texts, enhancing biomedical NLP tasks. Paper
GatorTron: A large-scale clinical language model with up to 8.9 billion parameters, trained on clinical notes and biomedical literature. Paper
BioMegatron: Larger biomedical domain language model, trained on a larger domain corpus for consistent improvements on biomedical NLP benchmarks. Paper
Knowledge and contrastive-based models
DRAGON: Integrates text and knowledge graph data, combining masked language modeling and KG link prediction. Paper, code
KeBioLM: Integrates knowledge from UMLS with PubMed abstract entities, enhancing named entity recognition and relation extraction. Paper, code
MedCPT: Uses contrastive pre-training on PubMed search logs for zero-shot biomedical information retrieval. Paper, code
Multimodal and specialized models
OphGLM: Integrates fundus images and medical dialogue for ophthalmology, fine-tuned on medical instructions and dialogues. Paper
Med-Gemini: Optimized for 2D and 3D imaging and genomics data, excelling in various medical tasks. Paper, blog post
MedVersa: Uses a large language model to integrate visual and linguistic data, supporting multimodal inputs for versatile medical image analysis. Paper
PH-LLM: Fine-tuned from Gemini, processes personal health data from mobile and wearable devices for personalized insights. Paper