Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation

Automated radiology report generation offers an effective solution to alleviate radiologists' workload. However, most existing methods focus primarily on single or fixed-view images to model current disease conditions, which limits diagnostic accuracy and overlooks disease progression. Although some approaches utilize longitudinal data to track disease progression, they still rely on single images to analyze current visits. To address these issues, we propose enhanced contrastive learning with Multi-view Longitudinal data to facilitate chest X-ray Report Generation, named MLRG. Specifically, we introduce a multi-view longitudinal contrastive learning method that integrates spatial information from current multi-view images and temporal information from longitudinal data. This method also utilizes the inherent spatiotemporal information of radiology reports to supervise the pre-training of visual and textual representations. Subsequently, we present a tokenized absence encoding technique to flexibly handle missing patient-specific prior knowledge, allowing the model to produce more accurate radiology reports based on available prior knowledge. Extensive experiments on MIMIC-CXR, MIMIC-ABN, and Two-view CXR datasets demonstrate that our MLRG outperforms recent state-of-the-art methods, achieving a 2.3% BLEU-4 improvement on MIMIC-CXR, a 5.5% F1 score improvement on MIMIC-ABN, and a 2.7% F1 RadGraph improvement on Two-view CXR.

News

2025-03-16 Release checkpoints for MIMIC-ABN and Two-view CXR.
2025-03-01 Upload official code and checkpoints for MIMIC-CXR.
2025-03-01 Release generated-radiology-reports, the labels column corresponds to reference reports, while the report column represents generated reports.
2025-02-28 Release .

Requirements

torch==2.3.1+cu118
transformers==4.43.3
torchvision==0.18.1+cu118
radgraph==0.09
python==3.9.0
please refer to requirements.txt for more details.
As stated in Issue #4, the transformers version should be maintained to prevent potential problems. Credit goes to @Andy for this clarification.
Set up the same virtual environment as ours：

% create virtual environment
conda create -n mlrg python=3.9.0
% install packages
pip install -r requirements.txt

Checkpoints

Checkpoints (pretrain and finetune) and logs for the MIMIC-CXR dataset are available at Baidu Netdisk and huggingface 🤗.
MIMIC-ABN
Two-view CXR

Datasets

Medical Images

MIMIC-CXR and MIMIC-ABN are publicly accessible through PhysioNet, with data systematically organized under root directories labeled p10 through p19, maintaining consistency with MIMIC-CXR's default configuration.
IU X-ray dataset is publicly available at NIH, and its root directory is the NLMCXR_png.
Two-View CXR dataset: the NLMCXR_png + MIMIC-CXR images. Two-view CXR aggregates studies with two views from MIMIC-CXR and IU X-ray. For more details, please refer to arXiv.
The comprehensive file architecture for all datasets is structured as delineated below:

files/
├── p10
    └── p10000032
            └── s50414267
               ├── 02aa804e-bde0afdd-112c0b34-7bc16630-4e384014.jpg
               └── 174413ec-4ec4c1f7-34ea26b7-c5f994f8-79ef1962.jpg
├── p11
├── p12
├── p13
├── p14
├── p15
├── p16
├── p17
├── p18
├── p19
└── NLMCXR_png
   ├── CXR1_1_IM-0001-3001.png
   ├── CXR1_1_IM-0001-4001.png
   └── CXR2_IM-0652-1001.png

Raw Radiology Reports

MIMIC-CXR and MIMIC-ABN: PhysioNet.
Two-view CXR: huggingface 🤗.

Reorganization of Raw Radiology Reports

To simplify usage, we have organized multi-view longitudinal data using the study_id. The processed datasets—MIMIC-CXR, MIMIC-ABN, and Two-view CXR—are available on Hugging Face 🤗 (PhysioNet authorization required). Note that the IU X-ray dataset (NLMCXR_png) does not include previous visit data due to the absence of study_id.
MIMIC-CXR: five_work_mimic_cxr_annotation_v1.1.json
MIMIC-ABN: mlrg_mimic_abn_annotation_v1.1.json
Two-view CXR: mlrg_multiview_cxr_annotation_v1.1.json
View Position for all datasets: five_work_mimic_cxr_view_position_v1.1.json

Evaluation using generated radiology reports

def compute_performance_using_generated_reports():
    from tools.metrics.metrics import compute_all_scores, compute_chexbert_details_scores
    mimic_cxr_generated_path = 'generated-radiology-reports/MIMIC-CXR/test_reports_epoch-1_20-10-2024_16-28-28.csv'
    mimic_abn_generated_path = 'generated-radiology-reports/MIMIC-ABN/test_reports_epoch-1_23-10-2024_10-25-20.csv'
    twoview_cxr_generated_path = 'generated-radiology-reports/Two-view CXR/test_reports_epoch-0_25-10-2024_11-38-35.csv'
    args = {
        'chexbert_path': "/home/miao/data/dataset/checkpoints/chexbert.pth",
        'bert_path': "/home/miao/data/dataset/checkpoints/bert-base-uncased",
        'radgraph_path': "/home/miao/data/dataset/checkpoints/radgraph",
    }
    for generated_path in [mimic_cxr_generated_path, mimic_abn_generated_path, twoview_cxr_generated_path]:
        data = pd.read_csv(generated_path)
        gts, gens = data['labels'].tolist(), data['report'].tolist()
        scores = compute_all_scores(gts, gens, args)
        print(scores)

Reproducibility on MIMIC-CXR

1. Download checkpoints for evaluation or initialization.

For CE metrics calculation: chexbert.pth, radgraph, and bert-base-uncased.
For model initialization: microsoft/rad-dino (image encoder), microsoft/BiomedVLP-CXR-BERT-specialized (text encoder), distilbert/distilgpt2 (define text generator), and cvt2distilgpt2 (initialize text generator).
Checkpoint directory: Place all checkpoints in a local directory (e.g., "/home/data/checkpoints"), and configure the --ckpt_zoo_dir /home/data/checkpoints argument in the corresponding script/**/**.sh file.

Chekpoint	Variable_name	Download
chexbert.pth	chexbert_path	stanfordmedicine
bert-base-uncased	bert_path	huggingface
radgraph	radgraph_path	PhysioNet
microsoft/rad-dino	rad_dino_path	huggingface
microsoft/BiomedVLP-CXR-BERT-specialized	cxr_bert_path	huggingface
distilbert/distilgpt2	distilgpt2_path	huggingface
cvt2distilgpt2	cvt2distilgpt2_path	github

2. Conducting Stages 1 and 2

# Stage 1: Multi-view Longitudinal Contrastive Learning
cd script/MIMIC-CXR
bash run_cxr_pt_v0906_fs.sh
# Stage 2: Chest X-ray Report Generation based on Patient-specific Prior Knowledge
cd script/MIMIC-CXR
bash run_cxr_ft_mlrg_v1011.sh

Citations

If you use or extend our work, please cite our paper at CVPR 2025.

@misc{liu2025-mlrg,
      title={Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation}, 
      author={Kang Liu and Zhuoqi Ma and Xiaolu Kang and Yunan Li and Kun Xie and Zhicheng Jiao and Qiguang Miao},
      year={2025},
      eprint={2502.20056},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2502.20056}, 
}

Acknowledgement

cvt2distilgpt2 Some codes are adapted based on R2Gen.
EVOKE Some code implementations are adapted from EVOKE, and the Two-view CXR dataset is sourced from this work.

References

[1] Nicolson, A., Dowling, J., & Koopman, B. (2023). Improving chest X-ray report generation by leveraging warm starting. Artificial Intelligence in Medicine, 144, 102633.

[2] Liu, K., Ma, Z., Xie, K., Jiao, Z., & Miao, Q. (2024). MCL: Multi-view Enhanced Contrastive Learning for Chest X-ray Report Generation. arXiv preprint arXiv:2411.10224.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
generated-radiology-reports		generated-radiology-reports
green_score		green_score
models		models
script		script
tools		tools
LICENSE		LICENSE
README.md		README.md
main_v0926_ablation_study.py		main_v0926_ablation_study.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation

News

Requirements

Checkpoints

Datasets

Medical Images

Raw Radiology Reports

Reorganization of Raw Radiology Reports

Evaluation using generated radiology reports

Reproducibility on MIMIC-CXR

Citations

Acknowledgement

References

About

Releases

Packages

Languages

License

mk-runner/MLRG

Folders and files

Latest commit

History

Repository files navigation

Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation

News

Requirements

Checkpoints

Datasets

Medical Images

Raw Radiology Reports

Reorganization of Raw Radiology Reports

Evaluation using generated radiology reports

Reproducibility on MIMIC-CXR

Citations

Acknowledgement

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages