Foundation Model Prompting for

Medical Image Classification Challenge 2023


News

[2023.12.15] The NeurIPS 2023 Workshop was successfully held.  

[2023.10.16] We create a new leaderboard for follow-on submission, please have a try!

[2023.09.01] We recommend the participants read the submission instructions on the submission page carefully before trying to submit the final evaluation results. Each participant will only have THREE chances to submit. 

[2023.08.28] To make the evaluation phase fair, we ask that all participants follow the official few-shot split (data file lists can be found here), which is also detailed in the submission instructions.

[2023.08.25] The test phase is open, and the validation set annotations and test set images are released.

[2023.07.15] The online evaluation submission is open, and the submission tutorial and example can be found here

[2023.06.25] The validation phase submission pipeline is debugging, we will release the submission interface and corresponding tutorial and examples as soon as possible.

[2023.05.26] The MedFM 2023 website is now fully open. Please check the timeline.

[2023.05.25] MedFM 2023 is accepted as a NeurIPS'2023 Competition! More details will be announced soon.



Timeline

1. Release of training data and validation data: May. 26th (12:00 AM GMT), 2023;

2.  Submission of the validation phase opening: July. 15th (12:00 AM GMT), 2023;

3. Submission of the evaluation phase opening: Aug. 25th (12:00 AM GMT), 2023;

4. Submission closing date: Sep. 15th (12:00 AM GMT), 2023;

5. Short paper and source code submission ends: Sept. 15th (12:00 AM GMT), 2023;

6.  NeurIPS'2023 workshop: Dec. 15th(12:00 AM GMT), 2023.



How to Participate

Note:

  • All participants are not allowed to use external medical image datasets during this challenge. The foundation models in other areas, such as natural images and natural language processing area, are allowed.
  • In the setting of our tasks, the few-shot number is counted by the number of patients rather than the number of images. All participants can only use corresponding few-shot samples in the training set. Using the full training set is not allowed.
  • When making the final submission, all participants should give links to the pre-trained model they used in this challenge in their methodology paper.
  • Please do not upload your Docker to Dockerhub during the challenge.
  • Noted that there are some images of the same patients existing in both training and validation sets, but it's ok because our main task is few-shot learning. The results in the validation submission are important but are not counted in the final testing submission. The reserved testing set is different from the public training and validation set. So we urge all participants to treat validation submissions carefully.

Stage 1. Join the competition and download the data

  • Register on the website and verify your account.
  • Click the green 'Join' button to participate in the Challenge. Please make sure that your grand challenge profile is complete (e.g., Name, Institution, Department, and Location).

Stage 2.  Develop your model and make validation submissions

  • We would provide an official baseline in GitHub, and you can follow it to go through the whole process of the MedFM Challenge.
  • In the validation submission phase, all participants only need to submit CSV files which include the prediction of validation of each dataset in a 1/5/10-shot setting. We offer 1 submission opportunity per day in the whole Validation Phase.

Stage 3. Make testing submission

  • To avoid overfitting the reserved dataset, we only offer three successful submission opportunities in Testing Phase. In the testing phase, participants would train their models on few-shot samples of the MedFMC public dataset and test performance on MedFMC reserved testing dataset (See Important Dates). We offer three submission opportunities in the whole Testing Phase. More details could be found here.


Awards

1. Monetary awards for top-3 winners(with best ranking over all three tasks): 1st place: $1,000; 2nd place: $600; 3rd place: $400.

2. Outstanding winners with groundbreaking solutions will be invited to submit their work to our special issue in the prestigious Medical Image Analysis Journal.

3. The top-10 winners will be invited to submit their groundbreaking solutions (as coauthors) in a
summarization paper.

4. Student participants in the winning teams will be considered for admission and scholarship
in organizers’ institutes.



Overview

In the past few years, deep learning foundation models have been trendy, especially in computer vision and natural language processing. As a result, many milestone works have been proposed, such as Vision Transformers (ViT), Generative Pretrained Transformer (GPT), and Contrastive Language-Image Pretraining (CLIP). They aim to solve many downstream tasks by utilizing the robust representation learning and generalization abilities of foundation models.

However, the lack of public availability and quality annotations in medical imaging has been the bottleneck for training large-scale deep learning models for many downstream clinical applications. It remains a tedious and time-consuming job for medical professionals to hand-label volumetric data repeatedly while providing a few differentiable sample cases is more logically feasible and complies with the training process of medical residents. In this case, a foundation model, often trained with thousands of millions of images and probably other modalities of data, could serve as the base for building applications with a single or a few cases in the form of prompt learning.

Thus, we propose holding the challenge of foundation models for medical image analysis in conjunction with Grand Challenge. This challenge aims at revealing the power of foundation models to ease the effort of obtaining quality annotations and target improving the classification accuracy of tail classes. It aligns with the recent trend and success of building foundation models mentioned above for a variety of downstream applications. It is designed to promote the foundation model and its application in solving frontline clinical problems in a setting where only one or a few sample cases (with quality annotations) are available during downstream task adaptation. It also fits perfectly for the long-tailed classification scenario, with only a few rare disease cases available for the training. In a word, our challenge aims to advance technique in prompting large-scale pre-trained foundation models via a few data samples as a new paradigm for medical image analysis, e.g., classification tasks proposed here as use cases.

In the challenge, during the training phase, a small amount of our private data would be utilized for the initial training (selected few samples) and validation (the rest of the dataset). Participants are encouraged to achieve higher performance scores in three detailed application tasks, i.e., one with radiological images-thoracic disease screening, one with pathological ones - pathological tumor tissue classification, and one with natural images - lesion classification in colonoscopy images. The final evaluation will be conducted in the same setting on the reserved private datasets, i.e., the random selection of a few samples for the training and the rest for the testing. The final metrics will be averaged over 5 individual runs of the same prompting/testing process.

Task 1: Thoracic Disease Screening (ChestDR)

Chest X-ray is a regularly adopted imaging technique for daily clinical routine. Many thoracic diseases are reported, and further examinations are recommended for differential diagnoses. Due to the large amount and fast reporting requirements in certain emergency facilities, a swift screening and reporting of common thoracic diseases could largely improve the efficiency of the clinical process. In this task, the participants shall target providing an accurate thoracic disease screening approach by utilizing the foundation models (limited to publicly accessible ones) and following the one-shot/few-shot learning setting, e.g., with 1, 5, or 10 sample cases for each disease category.




Task 2: Pathological Tumor Tissue Classification (Colon)

Pathology examination can support detecting early-stage cancer cells in small tissue slices. In the pathologist’s daily routine, they are required to look over several dozens of tissue slides, a tiresome and tedious job. In clinical diagnosis, quantifying cancer cells and regions is the primary goal for pathologists. The approaches for the classification of pathological tissue patches are desired to ease this process and help screen whether it exists regions of malignant cells in the entire slide in a sliding window manner. Similar to the setting in Task 1, the participants shall target providing an accurate model adaptation approach based on the foundation models and following the one-shot/few-shot learning setting, e.g., with 1, 5, or 10 sample cases for either lesion or non-lesion classes.




Task3: Lesion Detection in Colonoscopy Images (Endo)

Colorectal cancer is one of the most common and fatal cancers among men and women around the world. Abnormalities like polyps and ulcers are precursors to colorectal cancer and are often found in colonoscopy screening of people aged above 50. The risks largely increase along with aging. Colonoscopy is the gold standard for the detection and early diagnosis of such abnormalities with necessary biopsy on site, which could significantly affect the survival rate from colorectal cancer. Automatic detection of such lesions during the colonoscopy procedure could prevent missing lesions and reduce the workload of gastroenterologists in colonoscopy. In this task, the participants shall target providing an accurate classification of four different lesion types by utilizing the provided foundation model and following the one-shot/few-shot learning setting, e.g., with 1, 5, or 10 sample cases for each lesion category.





Sponsors