Adapting MiniGPT-4 (Vicuna 7B) for Tumor Detection with Bounding Box Prediction Using 4 MRI Modalities

In this post, we will explain how to adapt MiniGPT-4 with bounding box prediction capabilities to detect tumors across four MRI modalities. We’ll walk through the process of creating a custom dataloader, training the model, and evaluating it.


Fusion Model

The goal is to train a Large Language Model (LLM) that accepts four MRI image inputs and predicts bounding box coordinates for tumor locations. Each image corresponds to a different MRI modality.

For training, we will use prompts such as:

"<Img1><ImageHere></Img1><Img2><ImageHere></Img2><Img3><ImageHere></Img3><Img4><ImageHere></Img4> where is the tumor?"

We will introduce the ‘' token to predict the bounding box coordinates. The model will output the coordinates [maxx, maxy, minx, miny].

Example ‘summary.jsonl’

{
  "t2f Sagittal_path": "BraTS-GLI-00000-000_2_0.jpg",
  "t2w Sagittal_path": "BraTS-GLI-00000-000_2_2.jpg",
  "t1n Sagittal_path": "BraTS-GLI-00000-000_2_3.jpg",
  "t1c Sagittal_path": "BraTS-GLI-00000-000_2_4.jpg",
  "box": [[96, 84, 68, 68]]
}

Code Implementation




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • How to Build and Run a Docker Image Anywhere
  • Getting Started with OpenSCENARIO for CARLA
  • Training a Large Language Model (LLM) for Tumor Segmentation using Concatenation
  • nnUNetV2 with Bounding Box Prediction for 2D Data
  • Getting Started with Scenario Runner for CARLA