Microsoft's new generative AI image model can screen for cancer

Staff Writer
Nov 20, 2024
2 min read

Updated: Dec 14, 2024

Microsoft has developed a generative AI-based medical image analysis model called BiomedParse that can improve early diagnosis of life-threatening diseases such as cancer. BiomedParse combines object recognition, detection, and segmentation to get faster and more precise clinical insights.

Currently, radiologists study medical images to track tumors and their growth using tools such as MedSAM and SAM that only focus on segmentation. Object recognition and detection are done separately, limiting the scope of analysis, claims Microsoft.

By combining these tasks, the model can more effectively identify, locate, and map tumor boundaries on complex medical images. Radiologists can use simple natural language prompts to direct the model’s attention to specific areas of interest.

Though several models can perform segmentation, detection, and object recognition together, BiomedParse is one of the first to leverage generative AI for medical image analysis.

For instance, FacebookAI Research’s (FAIR) Detectron2 and TensorFlow’s Object Detection API are some of the popular open-source platforms used for object detection and instance segmentation.

To train BiomedParse, researchers at Microsoft created a novel dataset using OpenAI's GPT-4 to synthesize data from standard segmentation datasets. This allows the new model to accurately segment biomedical objects across nine modalities, surpassing previous methods while significantly reducing user input.

The novel dataset, called BiomedParseData, comprises over 6 million image-mask-text triplets, covering 64 major and 82 fine-grained biomedical object types across nine modalities.

To build the dataset, Microsoft used natural language descriptions from 45 existing biomedical segmentation datasets. GPT-4 was instrumental in organizing these descriptions into a unified biomedical object taxonomy and synthesizing additional variations for robust text prompting. This innovative approach empowers AI models to accurately identify, locate, and segment intricate biomedical objects within complex images.

According to Microsoft, the model’s ability to learn semantic representations for individual object types is particularly relevant in the analysis of challenging scenarios involving irregularly shaped objects.

Users can simply input a natural language description of an object, and the model will automatically predict both its label and its precise outline within the image, eliminating the need for manual bounding box annotations.

Several Indian firms in the medical sector are also using AI for image analysis and early detection of cancer. A case in point is Bengaluru based startup Niramai that captures thermal images of the breast to show temperature variations on the skin’s surface. It uses AI-powered image analysis tools to analyze the thermal patterns and look for specific characteristics that are indicative of cancerous tumors. Niramai claims that its AI can detect significantly smaller (around 5mm) tumors that are hard to detect with traditional mammograms.

Microsoft is leveraging its partnership with OpenAI to develop GPT-4 based models to fill information gaps in various cases. Recently, Microsoft announced a collaboration with NASA to develop Earth Copilot to simplify public access to geospatial data.

Based in Microsoft’s Azure OpenAI Service, Earth Copilot can provide information on any earth related geographical events such as hurricane, flood or deforestation and their impact as captured by geospatial satellites.

Image credit: Pexels