(IJCNN2026) Cell Instance Segmentation via Multi-Task Image-to-Image Schrödinger Bridge

Cell Instance Segmentation via Multi-Task Image-to-Image Schrödinger Bridge Kyushu University,
Fukuoka, Japan Hayato Inoue Shota Harada ◦Shumpei Takezaki Ryoma Bise

• Multi-task image-to-image Schrödinger bridge Overview 1 Conventional method Proposed
method Seg. model with Gen. model Complex post-processing is required to correct unnatural mask predictions Without complex post-processing, prevent unnatural mask predictions Schrodinger bridge Multi-task prediction = Concatenate + + Post-processing Microscopic image Cell instance mask × 𝑇 steps Mask Reverse distance map

• Cell instance segmentation (CIS) • To segment cell images
into each cell. • Manual CIS has some problems (burdens, costs, etc.) Background: Cell instance segmentation 2 Manual CIS is replaced by automated CIS For quantitative cell analysis Expert Segmentation model Microscopic image Cell instance mask A lot of work… Replace

• Challenging images cause unnatural masks • Ex.: ambiguous cell
boundaries and background artifacts Challenges in cell instance segmentation 3 Predicted unnatural instance masks lead to errors Ambiguous cell boundaries Background atrifacts Image GT Prediction Two cell instances are merged Non-cell regions are detected as cells Image GT Prediction

• The post-processing is applied to the predicted masks Conventional
methods rely on post-processing 4 Post-processing requires optimization of combinations and parameters Seg. Model Post-processing Optimal processing 𝑓𝐶 (𝑓𝐼 (𝑓𝑆 ∙ 𝜃𝑆 ∗ |𝜃𝐼 ∗)|𝜃𝐶 ∗) Combinations: {𝑓𝐴 , 𝑓𝐵 , … , 𝑓𝑍 } Parameters: {𝜃𝐴 , 𝜃𝐵 , … , 𝜃𝑍 } {𝑓𝐶 , 𝑓𝐼 , 𝑓𝑆 } {𝜃𝐶 ∗, 𝜃𝐼 ∗, 𝜃𝑆 ∗} 𝜃∗: Optimized parameter Optimization

Purpose: CIS without complex post-processing 5 Conventional method Proposed method
Seg. model with Our goal is to prevent unnatural mask prediction without complex post-processing Gen. model Complex post-processing is required to correct unnatural mask predictions Without complex post-processing, prevent unnatural mask predictions

• Two components to mitigate unnatural mask prediction • Image-to-image
(I2I) generative modeling (Schrodinger bridge) • Boundary-aware multi-task prediction without annotation cost Proposed method: Multi-task image-to-image Schrödinger bridge 6 Gen. model Image-to-image generative modeling × 𝑇 steps + = Concatenate + Boundary-aware multi-task prediction

• Schrödinger bridge[1] are used as I2I generative modeling •
Preserve a structure during prediction, a mask from an image • Generative modeling avoid the prediction of an unnatural mask Image-to-image generative modeling 7 Generative process (𝑇 steps) 𝑝Image 𝑝Mask Schrödinger bridge

• Schrodinger bridge[1] are used as I2I generative modeling •
Preserve a structure during prediction, a mask from an image • Generative modeling avoid the prediction of an unnatural mask Image-to-image generative modeling 8 Generative process (𝑇 steps) 𝑝Image 𝑝Mask UNet Prediction of next step

• Schrodinger bridge[1] are used as I2I generative modeling •
Preserve a structure during prediction, a mask from an image • Generative modeling avoid the prediction of an unnatural mask Image-to-image generative modeling 9 𝑝Mask High Generative process (𝑇 steps) Unnatural mask Natural mask 𝑝Image 𝑝Mask 𝑝Mask Low UNet Prediction of next step

• Cell instance masks do not emphasize cell boundaries •
Cell centers and boundaries have same mask values Boundary-aware multi-task prediction without annotation cost (1/2) 10 Generative process Microscopic image 1 1 1 1 1 1 1 1 Cell instance masks alone are insufficient for learning cell separation 𝑝Image 𝑝Multi Cell instance mask Center=1, Boundary=1

• Reverse distance maps emphasize the cell boundaries • Multi-task
prediction learns cell features and boundaries Boundary-aware multi-task prediction without annotation cost (2/2) 11 Boundary-aware supervision can be obtained without additional annotations 1 1 1 1 1 1 1 0 Generative process Microscopic image 𝑝Image 𝑝Multi Reverse distance map Center=0, Boundary=1

Multi-task image-to-image Schrodinger bridge 12 𝑝Image 𝑝Multi Image → Instance
mask Image → Reverse distance map UNet Prediction of next step + + Instance Mask Reverse distance map

• Dataset: PanNuke (7,904 images, 256×256 pixels) • Evaluation metrics:
bPQ (Detection and mask quality) • Comparisons: U-Net and CellViT (SOTA of CIS. Use Post-Proc.) Experiments: Quantitative results 13 Model Post-Proc. RV dist Generative bPQ U-Net 0.5799 U-Net w/ RV dist. 0.5837 CellViT w/o Post-Proc. 0.6149 CellViT 0.6221 Ours 0.6362 RV dist. = Reverse distance map, Proc. = Processing Ours achieves the best performance

Experiments: Qualitative results 14 Input GT CellViT w/o Post-Proc. CellViT
Ours Two cells separated No separated Unnatural Separated and natural

• Purpose: CIS without complex post-processing • Method: Multi-task image-to-image
Schrödinger Bridge • Results: Best segmentation performance on PanNuke • Future work: Extension to multi-class cell segmentation Summary 15 Multi-task Schrödinger Bridge generates natural cell instance masks without complex post-processing

(IJCNN2026) Cell Instance Segmentation via Mult...

(IJCNN2026) Cell Instance Segmentation via Multi-Task Image-to-Image Schrödinger Bridge

Shumpei Takezaki

More Decks by Shumpei Takezaki

Other Decks in Science

Featured

Transcript

Cell Instance Segmentation via Multi-Task Image-to-Image Schrödinger Bridge Kyushu University,

• Multi-task image-to-image Schrödinger bridge Overview 1 Conventional method Proposed

• Cell instance segmentation (CIS) • To segment cell images

• Challenging images cause unnatural masks • Ex.: ambiguous cell

• The post-processing is applied to the predicted masks Conventional

Purpose: CIS without complex post-processing 5 Conventional method Proposed method

• Two components to mitigate unnatural mask prediction • Image-to-image

• Schrödinger bridge[1] are used as I2I generative modeling •

• Schrodinger bridge[1] are used as I2I generative modeling •

• Schrodinger bridge[1] are used as I2I generative modeling •

• Cell instance masks do not emphasize cell boundaries •

• Reverse distance maps emphasize the cell boundaries • Multi-task

Multi-task image-to-image Schrodinger bridge 12 𝑝Image 𝑝Multi Image → Instance

• Dataset: PanNuke (7,904 images, 256×256 pixels) • Evaluation metrics:

Experiments: Qualitative results 14 Input GT CellViT w/o Post-Proc. CellViT

• Purpose: CIS without complex post-processing • Method: Multi-task image-to-image