Upgrade to Pro — share decks privately, control downloads, hide ads and more …

(IJCNN2026) Cell Instance Segmentation via Mult...

(IJCNN2026) Cell Instance Segmentation via Multi-Task Image-to-Image Schrödinger Bridge

https://arxiv.org/abs/2604.12318

Existing cell instance segmentation pipelines typically combine deterministic predictions with post-processing, which imposes limited explicit constraints on the global structure of instance masks. In this work, we propose a multi-task image-to-image Schrödinger Bridge framework that formulates instance segmentation as a distribution-based image-to-image generation problem. Boundary-aware supervision is integrated through a reverse distance map, and deterministic inference is employed to produce stable predictions. Experimental results on the PanNuke dataset demonstrate that the proposed method achieves competitive or superior performance without relying on SAM pre-training or additional post-processing. Additional results on the MoNuSeg dataset show robustness under limited training data. These findings indicate that Schrödinger Bridge-based image-to-image generation provides an effective framework for cell instance segmentation.

Avatar for Shumpei Takezaki

Shumpei Takezaki

June 25, 2026

More Decks by Shumpei Takezaki

Other Decks in Science

Transcript

  1. Cell Instance Segmentation via Multi-Task Image-to-Image Schrödinger Bridge Kyushu University,

    Fukuoka, Japan Hayato Inoue Shota Harada ◦Shumpei Takezaki Ryoma Bise
  2. • Multi-task image-to-image Schrödinger bridge Overview 1 Conventional method Proposed

    method Seg. model with Gen. model Complex post-processing is required to correct unnatural mask predictions Without complex post-processing, prevent unnatural mask predictions Schrodinger bridge Multi-task prediction = Concatenate + + Post-processing Microscopic image Cell instance mask × 𝑇 steps Mask Reverse distance map
  3. • Cell instance segmentation (CIS) • To segment cell images

    into each cell. • Manual CIS has some problems (burdens, costs, etc.) Background: Cell instance segmentation 2 Manual CIS is replaced by automated CIS For quantitative cell analysis Expert Segmentation model Microscopic image Cell instance mask A lot of work… Replace
  4. • Challenging images cause unnatural masks • Ex.: ambiguous cell

    boundaries and background artifacts Challenges in cell instance segmentation 3 Predicted unnatural instance masks lead to errors Ambiguous cell boundaries Background atrifacts Image GT Prediction Two cell instances are merged Non-cell regions are detected as cells Image GT Prediction
  5. • The post-processing is applied to the predicted masks Conventional

    methods rely on post-processing 4 Post-processing requires optimization of combinations and parameters Seg. Model Post-processing Optimal processing 𝑓𝐶 (𝑓𝐼 (𝑓𝑆 ∙ 𝜃𝑆 ∗ |𝜃𝐼 ∗)|𝜃𝐶 ∗) Combinations: {𝑓𝐴 , 𝑓𝐵 , … , 𝑓𝑍 } Parameters: {𝜃𝐴 , 𝜃𝐵 , … , 𝜃𝑍 } {𝑓𝐶 , 𝑓𝐼 , 𝑓𝑆 } {𝜃𝐶 ∗, 𝜃𝐼 ∗, 𝜃𝑆 ∗} 𝜃∗: Optimized parameter Optimization
  6. Purpose: CIS without complex post-processing 5 Conventional method Proposed method

    Seg. model with Our goal is to prevent unnatural mask prediction without complex post-processing Gen. model Complex post-processing is required to correct unnatural mask predictions Without complex post-processing, prevent unnatural mask predictions
  7. • Two components to mitigate unnatural mask prediction • Image-to-image

    (I2I) generative modeling (Schrodinger bridge) • Boundary-aware multi-task prediction without annotation cost Proposed method: Multi-task image-to-image Schrödinger bridge 6 Gen. model Image-to-image generative modeling × 𝑇 steps + = Concatenate + Boundary-aware multi-task prediction
  8. • Schrödinger bridge[1] are used as I2I generative modeling •

    Preserve a structure during prediction, a mask from an image • Generative modeling avoid the prediction of an unnatural mask Image-to-image generative modeling 7 Generative process (𝑇 steps) 𝑝Image 𝑝Mask Schrödinger bridge
  9. • Schrodinger bridge[1] are used as I2I generative modeling •

    Preserve a structure during prediction, a mask from an image • Generative modeling avoid the prediction of an unnatural mask Image-to-image generative modeling 8 Generative process (𝑇 steps) 𝑝Image 𝑝Mask UNet Prediction of next step
  10. • Schrodinger bridge[1] are used as I2I generative modeling •

    Preserve a structure during prediction, a mask from an image • Generative modeling avoid the prediction of an unnatural mask Image-to-image generative modeling 9 𝑝Mask High Generative process (𝑇 steps) Unnatural mask Natural mask 𝑝Image 𝑝Mask 𝑝Mask Low UNet Prediction of next step
  11. • Cell instance masks do not emphasize cell boundaries •

    Cell centers and boundaries have same mask values Boundary-aware multi-task prediction without annotation cost (1/2) 10 Generative process Microscopic image 1 1 1 1 1 1 1 1 Cell instance masks alone are insufficient for learning cell separation 𝑝Image 𝑝Multi Cell instance mask Center=1, Boundary=1
  12. • Reverse distance maps emphasize the cell boundaries • Multi-task

    prediction learns cell features and boundaries Boundary-aware multi-task prediction without annotation cost (2/2) 11 Boundary-aware supervision can be obtained without additional annotations 1 1 1 1 1 1 1 0 Generative process Microscopic image 𝑝Image 𝑝Multi Reverse distance map Center=0, Boundary=1
  13. Multi-task image-to-image Schrodinger bridge 12 𝑝Image 𝑝Multi Image → Instance

    mask Image → Reverse distance map UNet Prediction of next step + + Instance Mask Reverse distance map
  14. • Dataset: PanNuke (7,904 images, 256×256 pixels) • Evaluation metrics:

    bPQ (Detection and mask quality) • Comparisons: U-Net and CellViT (SOTA of CIS. Use Post-Proc.) Experiments: Quantitative results 13 Model Post-Proc. RV dist Generative bPQ U-Net 0.5799 U-Net w/ RV dist. 0.5837 CellViT w/o Post-Proc. 0.6149 CellViT 0.6221 Ours 0.6362 RV dist. = Reverse distance map, Proc. = Processing Ours achieves the best performance
  15. Experiments: Qualitative results 14 Input GT CellViT w/o Post-Proc. CellViT

    Ours Two cells separated No separated Unnatural Separated and natural
  16. • Purpose: CIS without complex post-processing • Method: Multi-task image-to-image

    Schrödinger Bridge • Results: Best segmentation performance on PanNuke • Future work: Extension to multi-class cell segmentation Summary 15 Multi-task Schrödinger Bridge generates natural cell instance masks without complex post-processing