EarthSynth: Generating Informative Earth Observation with Diffusion Models

by SatAI.challenge

Embed

Start on current slide

Slide 1

Slide 1 text

EarthSynth: Generating Informative Earth Observation with Diffusion Models  Helios  佐々木謙一  1 第11回 SatAI.challenge勉強会 

Slide 2

Slide 2 text

目次   2 ● 自己紹介スライド  ● 研究の1ページサマリ紹介   ● 研究の背景（Introduction）   ● 手法について（Method）   ● 実験（Experimet）  ● 結論（Conclusion） 

Slide 3

Slide 3 text

3 著者紹介 This image was generated by ChatGPT

Slide 4

Slide 4 text

佐々木謙一 • 2012-2016:東工大機械宇宙学科 • 2016-2019:東工大院松永研究室 • 2019-2023:CU Boulder Aerospace Engineering Ph.D. in Remote Sensing, Marine pollution monitoring • Internship • 2023-2025: Esri, product engineer in spatial analysis team • 2025: Helios

Slide 5

Slide 5 text

5 要約 This image was generated by ChatGPT

Slide 6

Slide 6 text

EarthSynth: Generating Informative Earth Observation with Diffusion Models  6 ● 分類・検出・セグメンテーションを含む複数タスクに対応する生成モデルを構築 ● CF-Comp（Counterfactual Composition）：複数画像の物体と背景を論理的に再構成 ● R-Filter: CLIPスコアを用いて高品質な合成データのみを選別 ● 下流モデルの事前学習やデータ拡張として有効   EarthSynth: 拡散モデルを用いたタスク横断の合成画像生成手法   Jiancheng Pan et al. (2025),”EarthSynth: Generating Informative Earth Observation with Diffusion Models’, arXiv. より引用

Slide 7

Slide 7 text

7 論文紹介 This image was generated by ChatGPT

Slide 8

Slide 8 text

Remote sensing Image（RSI）の課題 ● ラベル作成コスト高 ● クラスの偏り (例：車や建物は多いがヘリポートは少ない） ● タスクごとに別の合成モデルを使う非効率性生成モデルの役割 ● 拡散モデルによる高品質データの合成 ● データ多様性・一般化性能の向上   Introduction   8 Jiancheng Pan et al. (2025),”EarthSynth: Generating Informative Earth Observation with Diffusion Models’, arXiv. より引用

Slide 9

Slide 9 text

手法  9 EarthSynthの概要  ● 条件付きDiffusion（テキスト + セマンティックマスク） ● マルチソース・マルチカテゴリデータセット「EarthSynth-180K」を構築 ● 生成出力は画像・マスク・テキストのトリプレット   Jiancheng Pan et al. (2025),”EarthSynth: Generating Informative Earth Observation with Diffusion Models’, arXiv. より引用

Slide 10

Slide 10 text

手法  10 EarthSynth  1. データ収集 & EarthSynth-180Kの構築 ● 公開データセットを統合：OEM、LoveDA、DeepGlobeなど ● 各画像に対して： ○ セマンティックマスク（m） ○ テキスト説明（t）を自動/半自動で生成 ● 180,000件の (画像, マスク, テキスト) トリプレット 2. モデル学習 ● Stable Diffusion v1.5をベースに再学習 ● 条件付き入力：セマンティックマスク m, テキスト t ● セマンティクス強化 ○ CF-Comp（物体と背景の動的合成） ○ Local/Global Lossによる空間制御 Jiancheng Pan et al. (2025),”EarthSynth: Generating Informative Earth Observation with Diffusion Models’, arXiv. より引用

Slide 11

Slide 11 text

手法  11 EarthSynth  3. サンプル生成（Inference） ● 任意のマスク・テキストを入力すると、新規のRS画像 x を生成 ● 出力：x,m,t（画像・マスク・テキスト）のトリプレット ● R-Filterによる品質チェック 4. 下流タスクへの応用 ● Scene Classification：画像 + カテゴリラベル（テキストから抽出） ● Object Detection：マスクからBBox抽出 ● Semantic Segmentation：マスクをそのまま使用 Jiancheng Pan et al. (2025),”EarthSynth: Generating Informative Earth Observation with Diffusion Models’, arXiv. より引用

Slide 12

Slide 12 text

手法  12 EarthSynth  Counterfactual Composition（CF-Comp） ● 意味的に一貫した合成画像を動的に生成 ● Copy-Paste により、異なる画像から物体と背景を組み合わせる ● 適合基準：ICS（色感）、MOR（マスク重なり）、TSS（テキスト類似度）    

Slide 13

Slide 13 text

手法  13 EarthSynth  Rule-based Filtering（R-Filter） ● 生成後の合成データをCLIPスコアで評価 ● 画像全体・物体部分・背景部分を評価 ● スコアが閾値以上のデータのみを学習に使用    

Slide 14

Slide 14 text

結果  14 Downstream task    分類：CLIP 検出：GroundingDINO セグメンテーション：GSNet　  Jiancheng Pan et al. (2025),”EarthSynth: Generating Informative Earth Observation with Diffusion Models’, arXiv. より引用

Slide 15

Slide 15 text

結果  15 Downstream task    分類：CLIP 検出：GroundingDINO セグメンテーション：GSNet　  Jiancheng Pan et al. (2025),”EarthSynth: Generating Informative Earth Observation with Diffusion Models’, arXiv. より引用

Slide 16

Slide 16 text

Ablation study  16 Key modules contribution  - 最も良い条件 - 128 samples/class - R-Filter: 1pt 向上 - CF-Comp: 1.4pt 向上 Jiancheng Pan et al. (2025),”EarthSynth: Generating Informative Earth Observation with Diffusion Models’, arXiv. より引用

Slide 17

Slide 17 text

まとめ  17 結論  EarthSynthは、単一のDiffusionモデルでタスク横断的な合成を実現 CF-CompとR-Filterによる意味的・構造的制御の強化リモートセンシングの事前学習・少数ショット学習の基盤へ応用可能感想  A100を4枚用いて45h学習、学習生成効率と計算コストがどうなっているかマルチタスクへの適用と言っているが後処理で調整してるだけ時系列データへの応用に期待  Jiancheng Pan et al. (2025),”EarthSynth: Generating Informative Earth Observation with Diffusion Models’, arXiv. より引用