Slide 16
Slide 16 text
Supervised Alignment
• The entropy-regularized OT is differentiable and thus can be
directly integrated into neural models.
• Fine-tune the entire model by minimizing the binary cross-
entropy loss:
ℒ 𝑃𝑖,𝑗
, 𝑌𝑖,𝑗
= −𝑌𝑖,𝑗
log 𝑃𝑖,𝑗
− 1 − 𝑌𝑖,𝑗
log(1 − 𝑃𝑖,𝑗
)
16