Slide 45
Slide 45 text
© SAKURA internet Inc.
,7$BDIFαΠζͱσʔλసૹͷݟੵΓ
#MMBNB #MMBNB
,7$BDIFαΠζ
n_layers
n_heads
head_dim
KV Cache / layer = 2 × (n_heads × head_dim) × 2 [FP16]
ೖྗτʔΫϯ,
,7$BDIFTJ[F<(#>
ೖྗτʔΫϯ,
,7$BDIFTJ[F<(#>
ೖྗτʔΫϯ,Yಉ࣌ଓϦΫΤετ
,7$BDIFTJ[F<(#>
,7$BDIFసૹ࣌ؒ
(CQT(#T
NTT
NTT
(CQT(#T
NTT
NTT
(CQT(#T
NTT
NTT
͜ͷ,7$BDIFసૹʹ͔͔Δ͕࣌ؒϢʔβʔମݧΛѱԽͤ͞Δ͕ɺ
Ϣʔβʔͷೖྗͷ࣍ୈͰγεςϜʹ͔͔Δෛՙ͕มΘΔͷ͕Πϯϑϥઃܭ্͍͠ϙΠϯτ
,7$BDIF4J[F$BMDVMBUPSΛར༻ͯ͠ࢉग़
IUUQTMNDBDIFBJLW@DBDIF@DBMDVMBUPSIUNM