Slide 35
Slide 35 text
GoogLeNet
oss as the classifier (pre-
as the main classifier, but
ng network is depicted in
ined using the DistBe-
ing system using mod-
rallelism. Although we
n only, a rough estimate
work could be trained to
GPUs within a week, the
usage. Our training used
escent with 0.9 momen-
ule (decreasing the learn-
lyak averaging [13] was
at inference time.
e changed substantially
ompetition, and already
with other options, some-
d hyperparameters, such
Therefore, it is hard to
most effective single way
ate matters further, some
on smaller relative crops,
[8]. Still, one prescrip-
y well after the competi-
sized patches of the im-
y between 8% and 100%
constrained to the inter-
photometric distortions
to combat overfitting to
data.
fication Challenge
n challenge involves the
ne of 1000 leaf-node cat-
There are about 1.2 mil-
r validation and 100,000
is associated with one
mance is measured based
predictions. Two num-
-1 accuracy rate, which
the first predicted class,
mpares the ground truth
es: an image is deemed
input
Conv
7x7+2(S)
MaxPool
3x3+2(S)
LocalRespNorm
Conv
1x1+1(V)
Conv
3x3+1(S)
LocalRespNorm
MaxPool
3x3+2(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
MaxPool
3x3+1(S)
DepthConcat
Conv
3x3+1(S)
Conv
5x5+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
MaxPool
3x3+1(S)
DepthConcat
Conv
3x3+1(S)
Conv
5x5+1(S)
Conv
1x1+1(S)
MaxPool
3x3+2(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
MaxPool
3x3+1(S)
DepthConcat
Conv
3x3+1(S)
Conv
5x5+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
MaxPool
3x3+1(S)
AveragePool
5x5+3(V)
DepthConcat
Conv
3x3+1(S)
Conv
5x5+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
MaxPool
3x3+1(S)
DepthConcat
Conv
3x3+1(S)
Conv
5x5+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
MaxPool
3x3+1(S)
DepthConcat
Conv
3x3+1(S)
Conv
5x5+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
MaxPool
3x3+1(S)
AveragePool
5x5+3(V)
DepthConcat
Conv
3x3+1(S)
Conv
5x5+1(S)
Conv
1x1+1(S)
MaxPool
3x3+2(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
MaxPool
3x3+1(S)
DepthConcat
Conv
3x3+1(S)
Conv
5x5+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
Conv
1x1+1(S)
MaxPool
3x3+1(S)
DepthConcat
Conv
3x3+1(S)
Conv
5x5+1(S)
Conv
1x1+1(S)
AveragePool
7x7+1(V)
FC
Conv
1x1+1(S)
FC
FC
SoftmaxActivation
softmax0
Conv
1x1+1(S)
FC
FC
SoftmaxActivation
softmax1
SoftmaxActivation
softmax2
Figure 3: GoogLeNet network with all the bells and whistles.
⼤規模画像認識コンペティション
ILSVRCで2014年に優勝
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR, 2015.