Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CW, AROW, and SCW

CW, AROW, and SCW

Confidence-weighted Algorithm (CW), Adaptive Regularization of Weight Vectors (AROW), and Exact Soft Confidence-Weighted Learning (SCW).

Sorami Hisamoto

May 27, 2014
Tweet

More Decks by Sorami Hisamoto

Other Decks in Research

Transcript

  1. Sorami Hisamoto
    27 May, 2014
    $8 "308 BOE4$8

    View Slide

  2. 7BSJPVTPOMJOFMFBSOJOHBMHPSJUINT
    ‣ 1FSDFQUSPO<3PTFOCMBUU>
    ‣ \.BSHJO 7PUFE "WFSBHFE 4FDPOEPSEFS 4USVDUVSFE ʜ^1FSDFQUSPO
    ‣ .*3"1".BSHJO*OGVTFE3FMBYFE"MHPSJUIN1BTTJWF"HHSFTTJWF<$SBNNFS>
    ‣ $8$POpEFODF8FJHIUFE"MHPSJUIN<%SFE[F>
    ‣ *&--*1*NQSPWFE&MMJQTPJE.FUIPE<:BOH>
    ‣ "308"EBQUJWF3FHVMBSJ[BUJPOPG8FJHIU7FDUPST<$SBNNFS>
    ‣ /"308/FX"EBQUJWF"MHPSJUINTGPS0OMJOF$MBTTJpDBUJPO<0SBCPOB$SBNNFS>
    ‣ /)&3%/PSNBM)FSE<$SBNNFS>
    ‣ "EB(SBE"EBQUJWF4VC(SBEJFOU.FUIPET<%VDIJ>
    ‣ 4$8&YBDU4PGU$POpEFODF8FJHIUFE-FBSOJOH<8BOH>
    ‣ ʜ
    2

    View Slide

  3. 1"1BTTJWF"HHSFTTJWFBMHPSJUIN<$SBNNFS>
    3
    ‣ *GSFTVMUDPSSFDU EPOPUIJOH QBTTJWF

    ‣ *GSFTVMUXSPOH NJOJNBMMZVQEBUFXFJHIUTUPDMBTTJGZDPSSFDUMZ BHHSFTTJWF

    View Slide

  4. 1"1BTTJWF"HHSFTTJWFBMHPSJUIN<$SBNNFS>
    3
    minimally update
    ‣ *GSFTVMUDPSSFDU EPOPUIJOH QBTTJWF

    ‣ *GSFTVMUXSPOH NJOJNBMMZVQEBUFXFJHIUTUPDMBTTJGZDPSSFDUMZ BHHSFTTJWF

    View Slide

  5. 1"1BTTJWF"HHSFTTJWFBMHPSJUIN<$SBNNFS>
    3
    minimally update
    ‣ *GSFTVMUDPSSFDU EPOPUIJOH QBTTJWF

    ‣ *GSFTVMUXSPOH NJOJNBMMZVQEBUFXFJHIUTUPDMBTTJGZDPSSFDUMZ BHHSFTTJWF

    correctly classify

    View Slide

  6. 1"1BTTJWF"HHSFTTJWFBMHPSJUIN<$SBNNFS>
    ‣ )BTDMPTFEGPSNTPMVUJPO 3
    minimally update
    ‣ *GSFTVMUDPSSFDU EPOPUIJOH QBTTJWF

    ‣ *GSFTVMUXSPOH NJOJNBMMZVQEBUFXFJHIUTUPDMBTTJGZDPSSFDUMZ BHHSFTTJWF

    correctly classify

    View Slide

  7. 1"JMMVTUSBUFE
    4
    Figure from http://kazoo04.hatenablog.com/entry/2012/12/20/000000

    View Slide

  8. 1"JMMVTUSBUFE
    4
    Figure from http://kazoo04.hatenablog.com/entry/2012/12/20/000000
    Do nothing.

    View Slide

  9. 1"JMMVTUSBUFE
    4
    Figure from http://kazoo04.hatenablog.com/entry/2012/12/20/000000
    Minimally update to
    correctly classify.
    Do nothing.

    View Slide

  10. &YUFOTJPOT1"*1"**
    ‣ 0SJHJOBM1"UPPlBHHSFTTJWFzJOTPNFTJUVBUJPOT DPOTUSBJOUBMXBZTDMBTTJGZDPSSFDUMZ

    ‣ 8FBLUPMBCFMOPJTF
    ‎ HFOUMFSVQEBUFTUSBUFHJFT TPGUNBSHJO

    5
    PA-I
    PA-II

    View Slide

  11. &YUFOTJPOT1"*1"**
    ‣ 0SJHJOBM1"UPPlBHHSFTTJWFzJOTPNFTJUVBUJPOT DPOTUSBJOUBMXBZTDMBTTJGZDPSSFDUMZ

    ‣ 8FBLUPMBCFMOPJTF
    ‎ HFOUMFSVQEBUFTUSBUFHJFT TPGUNBSHJO

    5
    PA-I
    PA-II

    View Slide

  12. &YUFOTJPOT1"*1"**
    ‣ 0SJHJOBM1"UPPlBHHSFTTJWFzJOTPNFTJUVBUJPOT DPOTUSBJOUBMXBZTDMBTTJGZDPSSFDUMZ

    ‣ 8FBLUPMBCFMOPJTF
    ‎ HFOUMFSVQEBUFTUSBUFHJFT TPGUNBSHJO

    5
    PA-I
    PA-II

    View Slide

  13. 1"QTFVEPDPEF
    6
    Algorithm from [Crammer+ 2006] “Online Passive-Aggressive Algorithms”

    View Slide

  14. $8$POpEFODF8FJHIUFEBMHPSJUIN<%SFE[F>
    ‣ *EFBXFJHIUTGPSGSFRVFOUGFBUVSFTNPSFlDPOpEFOUzUIBOSBSFPOFT
    ‎ $POTJEFS(BVTTJBOEJTUSJCVUJPOGPSXFJHIUTVQEBUFNFBOWBSJBODF
    7
    Figure from http://kazoo04.hatenablog.com/entry/2012/12/20/000000
    Previous:
    CW:

    View Slide

  15. $8$POpEFODF8FJHIUFEBMHPSJUIN<%SFE[F>
    ‣ *EFBXFJHIUTGPSGSFRVFOUGFBUVSFTNPSFlDPOpEFOUzUIBOSBSFPOFT
    ‎ $POTJEFS(BVTTJBOEJTUSJCVUJPOGPSXFJHIUTVQEBUFNFBOWBSJBODF
    7
    No memory.
    Figure from http://kazoo04.hatenablog.com/entry/2012/12/20/000000
    Previous:
    CW:

    View Slide

  16. $8$POpEFODF8FJHIUFEBMHPSJUIN<%SFE[F>
    ‣ *EFBXFJHIUTGPSGSFRVFOUGFBUVSFTNPSFlDPOpEFOUzUIBOSBSFPOFT
    ‎ $POTJEFS(BVTTJBOEJTUSJCVUJPOGPSXFJHIUTVQEBUFNFBOWBSJBODF
    7
    No memory.
    Figure from http://kazoo04.hatenablog.com/entry/2012/12/20/000000
    Previous:
    CW:

    View Slide

  17. $8$POpEFODF8FJHIUFEBMHPSJUIN<%SFE[F>
    ‣ *EFBXFJHIUTGPSGSFRVFOUGFBUVSFTNPSFlDPOpEFOUzUIBOSBSFPOFT
    ‎ $POTJEFS(BVTTJBOEJTUSJCVUJPOGPSXFJHIUTVQEBUFNFBOWBSJBODF
    7
    No memory.
    Figure from http://kazoo04.hatenablog.com/entry/2012/12/20/000000
    Previous:
    CW:
    More “confident”.

    View Slide

  18. *OTJEF$8
    ‣ *OTUFBEPGXFJHIUX VTFЖ NFBOWFDUPS
    BOEЄ DPWBSJBODFNBUSJY

    8
    It has closed form solution (c.f. [Dredze+ 2008]).
    Often use diag only instead of Σ"
    to make it simpler (not much performance change).

    View Slide

  19. *OTJEF$8
    ‣ *OTUFBEPGXFJHIUX VTFЖ NFBOWFDUPS
    BOEЄ DPWBSJBODFNBUSJY

    8
    Minimally update
    It has closed form solution (c.f. [Dredze+ 2008]).
    Often use diag only instead of Σ"
    to make it simpler (not much performance change).

    View Slide

  20. *OTJEF$8
    ‣ *OTUFBEPGXFJHIUX VTFЖ NFBOWFDUPS
    BOEЄ DPWBSJBODFNBUSJY

    8
    Minimally update
    Correctly
    classify with 

    prob >= η
    It has closed form solution (c.f. [Dredze+ 2008]).
    Often use diag only instead of Σ"
    to make it simpler (not much performance change).

    View Slide

  21. $81TFVEPDPEF
    9

    View Slide

  22. 8IBU`TXSPOHXJUI$8
    ‣ 6QEBUFJTUPPlBHHSFTTJWFz
    ‣ "MXBZTDMBTTJGZXJUIQSPCБ
    ‎ 8FBLUPMBCFMOPJTF FBTJMZPWFSpU
    ‣ $8BTTVNFTMJOFBSMZTFQBSBCMFEBUB
    ‣ %J⒏DVMUUPTPGUFODPOTUSBJOUGSPNJUTPSJHJOBMGPSN
    10

    View Slide

  23. "308"EBQUJWF3FHVMBSJ[BUJPOPG8FJHIU7FDUPST<$SBNNFS>
    ‣ -BSHFNBSHJOUSBJOJOH
    ‣ $POpEFODFXFJHIUJOH
    ‣ )BOEMJOHOPOTFQBSBCMFEBUB
    11

    View Slide

  24. *OTJEF"308
    12
    Values of hyper-parameter λ’s not so important (e.g. 0.1).
    It has closed form solution (c.f. [Crammer+ 2009]).

    View Slide

  25. *OTJEF"308
    12
    Minimally update
    Values of hyper-parameter λ’s not so important (e.g. 0.1).
    It has closed form solution (c.f. [Crammer+ 2009]).

    View Slide

  26. *OTJEF"308
    12
    Minimally update
    Minimize loss
    Values of hyper-parameter λ’s not so important (e.g. 0.1).
    It has closed form solution (c.f. [Crammer+ 2009]).

    View Slide

  27. *OTJEF"308
    12
    Minimally update
    Minimize loss
    More data, 

    more confident
    Values of hyper-parameter λ’s not so important (e.g. 0.1).
    It has closed form solution (c.f. [Crammer+ 2009]).

    View Slide

  28. "3081TFVEPDPEF
    13

    View Slide

  29. 4$84PGU$POpEFODF8FJHIUFEMFBSOJOH<8BOH>
    ‣ -BSHFNBSHJOUSBJOJOH
    ‣ $POpEFODFXFJHIUJOH
    ‣ )BOEMJOHOPOTFQBSBCMFEBUB
    ‣ "EBQUJWFNBSHJO
    14
    Can see it as PA-I/PA-II equivalent of CW.

    View Slide

  30. *OTJEF4$8
    15
    They have closed form solutions (c.f. [Wang+ 2012]).
    Formulas from d.hatena.ne.jp/kisa12012/20120625/

    View Slide

  31. *OTJEF4$8
    15
    They have closed form solutions (c.f. [Wang+ 2012]).
    Formulas from d.hatena.ne.jp/kisa12012/20120625/

    View Slide

  32. 4$81TFVEPDPEF
    16

    View Slide

  33. 0OMJOFBMHPSJUINTDPNQBSFE
    17
    Table from [Wang+ 2012] “Exact Soft Confidence-Weighted Learning”

    View Slide

  34. &WBMVBUJPO
    18
    ‣ .BSHJOCBTFEVTVBMMZPVUQFSGPSNTOPONBSHJOCBTFE
    ‎ -BSHFNBSHJO
    ‣ 4FDPOEPSEFSVTVBMMZPVUQFSGPSNTpSTUPSEFS
    ‎ $POpEFODFXFJHIUJOH
    ‣ $8PVUQFSGPSNTpSTUPSEFS CVUOPUXJUISFBMXPSME OPJTZ
    EBUB
    ‎ )BOEMJOHOPOTFQBSBCMFEBUB
    ‣ "308PVUQFSGPSNT$8XJUINBOZSFBMXPSMEEBUB CVUNPSFVQEBUFT
    ‎ "EBQUJWFNBSHJO
    Experiments on [Wang+ 2012]: various classification tasks.

    View Slide

  35. 4VNNBSZ
    ‣ $8DPOTJEFSDPOpEFODFPGXFJHIUT5PPBHHSFTTJWF XFBLUPOPJTF
    ‣ "308 4$8TPGUFOUIFDPOTUSBJOUPG$8
    ‣ "3084$8DPNQBSBCMFQFSGPSNBODF
    ‣ 4$8JGZPVEPO`UNJOEpOEJOHPQUJNBMIZQFSQBSBNFUFSTʜ
    ‣ "308PUIFSXJTFʜ
    19
    … but it all depends on the data sets!

    View Slide

  36. References

    View Slide

  37. 1BQFST
    ‣ <3PTFOCMBUU>5IFQFSDFQUSPO"QSPCBCJMJTUJDNPEFMGPSJOGPSNBUJPOTUPSBHFBOEPSHBOJ[BUJPOJOUIFCSBJO
    ‣ <$PMMJOT>%JTDSJNJOBUJWF5SBJOJOH.FUIPETGPS)JEEFO.BSLPW.PEFMT5IFPSZBOE&YQFSJNFOUTXJUI1FSDFQUSPO"MHPSJUINT &./-1

    ‣ <$FTB#JBODIJ>"TFDPOEPSEFSQFSDFQUSPOBMHPSJUIN 4*$0.1

    ‣ <$SBNNFS>0OMJOFQBTTJWFBHHSFTTJWFBMHPSJUINT +.-3

    ‣ <%SFE[F>$POpEFODF8FJHIUFE-JOFBS$MBTTJpDBUJPO *$.-

    ‣ <$SBNNFS>.VMUJ$MBTT$POpEFODF8FJHIUFE"MHPSJUINT &./-1

    ‣ <:BOH>0OMJOFMFBSOJOHCZFMMJQTPJENFUIPE *$.-

    ‣ <$SBNNFS>"EBQUJWF3FHVMBSJ[BUJPOPG8FJHIU7FDUPST /*14

    ‣ <0SBCPOB$SBNNFS>/FX"EBQUJWF"MHPSJUINTGPS0OMJOF$MBTTJpDBUJPO /*14

    ‣ <$SBNNFS>-FBSOJOHWJB(BVTTJBO)FSEJOH /*14

    ‣ <%VDIJ>"EBQUJWF4VCHSBEJFOU.FUIPETGPS0OMJOF-FBSOJOHBOE4UPDIBTUJD0QUJNJ[BUJPO +.-3

    ‣ <$SBNNFS>$POpEFODF8FJHIUFE-JOFBS$MBTTJpDBUJPOGPS5FYU$BUFHPSJ[BUJPO +-.3

    ‣ <8BOH>&YBDU4PGU$POpEFODF8FJHIUFE-FBSOJOH *$.-

    ‣ <)PJ>-*#0-"-JCSBSZGPS0OMJOF-FBSOJOH"MHPSJUINT +-.3

    21

    View Slide

  38. 1SFTFOUBUJPOT
    ‣ l4FDPOE0SEFS-FBSOJOHz5VUPSJBM!&$.-1,%%,PCZ$SBNNFS

    IUUQXXXFDNMQLEEPSHUVUPSJBMT
    ‣ 0OMJOF-JOFBS$MBTTJpFSTd1FSDFQUSPO͔Β$8·Ͱdେؠल࿨

    IUUQXXXTMJEFTIBSFOFULJTBPOMJOFDMBTTJpFST
    ‣ "EBQUJWF3FHVMBSJ[BUJPOPG8FJHIU7FDUPSTେؠल࿨

    IUUQXXXSEMJUDVUPLZPBDKQdPJXBVQMPBE"308QEG
    ‣ େن໛σʔλΛجʹͨࣗ͠વݴޠॲཧ!4*('1"*Ԭ໺ݪେี

    IUUQTTJUFTHPPHMFDPNTJUFEBJTVLFPLBOPIBSB
    ‣ ΦϯϥΠϯತ࠷దԽͱઢܗࣝผϞσϧֶशͷ࠷લઢ@*#*4Ԭ໺ݪେี

    IUUQXXXTMJEFTIBSFOFUQpJCJTPLBOPIBSB
    ‣ 5PLZP/-1ύʔηϓτϩϯͰָ͍͠஥͕ؒΆΆΆΆʙΜ਺ݪྑ඙

    IUUQXXXTMJEFTIBSFOFUTMFFQZ@ZPTIJUPLZPOMQ
    ‣ 1'*$ISJTUNBTTFNJOBS

    IUUQXXXTMJEFTIBSFOFUQpQpDISJTUNBTTFNJOBS
    ‣ 4FDPOE0SEFS1FSDFQUSPO.JDIBFM3-ZV

    IUUQXXXDTFDVILFEVILMZV@NFEJBHSPVQNFFUJOHIRZBOH@TFDPOEPSEFSQEG
    ‣ "3088JMMFN,SBZFOIP⒎

    IUUQTXXXZPVUVCFDPNXBUDI W"0C@WGO1HD
    ‣ ύʔηϓτϩϯΞϧΰϦζϜ(SBIBN/FVCJH

    IUUQXXXQIPOUSPODPNTMJEFTOMQQSPHSBNNJOHKBQFSDFQUSPOQEG
    22

    View Slide

  39. "SUJDMFT
    ‣ $POpEFODF8FJHIUFE-JOFBS$MBTTJpDBUJPOΛಡΜࣹܸͩͭͭ͠લస

    IUUQEIBUFOBOFKQULOH
    ‣ "308ͷίʔυΛॻ͍ͯΈͨUTVCPTBLBͷ೔ه

    IUUQEIBUFOBOFKQUTVCPTBLB
    ‣ "308Λ3VCZͰ࣮૷ͯ͠ΈͨdCMPHLJCJ[

    IUUQCMPHLJCJ[BSPXSVCZIUNM
    ‣ &YBDU4PGU$POpEFODF8FJHIUFE-FBSOJOH *$.-
    ಡΜͩLJTBͷ೔ه

    IUUQEIBUFOBOFKQLJTB
    ‣ ΦϯϥΠϯઢܗ෼ྨثͱ4$84JEFTXJQF

    IUUQLB[PPIBUFOBCMPHDPNFOUSZ
    ‣ "308͸$8ΑΓز෼Ϛγ͔OZͷ೔ه

    IUUQEIBUFOBOFKQOZQ
    ‣ l.*3" .BSHJO*OGVTFE3FMBYFE"MHPSJUIN
    zதᖒහ໌

    IUUQOMQJTUJLZPUPVBDKQNFNCFSOBLB[BXBQVCECPUIFS.*3"QEG
    ‣ ػցֶश௒ೖ໳**ʙ(NBJMͷ༏ઌτϨΠͰ΋࢖͍ͬͯΔ1"๏Λ෼Ͱशಘ͠Α͏ʂʙ&DIJ[FO#MPH;XFJ

    IUUQEIBUFOBOFKQFDIJ[FO@UN
    ‣ 5IFTJNJMBSJUZCFUXFFODPOpEFODFXFJHIUFEMFBSOJOHBOEUIFOBUVSBMHSBEJFOU"MFYBOESF1BTTPTT.-CMPH

    IUUQBUQBTTPTNFQPTUUIFTJNJMBSJUZCFUXFFODPOpEFODFXFJHIUFEMF
    ‣ OBUVSBMMBOHVBHFQSPDFTTJOHCMPH$MBTTJpFSQFSGPSNBODFBMUFSOBUJWFNFUSJDTPGTVDDFTT

    IUUQOMQFSTCMPHTQPUKQDMBTTJpFSQFSGPSNBODFBMUFSOBUJWFIUNM
    23

    View Slide

  40. *NQMFNFOUBUJPOT
    ‣ -*#0-

    "-JCSBSZGPS0OMJOF-FBSOJOH"MHPSJUINT

    IUUQXXXDBJTOUVFEVTHdDIIPJMJCPM
    ‣ +VCBUVT

    %JTUSJCVUFE0OMJOF.BDIJOF-FBSOJOH'SBNFXPSL

    IUUQKVCBUVT
    ‣ "308

    "OJNQMFNFOUBUJPOPGUIFF⒏DJFOUDPOpEFODFXFJHIUFEDMBTTJpFS

    IUUQTDPEFHPPHMFDPNQBSPXQQ
    ‣ PMM

    0OMJOF-FBSOJOH-JCSBSZ

    IUUQTDPEFHPPHMFDPNQPMMXJLJ0MM.BJO+B 24

    View Slide