Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mercari 3rd place solutionKaggleWinnerCall ZhenzheYing

Mercari 3rd place solutionKaggleWinnerCall ZhenzheYing

mercari

May 09, 2018
Tweet

More Decks by mercari

Other Decks in Programming

Transcript

  1. Agenda 1. Background 2. Summary 3. Feature selection & engineering

    4. Training methods 5. Important findings
  2. Background • Master of Xi'an JiaoTong University, China. • NLP

    engineer at 10jqka. • Champion in DiDi & udacity self-driving competition
  3. Summary • NN model: • 40mins • texture feature •

    PB 0.400 • FM model: • 20mins • ngrams tfidfvector + countvector • PB 0.417
  4. NN model Features Selection / Engineering 1.secondhand market (a variety

    of types) 2.the most of users: women 3.long description, with correlation feature
  5. NN model Features Selection / Engineering • Discard the unknown

    characters : ^ @ { [ • Merge the similar words by considering the ngrams • concat the short text (brand and name)
  6. NN model Features Selection / Engineering feature name description type

    name+brand maxlength = 8-10 token seq description maxlength = 64-72 token seq category3 onehot condition 1/2/3/4/5 shipping 0/1 length of description mean and std price and count of the category3
  7. NN model Training Methods batchsize number of data lr optimizer

    907 600k 0.0055 adam 907 600k 0.0055 adam 1027 600k 0.0055 adam 1127 all 0.0055 adam 1424 400k 0.0055 adam 1424 800k 0.008 adagrad 900 data of women and beauty 0.008 adagrad
  8. FM model Features Selection / Engineering feature name description type

    name countvector brand onehot description tfidfvector category1~3 onehot condition 1/2/3/4/5 shipping 0/1
  9. NN model Important and Interesting Findings • The score will

    rise when you spent more time on the manual work: Franch letter Cellphone name ...... • Make model to find the gradient more easily than others in the same time by manual the training process.
  10. FM model Training Methods • The word dictionary is very

    large. Be caution about the memory. • FM's overfits easily cause it has too many params. Set the epochs below 10. • Ensembled with the NN model by sum. FinalRes = 0.6 * nnres + 0.4 * fmres