Mercari 3rd place solutionKaggleWinnerCall ZhenzheYing

Kaggle Mercari Price Suggestion Challenge 3rd Solution Zhenzhe Ying

Agenda 1. Background 2. Summary 3. Feature selection & engineering
4. Training methods 5. Important findings

Background • Master of Xi'an JiaoTong University, China. • NLP
engineer at 10jqka. • Champion in DiDi & udacity self-driving competition

Summary • NN model: • 40mins • texture feature •
PB 0.400 • FM model: • 20mins • ngrams tfidfvector + countvector • PB 0.417

NN model Features Selection / Engineering 1.secondhand market (a variety
of types) 2.the most of users: women 3.long description, with correlation feature

NN model Features Selection / Engineering • Discard the unknown
characters : ^ @ { [ • Merge the similar words by considering the ngrams • concat the short text (brand and name)

NN model Features Selection / Engineering feature name description type
name+brand maxlength = 8-10 token seq description maxlength = 64-72 token seq category3 onehot condition 1/2/3/4/5 shipping 0/1 length of description mean and std price and count of the category3

NN model Structure

NN model Training Methods batchsize number of data lr optimizer
907 600k 0.0055 adam 907 600k 0.0055 adam 1027 600k 0.0055 adam 1127 all 0.0055 adam 1424 400k 0.0055 adam 1424 800k 0.008 adagrad 900 data of women and beauty 0.008 adagrad

FM model Features Selection / Engineering feature name description type
name countvector brand onehot description tfidfvector category1~3 onehot condition 1/2/3/4/5 shipping 0/1

NN model Important and Interesting Findings • The score will
rise when you spent more time on the manual work: Franch letter Cellphone name ...... • Make model to find the gradient more easily than others in the same time by manual the training process.

FM model Training Methods • The word dictionary is very
large. Be caution about the memory. • FM's overfits easily cause it has too many params. Set the epochs below 10. • Ensembled with the NN model by sum. FinalRes = 0.6 * nnres + 0.4 * fmres

Q&A Thank you!

Mercari 3rd place solutionKaggleWinnerCall Zhen...

Mercari 3rd place solutionKaggleWinnerCall ZhenzheYing

mercari PRO

More Decks by mercari

Other Decks in Programming

Featured

Transcript

Kaggle Mercari Price Suggestion Challenge 3rd Solution Zhenzhe Ying

Agenda 1. Background 2. Summary 3. Feature selection & engineering

Background • Master of Xi'an JiaoTong University, China. • NLP

Summary • NN model: • 40mins • texture feature •

NN model Features Selection / Engineering 1.secondhand market (a variety

NN model Features Selection / Engineering • Discard the unknown

NN model Features Selection / Engineering feature name description type

NN model Structure

NN model Training Methods batchsize number of data lr optimizer

FM model Features Selection / Engineering feature name description type

NN model Important and Interesting Findings • The score will

FM model Training Methods • The word dictionary is very

Q&A Thank you!