Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Categorization using Neural Networks

forLoop
August 22, 2016

Categorization using Neural Networks

Fatai Salami showed the audience how to implement categorization using Neural Networks.

forLoop

August 22, 2016
Tweet

More Decks by forLoop

Other Decks in Programming

Transcript

  1. • Sometimes referred to as classification • categorization is assigning

    a category ci to document di What is Categorization
  2. • text categorization (e.g., spam filtering) • fraud detection •

    optical character recognition • machine vision (e.g., face detection) • natural-language processing (e.g., spoken language understanding) • market segmentation (e.g.: predict if customer will respond to promotion) • Bioinformatics (e.g., classify proteins according to their function) Categorization Tasks
  3. • IT DEPENDS • How large is your training set?,

    How large is your feature space? Data Set linearly separable? • Quality data trumps everything What Algorithm should I use?
  4. • Clean data • Identify possible features • Separate data

    in the ratio 60 (train) : 20 (validate) : 20 (test) Getting data
  5. • Organization • State of residence • Area of residence

    • Monthly income • Gender • Marital status • No of dependants • How much borrowed? • Account balance at middle of the month • Credit score etc… Identify features
  6. Good to have all features in the range -1 to

    +1 • numeric x-data can be Gaussian normalized (really depends on your data) • binary x-data is (-1 +1) encoded • categorical x-data is 1-of-(C-1) effects-coded ( ex: [0,1] or [1,0] or [-1,-1] ) Prepare features for Neural Network
  7. • numeric y-data is left alone • binary y-data is

    1-of-C dummy-coded • categorical y-data is 1-of-C dummy-coded
  8. • Input layer – only one layer. Neurons determined by

    number of features to be used • Output layer – only one layer. Usually one neuron except for multiclass classification (run away: 0, return money: 1) Introducing “maybe” = 3 neurons (run away: {1,0,0}, maybe: {0,1,0}, return money:{0,0,1}
  9. • Hidden layers – linearly separable data requires no hidden

    layer. Rare problems requiring more than two hidden layer. One is usually enough. There are many rule-of-thumb methods: • The number of hidden neurons should be between the size of the input layer and the size of the output layer. • The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer. • The number of hidden neurons should be less than twice the size of the input layer.
  10. • Too few hidden neurons result in under-fitting (model is

    not able to get good results) • Too many hidden neurons results in over-fitting (model fits well with training data but does not perform well with test data)
  11. Always better to use existing library/services • Azure machine learning

    services • Weka • C# - Aforge.Net, NeuronDotNet • Java – Neuroph • Python – PyBrain, NeuroLab • Php - Neural Mesh • R - nnet Coding the Neural Network
  12. • Epoch – number of iterations (too much can result

    to model over-fitting) • Learning rate – how much an updating step influences the current value of the weights (too small – runs for long; too large – can go past optimum weights) • Momentum – used to diminish the fluctuations in weight changes over consecutive iterations Network Configuration Terms
  13. Where: • Vi = Original output of data • Vo

    = Output from neural net • Vl = length of data • TP = True Positive – accurately predicted valid for a valid transaction. • FP = False Positive – predicted valid for a fraudulent transaction. • FN = False Negative – predicted fraudulent for a valid transaction.