Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Categorization using Neural Networks

0b0ba94d014c694b932ada74f1c9f1af?s=47 forLoop
August 22, 2016

Categorization using Neural Networks

Fatai Salami showed the audience how to implement categorization using Neural Networks.

0b0ba94d014c694b932ada74f1c9f1af?s=128

forLoop

August 22, 2016
Tweet

Transcript

  1. Categorization using Neural Networks by Fatai Salami (Phatye)

  2. • Fatai Salami • Software Developer – Lecturer – Researcher

    • fatai.salami@gmail.com or @phatye Profile
  3. • Sometimes referred to as classification • categorization is assigning

    a category ci to document di What is Categorization
  4. • text categorization (e.g., spam filtering) • fraud detection •

    optical character recognition • machine vision (e.g., face detection) • natural-language processing (e.g., spoken language understanding) • market segmentation (e.g.: predict if customer will respond to promotion) • Bioinformatics (e.g., classify proteins according to their function) Categorization Tasks
  5. • IT DEPENDS • How large is your training set?,

    How large is your feature space? Data Set linearly separable? • Quality data trumps everything What Algorithm should I use?
  6. Case Study: Will s/he run away with my money?

  7. • Clean data • Identify possible features • Separate data

    in the ratio 60 (train) : 20 (validate) : 20 (test) Getting data
  8. • Organization • State of residence • Area of residence

    • Monthly income • Gender • Marital status • No of dependants • How much borrowed? • Account balance at middle of the month • Credit score etc… Identify features
  9. Good to have all features in the range -1 to

    +1 • numeric x-data can be Gaussian normalized (really depends on your data) • binary x-data is (-1 +1) encoded • categorical x-data is 1-of-(C-1) effects-coded ( ex: [0,1] or [1,0] or [-1,-1] ) Prepare features for Neural Network
  10. • numeric y-data is left alone • binary y-data is

    1-of-C dummy-coded • categorical y-data is 1-of-C dummy-coded
  11. Design Neural Network Model

  12. • Input layer – only one layer. Neurons determined by

    number of features to be used • Output layer – only one layer. Usually one neuron except for multiclass classification (run away: 0, return money: 1) Introducing “maybe” = 3 neurons (run away: {1,0,0}, maybe: {0,1,0}, return money:{0,0,1}
  13. • Hidden layers – linearly separable data requires no hidden

    layer. Rare problems requiring more than two hidden layer. One is usually enough. There are many rule-of-thumb methods: • The number of hidden neurons should be between the size of the input layer and the size of the output layer. • The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer. • The number of hidden neurons should be less than twice the size of the input layer.
  14. • Too few hidden neurons result in under-fitting (model is

    not able to get good results) • Too many hidden neurons results in over-fitting (model fits well with training data but does not perform well with test data)
  15. Always better to use existing library/services • Azure machine learning

    services • Weka • C# - Aforge.Net, NeuronDotNet • Java – Neuroph • Python – PyBrain, NeuroLab • Php - Neural Mesh • R - nnet Coding the Neural Network
  16. • Epoch – number of iterations (too much can result

    to model over-fitting) • Learning rate – how much an updating step influences the current value of the weights (too small – runs for long; too large – can go past optimum weights) • Momentum – used to diminish the fluctuations in weight changes over consecutive iterations Network Configuration Terms
  17. Training using Aforge.Net

  18. Predicting using Aforge.Net

  19. • Analysing the result

  20. Where: • Vi = Original output of data • Vo

    = Output from neural net • Vl = length of data • TP = True Positive – accurately predicted valid for a valid transaction. • FP = False Positive – predicted valid for a fraudulent transaction. • FN = False Negative – predicted fraudulent for a valid transaction.
  21. Thank you