@galuhsahid Text Classification INFO RESMI Tri Care SELAMAT Nomor Anda terpilih mendapatkan Hadiah 1 Unit MOBIL Dri Tri Care Dengan PIN Pemenang: br25h99 info: www.gebeyar-3care.tk Da sebenernya makan sama indomie ge cukup aku mah 😂 RAWIT harganya miring, Internetan makin sering! AYO isi ulang kartu AXIS kamu sekarang, pastikan besok ikut promo Rabu RAWIT. Info838. LD328 Fraud Normal Promo
@galuhsahid Transform Source Review 2: This movie is not scary and is slow Vocabulary: ‘This’, ‘movie’, ‘is’, ‘very’, ‘scary’, ‘and’, ‘long’, ‘not’, ‘slow’, ‘spooky’, ‘good’
Number of words in Review 2 = 8
TF for the word ‘this’ = (number of times ‘this’ appears in review 2)/ (number of terms in review 2) = 1/8
@galuhsahid Transform Source Review 2: This movie is not scary and is slow IDF(‘this’) = log(number of documents/number of documents containing the word ‘this’) = log(3/3) = log(1) = 0
@galuhsahid Transfer learning A deep learning model is trained on a large dataset, then used to perform similar tasks on another dataset (e.g. text classification)
@galuhsahid “...we train a general-purpose ‘language understanding’ model on a large text corpus (like Wikipedia), and then use that model for downstream NLP tasks that we care about (like question answering)” https://github.com/google-research/bert