Daily Apple...”, …] target = np.array([1, ...]) vectorizer = CountVectorizer(ngram_range=(1, 1)) train_set_dense = vectorizer.fit_transform(train_set).toarray() vectorizer.get_feature_names() '00', '01gzw6l7h8', '2nite', '40gb', 'applenews', 'co', # no 't' 'jam', 'sauce', 'iphone', 'mac', … 'would', 'wouldn', … 'ya', 'yay', 'yaaaay', … # hashtags? http:// @users?