Slide 16
Slide 16 text
Evaluation
settings
• Baseline
• random instance selection (i.e., passive learning)
• naively querying the longest sequence in terms of tokens
• Features
• includes words, orthographic patterns, part-of-speech, lexicons, etc.
• N-best approximation, N = 15
• QBC methods, # committee C = 3
• For information density, B = 1 (i.e., the information and
density terms have equal weight)
• Initialized L, five random labeled instances to
• 150 queries are selected from U in batches of size B =5
• five folds cross-validation
16