Data: thumbs up/down corpus Experiment: Train the classifier to mark a "p" annotation thumbs=up or thumbs=down, based n-grams (mostly unigrams) of various features on small, contained annotations. All results based on 10-fold cross-validation. ***** 2008-05-12 1-gram on Token.root 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7613889, 0.9875, 0.8513155); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.85, 0.4083333, 0.50000006); Overall results as: (precision, recall, F1)= (0.7888888, 0.7888888, 0.7888888); 1-gram on Token.root + Token.category 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.75853175, 0.9708333, 0.8407844); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.8, 0.425, 0.5066667); Overall results as: (precision, recall, F1)= (0.7777778, 0.7777778, 0.7777778); 1-gram on Token.root + Token.category + Token.orth 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.76031744, 0.9732143, 0.8419565); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.8, 0.425, 0.50000006); Overall results as: (precision, recall, F1)= (0.7777778, 0.7777778, 0.77777773); 1-gram on Token.string + Token.category + Token.orth 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.76031744, 0.9732143, 0.8419565); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.8, 0.425, 0.50000006); Overall results as: (precision, recall, F1)= (0.7777778, 0.7777778, 0.77777773); 1-gram on Token.string + Token.category 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.76031744, 0.9732143, 0.8419565); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.8, 0.425, 0.50000006); Overall results as: (precision, recall, F1)= (0.7777778, 0.7777778, 0.77777773); 1-gram on Token.string 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7621032, 0.9875, 0.8495755); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.85, 0.425, 0.51666677); Overall results as: (precision, recall, F1)= (0.7888889, 0.7888889, 0.7888888); **** 2008-05-13 2-gram on Token.string 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7186508, 0.9708333, 0.8107419); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.6, 0.28333333, 0.33190477); Overall results as: (precision, recall, F1)= (0.7222222, 0.7222222, 0.7222222); 2-gram on Token.root 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7103175, 0.9833333, 0.81169426); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.55, 0.25, 0.3152381); Overall results as: (precision, recall, F1)= (0.72222215, 0.72222215, 0.72222215); 2-gram on Token.root + Token.category 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7103175, 0.9708333, 0.8050276); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.5, 0.25, 0.28190476); Overall results as: (precision, recall, F1)= (0.71111107, 0.71111107, 0.71111107); 3-gram on Token.root + Token.category 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.6847222, 0.9708333, 0.78820527); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.25, 0.14166667, 0.1352381); Overall results as: (precision, recall, F1)= (0.67777777, 0.67777777, 0.67777777); 3-gram on Token.root 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.68055546, 0.9708333, 0.7836598); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.15, 0.125, 0.10666667); Overall results as: (precision, recall, F1)= (0.6666666, 0.6666666, 0.6666666); 3-gram on Token.string 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.6847222, 0.9708333, 0.78820527); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.25, 0.14166667, 0.1352381); Overall results as: (precision, recall, F1)= (0.67777777, 0.67777777, 0.67777777); 2-gram on Token.string + Token.orth 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7186508, 0.9708333, 0.8107419); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.6, 0.28333333, 0.33190477); Overall results as: (precision, recall, F1)= (0.7222222, 0.7222222, 0.7222222); 2-gram on Token.root + Token.orth 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7103175, 0.9708333, 0.8050276); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.5, 0.25, 0.28190476); Overall results as: (precision, recall, F1)= (0.71111107, 0.71111107, 0.71111107); 1-gram on Token.root + Token.orth 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7721032, 0.9875, 0.8579089); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.85, 0.44166666, 0.5300001); Overall results as: (precision, recall, F1)= (0.79999995, 0.79999995, 0.79999995); 1-gram on Token.string + Token.orth 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7621032, 0.9875, 0.8495755); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.85, 0.425, 0.51666677); Overall results as: (precision, recall, F1)= (0.7888889, 0.7888889, 0.7888888); 1-gram on Token.category + Token.orth 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.67222226, 0.9833333, 0.784783); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.1, 0.05, 0.06666667); Overall results as: (precision, recall, F1)= (0.6666667, 0.6666667, 0.6666667); 2-gram on Token.category + Token.orth 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.77936506, 0.9565476, 0.84722453); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.75, 0.4916667, 0.5390476); Overall results as: (precision, recall, F1)= (0.7888888, 0.7888888, 0.7888888); ***** 2008-05-14 1-gram: Token.root + Token.orth thresholdProbabilityClassification = 0.1 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7721032, 0.9875, 0.8579089); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.85, 0.44166666, 0.5300001); Overall results as: (precision, recall, F1)= (0.79999995, 0.79999995, 0.79999995); thresholdProbabilityClassification = 0.3 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7721032, 0.9875, 0.8579089); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.85, 0.44166666, 0.5300001); Overall results as: (precision, recall, F1)= (0.79999995, 0.79999995, 0.79999995); thresholdProbabilityClassification = 0.4 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7721032, 0.9875, 0.8579089); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.85, 0.44166666, 0.5300001); Overall results as: (precision, recall, F1)= (0.79999995, 0.79999995, 0.79999995); thresholdProbabilityClassification = 0.45 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7721032, 0.9875, 0.8579089); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.85, 0.44166666, 0.5300001); Overall results as: (precision, recall, F1)= (0.79999995, 0.79999995, 0.79999995); thresholdProbabilityClassification = 0.5 (standard) 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7721032, 0.9875, 0.8579089); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.85, 0.44166666, 0.5300001); Overall results as: (precision, recall, F1)= (0.79999995, 0.79999995, 0.79999995); thresholdProbabilityClassification = 0.55 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7721032, 0.9875, 0.8579089); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.85, 0.44166666, 0.5300001); Overall results as: (precision, recall, F1)= (0.79999995, 0.79999995, 0.79999995); thresholdProbabilityClassification = 0.6 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7721032, 0.9875, 0.8579089); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.85, 0.44166666, 0.5300001); Overall results as: (precision, recall, F1)= (0.79999995, 0.79999995, 0.79999995); thresholdProbabilityClassification = 0.65 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.76853174, 0.9708333, 0.84911764); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.9, 0.39166668, 0.52000004); Overall results as: (precision, recall, F1)= (0.8041667, 0.76666665, 0.7843137); thresholdProbabilityClassification = 0.7 0 LabelName=down, number of instances=56 (precision, recall, F1)= (0.7875794, 0.9565476, 0.8543192); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.7, 0.3333333, 0.43); Overall results as: (precision, recall, F1)= (0.81527776, 0.73333335, 0.7699346); thresholdProbabilityClassification = 0.9 0 LabelName=down, number of instances=56 (precision, recall, F1)= (1.0, 0.44809526, 0.60162723); 1 LabelName=up, number of instances=27 (precision, recall, F1)= (0.1, 0.1, 0.1); Overall results as: (precision, recall, F1)= (1.0, 0.29999998, 0.44842157);