***** 2008-05-14 5-fold cross-validation (full corpus: 1280 documents) Using a reduced corpus of 600 documents. 1-gram Token.root + Token.category 0 LabelName=1_Star_Review, number of instances=454 (precision, recall, F1)= (0.76470774, 0.77866936, 0.7695627); 1 LabelName=2_Star_Review, number of instances=132 (precision, recall, F1)= (0.25, 0.012828283, 0.024035087); 2 LabelName=3_Star_Review, number of instances=187 (precision, recall, F1)= (0.3222222, 0.041714974, 0.07292265); 3 LabelName=4_Star_Review, number of instances=1102 (precision, recall, F1)= (0.40315515, 0.17673904, 0.2429281); 4 LabelName=5_Star_Review, number of instances=3966 (precision, recall, F1)= (0.7893709, 0.9078147, 0.84306157); Overall results as: (precision, recall, F1)= (0.7573564, 0.71861595, 0.73745704); 1-gram Token.root 0 LabelName=1_Star_Review, number of instances=454 (precision, recall, F1)= (0.80601394, 0.8005349, 0.79887116); 1 LabelName=2_Star_Review, number of instances=132 (precision, recall, F1)= (0.3, 0.009191919, 0.017669173); 2 LabelName=3_Star_Review, number of instances=187 (precision, recall, F1)= (0.44761905, 0.03147343, 0.058424734); 3 LabelName=4_Star_Review, number of instances=1102 (precision, recall, F1)= (0.44071302, 0.1514436, 0.225185); 4 LabelName=5_Star_Review, number of instances=3966 (precision, recall, F1)= (0.7905634, 0.925197, 0.8515309); Overall results as: (precision, recall, F1)= (0.77023274, 0.72967184, 0.74939024); 1-gram Token.string 0 LabelName=1_Star_Review, number of instances=454 (precision, recall, F1)= (0.79177153, 0.77579737, 0.78005296); 1 LabelName=2_Star_Review, number of instances=132 (precision, recall, F1)= (0.26666668, 0.012828283, 0.024291497); 2 LabelName=3_Star_Review, number of instances=187 (precision, recall, F1)= (0.4833333, 0.039076082, 0.071640216); 3 LabelName=4_Star_Review, number of instances=1102 (precision, recall, F1)= (0.41225925, 0.1678707, 0.23693332); 4 LabelName=5_Star_Review, number of instances=3966 (precision, recall, F1)= (0.7874697, 0.91636753, 0.845864); Overall results as: (precision, recall, F1)= (0.759956, 0.7229568, 0.7409666); 1-gram Token.root + Token.orth 0 LabelName=1_Star_Review, number of instances=454 (precision, recall, F1)= (0.7887579, 0.7750289, 0.77804005); 1 LabelName=2_Star_Review, number of instances=132 (precision, recall, F1)= (0.4666667, 0.025858587, 0.0481685); 2 LabelName=3_Star_Review, number of instances=187 (precision, recall, F1)= (0.65, 0.040621977, 0.07631316); 3 LabelName=4_Star_Review, number of instances=1102 (precision, recall, F1)= (0.4694272, 0.15873934, 0.23662534); 4 LabelName=5_Star_Review, number of instances=3966 (precision, recall, F1)= (0.7866119, 0.9231432, 0.848108); Overall results as: (precision, recall, F1)= (0.7656352, 0.7269664, 0.74576813); 2-gram Token.root ==== java.lang.NegativeArraySizeException at gate.learning.NLPFeaturesOfDoc.gatedoc2NgramFeatures(NLPFeaturesOfDoc.java:177) at gate.learning.NLPFeaturesOfDoc.obtainDocNLPFeatures(NLPFeaturesOfDoc.java:113) at gate.learning.LightWeightLearningApi.annotations2NLPFeatures(LightWeightLearningApi.java:194) at gate.learning.EvaluationBasedOnDocs.oneRun(EvaluationBasedOnDocs.java:306) at gate.learning.EvaluationBasedOnDocs.kfoldEval(EvaluationBasedOnDocs.java:161) at gate.learning.EvaluationBasedOnDocs.evaluation(EvaluationBasedOnDocs.java:88) at gate.learning.LearningAPIMain.execute(LearningAPIMain.java:762) at gate.creole.ConditionalSerialController.runComponent(ConditionalSerialController.java:143) at gate.creole.SerialController.executeImpl(SerialController.java:148) at gate.creole.ConditionalSerialAnalyserController.executeImpl(ConditionalSerialAnalyserController.java:71) at gate.creole.AbstractController.execute(AbstractController.java:65) at gate.gui.SerialControllerEditor$RunAction$1.run(SerialControllerEditor.java:1253) at java.lang.Thread.run(Thread.java:595) 2-gram Token.string