Evaluation of project works

Maximum points: Presentation 10, Document 10, Program 12, Validation 8, Total 40.

Jesse Hauninen, Lukas Obrdlik and Dominik Wisniewski: "Zoo" animal recognition system

Notice! Program part could be increased, if I get it compiled + tested!

Presentation (student evaluation): 9.6
Document 10
All parts clearly written. The only unclear thing was how the animal is predicted, when the group is known -> explained afterwards. Diagrams of the network would be nice. Use of Bayes rule carefully described.
Program 10 (can be increased, if I get it compiled!)
Naive Bayes implemented correctly.
Validating results (either using data or literature) 3
Only the training error reported. No trial to evaluate generalization error or test the error with new instances outside the training set.

Dominik has done afterwards carefull analysis of error cases. +4p for him.
Total (preliminary): 32.6

Tersia Gowases , Ahmed Riadh Hashim and Belinda Wafula: Medical diagnosis system for malaria

Quite easy task to implement (just forward predicting in a Bayesian network), but really hard to find data!
Notice! Can be increased, if I get justifications for parameter settings.

Presentation (student evaluation)9.1
Document 6
Background well described, but unfortunately not the program. The model structure is described & ok. Problem: how the model parameters are defined?! In the first document user manual especially attractive.
Program 10
Very easy reading code. The program works nicely, implements what it should. Problem: when user answers "Maybe", prior probability 0.5 is used instead of known probabilities. However, the user can also define some other prior probability. Nice UI (two program versions).
Validating results (either using data or literature) 0
No validation! Could be validated by giving literature references, where model parameters are from. Impossible to validate in the common sense (by cross-validation)!
Total: 25.1 (can be increased by giving a report of literature refernces)

Alfiya Akhmetova, Alina Gutnova and Maxim Mozgovoy: Language recognition system

In testing strange results (much poorer than reported). The selected test documents were "harder", especially a piece from Kalevala. Especially the word frequency metric worked poorly, bigrams metric better.

Presentation (student evaluation) 9
Document 9
Quite clear text. Problem: still (in updated version) unclearly described the used metrics + their justification. Data sets not described. Otherwise good! No user manual!
Program 12
Original idea, implements the design, works surprisingly well!
Validating results (either using data or literature) 6
A test set was used + results reported. However, it seems that the test documents were very similar to training documents - how they were selected? You should have tried with different documents, even if they do not produce so nice results! (Testing aims especially for finding weaknesses!) Good: Also tested with languages which did not belong to the training set + results well analyzed.
Total: 36

Carolina Isalas Sedano, Mikko Vinni and Marek Winkler: Predicting course outcomes by Bayesian methods

The method was given with the topic -> quite easy design, although required understanding the idea of TAN Bayesian networks.

Presentation (student evaluation) 8.6
Document 10
Very clear, contains all parts, comprehensive. Only problem is some messy pictures -> Corrected! The final document is exceptionally excellent!
Program 12
Implements correctly what it should. Special emphasis put in crossvalidation (a more general environment than requiored). Easy to enlarge. Nice UI.
Validating results (either using data or literature) 8

Really well done. Mikko even discovered the reason for some strange results by mixing the training set again for cross validation, with much better results. Compared with NB and lin.reg (for numeric data). Results + comparisons well represented as diagrams. Unfortunately true pos and true neg not reported -> hard to make conclusions!
Total 38.6

Yuriy Lakhtin, Fedor Nikitin and Lukasz Racoczy: Predicting course outcomes by neural networks and SVMs

The methods were given with the topic, but more demanding and larger topic than the previous, because two (quite difficult) modelling paradigms were implemented. NN and SVM parts were implemented separately.

Presentation (student evaluation): 9.4
Document NN part 8, SVM part 10
NN: many parts are hard to read, sentences are not understandable. Text structure is in some parts unclear. Otherwise covers everything quite well. Functions are represented as figures -> illustrative. SVM: Really good! Clear and contains everything.
Program: NN: 12p, SVM 8p
SVM part: using SVM tool + programmmed a small data converter. However, Lukasz did all the studying and work himself. 8p

NN part implemented by matlab scripts + a java application for data preprocessing and crossvalidation. Learning to use matlab NN toolbox is quite demanding. All in all, a demanding task was well implemented, and the system is easy enlarge.
Validating results (either using data or literature) 5
NN: is very unstable and can produce very different results in consecutive executions. However, this was understood in the group (it could have been analyzed in the document, too!).
Unfortunately true pos and true neg are not reported in either. 5p
Total: NN 34.4, SVM 32.4

Antti Mikkonen, Konstantin Petrukhnov and Anahit Poghosova: Selecting the optimal imlementation method for an expert system

Demanding topic! The newest version tries to select the optimal method for classification problem (among DT, BN, NB, NN, SVM). Special difficulty: how to use the material available - managed quite well!

Presentation (student evaluation) 8.6
Document 7
Dec. tree part very briefly described, but Konstantin explained the idea afterwards. The picture is not exactly a dec.tree and ununderstandable without explanation. Clear, easy reading text.
Program 10 (Doesn't work always in very sensible way, but the problems are due to data set + design. The current version could work very well, if the data set just were larger!) Nearest neigh. part otherwise clever, but returns always the first match. Problem: two little set of data for nearest neighbours - more was available but not used.
Clever UI, easy to enlarge.
D-S doesn't represent ignorance -> could increase the accuracy! Still clever idea to use it for combining evidence.
Validating results (either using data or literature) 3
Sometimes produces quite strange results, sometimes correct. Reasons are known and the program follows its own logic. No real validation done!
Total 28.6

Maxim Dudochin, Matti Hyvärinen and Wojciech Wawrzyniak: Tree recognition system

Presentation (student evaluation) 9.6
Document 8
D-S part not described! Picture of dec.tree given + idea of splitting it explained quite well. However, the idea of whole program is not clear, without further explanations. -> explained afterwards (appendix). Otherwise clearly structures, nice reading.
Program 10
Clever, original idea how to handle ignorance in dec. trees. The original dec. tree is constructed using ID3, everything else is self implemented. Nice UI.
Validating results (either using data or literature) 3
Just training error, no real validation. Tested if the program allows one "I don't know" answer and still classifies correctly. More systematic testing would have been possible! (e.g. cross-validation)
Total 31.6