Evaluation of project works
Maximum points: Presentation 10, Document 10, Program 12, Validation 8,
Total 40.
Jesse Hauninen, Lukas Obrdlik and Dominik Wisniewski: "Zoo" animal
recognition system
Notice! Program part could be increased, if I get it compiled + tested!
- Presentation (student evaluation): 9.6
- Document 10
All parts clearly written. The only unclear thing was how the animal is
predicted, when the group is known -> explained afterwards. Diagrams of
the network would be nice. Use of Bayes rule carefully described.
- Program 10 (can be increased, if I get it compiled!)
Naive Bayes implemented correctly.
- Validating results (either using data or literature) 3
Only the training error reported. No trial to evaluate generalization
error or test the error with new instances outside the training set.
Dominik has done afterwards carefull analysis of error cases. +4p for him.
- Total (preliminary): 32.6
Tersia Gowases , Ahmed Riadh Hashim and Belinda Wafula: Medical
diagnosis system for malaria
Quite easy task to implement (just forward predicting in a Bayesian network),
but really hard to find data!
Notice! Can be increased, if I get justifications for parameter settings.
- Presentation (student evaluation)9.1
- Document 6
Background well described, but unfortunately not the program. The model
structure is described & ok. Problem: how the model parameters are defined?!
In the first document user manual especially attractive.
- Program 10
Very easy reading code. The program works nicely, implements what it should.
Problem: when user answers "Maybe", prior probability 0.5 is used instead of
known probabilities. However, the user can also define some other prior
probability. Nice UI (two program versions).
- Validating results (either using data or literature) 0
No validation! Could be validated by giving literature references,
where model parameters are from. Impossible to validate in the common
sense (by cross-validation)!
- Total: 25.1 (can be increased by giving a report of literature refernces)
Alfiya Akhmetova, Alina Gutnova and Maxim Mozgovoy: Language
recognition system
In testing strange results (much poorer than reported). The selected
test documents were "harder", especially a piece from
Kalevala. Especially the word frequency metric worked poorly, bigrams
metric better.
- Presentation (student evaluation) 9
- Document 9
Quite clear text. Problem:
still (in updated version) unclearly described the used metrics + their
justification. Data sets not described. Otherwise good! No user manual!
- Program 12
Original idea, implements the design, works surprisingly well!
- Validating results (either using data or literature) 6
A test set was used + results reported. However, it seems that the test
documents were very similar to training documents - how they were selected?
You should have tried with different documents, even if they do not produce
so nice results! (Testing aims especially for finding weaknesses!)
Good: Also tested with languages which did not belong to the
training set + results well analyzed.
- Total: 36
Carolina Isalas Sedano, Mikko Vinni and Marek Winkler: Predicting
course outcomes by Bayesian methods
The method was given with the topic -> quite easy design, although
required understanding the idea of TAN Bayesian networks.
- Presentation (student evaluation) 8.6
- Document 10
Very clear, contains all parts, comprehensive. Only problem is some messy
pictures -> Corrected! The final document is exceptionally excellent!
- Program 12
Implements correctly what it should.
Special emphasis put in crossvalidation (a more general environment than
requiored). Easy to enlarge. Nice UI.
- Validating results (either using data or literature) 8
Really well done. Mikko even discovered the reason for some strange
results by mixing the training set again for cross validation, with much
better results. Compared with NB and lin.reg (for numeric data).
Results + comparisons well represented as diagrams. Unfortunately true
pos and true neg not reported -> hard to make conclusions!
- Total 38.6
Yuriy Lakhtin, Fedor Nikitin and Lukasz Racoczy: Predicting course
outcomes by neural networks and SVMs
The methods were given with the topic, but more demanding and larger
topic than the previous, because two (quite difficult) modelling
paradigms were implemented. NN and SVM parts were implemented separately.
- Presentation (student evaluation): 9.4
- Document NN part 8, SVM part 10
NN: many parts are hard to read, sentences are not understandable.
Text structure is in some parts unclear. Otherwise covers everything
quite well. Functions are represented as figures -> illustrative.
SVM: Really good! Clear and contains everything.
- Program: NN: 12p, SVM 8p
SVM part: using SVM tool + programmmed a small data converter. However,
Lukasz did all the studying and work himself. 8p
NN part implemented by matlab scripts + a java application for data
preprocessing and crossvalidation. Learning to use matlab NN toolbox
is quite demanding. All in all, a demanding task was well implemented,
and the system is easy enlarge.
- Validating results (either using data or literature) 5
NN: is very unstable and can produce very different results in consecutive
executions. However, this was understood in the group (it could have been
analyzed in the document, too!).
Unfortunately true pos and true neg are not
reported in either. 5p
- Total: NN 34.4, SVM 32.4
Antti Mikkonen, Konstantin Petrukhnov and Anahit Poghosova: Selecting
the optimal imlementation method for an expert system
Demanding topic! The newest version tries to select the optimal method
for classification problem (among DT, BN, NB, NN, SVM). Special difficulty:
how to use the material available - managed quite well!
- Presentation (student evaluation) 8.6
- Document 7
Dec. tree part very briefly described, but Konstantin explained the idea
afterwards. The picture is not exactly a dec.tree and ununderstandable
without explanation. Clear, easy reading text.
- Program 10 (Doesn't work always in very sensible way, but the
problems are due to data set + design. The current version could work
very well, if the data set just were larger!)
Nearest neigh. part otherwise clever, but returns always the first match.
Problem: two little set of data for nearest neighbours - more was available
but not used.
Clever UI, easy to enlarge.
D-S doesn't represent ignorance -> could increase the accuracy! Still
clever idea to use it for combining evidence.
- Validating results (either using data or literature) 3
Sometimes produces quite strange results, sometimes correct. Reasons are
known and the program follows its own logic. No real validation done!
- Total 28.6
Maxim Dudochin, Matti Hyvärinen and Wojciech Wawrzyniak: Tree
recognition system
- Presentation (student evaluation) 9.6
- Document 8
D-S part not described! Picture of dec.tree given + idea of splitting it
explained quite well. However, the idea of whole program is not clear,
without further explanations. -> explained afterwards (appendix).
Otherwise clearly structures, nice reading.
- Program 10
Clever, original idea how to handle ignorance in dec. trees.
The original dec. tree is constructed using ID3, everything else is self
implemented. Nice UI.
Validating results (either using data or literature) 3
Just training error, no real validation. Tested if the program allows one
"I don't know" answer and still classifies correctly. More systematic
testing would have been possible! (e.g. cross-validation)
- Total 31.6