Traditionally, fuzzy rules are defined by a human expert. However, nowadays it is possible to learn them from data, too (at least in some extent). The approaches use either neural networks (they furst learn ''fuzzy neural networks and then read the fuzzy rules from the network) or genetic algorithms. Here is one example: Roubous et al.: Learning fuzzy classification rules from data
The most flexible feature in D-S theory is that we can represent ignorance. For example, if the beliefs in A and not A are m(A) and m(-A), then it is possible that m(A)+m(-A)<1 and the amount of ignorance is 1-m(A)-m(-A).
We can demonstrate grafically, how beliefs m1 and m2 are combined in
Demspter-Shafer rule:
The area of squares describe the combined belief in A, not A or
ignorance (any). We have to normalize the beliefs by dividing with the
area of consistent (non-contradictory) beliefs. The whole square
corresponds belief value 1.0. The contradictory area has value
m1(A)*m2(-A)+m1(-A)*m2(A). Thus the remaining area is
K=1-m1(A)*m2(-A)+m1(-A)*m2(A). Now the combined beliefs are:
m1+m2(A)=m1(A)*m2(A)+m1(any)*m2(A)+m1(A)*m2(any)/K
m1+m2(-A)=m1(-A)*m2(-A)+m1(any)*m2(-A)+m1(-A)*m2(any)/K
And the mount if ignorance is:
m1+m2(any)=m1(any)*m2(any)/K.
Notice that A can be also a set of propositions, e.g. A=A1 or A2.