www.delorie.com/gnu/docs/gnugo/gnugo_164.html | search |
Buy GNU books! | |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The most flavouring point is the building of the minimal DFA recognizing a given set of patterns. To perform the insertion of a new pattern into an already existing DFA one must completly rebuild the DFA: the principle is to build the minimal CDFA recognizing the new pattern to replace the original CDFA with its synchronised product by the new one.
We first give a formal definition: Let L be the left CDFA and R be the right one. Let B be the synchronised product of L by R. Its states are the couples (l,r) where l is a state of L and r is a state of R. The state (0,0) is the error state of B and the state (1,1) is its initial state. To each couple (l,r) we associate the union of patterns recognized in both l and r. The transitions set of B is the set of transitions (l1,r1)---a--->(l2,r2) for each symbol 'a' such that both l1--a--->l2 in L and r1--a--->r2 in R.
The maximal number of states of B is the product of the number of states of L and R but almost all this states are non reachable from the initial state (1,1).
The algorithm used in function 'sync_product()
' builds
the minimal product DFA only by keeping the reachable states.
It recursively scans the product CDFA by following simultaneously
the transitions of L and R. A hast table
(gtest
) is used to check if a state (l,r) has
already been reached, the reachable states are remapped on
a new DFA. The CDFA thus obtained is minimal and recognizes the
union of the two patterns sets.
For example these two CDFA's:
Give by synchronization product the following one:
It is possible to construct a special pattern database that generates an "explosive" automaton: the size of the DFA is in the worst case exponential in the number of patterns it recognizes. But it doesn't occur in pratical situations: the dfa size tends to be stable. By stable we mean that if we add a pattern which greatly increases the size of the dfa it also increases the chance that the next added pattern does not increase its size at all. Nevertheless there are many ways to reduce the size of the DFA. Good compression methods are explained in Aho, Ravi Sethi, Ullman "COMPILERS: Principles, Techniques and Tools" chapter Optimization of DFA-based pattern matchers.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
webmaster | delorie software privacy |
Copyright © 2003 by The Free Software Foundation | Updated Jun 2003 |