Trees and their immediate generalization, forests, are useful for any system that
manipulates hierarchically organized information. The range of applications is broad,
from abstract syntax trees in compilers through document structures in text processing
systems to company organization charts in business software.
Trees, in particular binary trees and their variants, also provide convenient
implementations of container data structures.
A tree consists of a set of nodes. Each node may have zero or more children, other
nodes of which it is the parent. Each node has at most one parent, although it may have
an arbitrary number of children.
The closeness of the notions of tree and node yields an elegant definition of trees in terms of lists. If we look at trees or, equivalently, at tree nodes, we can consider each node as being both:
An example of dynamic tree structure is provided by class TWO_WA Y_TREE, an heir of both TWO_WAY_LIST and BI_LINKABLE. There is also LINKED_TREE, which inherits from LINKED_LIST and LINKABLE, but TWO_WA Y_TREE is generally preferable since children of a node often needs to be traversed both ways; the notion of order is usually less significant here than for lists. Such a form of definition is a particularly effective way of conveying the profoundly recursive nature of trees. The corresponding classes are useful in many different areas such as graphics, text processing and compilation. To create a one-way or two-way linked tree, use
create my_tree.make (root_value)
This will attach my_tree to a new one-node tree, with root_value at the root node. Here my_tree must be declared of type TWO_WA Y_TREE [MY_TYPE] for some type MY_TYPE, and root_value must be of type MY_TYPE.create my_tree.make (max_estimated_children, root_value)
The integer argument max_estimated_children only serves for the initial allocation; the array will be resized if the number of children grows beyond the initial size. As with the previous kinds of tree, the newly created node initially has no children.TWO_WA Y_TREE is useful for fully dynamic trees, in which a node may get new children at any time. For some applications, the number of children of a tree, although still arbitrary, is set for each node when the node is created, and will not change after that. Of course, some children may be absent; the corresponding entries will be void references. Class FIXED_TREE provides the basic mechanism; as you may have guessed, the implementation associates with each node an array of its children, arrays usually being the right structure when a collection of objects is known not to change size after creation. To create a fixed tree, use
create my_tree.make (how_many_children, root_value)
The root will have the associated value root value and will have how_many_ children children, all initially void. Unlike the argument max_estimated_children for the creation procedure of ARRAYED_TREE , the value of how_many_children is the final arity of the newly created node; since it set separately for each node on creation, the various nodes of a tree can have different arities, as illustrated on the above figure.
Whether fixed or dynamic, recursive trees fully enjoy their dual origin. This means in
particular that each node is viewed as a list of its children, and can apply to this list the
features inherited from LIST,
appropriately renamed; for example:
child_put_left
child_forth
child_put
and so on. Feature count,
inherited from LIST, indicates
the number of children; it is renamed arity to conform to accepted
tree terminology. (The word is a substantived form of the “ary” adjective ending, as in
“ternary”, “quaternary” and so on, which yielded the expression “n-ary”.)
Binary trees are a special case of fixed trees in which nodes always have two children, although either or both of these children may be void.
Class BINARY_TREE
describes binary trees.
Class BINARY_SEARCH_TREE
describes binary search trees, an implementation of bags which is appropriate for comparable items.
Binary search trees rely for insertion on a policy whereby any item less than the
root is inserted (recursively) into the left subtree, and any item greater than the root into
the right subtree. So if the insertion order is reasonably random the items will distribute
evenly among the various branches, This means that the average height of the tree will
be not much more than the optimal: [log2 n] where n is the number of nodes and [x],
for any x, is the integer part of x.
Since search operations will follow the same principle (search left if smaller than
the root, and so on), the time to find an item or ascertain that it is not there is
proportional to the average height. In normal cases this means about [log2 n] basic
operations, rather than n with a linear structure such as a list, and hence much better
performance for large n.
Recursive trees, as described so far, are not active data structures: even though each node has its own cursor to traverse the list of its children, there is no global cursor on the tree as a whole. It is not hard to see that the notion of recursive tree is in fact incompatible with the presence of a global cursor. In situations where you need such a cursor, enabling you to move freely from a node to its children, siblings and parents, you may use class CURSOR_TREE and its descendants.
With cursor trees the model is different from what we have seen earlier in this chapter: there is a clear distinction between the nodes and the tree itself. The manipulated object is a tree, and the notion of node is merely implicit. In the various operations presented below and illustrated on the following figure, “up” means towards the root and “down” towards the leaves. This, of course, is the reverse of the properties of trees of the other kind - those which grow towards the sun and serve to print books about software.
The cursor supported by instances of
CURSOR_TREE has
a position referring to a node of the tree, which is then considered to be the active node,
or is off the tree. The different off positions are:
above
(above the root),
below
(below a leaf),
before
(before a leftmost sibling),
after
(after a rightmost sibling.) As with linear structures, fictitious sentinel elements are
assumed to be present to the left, right, top and bottom.
Various procedures are available to move the cursor in all directions:
You can move the cursor in any one direction (
up,
down,
forth,
back),
repeatedly, until it is
off
(above,
below,
after,
before
respectively), but once it is
off,
further movement in the same direction is prohibited. For example the precondition of
put_left
requires
before
to be false, and the precondition of
put_right
requires after
to be false.
It is possible to move down from an
above
position; in an empty tree this brings the cursor
below.
Similarly, it is possible to move up from
below,
left from
after,
right from
before.
The sentinel element above the tree’s root is considered to be the root of a forest
containing just one tree. This view justifies the convention for the result of arity when
the cursor is above: 0 if the tree
is_empty,
1 if it has a root (viewed as the child of the fictitious sentinel element).
The cursor attached to a cursor tree is not just a conceptual notion but an actual object, of type CURSOR. You may use the query cursor to obtain a reference to the current cursor. Procedure go_to takes a cursor as argument and brings the tree’s cursor to the node identified by the value of that argument.
A useful notion associated with trees and particularly applicable to cursor trees is that of
traversal.
A traversal is a certain policy for ordering all the nodes in a tree - usually to apply
an operation to all these nodes in the resulting order.
CURSOR_TREE and its
descendants support three forms of traversal: preorder, postorder and breadth-first.
They correspond to the most commonly used traversal policies on trees, illustrated on
the figure (where the children of each node are assumed to be ordered from left to
right):
For each of the traversals, procedures are available to move the cursor accordingly, for example breadth_start and breadth_ forth for breadth-first, and similar names for the others.