NOTE: This article is Chapter 8 from my unpublished textbook Applied Algorithms and Data Structures.
TopDown 234 Trees
The following discussion is adapted from Sedgewick, R Algorithms in C. Reading, Massachusetts: AddisonWesley, 1990, pp. 215220. In 234 trees, nodes can hold more than one key. Nodes with two keys (and three children) are called 3nodes, whereas nodes with 3 keys (and four children) are called 4nodes. The structure of 234 nodes is illustrated below.
Ordinary binary trees are a special case of 234 trees, consisting exclusively of 2nodes containing one key and having two children.
The keys stored in the subtrees of 34 nodes, and those stored in the nodes themselves, satisfy the following orderings.
Orderings for 3nodes  Orderings for 4nodes 
 
As an illustration of the kinds of transformations that can be performed on the nodes of 234 trees during insertion operations, consider the initial tree shown on the right.
 
If we insert the key "X", then the 2node containing "S" can be turned into a 3node to store the new key in its proper place. This is an example of augmenting a node. Similarly, a 3node can be transformed into a 4node.  
It is not difficult to imagine a situation where we end up having to insert a new key into a 4node. For example, the key "G" would go into the middle 4node of the example tree. A simpleminded solution would be to let the height of the tree increase.  
The problem with this approach is that the tree becomes unbalanced. A better solution is to spread the tree horizontally. First, split the 4node into two 2nodes containing the left and right keys, and transfer the middle key to the node’s parent.  
Then, insert the new key into one of the 2nodes produced. The second approach also presents a problem. What if it is necessary to split a 4node whose parent is also a 4node? In that case, split the parent and, if necessary, the grandparent, and so on all the way up to the root.  
To avoid the need to split multiple nodes after an insertion, we can follow a topdown approach. During insertion, make sure that the parent of any node visited will not be a 4node. This is accomplished by splitting any 4node visited on the way down the tree. In addition, whenever the root of the tree becomes a 4node, split it into three 2nodes. This is simpler than waiting until the next insertion to do the split because we need not worry abot the parent of the root. Splitting the root is the only operation that increases by one the height of the tree. Two of the possible transformation rules for 234 trees are illustrated below.
As an example of the topdown approach to build 234 trees, consider the insertion of the sequence of keys ABCDEFGHIJKLM
into an initially empty tree. (Note that such a sequence of keys would lead to a rightskewed binarysearch tree or "vine".) The following diagrams illustrate the insertion of keys "A" through "H". The reader is encouraged to finish the tree by inserting the remaining keys.
Properties of TopDown 234 Trees
TopDown 234 trees have three important properties. First, the transformation operations performed during insertions yield perfectlybalanced trees. Second, searches in Nnode 234 trees never visit more than (log)_2(N)+1 nodes because the distance from the root to every leaf is the same, and splitting the root increases the height by one. Hence, if all nodes are 2nodes the result holds since the tree is a perfect binary tree. If there are 3nodes and 4nodes, the height can only be lower. Third, insertions into Nnode 234 trees require fewer than (log)_2(N)+1 node splits in the worst case, and seem to require less than one node split on the average. Empirical studies consistently show that very few splits occur. (This will be confirmed later by sample executions.)
Representation of 234 Nodes
In order to implement search, insertion, and deletion operations on 234 trees, we must design a suitable data structure to represent the nodes. One possibility is to use unions in C or variant records in Pascal. For instance, assume that DataType
denotes a suitable type for the keys stored in 234 trees. We could define a Pascal variant record as follows.
type
nRange = 2 .. 4;
a234ptr = ^ a234rec;
a234rec = record
d1 : DataType;
left, right : a234ptr;
case n : nRange of
3 : ( d23 : DataType;
mid : a234ptr );
4 : ( d24, d3 : DataType;
midL, midR : a234ptr )
end;
Alternatively, we could define at the outset the largest possible Pascal record (or C/C++ structure), and enforce its proper manipulation. For example, in C/C++ we could define
typedef struct a234struct a234str;
typedef a234str * a234ptr;
struct a234struct {
int n;
DataType d1, d2, d3;
a234ptr p1, p2, p3, p4;
};
The above two possibilities for defining 234 nodes would cause the implementation of operations on 234 trees to be quite involved, because the data and pointer fields must be individually manipulated by name. (See, for example, Aho, A.V., J.E. Hopcroft, and J.D. Ullman Data Structures and Algorithms. Reading, Massachusetts: AddisonWesley, 1983, pp. 176178, which presents a twopage implementation of the insertion operation on 23 trees!) Such implementation is based on explicit naming of fields in a Pascal variant record.) In order to motivate our implementation, it is worth quoting what has been said on the issue of implementing 234 trees:
It seems that Sedgewick’s opinion explains why 234 trees are usually implemented in terms of binary redblack trees. (See, for example, Cormen T.H., C.E. Leiserson, R.L. Rivest, and C. Stein Introduction to Algorithms. Cambridge, Massachusetts: The MIT Press, Second Edition, 2001, Chapter 13.)
It turns out that 234 trees can be represented in a simple and uniform way, without resorting to the "coloring" trick used with redblack trees. The data and pointer fields of 234 trees can be implemented by means of arrays, which allow both systematic and efficient access, as shown below.
typedef struct a234struct a234str;
typedef a234str * a234ptr;
struct a234struct {
int n;
DataType d[3];
a234ptr p[4];
};
The above definitions are in the spirit of Wirth’s Pascal implementation of insertion, deletion, and search operations on Btrees. (Wirth, N. Algorithms + Data Structures = Programs. Englewood Cliffs, N.J.: PrenticeHall, 1976, pp. 252257.) We will generalize this representation in order to implement generic, balanced, multiway trees (MWTs) of order m, where m is the number of pointers stored in a node. We will use generic pointers for the data items stored in the nodes.
In general, for implementation purposes, an nnode of a multiway tree (an MWT) contains n data items (or keys), and n+1 pointers. The following declarations define MWTs that can contain from 3nodes to 6nodes. The order of the MWT equals the size of the maximum node plus one.
#define N 2 // minimum mumber of data items in a node
#define M 5 // maximum number of data items in a node (M >= N + 1)
typedef struct mwtStruct mwtStr;
typedef mwtStr * mwtPtr;
typedef char * genPtr;
typedef struct InfoPtrStr {
genPtr info;
mwtPtr ptr;
};
typedef struct mwtStruct {
int m;
InfoPtrStr e[ M + 1 ];
};
Given these definitions, an object stored at position i
(actually pointed to by InfoPtrStr e[ i ].info
) of a node is "bracketed" by the pointers InfoPtrStr e[ i1 ].ptr
and e[ i+1 ].ptr
. The valid range for i
is from 1
to M
. The allocation of structures for MWTs is accomplished by means of the following two functions.
In practical applications, the heap space used to allocate dynamic structures might become exhausted. This is the reason for the implementation and use of function AllocAbort
. If that function is ever called, the program terminates issuing an appropriate message.
The implementation of operations on MWTs is not a trivial task. Therefore, I will follow a wishfulthinking, topdown approach to implement a generic MWT abstract data type (ADT). We start off with the skeleton of a class definition, showing the data and function members that immediately come to mind from past experience with other ADTs. In addition to the constructor and the destructor, the obvious operations on MWTs are insert
, remove
, search
, print
, and destroy
. Observe that the public functions defined to perform these operations hide from the user the pointer to the root of the tree. Observe also that an MWT is destroyed only if both the OKtoDelInfo
flag is nonzero, and the function pointer DelFn
is not bound to NULL
.
enum Relval { LT, EQ, GT };
class gMWtree {
protected:
mwtPtr root;
int Mdiv2,
PrintFlag,
OKtoDelInfo;
void (*PrnFn)( genPtr, int );
RelVal (*CmpFn)( genPtr, genPtr );
void (*DelFn)( genPtr );
genPtr (*CoyFn)( genPtr, genPtr );
...
public:
gMWtree();
~gMWtree()
{
if ( OKtoDelInfo && (DelFn != NULL) )
destroy();
}
void destroy() { DeleteAll( root ); }
genPtr search( genPtr gp ) { return prSearch( gp, root ); }
genPtr insert( genPtr );
void print() { PrintTree( root ); }
genPtr remove( genPtr );
};
The constructor initializes all the data members to define a truly abstract class. The data member Mdiv2
defines the "mid point" of an mnode that must be split after becoming full.
gMWtree::gMWtree()
{
root = NULL;
PrnFn = NULL; CmpFn = NULL;
DelFn = NULL; CpyFn = NULL;
OKtoDelInfo = 1; PrintFlag = 0;
Mdiv2 = M >> 1;
if ( IsOdd( M ) )
++Mdiv2;
printf( "M div 2 == %d\n", Mdiv2 );
}
Function IsOdd
is simply defined as
int IsOdd( int i )
{
return i & 0x1;
} .
The deletion of all the elements of an MWT is done recursively. Observe that function DeleteAll
(protected
) performs an in order traversal of the tree. Because of the call free( t )
at the end of the function, the second recursive call is not tailrecursive. Hence it cannot be replaced by a goto statement to a label before the if statement.
void gMWtree::DeleteAll( mwtPtr t )
{
int i;
if ( t != NULL ) {
DeleteAll( t>e[ 0 ].ptr );
for ( i = 1; i <= t>m; ++i ) {
(*DelFn)( t>e[ i ].info );
DeleteAll( t>e[ i ].ptr );
}
free( t );
}
}
Search, Insertion, and Print Operations on Multiway Trees
Assuming that we have built an MWT, searching the tree is the first operation that comes to mind. In the class definition, the public
function search
initiates the search at the root
of the tree by calling the protected
function prSearch
.
Argument gp
to function prSearch
points to an object containing partial information, that is, search key(s). If an object matching the keys is found in the tree, the function returns a pointer to such object; otherwise it returns NULL
. Since MWTs are recursive structures, it is natural to write this function recursively, as indicated by the assertion before the two statements in the last else
clause. Such a call would be tailrecursive. These statements and the label before the first if
eliminate the tail recursion.
genPtr gMWtree::prSearch( genPtr gp, mwtPtr t )
{
int i; RelVal cmp;
l0:
if ( t == NULL )
return NULL;
else {
i = FindPlace( gp, t, 0, &cmp );
if ( cmp == EQ )
return t>e[ i+1 ].info;
else {
t = t>e[ i ].ptr;
goto l0;
}
}
}
Function prSearch
relies on function FindPlace
(protected) in order to find the place of the object sought in the multiway tree. It receives a pointer to the search object (gp
), a pointer to a node (t
), the index to start the search within the node ( i
), and a reference argument to return the result of the comparison function (cmp
). Since MWTs are extensions of binarysearch trees, the search continues as long as the object sought for is less than the objects stored in the node. The function returns the index of the pointer to the left of the object within the node that is greater than or equal to the search object. (In my first implementation of MWTs as 234 trees, the while
loop in function FindPlace
was part of function prSearch
. Pretty quickly, I realized that such a loop would have to be used during insertions, and hence I turned it into a function.)
int gMWtree::FindPlace( genPtr gp, mwtPtr t, int i, RelVal *cmp )
{
while ( i < t>m
&&
(*cmp = (*CmpFn)( t>e[ i+1 ].info, gp )) == LT )
++i;
return i;
}
The next operation that comes to mind is the insertion of an object into an MWT. If the tree is empty, function insert
creates a 1node (1 key, 2 pointers), and returns NULL
as an indication of the insertion of a new object.
Following the discussion on how to perform insertions in order to maintain a balanced tree, if the tree is not empty, function prInsert
(protected
) is called to perform the insertion. After the insertion, a full root node is split in preparation for the next insertion. Such a split is the only operation that makes the height of an MWT to increase by one.
int gMWtree::FindPlace( genPtr gp, mwtPtr t, int i, RelVal *cmp )
{
while ( i < t>m
&&
(*cmp = (*CmpFn)( t>e[ i+1 ].info, gp )) == LT )
++i;
return i;
}
The creation of a 1node is accomplished by the protected
functions Mk1node
and Mk0node
, defined as follows. (For selftracing, function Mk1node
prints a message and the object stored in the new 1node.)
mwtPtr gMWtree::Mk1node( genPtr gp,
mwtPtr p0, mwtPtr p1 )
{
mwtPtr q;
printf( "Mk1node: " );
(*PrnFn)( gp, PrintFlag ); NL();
q = Mk0node();
q>m = 1;
q>e[ 0 ].info = NULL;
q>e[ 0 ].ptr = p0;
q>e[ 1 ].info = gp;
q>e[ 1 ].ptr = p1;
return q;
}

mwtPtr gMWtree::Mk0node()
{
mwtPtr p = AllocMWTstr();
p>m = 0;
return p;
}
void NL()
{
printf( "\n" );
}

Function prInsert
(protected
) below does all the heavyduty work of inserting a new object into a multiway tree. The basic algorithm for its implementation is an inorder traversal of the tree, looking for the insertion place. Naturally, this function would also be defined recursively, with tailrecursion removed. Observe the use of function FindPlace
in much the same way it was used by function prSearch
.
The protected
function FullNode
, called by both search
and prSearch
, simply determines whether or not a node of an MWT has been filled completely with m
objects (indexed from 1 to m
).
int FullNode( mwtPtr t ) { return t != NULL && t>m == M; }
We turn now to the operations of splitting a full root node, splitting a full child node, and augmenting a leaf node.
A full root node contains M objects and M+1 pointers. It is split by creating two new nodes at p0 and p1 , storing the objects and pointers in the range 0 to Mdiv21 in the node at p0 , storing the objects and pointers in the range Mdiv2+1 to M in the node at p1 , storing the Mdiv2 object in the root node, and making the root node a 2node with children at p0 and p1 . 

This algorithmic description is implemented by the protected
function SplitRoot
:
void gMWtree::SplitRoot()
{
mwtPtr p0, p1;
ReportSplit( "root", root );
split( root, &p0, &p1 );
root>e[ 1 ].info = root>e[ Mdiv2 ].info;
root>e[ 0 ].ptr = p0;
root>e[ 1 ].ptr = p1;
root>m = 1;
}
To do its job, SplitRoot
relies on the protected
function split
, which creates the two new nodes, and calls the protected
function CopyRange
twice to copy two ranges of objects and pointers from the node at q
into the nodes at *ap0
and *ap1
. The justification for defining function split
is that we can use it again to split a full child node.
void gMWtree::split( mwtPtr q,
mwtPtr *ap0,
mwtPtr *ap1 )
{
mwtPtr p0, p1;
*ap0 = p0 = Mk0node();
*ap1 = p1 = Mk0node();
CopyRange( p0, 1, q, 1, Mdiv21 );
CopyRange( p1, 1, q, Mdiv2+1, M );
}
void gMWtree::CopyRange( mwtPtr dst,
int k,
mwtPtr src,
int i,
int j )
{
int s = i;
dst>e[ k1 ].ptr = src>e[ i1 ].ptr;
while ( s <= j )
dst>e[ k++ ] = src>e[ s++ ];
dst>m += j  i + 1;
}
A graphical description of function CopyRange
is shown below.
Copy objects and pointers in the ranges i
to j
, and i1
to j
, respectively, of the node at src
, starting at the pointer position k1
of the node at dst
, and update its size.
The justification for defining function CopyRange
is obviously the fact that it is used twice.
The protected
function ReportSplit
is called by split
for selftracing purposes, and is implemented as follows.
void gMWtree::ReportSplit( char *str, mwtPtr p )
{
printf( "split %s ", str );
PrintNode( "", p, 1 );
}
Augmenting a node involves adding one object and two pointers to nodes. To augment a node pointed to by, say t , at the pointer position i , we must first move one position to the right all the objects and pointers to nodes starting at i+1 and i , respectively.
After the movement, the object is stored in the node at position i+1 , and the pointers to the two nodes, say tL and tR , are stored at pointer positions i and i+1 , respectively.
 
This algorithmic description can be implemented in such a way that the movement of objects and pointers takes place from the end of the node toward pointer position i
and object position i+1
, as follows.
void gMWtree::augment
( mwtPtr t, genPtr gp, mwtPtr tL, mwtPtr tR, int i )
{
int j;
PrintNode( "augmenting", t, 0 );
printf( " with " ); (*PrnFn)( gp, PrintFlag );
++(t>m);
printf( " at i == %d\n", i );
for ( j = t>m1; j > i; j )
t>e[ j+1 ] = t>e[ j ];
t>e[ i+1 ].info = gp;
t>e[ i ].ptr = tL;
t>e[ i+1 ].ptr = tR;
}
To split a full child node at, say tI (corresponding to pointer position i from its parent node at t ), simply use function split on the child, to obtain two nodes at tL and tR , and then augment the parent node at position i with the "middle" object from the child, and the pointers to the new nodes. Behold the simplicity of the implementation! 

void gMWtree::SplitChild( mwtPtr t, mwtPtr tI, int i )
{
mwtPtr tL, tR;
ReportSplit( "child", tI );
split( tI, &tL, &tR );
augment( t, tI>e[ Mdiv2 ].info, tL, tR, i );
free( tI );
}
After the split, the child node is no longer needed, and it is returned to the heap space. Function SplitChild
should be protected
.
Observe that the augmentation of the parent node in function SplitChild
is justified because such a node was found not to be full during the search for the insertion place. This is the rationale behind splitting a full root node after an insertion, and splitting a full child on the way down the tree during the insertion.
To print an MWT, it is useful to implement a levelbylevel traversal. The task at hand is to print all the objects stored in a node, and to traverse all the children of each node. The implementation is pretty much straightforward. (The reader may skip the use of xorQueue, and write a class to implement ordinary linked queues to contain pointers to MWT nodes as the base class for mwtPtrQ.)
class mwtPtrQ : public xorQueue {
};
void gMWtree::PrintTree( mwtPtr t )
{
mwtPtrQ q;
NL();
q.enqueue( (genPtr)t );
LevelByLevelPrint( 0, &q );
NL();
}
void gMWtree::LevelByLevelPrint( int level, mwtPtrQ *pq0 )
{
int i;
mwtPtrQ q1;
mwtPtr p, c;
if ( !pq0>IsEmpty() ) {
printf( "level %2d: ", level );
while ( !pq0>IsEmpty() ) {
p = (mwtPtr)(pq0>dequeue());
if ( p != NULL ) {
PrintNode( "", p, 0 );
for ( i = 0; i <= p>m; ++i )
if ( (c = p>e[ i ].ptr) != NULL )
q1.enqueue( (genPtr)c );
}
else
printf( "_ " );
}
NL(); NL(); NL();
LevelByLevelPrint( level+1, &q1 );
}
}
void gMWtree::PrintNode( char *str, mwtPtr p, int nl )
{
if ( *str )
printf( "%s node ", str );
if ( p == NULL )
printf( "/" );
else {
printf( "<" );
for ( int i = 1; i < p>m; ++i ) {
(*PrnFn)( p>e[ i ].info, PrintFlag );
printf( "," );
}
(*PrnFn)( p>e[ p>m ].info, PrintFlag );
printf( "> " );
}
if ( nl )
NL();
}
Illustration of a Search Operation on Multiway Trees
Consider the following detailed drawing of a multiway tree and a search object.
The info fields of the tree nodes point to structures containing a pointer to a search key (such as education
and multiprocessor
), and a pointer to some other information related to the key. Suppose that we define these structures, and a specific class as derived from the gMWtree
class,
typedef struct {
genPtr key;
genPtr data;
} DataStr;
typedef DataStr * DataPtr;
class DataTree : public gMWTree { ... };
whose constructor binds the PrnFn
and CmpFn
function pointers to appropriate functions. For instance, the function to compare keys might be as follows.
RelVal CmpKeys( genPtr gp1, genPtr gp2 )
{
DataPtr dp1 = (DataPtr)gp1, dp2 = (DataPtr)gp2;
string s1 = (string)(dp1>key), s2 = (string)(dp2>key);
int i = strcmp( s1, s2 );
return i < 0 ? LT : i == 0 ? EQ : GT;
}
To the left of the tree, a similar structure containing only a search key is pointed to by the argument gp
to function gMWtree::search
. Beneath the leftmost node of the tree, the code for function gMWtree::prSearch
is written with the call to FindPlace
replaced by its implementation (the while
loop, in which I mistakenly renamed the m
data member as n
). The initial value of argument t
to prSearch
is a copy of the root
pointer. So, we are searching for a node in the tree containing trie
as its search key. The following table shows an execution trace of prSearch
.
t  t>n  i   t>e[ i+1 ].info  cmp 
root node  2  0   object with key education  LT 
  1   object with key multiprocessor  LT 
  2  exit the while loop   
rightmost node of level 2  2  0   object with key program  LT 
  1   object with key scatter  LT 
  2  exit the while loop   
only node shown at level 3  5  0   object with key text  LT 
  1   object with key traversal  LT 
  2   object with key trie  EQ 
Upon finding an object with the same key, prSearch
returns a pointer to it.
Illustration of Insertion and Print Operations on Multiway Trees
Suppose that an appropriate class has been defined as derived from the gMWTree
class to create an MWT of order 6 (N == 2
, M == 5
) to store integer numbers. The following table illustrates a sequence of insertion and print operations on the tree. (I edited the output of function gMWtree::print
, and added by hand the lines linking nodes.) The table should be read rowbyrow.
Removal Operation on Multiway Trees
The removal of elements from MWTs is seemingly a very difficult operation because of the requirement that these trees must be perfectly balanced. The insertion algorithm guarantees a balanced tree, and we must make the removal give the same guarantee. We wish we had a means of implementing selfadjusting MWTs similar to the one shown below. As was done in the case of binary trees, we proceed in a topdown fashion by analyzing increasingly more complicated cases. I will use the integer MWT just built to illustrate the various removal scenarios. (I must point out that most of the textbooks on advanced data structures do not address the problem of removals (or deletions) from MWTs. Those that address it do so in a cursory fashion and only in graphical form. All the code presented here to manipulate MWTs, especially the one to remove elements, is entirely my own. It is yet one more example of the power of topdown, and bottomup structured programming.)
A SelfAdjusting Search Tree.
(Published on the occasion of Robert E. Tarjan’s Turing Award Lecture "Algorithm Design." Communications of the ACM Vol. 30, No. 3, March 1987.)


The toplevel function to remove
objects from an MWT is given below. The argument gp0
to remove
is a generic pointer to a structure containing partial information (the search key[s]). This function simply initiates the inorder traversal of the MWT, which is done by the protected function prRemove
. The arguments to the latter function are: a pointer to the search object, a pointer (initially NULL
) to a parent node, and pointer (initially root
) to the child node. (The function has an optional fourth argument described later) within the parent node to access the pointer to the child. If prRemove
returns NULL
, there is no object in the tree whose key(s) match the search key(s). In the case of a successful removal, if the root node has become empty, it is returned to the heap. Function remove
returns whatever it got back from prRemove
.
genPtr gMWtree::remove( genPtr gp0 )
{
genPtr gp1;
printf( "removing key " );
(*PrnFn)( gp0, PrintFlag );
NL();
gp1 = prRemove( gp0, NULL, root );
if ( gp1 == NULL )
printf( " (not present)\n" );
if ( root>m == 0 ) {
free( root );
root = NULL;
}
return gp1;
}
Due to the complexity of removal operations (and to illustrate bottomup software design), function prRemove
will be built from the ground up. First, we organize the inorder traversal of the MWT, to search for the element to be removed, following the code of functions prSearch
and prInsert
.
Removal From Leaf Nodes
Given the integer MWT tree obtained with the last insertion (shown below), suppose that we remove number 35.
This number is in the 4node <32,35,38> pointed to by tC
. Its parent node, pointed to by tP
, is the 3node <30,40>.
This is the simplest kind of removal operation. It involves reducing the node at tC, that is, shifting number 38 to replace 35, and turning the 4node into a 3node. As long as a node has at least the minimum number of elements (N
), nothing else must be done. On the other hand, if a reduced node underflows, then it must be restored using an element from its parent node at tP
. We anticipate that this operation may have to be performed several times. These considerations lead to the second version of function prRemove
.
Second Version of prRemove
genPtr gMWtree::prRemove( genPtr gp0,
mwtPtr tP,
mwtPtr tC,
int j )
{
int i; RelVal cmp; genPtr gp1; mwtPtr tI;
if ( tC == NULL )
return NULL;
else {
i = FindPlace( gp0, tC, 0, &cmp );
if ( cmp != EQ )
gp1 = prRemove( gp0,
tC,
tC>e[ i ].ptr,
i );
else {
gp1 = tC>e[ i+1 ].info;
tI = tC>e[ i ].ptr;
if ( tI == NULL )
reduce( tC, i+1 );
else
}
if ( tP != NULL && UnderflowNode( tC ) )
RestoreChild( tP, tC, j );
return gp1;
}
}
 When the comparison returned by
FindPlace
is EQ
, removal takes place from the node at tC
.  If the node at
tC
is a leaf node then it must be reduced.  After the removal, if the node at
tP
exists (parent node), and the node at tC
(child node) underflowed, then the child must be restored with an element from the parent at index position j+1
.
The reduction of the node pointed to by t
at object position i
involves shifting to the left by one position all the objects and pointers to the right of the object being removed. The node size is decremented by one.
void gMWtree::reduce( mwtPtr t, int i )
{
PrintNode( "reducing", t, 0 );
printf( " at i == %d\n", i );
for ( int j = i+1; j <= t>m; ++j )
t>e[ j1 ] = t>e[ j ];
(t>m);
}
The node at t
has underflowed if its size is less than the minimum size allowed.
int UnderflowNode( mwtPtr t )
{
return t != NULL && t>m < N;
}
The following figure illustrates three removals from leaf nodes of the integer MWT. The resulting leaf nodes after the removals are not underflow nodes.
Removal of elements from leaf nodes of the integer MWT without causing them to underflow
Restoration of an Underflowed Child Leaf Node, Case 1: Nonminimal Siblings
In order to figure out how to restore an underflow child node, consider again the original integer MWT (shown below), and suppose that we remove elements 32 and 38 in that order.
As shown in the following figure, these removals cause the leaf 4node <32,35,38> to be reduced to the 3node <35,38>, and then to the 2node <35>, which becomes and underflow node.
As shown by the dotted arrows in the last tree, to restore the underflow 2node <35> we can move number 40 from its parent node to make the 3node <35,40>, and replace 40 in the parent node with the leftmost number (42) from the nodes right sibling. The resulting balanced tree is as follows.
There is, of course, the symmetric case in which the borrowed number is the rightmost number from the underflow node’s left sibling, also performing the rotation of numbers through the parent node. The following figures illustrate this situation.
Tree resulting after the removal of numbers 46, 35, and 20, in that order.  
Tree resulting after the removal of number 13.The dotted arrows indicate the movement of 10 from the parent node to the underflow node <15>, and the replacement of 10 with the rightmost number of the left sibling.  
Balanced tree after the restoration of the node.  
It would seem that these two symmetric restoration operations for an underflow node are justified only if the following conditions hold
 The parent of the underflow node is at least a 3node.
 Either the right or the left sibling of the underflow node is at least a 4node.
As will become apparent later, the first condition it is not necessary at all. If it were, we could easily define a function Is3node
to test the parent node, and modify version 2 of function prRemove
as follows
if ( Is3node( tP ) && UnderflowNode( tC ) )
RestoreChild( tP, tC, j );
Finally, when an underflow node has two siblings and both of them are at least 4nodes, then it is advisable to borrow from the largest. We are now ready to start writing RestoreChild
.
RestoreChild
void gMWtree::RestoreChild( mwtPtr tP,
mwtPtr tC,
int i )
{
mwtPtr tLs, tRs;
int minLs, minRs,
mLs, mRs;
PrintNode( "underflow", tC, 1 );
print();
GetSiblings( tP, i, &tLs, &tRs );
minLs = MinimalNode( tLs );
minRs = MinimalNode( tRs );
if ( minLs && minRs )
else {
mLs = tLs>m;
mRs = tRs>m;
if ( minLs  mRs > mLs )
MoveFromRight( tP, tRs, tC, i );
else
MoveFromLeft( tP, tLs, tC, i );
}
}
 The function defines variables to point to potentially two siblings, for the underflow child node at
tC
has at least one. There are two flags to know if the siblings are minimal (i.e. cannot give away anything), and two variables for the sizes of the siblings.  Get pointers to the sibling(s), and determine whether it(they) is(are) minimal.
 If both siblings are minimal, we must figure out something else to do.
 Otherwise, there is at least one nonminimal sibling.
 If the left sibling is minimal, or if the right sibling is larger, then move an item from the right sibling.
 Otherwise, the left sibling is nonminimal or larger than the right. Move an item from the left sibling.
Thus far, function RestoreChild
relies on the protected
functions GetSiblings
, MinimalNode
, MoveFromRight
, and MoveFromLeft
. We have all the information we need to implement them.
If the pointer t
is NULL
, it does not point to a node, and the nonexisting node is minimal. Otherwise, the node at t
is minimal if it contains fewer than N+1
info
items.
int gMWT::MinimalNode( mwtPtr t )
{
return t == NULL  t>m < N+1;
}
To find the siblings of a node, we need a pointer to its parent (tP
). If the child whose siblings we want is at tP>e[ i ].ptr
, its potential siblings would be at positions i – 1
and i + 1
of the array.
void gMWtree::GetSiblings( mwtPtr tP, int i, mwtPtr *aLs, mwtPtr *aRs )
{
*aLs = *aRs = NULL;
if ( i < tP>m )
*aRs = tP>e[ i+1 ].ptr;
if ( i > 0 )
*aLs = tP>e[ i1 ].ptr;
}
Functions MoveFromRight
and MoveFromLeft
must implement the two rotation operations described above with the example integer MWT. We can implement them using functions augment
, reduce
, and some additional code. Observe carefully their input assertions.
void gMWtree::MoveFromRight
( mwtPtr tP, mwtPtr tRs, mwtPtr tC, int i )
{
ReportMoveThrough( tC, tP, tRs );
augment( tC, tP>e[ i+1 ].info,
tC>e[ tC>m ].ptr,
tRs>e[ 0 ].ptr,
tC>m );
tP>e[ i+1 ].info = tRs>e[ 1 ].info;
reduce( tRs, 1 );
}
void gMWtree::MoveFromLeft
( mwtPtr tP, mwtPtr tLs, mwtPtr tC, int i )
{
int mLs = tLs>m;
ReportMoveThrough( tC, tP, tLs );
augment( tC, tP>e[ i ].info,
tLs>e[ tLs>m ].ptr,
tC>e[ 0 ].ptr,
0 );
tP>e[ i ].info = tLs>e[ mLs ].info;
reduce( tLs, mLs );
}
Function ReportMoveThrough
is called by MoveFromLeft
and MoveFromRight
for selftracing purposes. Its implementation, and the actual output produced for the two restoration examples are shown below.
void gMWtree::ReportMoveThrough( mwtPtr dst, mwtPtr thr, mwtPtr src )
{
PrintNode( "moving from node ", src, 0 );
PrintNode( " through node ", thr, 0 );
PrintNode( " to node ", dst, 1 );
}
The reader should appreciate the simplicity of the implementation of functions MoveFromLeft
and MoveFromRight
. Ignoring the comments, each of the functions consists of three instructions, two of which are a call to function augment
(designed to perform insertions!) and a call to function reduce
. (As the reader may have noticed, I do not put many comments in my code because I believe that code should be selfdocumenting. The comments that I added, after the fact, to functions MoveFromLeft
and MoveFromRight
are just to repeat the steps of the restoration process in each case.)
Restoration of an Underflowed Child Leaf Node, Case 2: Minimal Siblings
The first version of RestoreChild is incomplete because it does nothing when the siblings of an underflowed node are minimal. Again, to determine what to do we must study a concrete case. Consider the figure on the right, showing a tree, and the result after removing number 38.
As indicated by the dashed arrows, to restore the tree, we do the following:
 Reduce the parent node to send 27 down to the left (and only) sibling of the underflow node.
 Move all the information (data and tree pointers) from the underflow node to its sibling.
 Return the underflow node to the heap.
 
This operation constitutes a merge of the underflow node with its left sibling. Obviously, there is the symmetric situation of merging an underflow node with its right sibling. We are now in the position to complete function RestoreChild
.
Third (and definitive) version of RestoreChild
Both siblings are minimal nodes.
If the left sibling exists, merge the underflow node at the end of its sibling.
Otherwise, merge the underflow node at the beginning of its right sibling.
void gMWtree::RestoreChild( mwtPtr tP,
mwtPtr tC,
int i )
{
mwtPtr tLs, tRs;
int minLs, minRs,
mLs, mRs;
PrintNode( "underflow", tC, 1 );
print();
GetSiblings( tP, i, &tLs, &tRs );
minLs = MinimalNode( tLs );
minRs = MinimalNode( tRs );
if ( minLs && minRs )
if ( tLs != NULL )
MergeLeft( tP, tLs, tC, i );
else
MergeRight( tP, tRs, tC, i );
else {
mLs = tLs>m;
mRs = tRs>m;
if ( minLs  mRs > mLs )
MoveFromRight( tP, tRs, tC, i );
else
MoveFromLeft( tP, tLs, tC, i );
}
}
The implementation of functions MergeLeft
and MergeRight
requires a little more work than what was done in functions MoveFromLeft
and MoveFromRight
, but the basic idea is the same: every time we move a data pointer, we also move its enclosing tree pointers. Again, pay close attention to the input assertions of the functions.
Function MergeLeft
can be implemented in terms of reduce
(on the parent node) and CopyRange
(to put elements at the end of the left sibling), plus some additional code:
void gMWtree::MergeLeft( mwtPtr tP, mwtPtr tLs, mwtPtr tC, int i )
{
++(tLs>m);
tLs>e[ tLs>m ].info = tP>e[ i ].info;
reduce( tP, i );
ReportMoveAll( tLs, tC );
CopyRange( tLs, tLs>m+1, tC, 1, tC>m );
CheckEmptyRoot( tP, tLs );
free( tC );
}
Function MergeRight
can also use reduce
on the parent node, but must resort to function augment
to put elements, one by one, at the beginning of the right sibling. The function also has some additional code.
void gMWtree::MergeRight( mwtPtr tP, mwtPtr tRs, mwtPtr tC, int i )
{
augment( tRs, tP>e[ i+1 ].info, NULL, tRs>e[ 0 ].ptr, 0 );
reduce( tP, i );
ReportMoveAll( tRs, tC );
for ( int k = tC>m; k > 0; k )
augment( tRs, tC>e[ k ].info, NULL, tC>e[ k ].ptr, 0 );
tRs>e[ 0 ].ptr = tC>e[ 0 ].ptr;
CheckEmptyRoot( tP, tRs );
free( tC );
}
Functions MergeLeft
and MergeRight
call function ReportMoveAll
, for selftracing purposes,
void gMWtree::ReportMoveAll( mwtPtr dst, mwtPtr src )
{
PrintNode( "moving into", dst, 0 );
PrintNode( " all {info, ptr} from", src, 1 );
}
and function CheckEmptyRoot to replace the root of the MWT with a new root when it becomes empty because of compression of the tree during rebalancing.
void gMWtree::CheckEmptyRoot( mwtPtr tP, mwtPtr NewRoot )
{
if ( tP == root && tP>m == 0 ) {
root = NewRoot;
free( tP );
}
}
The following figure illustrates the complete process of removal of number 38 from the example tree, and the rebalancing of the tree.
Before dealing with removal of elements from internal nodes of a MWT, I will present a beautiful example of the operation of functions MergeLeft
and MergeRight
. Consider the following tree, and the result after removal of number 22.
At the point where the 3node <22, 24> underflows, function prRemove
, has called itself recursively once, and the last tree shows pointer tP
at the 3node <10,18> and pointer tC
at the underflow 2node <24>. The the test tP != NULL && UnderflowNode( tC )
is true, and prRemove
calls RestoreChild
. The underflow node has only a left sibling, pointed to by tLs
, and it is a minimal node. Therefore, function MergeLeft
is called to perform the merge indicated by the dashed arrows. The following figure shows the tree after the merge.
After the merge that produced the top tree in the figure, function prRemove
returns from the recursive call. (For convenience, the relevant code is shown below with the recursive call in bold.)
The reader should observe how remarkable this behavior is. Function RestoreChild
was designed to restore underflowed leaf nodes and yet, due to the recursive traversal of the tree by prRemove
, it works just as well with internal nodes! This is the reason for not demanding the node at tP
to be at least a 3node when calling RestoreChild
. The test tP != NULL && UnderflowNode( tC )
simply asks whether the node at tP
is not NULL
before asking whether the node at tC
is underflowed bcause such value is assigned to tP
only when tC
is pointing to the root of the whole tree on the first call to prRemove
. As a consequence, the root node of an MWT is the only one allowed to remain underflowed as a 2node.
The second thing the reader should observe is the power of recursive solutions to problems that are inherently recursive, to wit, the traversal of a recursive structure. Finally, because of the possibility of having to restore an underflowed node at the end of prRemove
, the recursive call shown in bold above is not tailrecursive. Even though I consider myself a seasoned programmer, I cannot begin to imagine how to implement the removal operation in an iterative fashion without using an explicit stack.
Removals From Internal Nodes of Multiway Trees
To complete the definition of function prRemove
, we consider now the removal of elements from internal nodes. Again, in order to determine what to do we study some concrete situations.
In general, the removal of an element from an internal node involves replacing it with either its successor or its predecessor, as we did when removing an internal node of a BST in Chapter 6. Using wishful thinking, the third (and definitive) version of prSearch
is shown below, with the added code in bold.
genPtr gMWtree::prRemove( genPtr gp0,
mwtPtr tP,
mwtPtr tC,
int j )
{
int i; RelVal cmp; genPtr gp1; mwtPtr tI;
if ( tC == NULL )
return NULL;
else {
i = FindPlace( gp0, tC, 0, &cmp );
if ( cmp != EQ )
gp1 = prRemove( gp0, tC, tC>e[ i ].ptr, i );
else {
gp1 = tC>e[ i+1 ].info;
tI = tC>e[ i ].ptr;
if ( tI == NULL )
reduce( tC, i+1 );
else
ReplaceWithPredOrSucc( tC, tI, i );
}
if ( tP != NULL && UnderflowNode( tC ) )
RestoreChild( tP, tC, j );
return gp1;
}
}
The reader may be wondering what we must do when, after performing the replacement in the internal node and the reduction of a child node, the latter underflows. Well, we already have taken care of that! We now turn to the implementation of function ReplaceWithPredOrSucc
.
Following the logic behind the removal of an internal node in a BST, ReplaceWithPredOrSucc
must find the nodes containing the predecessor and the successor of the element to be removed. To minimize the amount of work, the replacement element should be taken from the largest node, if both of them exist. Instead of calling reduce on the node that "lost" an element, we simply call prRemove to remove such an element
void gMWtree::ReplaceWithPredOrSucc( mwtPtr tC, mwtPtr tI, int i )
{
mwtPtr tIp1, tPred, tSucc;
int mPred, mSucc;
genPtr gp;
tIp1 = tC>e[ i+1 ].ptr;
tPred = PredecessorNode( tI );
tSucc = SuccessorNode( tIp1 );
mPred = tPred>m;
mSucc = tSucc>m;
if ( CopyFn == NULL ) {
printf( "ERROR: NULL CopyFn pointer\n" );
exit( 0 );
}
if ( mPred > mSucc ) {
gp = (*CopyFn)( tC>e[ i+1 ].info, tPred>e[ mPred ].info );
ShowReplacement( "predecessor", gp );
prRemove( gp, tC, tI, i );
}
else {
gp = (*CopyFn)( tC>e[ i+1 ].info, tSucc>e[ 1 ].info );
ShowReplacement( "successor", gp );
prRemove( gp, tC, tIp1, i+1 );
}
}
The functions to find the node containing the predecessor and the node containing the successor of the element being removed are generalizations of standard functions to find the predecessor and successor of a node in binry search trees, and are as follows.
mwtPtr gMWtree::PredecessorNode( mwtPtr t )
{
mwtPtr tM;
while ( (tM = t>e[ t>m ].ptr) != NULL )
t = tM;
return t;
}
mwtPtr gMWtree::SuccessorNode( mwtPtr t )
{
mwtPtr t0;
while ( (t0 = t>e[ 0 ].ptr) != NULL )
t = t0;
return t;
}
Finally, a function to show the replacement (used for selftracing purposes) is as follows.
void gMWtree::ShowReplacement( char *str, genPtr gp )
{
printf( "replacing with %s: ", str );
(*PrnFn)( gp, PrintFlag );
NL();
}
As an illustration of the execution of prRemove
and its auxiliary functions, the following figure shows an initial tree and the actions that take place with the removal of some of its elements. The reader is encouraged to determine, from the messages issued, which function(s) is(are) responsible for each of the actions.
Examples of removal operations on an MWT
This concludes the topdown design and implementation of operations on balanced MWTs.