Manager Classes, Passive and Active Objects, and Two-Layered
Frameworks
Jiri Soukup, Code Farms Inc.
7214 Jock Trail, Richmond, Ont., Canada, KOA 2ZO
613-838-4829, fax 613-838-3316, jiri@debra.dgbt.doc.ca
Abstract
This paper presents a new design strategy which treats relations
between classes as objects. This concept has a profound effect on
the entire software architecture: Instead of having many active
objects that are busy communicating with each other, the architecture
is based on active objects which manage less actively sometimes
totally passive objects. This arrangement brings system and order
into the chaos of uncontrolled object interaction. When applying
this concept to framework design, we get frameworks composed of
two layers: A layer of classes that carry framework relevant data
and pointers, plus a layer of classes that have no data, but manage
the relations. Application classes inherit only the data, not the
framework methods. Experience from numerous commercial projects
indicates that the new approach is effective for rapid generation
of production-grade code; it also improves code modularity and maintenance
by reducing dependency cycles among application classes.
Category:
Research
Topic Area:
Architectures; reuse, components, and frameworks; patterns; analysis
& design methods.
Relations between objects
The object-oriented paradigm, which encapsulates both the data and
methods that operate on those data within one class, breaks down
when dealing with two or more closely cooperating classes. This
situation typically occurs in traditional data structures, in situations
that Booch calls "mechanisms", or when implementing design patterns.
The problem of how to manage groups of cooperating classes safely
and efficiently quickly become one of the critical problems we are
facing today. After all, design patterns (see [2]) describe the
cooperation of several classes: most frameworks are conglomerates
of cooperating classes, and many data structures involve two or
more different types (see [1]), while object coordination is considered
an important part of the architecture (see [4]). in [5], this topic
is covered in Chap.7.3.3 - Control Objects. The same problem is
the center of attention in [6], which will be published later this
year. Example 1: Implementing a graph Assume that the classes
Node and Edge represent a directed graph; for each Node, we have
a list of Edges which originate in it. Each Node has a pointer to
the beginning of the list, each Edge has two pointers: one to the
next Edge on the list, and one to the target Node. Methods that
control the graph manipulation are associated with either Node or
Edge, depending on the character of the operation. For example,
using C++ notation:
Edge* Node::firstEdge(void); // assigned to Node Edge* Edge::nextEdge(void);
// assigned to Edge Node* Edge::nextNode(void); // assigned to
Edge Example 2: Implementing the many-to-many relation
When implementing the many-to-many relation (Entity-Relationship
Model M:N), the situation is similar, except that now three classes
are involved: Source, Relation, and Target. Again, the methods
that control the relation are assigned to different classes, depending
on where the data required for the particular operation reside.
Example 3: Implementing the design pattern 'Composite'
The design pattern called Composite involves the cooperation of
three classes: Component, Leaf, and Composite. Component is a
base class, from which the classes Leaf and Composite are derived.
Leaf is a simple single object, but Composite includes a collection
of Components. This is an elegant way of representing general
hierarchies, where any object can include a collection of other
objects, including nesting at arbitrary levels. Operations such
as add() or remove() are usually virtual methods assigned to both
Component and Composite.
Dependency between classes
Even though many recent publications discuss various strategies
of controlling cooperating objects (see for example [4], Chap.18.3
and 22), class dependency is usually neglected. And yet, class dependency
has a major impact on code modularity and the ability to maintain
complex designs (for more details, see [1], Chap.2).
There are several ways to define dependency between classes.
In my opinion, the simplest definition, which also catches the
essence of the problem, is to say that class A depends on class
B. if class A cannot be compiled without having the full definition
of class B. According to this definition, independent classes
can be compiled and debugged in separate modules.
Example 4: How one class can depend on another In the
following code, class B depends on class A in 3 different ways.
Note that 'friend' statements do not cause dependency, but are
usually an indication that some dependency is present.
class A {
friend class B; // most likely, B depends on A
int i;
public:
void fun(void){ /* whatever */}
};
class B {
A a; //member of type A
public:
int g(A *p){
p->f(); // method of A called
return p->i; // member of A is used (friends)
}
};
Example 5: Pointer members do not make classes dependent
In the following code, even though class C works with pointers to
A, it is still independent of A. The code can be compiled without
knowing the implementation of class A.
class A;
class C {
friend class A;
A *a; // pointer does not cause dependency
public:
void set(A *a){a=h;} // pointer does not cause dependency
};
In practical applications, a certain amount of class dependency
is always present. However, the character of the dependency graph
determines whether the design can be developed in a modular, structured
fashion, or whether all classes form one interconnected cluster
which must always be compiled, debugged, and maintained together
- the situation which I like to call "spaghetti ++".
In Fig.1, the dependency graph forms
a lattice. All classes at level O can be designed and debugged
in isolation; after that, classes at level 1 can be developed
using classes from level O which are already solid and tested.
Continuing level by level, class J is finally designed and tested.
For example, if class C changes, we know that the changes may
affect only classes F,G,H, and J.
The situation in Fig.2 is different,
because the dependency graph forms a loop. Classes G and H can
be developed and tested in isolation, but all remaining classes
must be handled as one large unit.
In designs where some classes participate in more than one relation,
we frequently run into this situation. The resulting cluster of
classes is often called a framework. Frameworks are handled as
one large unit not only because of their particular functionality,
but also because the dependencies among the classes do not permit
the splitting of design into smaller modules. Complex class dependencies
often occur in CAD and CASE systems, financial and business systems,
compiler design, or in connection with fast graphical displays.
The fact that we cannot split these systems into smaller modules
is a major design obstacle. One of the case studies described
in [1] mentions a CASE system which works with 500 classes organized
at 7 inheritance levels, and connected by 52 relations, usually
involving 2-3 classes.
Manager classes
The reason why we get these clusters of interdependent classes is
that, within these classes, we hide the description of class relations.
The cure for this is to think in a more object-oriented way, and
to represent not only the objects, but also their relations as classes.
For each relation, we will use a special class which will contain
all of the operations pertinent to the relation; the related data
and pointers will be stored in the original Classes (as we did
before). This arrangement requires one additional class for each
relation, but we get a better architecture. Fig.3
shows what happens when representing a graph. The class which
manages the relation (class Graph in Fig.3)
will be called the "manager class".
"Manager Class" is just another name for the "Pattern Class"
introduced in [1]. Both names relate to the fact that the class
manages the entire pattern. In [4], this type of class is called
"Coordinator", in [5] it is called "Control Object".
Note that data structures can be viewed as more primitive design
patterns (see [1]). Since the manager class includes all of the
interface, it must have access to the classes that contain the
data. Typically, the manager class is a friend of all the classes
that contain its data.
Manager classes work well even when inheritance is present.
For example (see [1],[3]), when representing pattern Composite
in this style, classes Composite and Leaf are both subclasses
of class Component.
Example 6: Direct graph representation using a manager class,
Graph
class Edge;
class Graph;
class Node {
friend class Graph;
Edge *first;
};
class Edge {
friend class Graph;
Edge *next;
Node *target;
};
class Graph {
public:
Edge *nextEdge(Edge *e){return e->next;}
Edge *firstEdge(Node *n){return n->first;}
Node *nextNode(Edge *e){return e->target;}
void addEdge(Node *n1, Edge *e, Node *n2);
// ... all the remaining interface
};
Paradigm shift
The new representation of the relations slightly changes the access
syntax. For example, when connecting Node n1 to Node n2 using Edge
e, the typical interface used today looks like this:
Edge *e; Node *n1,*n2;
e->addEdge(n1,n2);
e=n1->firstEdge();
e=e->nextEdge();
The new interface is
Graph g; // object representing the relation
Edge *e; Node *n1,*n2;
g.addEdge(n1,e,n2);
g.firstEdge(n1);
e=g.nextEdge(e);
Formally, there is not that much difference: however, there
is a significant paradigm shift. In the currently used style, methods
are assigned to the application objects. For example, you think
about adding Edge e between Nodes no and n2; the fact that you are
adding an edge to your Graph is implicit and neglected. The notion
of the graph is the last on your mental list.
In the new style, you first think about the graph. Your prime
thought is that you are adding an edge to the graph! Which edge
and between what nodes is the information that must be provided,
but has only the secondary importance.
The advantage of the new interface style is apparent when your
objects become involved in more than one relation (the typical
framework situation). For example, if Node and Edge are involved
in two independent Graphs, the new interface can handle it gracefully:
Graph g1,g2; Edge *e; Node *n1,*n2;
g1.addEdge(n1,e,n2); // adding edge on graph go
g2.addEdge(n1,e,n2); // adding edge on graph g2
The interface which is typically used today cannot handle
this case, unless you start to customize the interfaces.
Two-layered frameworks
The advantage of using manager classes can be seen from Fig.4.
In the traditional design style, this framework which involves 3
patterns forms a cluster of interdependent classes that must be
developed, debugged, and tested together. Manager classes P1,P2,
and P3 break the dependency cycle and bring structure to the entire
design. Classes A,B,...,G can be developed and tested independently;
each pattern can be tested without even compiling the other patterns.
This approach results in a new, two-layered style of framework
architecture. Base classes which form the main framework layer
contain relational data and pointers, but have very little (if
any) interaction. All of the framework interface is concentrated
in the second, management layer (manager classes - see Fig.5).
The main difference from the traditional framework is that,
in the new, two-layered framework, the application objects do
not inherit the framework interface. When operating on the framework
(traversing, adding or deleting relations), we use the interface
provided by the management layer.
It is interesting to see how this approach (see Fig.4)
relates to the role modelling described in [6]. Patterns P1, P2,
and P3 represent models, and classes A,8,.. play certain roles
in these models. Classes C,E, and F play multiple roles.
Passive and active objects
You may ask why this new design style, which has so many obvious
advantages, is not already widely used today. Part of the problem
may be a Smalltalk legacy. Since "friends" are not available in
Smalltalk, implementing manager classes in this language either
runs into problems with the encapsulation of data, or into performance
degradation (function calls for any access to relational pointers).
Another reason may be that there is only one class library using
manager classes, the C++ Data Object Library from Code Farms Inc.,
and because this library uses a code generator, it may create
an impression that manager classes do not fit into the set of
regular features of the C++ language. That is not true, though.
Manager classes can be added to any other existing class library,
and they can be based on templates.
Perhaps, the old design style is just a matter of habit. Everybody
is designs programs that way, and adding an additional class may
appear unnecessary and redundant. For example, Chap.18.3 in [4]
mentions the possibility of name pollution when too many free-
standing operations are used. Note, however that we are not adding
free-standing operations here: all the functionality is encapsulated
under the manager class.
Each manager class adds very little overhead (only one instance
of each manager class is actually used), but that may not be apparent.
I believe that the main problem is in how we educate programmers.
The following short episode demonstrates the situation.
While consulting for a large telecommunication project which
involves hundreds of programmers and millions of lines of code,
I pointed out that using manager classes would improve both software
architecture and system performance. After I explained the advantages
of manager classes, one of the programmers told me:
"What you recommend here is a system with many passive objects
and a few active objects that control them. This arrangement is
similar to old data structures (structured programming). I have
been taught to use active objects; I don't like your architecture."
This way of thinking is typical for code designers
who often forget about the other programmers who will have to
maintain the system in the years to follow.
We are at crossroads similar to those when structured programming
was introduced 15 or 20 years ago. At that time, many programmers
believed that the freedom of arbitrary jumps (GO TO statements)
was essential to the art of programming. Only with time, did the
advantages of structured programming became apparent. In a similar
vein, it's time to introduce more structure into object-oriented
programming, and manager classes provide exactly that.
Similar approaches
Coordinator classes from [4], Control classes from [5], and our
Manager classes (i.e. Pattern classes from [I]) are quite similar.
The difference is in the reason why these classes are introduced,
and also in the scope of what they cover. We assume that a manager
class will be used whenever two or more classes cooperate, and that
the manager class will encapsulate the entire interface related
to the cooperation. Our primary motivation is to avoid class dependency
cycles, and to represent relations as objects. We use only one instance
of each manager class which usually (but not necessarily) has a
global meaning just like class declarations; the lifespan of this
instance typically covers the entire program run. The interface
is always a part of the manager class.
On the other hand, [5] recommends that one use control objects
only when the behavior of the group is hard to assign to any of
the other object types (p.185): "The control objects typically
work as glue or cushions to unite the remaining objects so they
form one use case. They are typically the most ephemeral of all
the object types and usually last only during the performance
of one use case. It is, however, difficult to strike a balance
between what is placed in entity objects, control objects and
interface objects... Typical types of functionality placed in
the control objects are thus transaction-related behavior, or
control sequences specific to one or a few use cases, or functionality
to isolate the entity object from the interface objects."
We assume that each attribute stored in the passive objects
is managed by exactly one manager class. These attributes are
typically pointers that form lists and other inter-class relations.
This is not the assumption in [4], where the main concern is the
possibility of interference between coordinator classes (p.403):
"An intrinsic danger of compositional design is that since we
are always building objects by linking together other objects,
it is possible to lose track of where any link may actually lead.
This creates the potential for interference." This is very interesting,
since one of our primary reasons for the use of manager classes
is separation of patterns, and a complete prevention of interference.
In a private communication, Doug Lea suggested that a paper
like this should answer two questions:
Question 1: When do you need manager classes?
Question 2: When should objects be pure data-slaves?
Answer 1: You need to use manager classes when two or
more classes depend on each other, and when there is no clear
candidate class for the interface representing the cooperation.
However, I recommend to use a manager class in any situation where
two or more classes cooperate or are involved in pointer relations.
This includes simple collections which I like to represent as
a triplet of classes: the first application class which holds
the collection, typically by containing a pointer to a beginning
of a list; the second application class which forms the list using
embedded pointers; and then a manager class which provides the
methods and the interface. There may be more than one manager
class, for example the main manager class, and one or two iterators.
Answer 2: Objects should be pure data-slaves if their
only purpose is to build data structures or structural design
patterns. This may typically happen when building certain types
of frameworks, from which the application classes inherit their
relational behavior. Manager classes contain the relational part
of the architecture. If there is any other action to be performed,
the object will be smaller and have fewer methods, but it will
not be a pure data- slave.
Practical experience
Data structures based on manager classes have been commercially
distributed by Code Farms for over 5 years, and have been successfully
used by hundreds of companies, and on projects exceeding 100k lines
of code. Programmers who have been working with this method for
several years generally find three positive things about this approach:
(a) It permits generation of production-grade code in
the time normally required for rapid prototyping. Manager classes
(and related inter-class relations) can be added or removed without
upsetting existing software.
(b) Debugging is much improved, because the design can
be tested in smaller, more manageable modules.
(c) The resulting software is easier to maintain; manager
classes remain in the code and record the architecture.
Example 7: Universal CAD interface
High Level Design Systems of Santa Clara, California, designed
a universal interface, which connects various commercial VLSI
design systems together. The result is that, today, it is possible
to combine algorithms from different, usually competing commercial
vendors. The heart of HLDS system is a memory resident framework
called "Pillar" which describes and manipulates objects such as
transistors, circuit cells, and connection wires running at multiple
layers. Pillar stores the physical layout (the geometry of all
objects, their locations, and rotations), electrical information
(power consumption, resistance, capacitance, signal delay, etc.),
and the connectivity of the circuit. Pillar includes about 30
application classes used in 50 aggregates and other data organizations
(50 manager classes).
Pillar supports both automatic layout systems and an interactive
graphics editor. Using a library based on manager classes, the
basic Pillar framework was designed by one person in 3 months.
Compared with several similar projects in the past, the project
was estimated to take 9 months. The program has 15,000 lines of
C++, compared to 60,000 lines of C for an earlier version. An
important function of Pillar is to provide fast color display
of massive VLSI data. The data are stored hierarchically; a typical
design includes 100,000 objects in 2 MB of memory. Recently, the
system was used to design a major microprocessor, which required
500 MB of data.
Example 8: Large CASE graphics system
One of the best CASE systems I know (S-CASE, Multi-Quest Corp.,
III.) has been designed using this technique. Internally, S_CASE
uses about 600 classes in up to 7 inheritance levels, including
50 manager classes; each manager class controls a group of 2-3
cooperating classes. However, the central database-like core involves
only 44 application classes controlled by 31 manager classes.
This is a situation quite similar to Pillar, with its densly interconnected
framework in the heart of the architecture. S_CASE runs on Sun
Spars station under OPEN LOOK, on HP 9000 under OSF/Motif, on
IBM-PC under MS Windows, and on Apple Macintosh. The entire system
includes 100,000+ lines of C++, and was coded and polished to
a commercial release by 3 programmers in 4 years.
References
[1] Soukup J.: Taming C++, Pattern Classes and Persistence for
Large Projects, Addison-Wesley 1994, ISBN 0-201-52826-6
[2] Gamma E., Helm R., Johnson R., Vlissides J.: Design Patterns,
Elements of Reusable Object-Oriented Software, Addison-Wesley 1994,
ISBN 0-201-63361-2
[3] Soukup J.: Implementing Patterns, PLoP'94 Conference, also
listed in "Pattern Languages of Program Design", Addison-Wesley
1995, ISBN 0-201-60734-4.
[4] de Champeaux D., Lea D., Faure P.: Object-Oriented System
Development, Addison-Wesley 1993, ISBN 0-201-56355-X
[5] Jacobson l.,Christenson M., Jonsson P., Overgaardd G.: Object-
oriented Software Engineering - A Use Case Driven Aproach, Addison-
Wesley t992, ISBN 0-201-54435-0
[6] Reenskaug T.: Working With Objects, to be published by
Manning/Prentice Hall, 1995