Implementing complex relations between applications classes is an important part of Object-Oriented Design. Handling many-to-many relations can be a real challenge for application programmers. To see why this is the case, let's consider two classes: Student and Course. Each Student may be enrolled in several Courses and each Course might contain several Students. If we use these two classes in a registration system then such operations as canceling a class or deleting a student become inherently non-trivial. If we are to delete a student from the system, we would have to remove this student from all classes she is currently enrolled in. This is a classic example of maintaining data integrity. Another important question is where should we keep such information as a student's grade for a class. Should it be kept in either Class or Student, or perhaps we need to introduce another class for this particular purpose?
One of the benefits of the OO approach is its support for reusability. Class libraries are designed to solve some common programming problems. The existing commercial class libraries provide support for many standard data-structures and algorithms. Using a library can result in significant time savings spent in development and debugging which often results in increased productivity and more robust applications.
We would like to investigate how the common C++ libraries today help programmers deal with the challenges of handling the many-to-many relations between application classes. The class examples we will consider are: Data Object Library and Pattern Template Library from Code Farms Inc., Standard Template Library, and Microsoft Foundation Classes.
In order to accomplish the task we will design a complete working example which reads some data and does something reasonable with it. Then we will implement the example using each of the aforementioned class libraries. This will allow us to measure the performance of different implementations. We would like to assess such design issues as: efficiency, abstractions, simplicity, generality, the need for macros and non-standard language extensions, and the use of code-generators.
In our assessment of these issues, we would like to concentrate on the support each class library offers to the programmer rather than on the internal structure and implementation of the libraries.
The question of dealing with many-to-many associations has been tackled by many authors who wrote about Object-Oriented Design. Thus James Rumbaugh et. al [1] defines three approaches to implement the two-way associations:
As far as the implementation of Link Attributes (Student grades and attendance), the authors suggest that "the best approach is usually to implement the association as a distinct class, in which each instance represents one link and its attributes".
In our example, the attributes will be accessed equally in both directions and accesses are as common as updates. Those requirements rule out the first two approaches leaving us with the third one. Also having a distinct class to store a link and its attributes answers the question of where we are going to store grades and attendance information.

Every major programming language comes with a standard library of reusable methods that can be used by application programmers. Over the years, it was identified which programming paradigms were used most of the time. So it made sense to provide a standard implementation of them in standard libraries. For instance, it is fair to say that almost every application of a reasonable size at one time or another needs to compare and copy strings. That's how strcmp() and strcpy() came into existence.
The object-oriented methodology is commonly recognized as a preferred way to structure large, extensible, and flexible software. It brought code reusability to a new level. Now application programmers are provided with class libraries and frameworks. These class libraries have a set of interacting classes that offer solutions to most of the standard programming tasks. Since the classes in such libraries are tightly coupled, it is rarely possible to use just a single class in an application. For example, if one chooses to use one of the standard containers in STL, it is almost guaranteed that some other STL iterators and algorithms classes will be used as well. Very often large applications might use several class libraries at the same time. For example, when writing Windows applications, one might choose to use MFC for Windows OS specific functionality and STL for data manipulation logic. Using class libraries relieves application programmers from worrying about many implementation details. This might provide for an opportunity to think on a higher level of abstraction and thus increase productivity resulting in more flexible and extensible implementations.
Class libraries are as different as the programming tasks they are intended to accommodate. For instance, the primary purpose of Microsoft Foundation Classes was to encapsulate the enormous Windows API hiding most of its complexities and inconsistencies. MFC also provides several standard containers which were originally written only to support Windows classes. For example, Frame needs a some kind of container to hold its Views. We believe that's how a set of standard containers came into existence in MFC. Quite the opposite, the primary goal in the design of Standard Template Library was to provide a standard library for the C++ programming language. According to Bjarne Stroustrup, "a standard library is something that every implementer must supply so that every programmer can rely on it". The design of STL was determined by the following three goals:
The Data Object Library from Code Farms Inc. was produced to gain benefits from using intrusive datastructures that had been neglected in most C++ libraries. The Pattern Template Library also from Code Farms Inc. was designed to provide a framework for implementing some of the Design Patterns.
It seems like implementing associations between application classes is a common programming task in Object-Oriented development. It just makes sense that the major class libraries should provide some support for application programmers in handling this non-trivial programming task. In the next section, we'll explore how the design outlined in the previous section can be implemented in four common commercial C++ class-libraries.
In order to assess the services provided by each class library, we implement a simple registration system with a very limited functionality. This registration system allows students to add and drop classes, list all classes for a particular student, and delete students from the system. Despite the fact that the functionality of the system is very limited for a real world application, it allows us to specifically concentrate on handling the many-to-many associations between students and classes.
We would like to express thanks to Dr. Jiri Soukup, author of DOL and PTL, who provided code samples which helped us in the design of our test programs. In the original implementation, Dr. Soukup defined the main function for the example which is shared by all four implementations. All the many-to-many associations related functionality is encapsulated by the School class interface.
class School {
public:
//Helper function
static char* getName(int i,char* buf);
//Add a new class to the system given the class' name
Class* add_class(const char* name);
//Add a new student to the system given the student's name
Student* add_student(const char* name);
//Find a student give a name
//If not found, return NULL
Student* find_student(const char* name);
//List the classes this student is registered for
void list_student_classes(Student* sp);
//Find a Takes relation for the student if she is registered
//for a class named class_name
//If not found return 0
Takes* find_student_class(const Student* sp, const char* class_name);
//Find class with this name
Class* find_class(const char* name);
//Add the class for this student
void add_student_class(Student* sp, Takes *tp, Class * cp);
//Delete the class (Takes relation) for a student
void delete_student_class(Takes* tp);
//Delete the student from the system
void delete_student(Student* sp);
};
Not all the methods in the School class deal with the handling of the many-to-many associations. The methods that we will look at are: list_student_classes(), find_student_class(), add_student_class(), delete_student_class(), and delete_student().
Before we look the implementations of these methods using the four libraries, let's review how each class library organizes its data.
The DOL is the only library that treats all data structures as objects. This is the approach that Dr. Soukup [3] has been advocating for a long time. DOL uses a code generator to create the actual data structure class definitions. What's unique about this data organization is that DOL specifically provides support for many-to-many associations by providing ZZ_HYPER_M_TO_N. The relation takes three parameters and directly supports our design. The implementation provides iterators to traverse through the collections of students and classes, and the relation between the two. The NAME data organization in DOL is similar to the String class in other libraries..
// ----------- OTL data organization -------------- ZZ_HYPER_UTILITIES(util); ZZ_HYPER_M_TO_N(scRelation,Student,Takes,Class); ZZ_HYPER_NAME(studentName,Student); ZZ_HYPER_NAME(className,Class); ZZ_HYPER_DOUBLE_COLLECT(students,School,Student); ZZ_HYPER_DOUBLE_COLLECT(classes,School,Class); static students_iterator sit; static classes_iterator cit; static scRelation_sIterator sourceIter; // --------------------------------------------
The PTL data organization is very similar to the one of DOL. It provides the same facilities. The only difference is that everything that the DOL accomplished via the means of a code generator is done in the PTL implementation using pure C++ templates. The very fact that the PTL has an in-built support for many-to-many associations by having such template classes as ManyToManySource, ManyToManyTarget, ManyToManyRelation, proves that such support can be added to a class library without using code generators.
class School : public CollectionParent<School,Student>,
public CollectionParent<School,Class> {
...
};
class Student : public CollectionChild<School,Student>,
public ManyToManySource<Student,Takes,Class> {
...
};
class Class : public CollectionChild<School,Class>,
public ManyToManyTarget<Student,Takes,Class> {
...
};
class Takes : public ManyToManyRelation<Student,Takes,Class> {
...
}
// ----------- data organization --------------
ManyToMany<Student,Takes,Class> scRelation;
Collection<School,Student> students;
Collection<School,Class> classes;
// --------------------------------------------
...
CollectionIterator<School,Student> sit;
CollectionIterator<School,Class> cit;
ManyToManySourceIterator<Student,Takes,Class> sourceIter;
...
The STL data organization consists of three lists. We choose a list instead of a vector for all three collections because the former is specifically optimized for insertions and deletions from any position in the collection.The data structures are declared as private members in the School class. Since all STL containers come with iterators, the very declaration of the three lists gives us the corresponding iterators automatically.
// ----------- STL data organization
--------------
typedef list
<Student*>
Student_list;
typedef list <Class*> Class_list;
typedef list <Takes*> Takes_list;
// School encapsulates your
problem
class School {
Student_list lstStudents;
Class_list lstClasses;
Takes_list lstSC;
...
};
//
--------------------------------------------
The MFC data
organization is very similar to the one of STL. The TypedPtrLis class provides a
type-safe "wrapper" for objects of classCPtrLis. When you use CTypedPtrList
rather than CObList or CPtrList, the C++ type-checking facility helps eliminate
errors caused by mismatched pointer types. Notice that this is the default
behavior in STL. And of course MFC doesn't provide iterators for its containers.
This shortcoming can be easily overcome if we use a technique presented in [5]
where the author suggests a way to use STL iterators with MFC containers.
// ----------- MFC data organization --------------
typedef CTypedPtrList<CPtrList, Student*> Student_list;
typedef CTypedPtrList<CPtrList, Class*> Class_list;
typedef CTypedPtrList<CPtrList, Takes*> Takes_list;
// School encapsulates your problem
class School {
Student_list lstStudents;
Class_list lstClasses;
Takes_list lstSC;
...
};
// --------------------------------------------
Below is the implementation of the functions dealing with many-to-many associations written using DOL, STL, and MFC. Since PTL and DOL only differ in their data definitions having almost identical identical implementations of all the functions of the School class, all the DOL-related discussion from that point on also applies to PTL unless noted otherwise.
The function list_student_classes() takes a Student pointer as a parameter and prints the names of classes this Student is registered for. DOL offers a convenient way to traverse through the Takes relationships for a given student. The source iterator is initialized with the student pointer and then the ITERATE macro walks through all Takes for this Student. The Student name is then retrieved from the className class.
The STL implementation takes the advantage of the traversal function for_each to go through the list of Takes. The STL's for_each() template function applies a given predicate to each member of a sequence. In this case, the predicate is implemented with a class with an overloaded parenthesis operator. This is a common approach in cases when the function needs to store a value that is used for comparison when the function call operator is invoked. Another alternative would be to declare such value global which would violate the encapsulation principle. The efficiency of implementation depends entirely on how well the C++ compiler can inline the function call to the overloaded operator.
Since MFC doesn't provide any iterators classes per se, it still has an elegant way to traverse through its containers. This is accomplished via the means of a handle which is stored in a variable of type POSITION. This variable is initialized with the call GetHeadPosition(). After that the traversal is accomplished with the call to GetNext().
Even though it seems that STL provides quite an elegant way to traverse a linked list with its for_each algorithm, its implementation involves the creation of another object that might prove to be quite expensive. The DOL implementation impresses with its elegant use of iterators when a pointers retrieved from one iterator is used to initialize another iterator.
/**list_student_classes
List the classes this student is registered for
*/
//DOL
void School::list_student_classes(Student * sp){
Takes *tp;
Class *cp;
printf("registered in:\n");
sourceIter.start(sp);
ITERATE(sourceIter, tp){ // walk through all Takes's of sp
cp=scRelation.target(tp);
printf("%s\n", className.fwd(cp));
}
printf("\n");
}
//STL
class Print_Classes{
Student *sp;
public:
Print_Classes(Student *p) : sp(p) {}
inline void operator() (Takes * tp){
if(EQ_STR(tp->GetStudent()->GetName(), sp->GetName()))
printf("%s\n", tp->GetClass()->GetName());
}
};
void School::list_student_classes(Student * sp){
printf("%s is enrolled in:\n", sp->GetName());
Print_Classes pc(sp);
for_each(lstSC.begin(), lstSC.end(), pc);
}
//MFC
void School::list_student_classes(Student * sp){
printf("%s is enrolled in:\n", sp->GetName());
Takes * tp;
POSITION pos = lstSC.GetHeadPosition();
while(pos != NULL){
tp = lstSC.GetNext(pos);
if(EQ_STR(tp->GetStudent()->GetName(), sp->GetName()))
printf("%s\n", tp->GetClass()->GetName());
}
}
The same observations that were made about the list_student_classes() function implementations, can be applied to the ones of the find_student_class() function.
/////////////////////////////////////////////////////////////////////
/** find_student_class
Find a Takes relation for the student if she is registered
for a class name class_name
If not found return 0
*/
//DOL
Takes* School::find_student_class(Student *sp, const char * name){
Class *cp;
Takes *tp;
sourceIter.start(sp);
ITERATE(sourceIter, tp){ // walk through all Takes's of sp
cp = scRelation.target(tp);
if(!strcmp(name, className.fwd(cp)))break;
}
return tp;
}
//STL
class same_Class{
const Student * sp;
char class_name[1<<5];
public:
same_Class(const Student *s, const char* name) : sp(s) {
strcpy(class_name, name);
}
bool operator() (Takes * tp){
return (EQ_STR(tp->GetStudent()->GetName(), sp->GetName()) &&
EQ_STR(tp->GetClass()->GetName(), class_name));
}
};
Takes* School::find_student_class(const Student *sp, const char * class_name){
same_Class sc(sp, class_name);
Takes_list::iterator i = find_if(lstSC.begin(), lstSC.end(), sc);
return (i == lstSC.end()) ? NULL : *i;
}
//MFC
Takes* School::find_student_class(const Student *sp, const char * class_name){
Takes * tp;
POSITION pos = lstSC.GetHeadPosition();
while(pos != NULL){
tp = lstSC.GetNext(pos);
if(EQ_STR(tp->GetStudent()->GetName(), sp->GetName()) &&
EQ_STR(tp->GetClass()->GetName(), class_name))
return tp;
}
return NULL;
}
Even though the add_student_class() function seems not to be doing much, it's worth comparing the implementations of it. All three of them create the Take object first. However, only the DOL implementation uses three pointers for the add operation. The reason for it is that DOL directly supports many-to-many associations and provides the appropriate data structure for this purpose (scRelation in this case).
////////////////////////////////////////////////////////////////////
/** add_student_class
Add the class for this student
*/
//DOL
void School::add_student_class(Student * sp, Takes * tp, Class * cp){
tp = new Takes; // many-to-many involves 3 object types!!
scRelation.add(sp,tp,cp);
}
//STL
void School::add_student_class(Student * sp, Class * cp){
Takes * tp = new Takes(sp, cp);
lstSC.push_back(tp);
}
//MFC
void School::add_student_class(Student * sp, Class * cp){
Takes * tp = new Takes(sp, cp);
lstSC.AddTail(tp);
}
What's interesting about the delete_student_class() implementations is that both DOL and STL provide functions to handle the removal of the Takes relation from the appropriate object. Thus in those two libraries it can be accomplished in one statement. The MFC implementation is much longer since it has to iterate through the list in order to find an iterm to remove.
//////////////////////////////////////////////////////////////
/** delete_student_class
Delete the class (Takes relation) for a student
*/
//DOL
void School::delete_student_class(Takes * tp){
scRelation.del(tp); // disconnect tp
delete tp;
}
//STL
void School::delete_student_class(Takes * tp){
lstSC.remove(tp);
delete tp;
}
//MFC
void School::delete_student_class(Takes * tp){
Takes * t;
POSITION pos = lstSC.GetHeadPosition();
while(pos != NULL){
t = lstSC.GetNext(pos);
if(t == tp){
lstSC.RemoveAt(pos);
delete tp;
break;
}
}
}
The delete_student function implementation is the most involved and probably most interesting to analyze and compare. In order to delete a student from the system, we have to remove that student from all the classes she is enrolled in first.
In order to measure the performance of the implementations, we have generated an unrealistic test case where a student to be removed was registered for 100,000 classes. Since different libraries might choose to provide alternative implementations of the global delete operator, the delete statements were commented out for the experiment. If a library, for example, allocates a large chunk of memory for a program so that operators new and delete don't result in a system call for every invocation, this immediately compromises the accuracy of a timing experiment. If we chose to perform the measurements leaving the delete statements where they are, then the time of destructing the objects might overwrite the time for the remove operation that we are trying to measure. The test environment used was a Pentium 166 machine running Windows NT 4.0 Service Pack 5. All three test programs were compiled using Visual C++ 5.0 and run twice: the first time all optimizations were turned off; the second time the speed optimization option was specified. The results are summarized in the comparison table. DOL shows the fastest running times in both optimized and non-optimized runs. This happens because internally the DOL's many-to-many datastructure doesn't have to do any searching prior to removal. Since the internal implementation of the libraries is not a topic of this discussion, we won't elaborate on this any further. What's interesting is that both STL and MFC perform comparably in optimized runs. However, in non-optimized runs, the STL version takes almost twice as much as the MFC one. What this means is that the Visual C++ compiler and perhaps most of the modern C++ compilers are well-equipped with optimizing various C++ constructs such as inlining member function calls. STL was designed to rely on availability of such optimizations.
We don't think that the sheer performance factor can be a single determining factor when deciding which library to use for a given project. High-performance applications rarely use class libraries anyway. What's more important is how the use of a class-library promotes building robust, flexible, and extensible applications.
What the delete_student function() does is quite a challenging programming paradigm. It has to traverse through a list of Takes object pointers and remove the ones that contain a given Student. This is not a trivial task at all. Please, check the following hyperlink for the discussion of why this is the case:
It should be clear in this case that DOL provides the most natural and concise facility to accomplish the deletion of items from the container that is being iterated. The reason for it is because DOL iterators have an in-built support for this operation. In the STL implementation, the existence of the remove_if() template function doesn't make things any more straightforward. The STL implementation is about three times larger than the one implemented in DOL requiring an extra class and a template function. The MFC implementation requires to keep an extra POSITION variable that makes it harder to understand.
////////////////////////////////////////////////////////
/** delete_student
Delete the student from the system
*/
//DOL
void School::delete_student(Student * sp){
Takes * tp;
sourceIter.start(sp); // delete all class links of this student
ITERATE(sourceIter,tp){ // walk through all Takes's of sp
scRelation.del(tp); // disconnect tp
delete tp; // yes, this is allowed within the loop!
}
students.del(this, sp);
}
//STL
template <class Cont>
class erase_elem{
Cont & cont;
public:
erase_elem(Cont& c) : cont(c) {}
inline void operator () (Cont::iterator it){
cont.erase(it);
delete *it;
}
};
template <class FwdIt, class Pred, class Pred1>
void real_remove_if(FwdIt first, FwdIt last, Pred pr, Pred1 destroy_pr){
FwdIt i = remove_if(first, last, pr);
while(i != last){
FwdIt del = i ;
i++;
destroy_pr(del);
}
}
void School::delete_student(Student * sp){
// delete all class links of this student
erase_elem <Takes_list> ee(lstSC);
same_Student ss(sp);
real_remove_if(lstSC.begin(), lstSC.end(), ss, ee);
//delete student
lstStudents.remove(sp);
delete sp;
}
//MFC
void School::delete_student(Student * sp){
//delete Student's Takes relations
Takes * t;
POSITION pos = lstSC.GetHeadPosition();
while(pos != NULL){
POSITION del = pos;
t = lstSC.GetNext(pos);
if(EQ_STR(t->GetStudent()->GetName(), sp->GetName())){
lstSC.RemoveAt(del);
delete t;
}
}
pos = lstStudents.Find(sp);
//delete Student
lstStudents.RemoveAt(pos);
delete sp;
}
The following table summarizes the general observations about the support for many-to-many associations provided by the libraries we considered.
| DOL | PTL | STL | MFC | |
| Has a many-to-many datastructure | Yes | Yes | No | No |
| Has iterators | Yes | Yes | Yes | No |
| Has "smart iterators (see 1) | Yes | Yes | No | No |
| Uses code-generators | Yes | No | No | No |
| Time to remove a student "enrolled" in 100,000 classes (non-optimized version) | 180ms | N/A | 590ms | 330ms |
| Time to remove a student "enrolled" in 100,000 classes (optimized version) | 130ms | N/A | 230ms | 230ms |
| Portable to different platforms | Yes | Yes | Yes | No |
1) "Smart" iterators support the deletion of items from a container being iterated
The Data Obejct Library uses a code generator that generates the macro expansion code that allows to have a fairly efficient code while having most of the complexity hidden from application programmers. The Pattern Template Library achieves the same results by using templates. What's important that the latest version of PTL has a many-to-many datastructure that can be used by application programmers.
Only DOL and PTL have a built-in support for many-to-many associations with an appropriate datastructure and a set of corresponding iterators. The many-to-many datastructure can be built using the existing classes in both MFC and STL. However, it takes a mature designer well-versed in the Object-Oriented Design subtleties to do this properly. Even in this case, the resulting implementations might prove to have certain shortcomings not being always elegant and concise.
This brings us to a conclusion that any major general purpose class library designed in the future, has to provide some support for implementing various associations between application classes. Being able not to reinvent the wheel and to reuse such provided solution is bound to result into more robust, flexible, and extensible software. How a class library should be organized to have this support implemented in an efficient manner is a topic for another discussion.
James Rumbaugh Object-Oriented Modeling and Design, Prentice Hall, 1991
Jiri Soukup Taming C++: Pattern Classes and Persistence for Large Projects, Addison-Wesley, 1994
Jiri Soukup Writing Complex Software, Dr. Dobb's Journal, October 1999
Bjarne Stroustrup C++ Programming Language, Addison-Wesley, 1997
The working source code for the example used in the report.
I would like to thank Dr. Jiri Soukup for his considerable help with this report. In addition to motivating my interest in the topic, he also provided the code examples in both DOL and PTL and patiently answered my numerous questions.