Conrad Weisert, August 15, 2003
©2003 Information Disciplines, Inc.
This article may be freely circulated, as long as the copyright credit is included.
We're continuing to follow up on Koenig & Moo's excellent advice
on character-string representation and manipulation in this month's C/C++ Users Journal.
In part one we examined how a programmer can use
old-fashioned C character strings (arrays of char)
in a disciplined and relatively safe way.
In pointing out the superiority of a C++ string class over C's crude character manipulation facilities, the authors showed this example:
| Pure C version | C++ / STL version |
struct Person {
char name[30];
char address[50];
}; |
struct Person {
string name;
string address;
}; |
Which of those is "better"? Which is appropriate in an object-oriented program?
Of course, neither is appropriate. The pure C version may be worse than the C++ / STL version, but both versions seriously violate the letter and spirit of OOP.1
In the first version an address is a string of maximum length 49.
In the second version an address is a string of arbitrary length.
Some obvious questions:
Every systems analyst, programmer, and data administrator knows that an
address has structure. That structure is an obvious candidate for
standardization within an organization, whether or not the organization
follows an object-oriented approach to software design and development.
Designing a class
or struct for
Address is not only desirable, it's
mandatory. Failure to do so is a serious design flaw.
So, we might replace the earlier examples by this:
struct Person {
string name;
Address address;
}; |
but that's not all.
Address class or structureWe're so well acquainted with mailing addresses that it's tempting to code without much reflection something like this:
struct Address {
string street;
string city;
string state;
string zip;
}; |
but unfortunately, that structure is loaded with problems. For one thing it's limited to one particular kind of mailing address, a U.S. address that isn't a post-office box. Isn't this a perfect example of an is-a hierarchy?
In addition, we see these problems:
street
an array? Should we define multiple fields street1
street2, etc.? Or should we
embed a line-feed character or slash in the single string to signify
the break?
city and other fields that
may have to fit on a mailing label or a screen
form. It's customary and good practice to limit the maximum size
of all these fields.
state field is redundant,
implied by the zip code.2 We might
want a data-entry user to enter it
for consistency checking, but it's silly to store it in the object.
We'll refrain from suggesting specific solutions here, since there are many valid ones. But we have to do something to resolve those issues.
No. We agree that addresses have structure, but so do people's names. Representing a person's name by a character string (or by two or three character strings) is naive, inflexible, inconsistent, and error-prone. Taking a top-down design approach, we can rewrite the original example:
struct Person {
PersonName name;
USA_Addresss address;
}; |
You can work out the definitions of the two member classes as a not-so-trivial exercise. Once that's done, your organization will find them useful in many applications.
There remain serious questions about the anemic
Person structure itself.
Those issues, however, are well beyond the scope of either this article
or Koenig & Moo's original, and we won't go into them here.
In the next article, we'll look at some shortcomings of the standard C++
string class that render it
unsuitable for representing some text fields, and discuss
practical alternatives.
1 -- Of course Koenig & Moo's purpose was limited to the details of string manipulation, so it wouldn't be fair to condemn the authors for violating higher-level design criteria. Still, since the example may mislead less experienced readers, it shouldn't stand unchallenged.
2 -- IDI's reusable component library contains the tables relating zip-code ranges to state names and state abbreviations, as well as BASIC subroutines to search them. Call or E-mail us if you're interested.
Return to Technical articles
Return to C++ topics
Return to IDI home page.
Last modified August 16, 2003