by Conrad Weisert
July 1, 2012
© 2012 Information Disciplines, Inc.
This article may be circulated freely as long as the copyright notice is included.
A couple of years ago we noted that many
obvious opportunities to
define object-oriented classes for many simple text data items. Given the power and ease of use
of the standard C++ and Java library classes it's so convenient to make everything a raw
string that they do so unthinkingly.
Since then, we've encountered situations where even an otherwise sensibly-defined text data item
class contains an unrestricted
string member data item!
One young programmer vehemently argued that there should be no limit whatever on the length of,
a person's name and no restriction on the characters that are acceptable in such a data item. If
a data-entry clerk entered 2000 asterisks in a name field, who are we (class designers) to impose
rules forbidding it?
Too-long data items were rarely a problem in older languages such as PL/I or even Cobol, since those languages provided a way to specify a fixed-length string. A PL/I programmer might code:
DECLARE CITY_NAME CHARACTER(18);
CITY_NAMEit would automatically be padded with trailing blanks.
CITY_NAME, it would be truncated.
That's the way most programmers wanted text data to work, but if you wanted to do something else you could still use varying-length strings or code custom logic.
Older programmers recall that C++ was available for general use long before the standard library
String. Not surprisingly, many
programmers developed their own string classes. Some of them, including a few textbook examples,
were naive and seriously flawed. A few, however, inspired by PL/I and other sound examples,
provided solid string-manipulation capability.
Since the earliest days of C++ our IDI library supported four separate but interacting C++ string classes:
|fixed-length string, like the above PL/I example|
|varying-length string up to a specified maximum; avoids reallocation as a string is built up from pieces.|
|dynamic string, like the current C++ and Java standard classes|
|constant length string contiguous with any record (composite object) of which it's a member|
Even though the newer standard library classes are probably more efficient (e.g. using "reference counting"), we still use our old library classes for situations where they simplify our code. In particular we rarely use the C++ library String class or our own Dstring class where it's important to enforce a fixed or maximum length.
One might argue that there's always some maximum size beyond which something must have gone horribly wrong if a program encounters a longer string. Nevertheless, we can and should make a useful distinction between:
Notwithstanding the claims of our naïve young colleague, in conventional applications
type a items always call for a specified size, either fixed or maximum. We can't
imagine anything good coming from a program encountering a 200-character
or a 600-character
Therefore, class definitions for elementary text data items should always enforce a length, either by using a fixed-length string class for the member data item or by explicit logic.
Return to table of contents
Return to technical articles
Last modified July 1, 2012