Standard string classes discourage standard data representations . . .

What ever happened to fixed-length strings?

by Conrad Weisert
July 1, 2012
© 2012 Information Disciplines, Inc.

This article may be circulated freely as long as the copyright notice is included.

A strange modern practice

A couple of years ago we noted that many programmers1 were overlooking obvious opportunities to define object-oriented classes for many simple text data items. Given the power and ease of use of the standard C++ and Java library classes it's so convenient to make everything a raw string that they do so unthinkingly.

Since then, we've encountered situations where even an otherwise sensibly-defined text data item class contains an unrestricted string member data item! One young programmer vehemently argued that there should be no limit whatever on the length of, for example, a person's name and no restriction on the characters that are acceptable in such a data item. If a data-entry clerk entered 2000 asterisks in a name field, who are we (class designers) to impose rules forbidding it?

Old-fashioned technology

Too-long data items were rarely a problem in older languages such as PL/I or even Cobol, since those languages provided a way to specify a fixed-length string. A PL/I programmer might code:



That's the way most programmers wanted text data to work, but if you wanted to do something else you could still use varying-length strings or code custom logic.

C++ beginnings

Older programmers recall that C++ was available for general use long before the standard library classes, including String. Not surprisingly, many programmers developed their own string classes. Some of them, including a few textbook examples, were naive and seriously flawed. A few, however, inspired by PL/I and other sound examples, provided solid string-manipulation capability.

Since the earliest days of C++ our IDI library supported four separate but interacting C++ string classes:

Fstringfixed-length string, like the above PL/I example
Vstringvarying-length string up to a specified maximum; avoids reallocation as a string is built up from pieces.
Dstringdynamic string, like the current C++ and Java standard classes
Cstringconstant length string contiguous with any record (composite object) of which it's a member

Even though the newer standard library classes are probably more efficient (e.g. using "reference counting"), we still use our old library classes for situations where they simplify our code. In particular we rarely use the C++ library String class or our own Dstring class where it's important to enforce a fixed or maximum length.

When is it important to enforce length?

One might argue that there's always some maximum size beyond which something must have gone horribly wrong if a program encounters a longer string. Nevertheless, we can and should make a useful distinction between:

  1. data items, such as names or identifiers
  2. arbitrary text, such as book chapters or source programs.

Notwithstanding the claims of our naïve young colleague, in conventional applications type a items always call for a specified size, either fixed or maximum. We can't imagine anything good coming from a program encountering a 200-character cityName or a 600-character bookTitle.

Therefore, class definitions for elementary text data items should always enforce a length, either by using a fixed-length string class for the member data item or by explicit logic.

1—as well as a surprising number of textbooks and articles.

Return to table of contents
Return to technical articles

Last modified July 1, 2012