Part 1 of 3
©Conrad Weisert, August 3, 2012
Last month we challenged readers to help in preparing a
Person class that would satisfy the
needs of a broad range of applications. Many of us are surprised that such an
obvious class remains unspecified1 two decades
into the object-oriented era.
In the earlier article we deferred choosing a standard way of specifying a
person's name. In these articles we'll examine some of the choices leading
to a usable PersonName class, one
that can simplify programs whether or not they elect to use a standard
Person class.
We'll begin by considering mainly the internal data representation. After we're satisfied with it, we'll move on to specify the functionality (not much for a simple name) that it must support.
Let's examine some choices that we find in textbooks, presentations, and actual programs.
Textbooks, espcially in Java, often specify a simple String data
item: class Person {
String name;
.
.
}
ignoring the oppotunity to define a class. They may attempt a minimal
structure: class Person {
String lastName;
String firstName;
char middleInitial;
.
.
}
What's wrong with that?
PersonName must observe
a maximum size. Data administrators may argue about what that limit should
be, but they have to agree on some value.
More "sophisticated" textbooks wrap the above in a
class: class Person {
PersonName name;
.
.
}
class PersonName {
String lastName;
String firstName;
char middleInitial;
.
.
}
That has all the same disadvantages as the non-class solution, except that the representation is more localized and therefore easier to change globally.
A common reaction to the above criticisms, especially from students, is to
add more components: class PersonName {
String lastName;
String firstName;
String secondName;
String thirdName;
.
.
}
and then, realizing the wasteful messiness of that solution, to generalize
it: class PersonName {
String names[];
.
.
}
That's a slight improvement in terms of flexibility, but it carries a lot of
overhead in both space and execution speed. Can't we do better?
Do you know anyone with a 300-character last name? If so, would he or she find it easy and natural under his or her full name:
Of course not. We accept practical limits and we always have. In the course of a thoughtful discussion we might agree to limit the length of a person's surname to, say, 40 characters. Or pick any number you prefer.
That raises the next obvious question: What about the other name components? Suppose we also limit each of them to 40 positions and limit the number of components to four. We will then have allowed names up to a total length of 160 characters or, using international Unicode (standard in Java), 320 bytes, not counting the control information needed for each character string. That might be acceptable for a few applications, but even in today's cheap storage world, it strikes most of us as extravagant and impractical.
Of course, that's just the maximum length. The average or typical length would turn out to be more modest, consistent with what we see in a telephone directory. But if each component of each name is of a variable size, we'll have to carry a lot of extra control information and programs that manipulate names will have to incur added complexity.
Someone might have a very, very long surname, a short first name and no middle name. The total length might easily fit in 44 positions. It's clear, then, that any length limitation should apply to the whole name, not to each component. Is there a straightforward way of implementing that limitation while retaining full flexibility for the individual components?
The answer is yes.
We'll explore this important topic in detail in September and October. Meanwhile, let's have your ideas and proposed solutions.
1—By "unspecified" we mean only that a solution is not well known, does not appear in typical textbooks, and is not a part of a standard class library. Readers may well have superior solutions, and are invited to share them with us.
Last modified 3 August 2012
Return to technical articles .
Return to IDI home page.