Following up on last month's challenge . . .

A PersonName class

Part 1 of 3
©Conrad Weisert, August 3, 2012

Background:

Last month we challenged readers to help in preparing a Person class that would satisfy the needs of a broad range of applications. Many of us are surprised that such an obvious class remains unspecified1 two decades into the object-oriented era.

In the earlier article we deferred choosing a standard way of specifying a person's name. In these articles we'll examine some of the choices leading to a usable PersonName class, one that can simplify programs whether or not they elect to use a standard Person class.

We'll begin by considering mainly the internal data representation. After we're satisfied with it, we'll move on to specify the functionality (not much for a simple name) that it must support.

Step one: eliminate the abolutely atrocious choices

Let's examine some choices that we find in textbooks, presentations, and actual programs.

1. Using raw strings wihtout a class

Textbooks, espcially in Java, often specify a simple String data item:

   class  Person {
    String  name;
     .
     .
 }
ignoring the oppotunity to define a class. They may attempt a minimal structure:
   class  Person {
    String  lastName;
    String  firstName;
    char    middleInitial;
     .
     .
 }

What's wrong with that?

  1. There's no length limit. Yet we know that names have to fit
    1. on mailing labels,
    2. in all sorts of forms,
    3. on formatted reports and screen displays.
    Therefore a standard representation for a PersonName must observe a maximum size. Data administrators may argue about what that limit should be, but they have to agree on some value.

  2. It assumes that everyone's name has exactly those components, but we find exceptions everywhere. Just among prominent U.S. politicians we know of:
    1. J. Danforth Quayle
    2. George Herbert Walker Bush
    and in other societies we find even more variety,

2. Using raw strings within a class

More "sophisticated" textbooks wrap the above in a class:

   class  Person {
    PersonName name;
         .
         .
   }
   class PersonName {
     String  lastName;
     String  firstName;
     char    middleInitial;
      .
      .
 
   }

That has all the same disadvantages as the non-class solution, except that the representation is more localized and therefore easier to change globally.

3. Generalizing the number of names

A common reaction to the above criticisms, especially from students, is to add more components:

   class PersonName {
     String  lastName;
     String  firstName;
     String  secondName;
     String  thirdName;
      .
      .
 
   }
and then, realizing the wasteful messiness of that solution, to generalize it:
   class PersonName {
     String  names[];
      .
      .
 
   }
That's a slight improvement in terms of flexibility, but it carries a lot of overhead in both space and execution speed. Can't we do better?

Step 2: Consider some alternative choices

4. Limiting the size of each component

Do you know anyone with a 300-character last name? If so, would he or she find it easy and natural under his or her full name:

Of course not. We accept practical limits and we always have. In the course of a thoughtful discussion we might agree to limit the length of a person's surname to, say, 40 characters. Or pick any number you prefer.

That raises the next obvious question: What about the other name components? Suppose we also limit each of them to 40 positions and limit the number of components to four. We will then have allowed names up to a total length of 160 characters or, using international Unicode (standard in Java), 320 bytes, not counting the control information needed for each character string. That might be acceptable for a few applications, but even in today's cheap storage world, it strikes most of us as extravagant and impractical.

Of course, that's just the maximum length. The average or typical length would turn out to be more modest, consistent with what we see in a telephone directory. But if each component of each name is of a variable size, we'll have to carry a lot of extra control information and programs that manipulate names will have to incur added complexity.

5. Limiting the total length

Someone might have a very, very long surname, a short first name and no middle name. The total length might easily fit in 44 positions. It's clear, then, that any length limitation should apply to the whole name, not to each component. Is there a straightforward way of implementing that limitation while retaining full flexibility for the individual components?

The answer is yes.

Coming next time

We'll explore this important topic in detail in September and October. Meanwhile, let's have your ideas and proposed solutions.


1—By "unspecified" we mean only that a solution is not well known, does not appear in typical textbooks, and is not a part of a standard class library. Readers may well have superior solutions, and are invited to share them with us.

Last modified 3 August 2012

Return to technical articles .
Return to IDI home page.