by Conrad Weisert
July 21, 2010
© 2010 Information Disciplines, Inc.
This article may be circulated freely as long as the copyright notice is included.
In the past week's worth of reading articles and textbooks I've encountered these source-code (Java, C#, etc.) data declarations:
string customerName;
.
string city;
.
string productCode;
What do those declarations mean? The first one specifies that
a customerName can be a sequence of between
0 and 655351 positions, each of which is a letter,
a digit, a punctuation sign, or a control character. That's probably not what the
programmer intended.
The second example combines the same flaw with a misleading data name. The programmer
obviously meant cityName. In an object-oriented
programming world, a code reader would naturally expect
city
to be an object that models the properties (e.g. name,
location, population, founding date) and behavior of a city.
Note that a
CityName object would be a member
of City (where it's data name would
be just name), but could also be
used independently, e.g. for a field on a mailing label. Presumably the
City class would have a
name()2 accessor
function, which should return
a CityName object, not a
string.3
The surprising thing is that many of these examples come from sources that claim to
support the object paradigm. OOP offers an obvious, simple, and type-safe
way of representing such data items. We pointed this out
seven years ago, but the flood of silly
string data continues at an increasing
rate.
Is avoiding objects an error or just a poor choice? Experienced programmers could have a long debate on that without agreeing.4 However, at best such code indicates a naïve beginner programmer and betrays a lack of understanding of what OOP is about.
string data?String data items are appropriate for actual text. If you're developing a compiler, a query decoder, or a word processor, then your input consists of character strings. If you're generating a report or free-text messages, then your output consists of character strings. That's what character strings are for.
In a mature software development organization, individual programmers wouldn't even be making these choices. Why should each computer application system have its own way of representing, say, names of people? A corporate data administrator or enlightened analyst, having recognized the need, would have proposed a standard. Upon appropriate review and approval that standard would have been disseminated through a corporate data dictionary or other standards repository. Then if the organization practiced object-oriented programming, the supporting class definition would be developed and placed in a central library.
From that point on, programmers wouldn't be tempted to declare a
PersonName data item as a
string, and doing so would be recognized
by everyone as an error.
getName() for those who prefer
verbs as function names.Return to table of contents
Return to technical articles
Last modified August 20, 2010