These are notes from a talk given to both the New York C++ SIG and the
SFBA Center for Advanced Technology C++ Industrial Seminar Group, in
1995 and 1996.
Interface Design in the
Standard C++ Library:
The Grand Challenge
by Nathan Myers
ncm-nospam@cantrip.org
http://www.cantrip.org/
Abstract
-
The ANSI/ISO Standard C++ Library had
some very difficult goals.
-
Achieving those goals required some
innovative library interface techniques.
-
Those techniques are equally useful for
anyone designing classes.
Overview
-
ANSI/ISO C++ Standard Status
-
Standard Library Goals
-
Library Interface Techniques
Standard C++ Status
-
ANSI x3j16 (U.S.) and ISO WG21 (Int'l)
meet jointly to standardize C++ & Library.
-
Meetings each March, July & November.
-
Language core is stable
-
Complete
compilers by year-end
Standard C++
-
Core language
- C-compatible subset
- Classes, members, virtual, etc.
- Templates, exceptions, namespaces
-
Library
Support for I/O, inter-library communication,
basic data structures & algorithms.
-
Language is perhaps the first specifically
designed to support Library Design.
The Standard C++ Library
-
iostream,
locale
provide extensible I/O, formatting, parsing;
-
string, list, vector,
set, map & iterators
provide efficient aggregation, communication;
-
valarray, numeric_limits, complex
support optimized numerics, extensibly;
-
Dozens of algorithms (sort, search)
usable on standard and user-supplied data
structures.
Standard Library Goals
The Standard Library must be:
-
General:
usable anywhere C++ is.
-
Efficient:
no unnecessary overhead.
-
Adaptable:
tunable to varied needs.
-
Exemplary:
worth copying from.
(Non-goal: cfront-compatible)
The Standard Library Must Be General
Must support:
-
Any
style of programming.
-
Any execution environment:
- pacemakers to supercomputers
- ROM/fixed RAM to garbage-collection &
virtual memory
- real-time, multi-threading, shared memory,
object databases
The Standard Library Must Be Efficient
-
Unnecessary overhead would lead to:
- millions of inefficient user programs;
- reimplementation by users who need
efficiency.
-
Components serve as portable building
blocks for "fancier" libraries.
- Interfaces remain compatible with higher-
level structures.
- Basic structures suffice for interlibrary
communication.
The Standard Library Must Be Flexible
Support extensions & variations:
-
character types (char, wchar_t, ?)
- memory models (flat, "huge", OODB, shared,
garbage-collected)
- user data structures (hash tables, graphs)
- languages, character sets & display formats
The Standard Library Must Be Exemplary
-
The most-scrutinized of all C++ libraries.
-
The first library seen by majority of C++
programmers.
-
The most-used of all C++ libraries:
- used in more programs
- used in more varied circumstances
Non-goals
-
Not restricted to "Cfront" features
- Majority of (future) C++ users will have
complete compilers, won't know about Cfront
- Newer language features are visibly used both
to exercise compilers and to stimulate use.
-
Does not try to hide C++ from users
- Trying to hide our inheritance from C would
be misguided, ultimately doomed.
- Instead, extends C++ compatibly with its own
computational model.
Interface Techniques
How do we achieve these goals?
-
aggressive use of template facilities.
-
avoid imposing policy.
-
avoid depending on high-overhead
language features in low-level constructs.
-
Innovative interface design using the full
power of the language.
Interface Techniques
Overview:
-
Requirements
-
Convenience for simple uses.
-
Flexibility.
-
Backward compatibility
-
Examples
-
iostream, string & traits
-
locale as a type-indexed collections
-
Collections & allocator
Requirements: Convenience
The overwhelming majority of library uses
are of the simplest form:
-
built-in, native character set
-
ordinary C-like memory model.
Ordinary use must be so convenient that
users can ignore the alternatives, e.g.
string str; // this...
basic_string<char, char_traits<char>, allocator<char> >
str(allocator()); // not this.
Requirements: Flexibility
-
Character-type independence: iostream & string permit
- representation in different underlying types:
char, wchar_t, etc.,
- formatting in different encodings: ASCII,
Latin-1, EUC, Unicode,
etc.
-
Memory Model independence
-
Extensibility
Requirements: Backward Compatibility
-
The iostreams library has years of history
-
Must generalize and internationalize
without breaking existing programs.
-
This is almost the same problem as
ensuring convenience for the common
case.
Problem 1
iostream must allow alternate character
types without changing the basic interface.
template <class charT, ... >
class basic_streambuf { ... };
typedef basic_streambuf<char> streambuf;
typedef basic_streambuf<wchar_t> wstreambuf;
...but character types have extra semantics that iostream depends on:
streambuf::sgetc()
must return the int value EOF.
Are int and EOF right for other character types?
Solution 1: "traits"
Define a separate, template class,
specialized for the character type:
template <class charT>
struct char_traits { }; // empty
Specialize it for char, wchar_t, etc.
struct char_traits<char> {
typedef int int_type;
static inline int_type eof() { return EOF; }
...
};
Using "traits" (1)
Use the "traits" template in the streambuf:
template <class charT>
class basic_streambuf {
public:
typedef charT char_type;
typedef typename char_traits::int_type int_type;
int_type sgetc(); // returns char_traits::eof() at end-of-file.
...
};
Old code still works, as fast as before:
while ((ch = sbuf.getc()) != EOF) { ... }
Using "traits" (2)
Greater flexibility is possible. Suppose we declare
basic_streambuf so:
template <class charT,
class traits = char_traits<charT> >
class basic_streambuf {
public:
typedef typename traits::int_type int_type;
int_type sgetc();
...
};
Now users can substitute some other EOF ...
Using "traits" (3)
For example, a control-Z?
struct MyTraits : public char_traits<char> {
static inline int_type eof() { return 26; }
};
typedef ifstream<char,MyTraits> MyIfstream;
MyIfstream book("book.txt", ios::in);
string line;
while (getline(book, line)) { ... }
In a MyIfstream, control-Z is recognized
as end-of-file.
An Aside: public typedefs
Notice the public typedefs in the template:
template <class charT ,
class traits = char_traits<charT> >
class basic_streambuf {
public:
typedef charT char_type; // <-- here
typedef traits traits_type; // <-- and here
...
};
Always, always, always do this.
Numeric Traits
When writing numeric templates, you sometimes need the equivalent
of FLT_MAX for the element type used:
template <class Num>
void double_if_possible(Num& val)
{
if (val < ???_MAX/2)
val += val;
}
Numeric Traits (2)
The standard header <limits> contains:
template <class Num> numeric_limits {};
template <> struct numeric_limits< float > { // specialize
static inline float max()
{ return FLT_MAX ; }
...
};
struct numeric_limits< double > { // specialize
static inline double max()
{ return DBL_MAX ; }
...
};
Numeric Traits (3)
Now the function looks like:
template <class Num>
void double_if_possible(Num& val)
{
if (val < numeric_limits<Num>::max() /2)
val += val;
}
Problem #2: Type-safe
Extensibility
Internationalization is open-ended:
-
Built-in categories include
character sets, numeric, time, and money formats, string
collation, message catalogs.
-
Possible categories not standardized
include
time zones/daylight saving,
national holidays, measurement units
, ...
-
Users must be able to extend Std library.
Type-safe Extensibility (2)
Locale facilities encapsulate local
preferences.
-
A locale object must be a collection of
objects representing a choice for each
category:
e.g.
metric units, Unicode
character set, Japanese currency...
-
Each category has a
different
interface.
-
Need a collection indexed by abstract
type.
Type-safe Extensibility (3)
A locale is a collection of facets:
-
Each locale category is represented by an
abstract class interface, collectively called facets
.
-
Each facet implements one such interface.
-
A locale maps from a facet type to an instance of that type.
Type-safe Extensibility (4)
Template notation provides the interface:
const Facet & use_facet<Facet>(const locale&);
Given a locale loc and a facet supporting, e.g., string comparison:
collate<char>::compare( ... ),
call it as:
use_facet< collate<char> >(loc).compare( ... );
Similar to a cast: use_facet throws an exception if
the facet is not present in the argument locale object.
An Aside: Use with Defaults
Users see locale as a unit, pass it along to
functions that use it; usually only class
designers look at the facets of a locale.
The default for normal operations is the
global locale, locale(), so users need
mention a locale only if they have one:
string f(const locale& = locale());
...
f(); // use the global locale, locale()
f(loc) // use a private locale, instead.
An Aside: Use with Defaults (2)
Example: a date class Date:
class Date {
public:
string asString(
const locale& use = locale());
...
};
An Aside: Use with Defaults (3)
To convert a Date to a string
using (as in the common case) the default global
locale:
Date today = Date::now();
string s = today.asString();
To convert using a specific locale, e.g. french:
locale french("fr_FR");
Date today = Date::now();
string s = today.asString(french);
Another aside: imbue
Passing locale objects around a program
adds clutter. Also:
-
Many functions are too high-level to
warrant adding a locale argument.
-
For backward compatibility, you may be
unable to add a locale argument.
-
Operators have a fixed number of
arguments, can't add any.
Another aside: imbue (2)
Sometimes an object can "hitchhike" on an
existing argument, particularly a long-
lived argument like a stream.
Standard iostream provides a member :
locale ios::imbue(const locale&)
Another aside: imbue (3)
With imbue(), locales can be carried along
in the iostream object to affect operations
far down the call chain, without cluttering
the interface:
Date today = Date::now();
cout << today; // use global locale
cout.imbue(locale("fr_FR"));
cout << today; // use french locale
Another aside: imbue (4)
The operator<< on Date
retrieves the imbued locale from the stream and uses it
to do formatting.
-
iostream knows nothing about class Date.
-
locale knows nothing about class Date.
Imbuing (hitchhiking) is a good way to
help keep interfaces simple and general.
Pause...
In all the examples thus far, notice a
common theme:
- Simple uses are easy.
- Specialized uses are possible.
Problem: Memory Models
Programs often need to place objects in very
odd places:
-
Object databases
-
Shared memory
-
Garbage-collected memory
-
"huge" memory
-
across a network.
Problem: Memory Models (2)
We would like to use the same objects
regardless of where they are.
-
Objects are often composed of subobjects
which must also find their way to the same
storage.
-
Memory model choice must appear in the
template and class interfaces.
Solution: Memory Models
The standard library encapsulates memory
models in class interfaces called
allocators.
-
The default allocator type, called
allocator, provides ordinary C-style
memory management.
-
Users can define an allocator for standard
classes to use instead.
Use of Allocators
Standard Collection templates take a
defaulted Allocator parameter:
template <class T, class Allocator = allocator<T> >
class list {
public:
typedef T element_type;
typedef Allocator allocator_type;
explicit list(
Allocator& = Allocator());
...
};
Use of Allocators (2)
Ordinary uses of collections can ignore
allocators:
new list<string>
Specialized uses can substitute another regime:
typedef basic_string< ... >
HugeString;
new(HugeAlloc())
list<HugeString,HugeAlloc>
Use of Allocators (3)
Memory models that require a choice at
runtime are passed to the constructor, too:
typedef basic_string< ... > OString;
OdbAlloc parts("parts.odb")
new(parts)
list<Ostring,OdbAlloc>(parts)
List elements come from the specified
Object Database storage allocator object.
Allocators
NOTE:
Standard C++ Library allocators are
not the same as those in current commercial and
public-domain STL implementations.
Don't depend on that allocator interface.
In Conclusion...
-
Difficult challenges led to exploring the
strengths of the language.
-
The language permits
generality without adding runtime overhead
or a complicated interface.
user choices at coding time or deferred to
compile time, link time, run time, or function-
call time.
-
The Standard C++ Library exploits these
qualities.
Return to The Cantrip Corpus.
Send email: ncm-nospam@cantrip.org
Copyright ©1997 by Nathan Myers. All Rights Reserved.
URL: <http://www.cantrip.org/stdlibif.html>