Chapter Five

Object Life Cycle

There are a few thing you should think about when declaring, initializing and copying objects.

You should have as few variables as possible, since that can improve performance. This also means that you should not create a copy of an object unless you have to.

You should not have to browse through many pages of code to find the declaration of a variable.

You should not have to modify many pages of code if you want to change the value of a literal.

Copying and initialization should always create objects with valid states.

Initialization of variables and constants

A little discipline when declaring and initializing variables and constants can do wonders to make your code easier to understand and maintain. What may come as a surprise is that you can also improve the performance of your program.

RULES
AND
RECOMMENDATIONS

Rec 5.1 Declare and initialize variables close to where they are used.

Rec 5.2 If possible, initialize variables at the point of declaration.

Rec 5.3 Declare each variable in a separate declaration statement.

Rec 5.4 Literals should only be used in the definition of constants and enumerations.

Rec 1.2 , Style 1.4 , variable names.

Rule 7.10 , how to access string literals

Rec 5.1 Declare and initialize variables close to where they are used.

It is best to declare variables close to where they are used. Otherwise you may have trouble finding out what type a particular variable have. Another advantage with localized variable declarations is more efficient code, since only those objects that are actually needed will be initialized.

Initializing variables

Instead of declaring the variable at the beginning of a code block and giving it a value much later:

 
int i;

// 20 lines of code not using i

i = 10;          // No

try to declare and initialize the variable close to its first use:

 
int j = 10;      // Better

Rec 5.2 If possible, initialize variables at the point of declaration.

Try to initialize a variable to a well-defined value at the point of declaration. The main reason is to avoid redundant member function calls. Suppose you have a class with both a constructor and a assignment operator taking the same type of argument. If you assign an object of that class instead of using the corresponding constructor, then two member function calls are needed to give the object a proper value. The first call is to a default constructor that must be provided when an object is declared without an initializer.

Initialization instead of assignment

 
// Not recommended
EmcString string1;          // calls default constructor
string1 = "hello";          // calls assignment operator

// Better
EmcString string2("hello"); // calls constructor

Initialization at the point of declaration can also remove many potential bugs in your code, since the risk of using an uninitialized object will be reduced.

Variables of built-in types are a special case, since they have no default constructors that are called when an initializer is missing. Instead such variables remain uninitialized until they are assigned to, so if you do not initialize them, you should assign to them as soon as possible.

The reason that such variables are not always initialized, is that it can sometimes be very difficult or even impossible to do so. Suppose, for example, that the variable must be passed to a function as a reference argument to be initialized.

Assignment instead of initialization

 
int i;       // no reason to initialize i
cin >> i;    // modifies both cin and i

Rec 5.3 Declare each variable in a separate declaration statement.

Declaring multiple variables on the same line is not recommended. The code will be difficult to read and understand.

Separate declarations also make the code more readable and easier to comment, if you want to attach a comment to each variable.

Some common mistakes are also avoided. Remember that when declaring a pointer, unary * is only bound to the variable that immediately follows.

Declaring multiple variables

 
int i, *ip, ia[100], (*ifp)();    // Not recommended

LoadModule* oldLm = 0;    // pointer to the old object
LoadModule* newLm = 0;    // pointer to the new object

// declares one int*, m, and one int, n.
int* m, n;                // Not recommended

Rec 5.4 Literals should only be used in the definition of constants and enumerations.

Literals (often called "magic numbers") should only be used in the definition of constants and enumerations.

One reason is that literals need an additional comment to be understood. Some integers like 0 and 1 are exceptions since their meaning can often be deduced from the context in which they are used. Many of them can now be replaced by the new bool values, true and false .

Code with magic numbers is also more difficult to maintain, since their use may be sprinkled all over the code.

Correct use of "magic" number

 
// Literal in definition of const,
const size_t charMapSize = 256;   

// but not to specify array size!
char charMap[charMapSize];        

// Or for comparison!
for (int i = 0; i < charMapSize; i++) 
{
   // ...
}

Constructor initializer lists

Base classes and non-static data members should be initialized in the constructor initializer list since it is more efficient than to use assignment inside the body of the constructor.

RULES
AND
RECOMMENDATIONS

Rec 5.5 Initialize all data members.

Rule 5.6 Let the order in the initializer list be the same as the order of declaration in the header file. First base classes, then data members.

Rec 5.7 Do not use or pass this in constructor initializer lists.

Rec 1.2 - Style 1.5 , names of data members.

Rule 10.1 , access to data members.

Rec 5.5 Initialize all data members.

Initialization is the recommended way to give data members and base classes proper values. All direct base classes, non-static data members and virtual base classes can have initializers in the constructor initializer list. If the object to be initialized is a class with constructors, the expression determines what constructor to use. If not, the expression could be a value to copy.

If you do not specify an initializer, the default constructor will be used to initialize the data member or the base class, if such a constructor exists. Data members of a built-in type will not be initialized, which is potentially very dangerous. Clearly this is not desirable. Initializing integers to a value like zero can sometimes be a good idea.

It is possible to give data members values inside the body instead of in the initialization list. We do not recommend this practice, since it is less efficient to first call the default constructor and then the assignment operator, than to call only one constructor. For data members of built-in types there is no such difference, but for the sake of consistency, even these should be initialized in the constructor initialization list.

There are some exceptions. If a data member must be initialized by an expression that in any way must access the containing object, it is sometimes necessary to defer initialization to the body of the constructor. Another situation is when an expression is too complex to appear in the initialization list.

Base classes are treated as data members in the initialization list, which means that they are also initialized by the default constructor, if no initializer is provided.

Constructor initialization lists

 
class Base
{
   public:
      explicit Base(int i);
      Base();
   private:
      int iM;
};

Base::Base(int i) : iM(i) // iM must be initialized
{
   // Empty
}

Base::Base() : iM(0)      // iM must be initialized
{
   // Empty
}

class Derived : public Base
{
   public:
      explicit Derived(int i);
      Derived();
   private:
      int  jM;
      Base bM;
};
Derived::Derived(int i)  // jM must be initialized
: Base(i), jM(i)         // Default constructor used for bM
{
   // Empty
}
Derived::Derived()       // jM must be initialized
: jM(0), bM(1)           // Default constructor used for Base
{
   // Empty
}

Rule 5.6 Let the order in the initializer list be the same as the order of declaration in the header file. First base classes, then data members.

It is legal C++ to list initializers in any order you wish, but you are recommended to list them in the same order as they will be called.

The order in the initializer list is irrelevant to the execution order of the initializers. Putting initializers for data members and base classes in any other order than their actual initialization order is therefore highly confusing and error-prone. A data member could be accessed before it is initialized if the order in the initializer list is incorrect.

Virtual base classes are always initialized first. Then base classes, data members and finally the constructor body for the most derived class is run.

Order of initializers

 
class Derived : public Base    // Base is number 1
{
   public:
      explicit Derived(int i);
      Derived();
   private:
      int  jM;                 // jM is number 2
      Base bM;                 // bM is number 3
};

Derived::Derived(int i) : Base(i), jM(i), bM(i)  
// Recommended order       1        2      3
{
   // Empty
}

Rec 5.7 Do not use or pass this in constructor initializer lists.

Another unsafe practice is to use or pass this in the initializer list. The object pointed at by this is not fully constructed until the body of the constructor is being run.

The object is not fully constructed when base classes and data members are initialized. Calling a virtual member function through a pointer or reference to the partially constructed object is not safe. Doing so is probably wrong and the program is likely to crash.

Calling a member function in a member initializer list can be equally dangerous, since such a member function could try to access uninitialized members of the class.

Passing this to base class and member initializers, or using this implicitly by calling a member function in the initializer list, should therefore be avoided as much as possible.

Copying of objects

A general rule is to avoid copying as much as possible, but it is sometimes necessary to copy objects and you need to know when. It is equally important to understand when copying is inappropriate.

Copying can be done by initialization or by assignment. Copying by assignment is similar to initialization but is more difficult since you modify an existing object that may hold resources that must be correctly managed.

The compiler will generate a copy constructor and a copy assignment operator if the class does not declare one. It is important to understand when the compiler-generated ones are appropriate.

RULES
AND
RECOMMENDATIONS

Rec 5.8 Avoid unnecessary copying of objects that are costly to copy.

Rule 5.9 A function must never return, or in any other way give access to, references or pointers to local variables outside the scope in which they are declared.

Rec 5.10 If objects of a class should never be copied, then the copy constructor and the copy assignment operator should be declared private and not implemented.

Rec 5.11 A class that manages resources should declare a copy constructor, a copy assignment operator, and a destructor.

Rule 5.12 Copy assignment operators should be protected from doing destructive actions if an object is assigned to itself.

Rec 7.3 - Rec 7.5 , Rule 7.6 , argument passing.

Rule 7.7 , return value of copy assignment operator.

Rule 7.9 , parameter type for copy constructor and copy assignment operator.

Rec 12.7 , Rule 12.8 , resource management.

Rec 5.8 Avoid unnecessary copying of objects that are costly to copy.

Copying an object is not the same as making a bitwise copy of its storage. Bitwise copying, for example through the use of memcpy() , only works for a limited number of objects and should almost always be avoided.

For most objects, copying is the same as calling either the copy constructor or the assignment operator for the class. Since a class could have other objects as data members or inherit from other classes, many member function calls would be needed to copy the object. To improve performance, you should not copy an object unless it is necessary.

It is possible to avoid copying by using pointers and references to objects, but then you will instead have to worry about the lifetime of objects. You must understand when it is necessary to copy an object and when it is not.

Rule 5.9 A function must never return, or in any other way give access to, references or pointers to local variables outside the scope in which they are declared.

Returning a pointer or reference to a local variable is always wrong since it gives the user a pointer or reference to an object that no longer exists. Such pointer or reference cannot be used without the risk of overwriting the caller's stack space. Most compilers warn about this, but mistakes are still possible to make.

Returning dangling pointers and references

 
int& dangerous()
{
   int i = 5;
   return i;          // NO: Reference to local returned
}

int& j = dangerous(); // NO: j is dangerous to use

// much later:

cout << j;            // Crash, boom, bang, program dies

There are less obvious ways of making the same mistake, as in this example:

 
struct MyStruct
{
   char *p;
   // ...
};

MyStruct ms;

void alsoDangerous()
{
   const char str[100] = "Bad news up ahead";
   ms.p = str;        // No: address of local stored
}

alsoDangerous();

cout << ms.p << endl; // Garbage printed

The function alsoDangerous() does not explicitly pass any pointer or reference to any local object, but it lets such a pointer leak through by assigning it to a struct with a scope larger than the local data in the function. The result in this case is that garbage will be printed since the memory pointed at is likely to be overwritten.

Rec 5.10 If objects of a class should never be copied, then the copy constructor and the copy assignment operator should be declared private and not implemented.

Before you go ahead and implement copy constructors and copy assignment operators for a class, you should ask yourself if the class has a reasonable copy semantics or not. Is it reasonable to be able to copy an object of the class? Sometimes this is a very simple question to answer, such as for a string class which of course should be copyable. In many other cases the question about copying can be quite hard to answer. But remember that even if you cannot copy objects, you can still copy pointers and that is often sufficient.

Hopefully the question of copy semantics or not for a class will naturally come out of the design process. Do not push copy semantics on a class that should not have it.

By declaring the copy constructor and copy assignment operator as private , a class is made non-copyable. These member functions must be declared, since the compiler would otherwise generate a public copy constructor and a public copy assignment operator for the class. The two privately declared member functions should not be called, which means they do not have to be implemented, only declared.

Non-copyable class

 
class CommunicationPort
{
   public:
      explicit CommunicationPort(int port);
      ~CommunicationPort();
      // ...
   private:
      CommunicationPort(const CommunicationPort& cp);
      CommunicationPort&
         operator=(const CommunicationPort& cp);
      // ...
};

Rec 5.11 A class that manages resources should declare a copy constructor, a copy assignment operator, and a destructor.

As said before, the compiler will generate a copy constructor, a copy assignment operator and a destructor if these member functions has not been declared. For many classes, the generated member functions have the wrong behavior.

A good example is a string object that stores a pointer to memory allocated with new . If we implement a destructor that deletes the pointer, but do not provide a copy constructor, there is a good chance that some pointers will be deleted twice.

A compiler generated copy constructor does memberwise initialization and a compiler generated copy assignment operator does memberwise assignment of data members and base classes. For a string class, this would mean that the pointer, not the character array is copied. If the class has been written with the assumption that the character array is owned by the object, the bug is that two objects will store a pointer to the same character array after a call to the compiler generated copy constructor or copy assignment operator.

If a class should be copyable, we must implement a copy constructor, a copy assignment operator and a destructor when the ones generated by the compiler will not work correctly. This means that there is a large category of classes that should both declare and implement these three member functions. An even larger category of classes are those that declare them, since that would include all non-copyable classes as well.

Classes that manage resources belong to this category. We have to make sure that a resource is only acquired and released once.

Copyable class that manages memory

EmcIntStack is a simple stack class that manages an array of integers. Since we want to be able to copy stack objects, we declare the copy constructor, the assignment operator and the destructor as public members of the class.

 
// EmcIntStack is copyable

class EmcIntStack
{
   public:
      EmcIntStack(unsigned allocated = defaultSizeM);
      EmcIntStack(const EmcIntStack& s, unsigned ex = 0);
      ~EmcIntStack();
      EmcIntStack& operator=(const EmcIntStack& s);
      // ...
   private:
      enum      { defaultSizeM = 100 };
      unsigned  allocatedM;
      int*      vectorM;
      int       topM;
};

EmcIntStack::EmcIntStack(unsigned allocated)
: allocatedM(allocated),
  vectorM(new int[allocatedM]),
  topM(0)
{
}

EmcIntStack::EmcIntStack(const EmcIntStack& s, 
                         unsigned extra)
: allocatedM(s.topM+extra),
  vectorM(new int[allocatedM]),
  topM(s.topM)
{
   copy(vectorM, s.vectorM, s.topM);
}

EmcIntStack::~EmcIntStack()
{
   delete [] vectorM;
}

We will study the assignment operator when explaining the next rule.

Rule 5.12 Copy assignment operators should be protected from doing destructive actions if an object is assigned to itself.

When implementing the copy assignment operator we must make sure that self-assignment does not corrupt the state of the object. There is a risk that you delete pointers and then assign them to themselves. To prevent that, you could copy the new state of the object to local variables before assigning to the data members. This always works, but is less efficient than assigning to the data members directly. The most common solution is to check the address of the object passed as argument before modifying the state of the object. If the current object is passed as argument, the copy assignment operator simply returns without modifying the object.

Self-assignment

 
EmcString s = "Aguirre";
s = s;                     // Self assignment
cout << s << endl;         // Should print "Aguirre"

Implementing a copy assignment operator

When implementing the copy assignment operator for the EmcIntStack described above, we check the this -pointer before modifying the object. This is necessary since we want to be able to reuse already allocated memory instead of allocating new memory after each assignment.

 
EmcIntStack& EmcIntStack::operator=(const EmcIntStack& s)
{
   if (this != &s)
   {
      int* newVector = vectorM;
      if (allocatedM < s.topM)
      {
          // operator new may throw bad_alloc
          newVector = new int[s.topM];
          allocatedM = s.topM;
      }
      // copy elements
      copy(newVector, s.vectorM, s.topM);
      if (vectorM != newVector)
      {
          // release memory
          delete [] vectorM;
          vectorM = newVector;
      }
      // assign to object last to avoid changing state 
      // if the assignment fails due to bad_alloc
      topM = s.topM;
   }
   return *this;
}

Another similar class is our string class, EmcString . Like most other string classes, objects of this class have a character array to store the value of the string. EmcString has two data members, cpM and lengthM . When assigning to a string, we simple deallocate the character array pointed at by cpM and create a new one of appropriate size before copying the string.

 
class EmcString
{
   public:
      // ...
      EmcString& operator=(const EmcString& s);
      size_t length() const;
      // ...
   private:
      size_t lengthM;
      char*  cpM;
};

Instead of checking the this -pointer, we make sure that self-assignment does not corrupt the state of the object by making a copy of the argument before modifying the string. This will be slightly more efficient except when the parameter string is the same object as the one assigned to. This could be considered a special case that is not worth to optimize for. An even more efficient solution would be to avoid memory allocation altogether when the existing string is big enough as in the previous example.

 
EmcString& EmcString::operator=(const EmcString& s)
{
   // Not optimized for self-assignment
   char* tmp = new char[s.length() + 1];
   strcpy(tmp, s.cpM);
   delete [] cpM;
   cpM  = tmp;
   lengthM = s.lengthM;

   return *this;
}