Huawei C++ Programming Specification

  • Huawei C++ Programming Specification

C++ Programming Specification

Purpose

Rules are not perfect. By prohibiting features that might be useful in certain situations, they may affect code implementation. However, the purpose of our rules is “for the benefit of the majority of programmers.” If a team believes a rule cannot be followed, we hope to improve the rule together. Before referring to this specification, it is expected that you have the necessary foundational knowledge of the C++ language, rather than learning C++ through this document.

  1. Understand the ISO standard for the C++ language.
  2. Be familiar with the basic language features of C++, including features related to C++ 03/11/14/17.
  3. Understand the C++ standard library.

General Principles

Code must ensure functional correctness while meeting the characteristic requirements of being readable, maintainable, secure, reliable, testable, efficient, and portable.

Key Focus Areas

  1. Define C++ programming style, such as naming, formatting, etc.
  2. Modular design in C++, including how to design header files, classes, interfaces, and functions.
  3. Best practices for C++ language features, such as constants, type conversions, resource management, templates, etc.
  4. Best practices for modern C++, including conventions in C++11/14/17 that can improve code maintainability and reliability.
  5. This specification prioritizes C++17.

Conventions

Rule: A convention that must be followed during programming (must).

Recommendation: A convention that should be followed during programming (should).

This specification applies to general C++ standards. If a specific standard version is not mentioned, it applies to all versions (C++03/11/14/17).

Exceptions

Whether it’s a ‘Rule’ or a ‘Recommendation’, you must understand the reason behind the item and strive to follow it. However, some rules and recommendations may have exceptions.

When it does not violate the general principles, after full consideration and with sufficient justification, you may appropriately deviate from the conventions in this specification. Exceptions undermine code consistency, so please try to avoid them. Exceptions to ‘Rules’ should be rare.

In the following situations, the principle of style consistency takes precedence: When modifying external open-source code or third-party code, you should follow the existing specifications of the open-source or third-party code to maintain a unified style.

2 Naming

General Naming

CamelCase A mix of uppercase and lowercase letters, with words joined together. Different words are separated by capitalizing the first letter of each word. Based on whether the first letter of the joined word is capitalized, it is divided into: UpperCamelCase and lowerCamelCase.

TypeNaming Style
Type definitions like class types, struct types, enum types, union types, scope namesUpperCamelCase
Functions (including global functions, scope functions, member functions)UpperCamelCase
Global variables (including variables in global and namespace scopes, class static variables), local variables, function parameters, members of classes, structs, and unionslowerCamelCase
Macros, constants (const), enum values, goto labelsALL_UPPERCASE, with underscores

Note: The constant in the table above refers to variables of basic data types, enums, or string types modified by const or constexpr in the global scope, namespace scope, or class static member scope. It does not include arrays and other types. The variable in the table above refers to all other variables except constant definitions, all using lowerCamelCase style.

File Naming

Rule 2.2.1 C++ files end with .cpp, header files end with .h

We recommend using .h as the suffix for header files, as this makes them directly compatible with C and C++. We recommend using .cpp as the suffix for implementation files, as this directly distinguishes C++ code from C code.

Currently, there are some other suffix conventions in the industry:

  • Header files: .hh, .hpp, .hxx
  • cpp files: .cc, .cxx, .c

If your current project team uses a specific suffix, you can continue to use it, but please maintain a consistent style. However, for this document, we default to using .h and .cpp as suffixes.

Rule 2.2.2 C++ file names should be consistent with the class name

The names of C++ header and .cpp files should be consistent with the class name, using snake_case style.

If there is a class called DatabaseConnection, the corresponding file names should be:

  • database_connection.h
  • database_connection.cpp

File naming for structs, namespaces, enums, etc., is similar.

Function Naming

Function names should uniformly use UpperCamelCase, generally in the form of a verb or a verb-object structure.

class List {
public:
	void AddElement(const Element& element);
	Element GetElement(const unsigned int index) const;
	bool IsEmpty() const;
};

namespace Utils {
    void DeleteUser();
}

Type Naming

Type names should use UpperCamelCase. All type names—classes, structs, unions, type definitions (typedef), enums—use the same convention, for example:

// classes, structs and unions
class UrlTable { ...
class UrlTableTester { ...
struct UrlTableProperties { ...
union Packet { ...

// typedefs
typedef std::map<std::string, UrlTableProperties*> PropertiesMap;

// enums
enum UrlTableErrors { ...

For namespace naming, it is recommended to use UpperCamelCase:

// namespace
namespace OsUtils {
 
namespace FileUtils {
     
}
 
}

Recommendation 2.4.1 Avoid abusing typedef or #define to alias basic types

Unless there is a clear necessity, do not use typedef/#define to redefine basic data types. Prefer using the basic types from the <cstdint> header:

Signed TypeUnsigned TypeDescription
int8_tuint8_tExactly 8-bit signed/unsigned integer type
int16_tuint16_tExactly 16-bit signed/unsigned integer type
int32_tuint32_tExactly 32-bit signed/unsigned integer type
int64_tuint64_tExactly 64-bit signed/unsigned integer type
intptr_tuintptr_tSigned/unsigned integer type sufficient to hold a pointer

Variable Naming

General variable naming uses lowerCamelCase, including global variables, function parameters, local variables, and member variables.

std::string tableName;  // Good: Recommended style
std::string tablename;  // Bad: Prohibited style
std::string path;       // Good: When there's only one word, lowerCamelCase is all lowercase

Rule 2.5.1 Global variables should be prefixed with ‘g_’, static variable naming does not need a special prefix

Global variables should be used as sparingly as possible, and special care should be taken when using them. Therefore, a prefix is added for visual prominence to encourage developers to be more cautious with these variables.

  • Global static variables are named the same as global variables.
  • Static variables within functions are named the same as ordinary local variables.
  • Class static member variables are named the same as ordinary member variables.
int g_activeConnectCount;

void Func()
{
    static int packetCount = 0; 
    ...
}

Rule 2.5.2 Class member variable names consist of lowerCamelCase with a trailing underscore

class Foo {
private:
    std::string fileName_;   // Add a trailing underscore, similar to K&R naming style
};

For member variables of structs/unions, the lowerCamelCase style without a trailing underscore is still used, consistent with local variable naming style.

Macro, Constant, and Enumeration Naming

Macros and enum values use ALL_UPPERCASE, with words connected by underscores. Within the global scope, named and anonymous namespaces, const constants, class static member constants, should be ALL_UPPERCASE with underscores. Local const constants in functions and ordinary const member variables of classes should use lowerCamelCase naming style.

#define MAX(a, b)   (((a) < (b)) ? (b) : (a)) // This is only an example of macro naming, using macros for such functionality is not recommended

enum TintColor {    // Note: the enum type name is UpperCamelCase, its values are ALL_UPPERCASE with underscores
    RED,
    DARK_RED,
    GREEN,
    LIGHT_GREEN
};

int Func(...)
{
    const unsigned int bufferSize = 100;    // Local constant in function
    char *p = new char[bufferSize];
    ...
}

namespace Utils {
	const unsigned int DEFAULT_FILE_SIZE_KB = 200;        // Global constant
}

3 Formatting

Line Width

Rule 3.1.1 Line width should not exceed 120 characters

It is recommended that the number of characters per line should not exceed 120. If it exceeds 120 characters, please choose a reasonable way to break the line.

Exceptions:

  • If a line of comment contains a command or URL longer than 120 characters, it can be kept on one line to facilitate copying, pasting, and searching with grep.
  • #include statements containing long paths can exceed 120 characters, but this should also be avoided as much as possible.
  • Error messages in the preprocessor can span multiple lines. Preprocessor error messages are easier to read and understand on a single line, even if they exceed 120 characters.
#ifndef XXX_YYY_ZZZ
#error Header aaaa/bbbb/cccc/abc.h must only be included after xxxx/yyyy/zzzz/xyz.h, because xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
#endif

Indentation

Rule 3.2.1 Use spaces for indentation, indenting 4 spaces at a time

Only use spaces for indentation, with each indent being 4 spaces. Using the Tab character for indentation is not allowed. Currently, almost all Integrated Development Environments (IDEs) support configuring the Tab key to automatically expand to 4 spaces. Please configure your IDE to support using spaces for indentation.

Braces

Rule 3.3.1 Use K&R indentation style

K&R style When breaking lines, the left brace for functions (excluding lambda expressions) should start on a new line at the beginning of the line and occupy the line alone; other left braces should follow the statement at the end of the line. The right brace occupies a line alone, unless it is followed by the remainder of the same statement, such as while in a do statement, or else/else if in an if statement, or a comma, or semicolon.

For example:

struct MyType {     // Follows the statement at the end of the line, with a preceding space
    ...
};

int Foo(int a)
{                   // Function left brace on its own line at the beginning
    if (...) {
        ...
    } else {
        ...
    }
}

Reasons for recommending this style:

  • More compact code.
  • Compared to starting a new line, placing it at the end makes the code reading rhythm more continuous.
  • Conforms to the habits of later languages and mainstream industry practices.
  • Modern Integrated Development Environments (IDEs) have auxiliary features for displaying indentation and alignment. Placing braces at the end of a line does not affect the understanding of indentation and scope.

For empty function bodies, the braces can be placed on the same line:

class MyClass {
public:
    MyClass() : value_(0) {}
   
private:
    int value_;
};

Function Declarations and Definitions

Rule 3.4.1 The return type and function name of a declaration or definition should be on the same line; if the parameter list exceeds the line width, wrap it and align it reasonably

When declaring and defining a function, the function’s return type should be on the same line as the function name. If the line width allows, the function parameters should also be on one line; otherwise, the function parameters should be wrapped and reasonably aligned. The left parenthesis of the parameter list is always on the same line as the function name, not on a separate line; the right parenthesis always follows the last parameter.

Wrapping examples:

ReturnType FunctionName(ArgType paramName1, ArgType paramName2)   // Good: All on one line
{
    ...
}

ReturnType VeryVeryVeryLongFunctionName(ArgType paramName1,     // Line width not enough for all parameters, wrap
                                        ArgType paramName2,     // Good: Aligned with the parameter above
                                        ArgType paramName3)
{
    ...
}

ReturnType LongFunctionName(ArgType paramName1, ArgType paramName2, // Line width limit, wrap
    ArgType paramName3, ArgType paramName4, ArgType paramName5)     // Good: 4-space indent after wrap
{
    ...
}

ReturnType ReallyReallyReallyReallyLongFunctionName(            // Line width not enough for the first parameter, wrap directly
    ArgType paramName1, ArgType paramName2, ArgType paramName3) // Good: 4-space indent after wrap
{
    ...
}

Function Calls

Rule 3.5.1 The argument list of a function call should be on one line; when it exceeds the line width, wrap it and align the parameters reasonably

When calling a function, the argument list should be on one line. If the argument list exceeds the line width, it needs to be wrapped and the parameters should be reasonably aligned. The left parenthesis always follows the function name, and the right parenthesis always follows the last parameter.

Wrapping examples:

ReturnType result = FunctionName(paramName1, paramName2);   // Good: Function parameters on one line

ReturnType result = FunctionName(paramName1,
                                 paramName2,                // Good: Aligned with the parameter above
                                 paramName3);

ReturnType result = FunctionName(paramName1, paramName2,
    paramName3, paramName4, paramName5);                    // Good: Parameters wrapped, 4-space indent

ReturnType result = VeryVeryVeryLongFunctionName(           // Line width not enough for the first parameter, wrap directly
    paramName1, paramName2, paramName3);                    // After wrap, 4-space indent

If the parameters of a function call have an inherent relationship, prioritize comprehensibility over formatting requirements, grouping parameters reasonably for line breaks.

// Good: Each line's parameters represent a group of closely related data structures, placing them on one line for easier understanding
int result = DealWithStructureLikeParams(left.x, left.y,     // Represents a group of related parameters
                                         right.x, right.y);  // Represents another group of related parameters

if Statements

Rule 3.6.1 if statements must use braces

We require that all if statements use braces, even if there is only one statement.

Reasons:

  • The code logic is intuitive and easy to read.
  • It is less error-prone when adding new code to existing conditional statements.
  • It provides protection against errors when using function-like macros in if statements (if the macro definition omitted braces).
if (objectIsNotExist) {         // Good: Single-line conditional statements also use braces
    return CreateNewObject();
}

Rule 3.6.2 Do not write if/else/else if on the same line

In conditional statements, if there are multiple branches, they should be on different lines.

The correct way to write is as follows:

if (someConditions) {
    DoSomething();
    ...
} else {  // Good: else is on a different line from if
    ...
}

Here is a non-compliant example:

if (someConditions) { ... } else { ... } // Bad: else is on the same line as if

Loop Statements

Rule 3.7.1 Loop statements must use braces

Similar to conditional expressions, we require that for/while loop statements must be enclosed in braces, even if the loop body is empty or contains only one statement.

for (int i = 0; i < someRange; i++) {   // Good: Braces are used
    DoSomething();
}
while (condition) { }   // Good: Empty loop body, using braces
while (condition) {
    continue;           // Good: continue indicates empty logic, using braces
}

Bad examples:

for (int i = 0; i < someRange; i++)
    DoSomething();      // Bad: Braces should be added
while (condition);      // Bad: Using a semicolon can easily be mistaken as part of the while statement

switch Statements

Rule 3.8.1 The case/default blocks in a switch statement should be indented by one level

The indentation style for a switch statement is as follows:

switch (var) {
    case 0:             // Good: Indented
        DoSomething1(); // Good: Indented
        break;
    case 1: {           // Good: Format with braces
        DoSomething2();
        break;
    }
    default:
        break;
}
switch (var) {
case 0:                 // Bad: case not indented
    DoSomething();
    break;
default:                // Bad: default not indented
    break;
}

Expressions

Recommendation 3.9.1 When wrapping expressions, maintain consistency and place operators at the end of the line

For long expressions that do not meet the line width requirement, you need to break the line at an appropriate location. Generally, break after lower-precedence operators or connectors, placing the operator or connector at the end of the line. Placing operators and connectors at the end of the line indicates “not finished, more to follow.” Example:

// Assume the first line below does not meet the line width requirement

if ((currentValue > threshold) &&  // Good: After wrapping, the logical operator is at the end of the line
    someCondition) {
    DoSomething();
    ...
}

int result = reallyReallyLongVariableName1 +    // Good
             reallyReallyLongVariableName2;

After wrapping an expression, pay attention to maintaining reasonable alignment or a 4-space indent. Refer to the examples below.

int sum = longVariableName1 + longVariableName2 + longVariableName3 +
    longVariableName4 + longVariableName5 + longVariableName6;         // Good: 4-space indent

int sum = longVariableName1 + longVariableName2 + longVariableName3 +
          longVariableName4 + longVariableName5 + longVariableName6;   // Good: Aligned

Variable Assignment

Rule 3.10.1 Multiple variable definitions and assignment statements are not allowed on the same line

Having only one variable initialization statement per line makes it easier to read and understand.

int maxCount = 10;
bool isCompleted = false;

Here is a non-compliant example:

int maxCount = 10; bool isCompleted = false; // Bad: Multiple variable initializations should be on separate lines, one per line
int x, y = 0;  // Bad: Multiple variable definitions should be on separate lines, one per line

int pointX;
int pointY;
...
pointX = 1; pointY = 2;  // Bad: Multiple variable assignment statements on the same line

Exception: Multiple variables can be declared and initialized in for loop headers, if initialization statements (C++17), and structured binding statements (C++17). The variable declarations in these statements are strongly related. Forcibly splitting them into multiple lines can cause issues like inconsistent scope and separation of declaration and initialization.

Initialization

Initialization includes the initialization of structs, unions, and arrays.

Rule 3.11.1 When wrapping initialization, use indentation and align reasonably

When initializing structs or arrays, if wrapping is required, maintain a 4-space indent. From a readability perspective, choose the wrapping point and alignment position.

const int rank[] = {
    16, 16, 16, 16, 32, 32, 32, 32,
    64, 64, 64, 64, 32, 32, 32, 32
};

Pointers and References

Recommendation 3.12.1 For pointer types, place “*” next to the variable name or the type, but not on both sides or neither side with spaces

Pointer naming: * can be on the left or right, but do not have spaces on both sides or neither side.

int* p = nullptr;  // Good
int *p = nullptr;  // Good

int*p = nullptr;   // Bad
int * p = nullptr; // Bad

Exception: When a variable is modified by const, “*” cannot follow the variable. In this case, do not follow the type either.

const char * const VERSION = "V100";

Recommendation 3.12.2 For reference types, place “&” next to the variable name or the type, but not on both sides or neither side with spaces

Reference naming: & can be on the left or right, but do not have spaces on both sides or neither side.

int i = 8;

int& r = i;     // Good
int &r = i;     // Good
int*& rp = &r;  // Good, reference to a pointer, *& follows the type
int *&rp = &r;  // Good, reference to a pointer, *& follows the variable name
int* &rp = &r;  // Good, reference to a pointer, * follows the type, & follows the variable name

int & r = i;    // Bad
int& r = i;      // Bad

Preprocessor Directives

Rule 3.13.1 The “#” of preprocessor directives should be uniformly placed at the beginning of the line. For nested preprocessor directives, “#” can be indented

The “#” of preprocessor directives should be uniformly placed at the beginning of the line. Even if the preprocessor code is embedded in a function body, “#” should be placed at the beginning of the line.

Rule 3.13.2 Avoid using macros

Macros ignore scope, the type system, and various rules, which can easily cause problems. The use of macro definitions should be avoided as much as possible. If macros must be used, ensure the uniqueness of the macro name. In C++, there are many ways to avoid using macros:

  • Use const or enum to define understandable constants.
  • Use namespace to avoid name conflicts.
  • Use inline functions to avoid the overhead of function calls.
  • Use template functions to handle multiple types.

Macros can be used in necessary scenarios such as header guard macros, conditional compilation, and logging.

Rule 3.13.3 Do not use macros to represent constants

Macros are simple text replacements completed in the preprocessing stage. When a runtime error occurs, it directly reports the corresponding value; when debugging, it also displays the value, not the macro name. Macros have no type checking and are unsafe. Macros have no scope.

Rule 3.13.4 Do not use function-like macros

Before defining a function-like macro, consider whether it can be replaced by a function. For replaceable scenarios, it is recommended to use a function instead of a macro. The disadvantages of function-like macros are as follows:

  • Function-like macros lack type checking and are not as strict as function calls.
  • Macro parameters are not evaluated when expanded, which may produce unexpected results.
  • Macros do not have an independent scope.
  • Macros are too tricky, for example, the use of # and ubiquitous parentheses, which affects readability.
  • In specific scenarios, compiler-specific macro extension syntax must be used, such as GCC’s statement expression, which affects portability.
  • After macros are expanded in the pre-compilation stage, they are invisible during subsequent compilation, linking, and debugging. Moreover, multi-line macros are expanded into a single line. Function-like macros are difficult to debug, set breakpoints on, and are not conducive to problem localization.
  • For macros containing many statements, they are expanded at every call point. If there are many call points, it can cause code space bloat.

Functions do not have the above disadvantages of macros. However, the biggest disadvantage of functions compared to macros is lower execution efficiency (increasing function call overhead and the difficulty of compiler optimization). To this end, you can use inline functions when necessary. Inline functions are similar to macros in that they are also expanded at the call point. The difference is that inline functions are expanded at compile time.

Inline functions combine the advantages of both functions and macros:

  • Inline functions perform strict type checking.
  • Parameters of inline functions are evaluated only once.
  • Inline functions are expanded in place, with no function call overhead.
  • Inline functions can be optimized better than functions.

For performance-critical product code, consider using inline functions instead of functions.

Exception: In logging scenarios, it is necessary to retain information about the call point’s filename (__FILE__), line number (__LINE__), etc., through function-like macros.

Spaces and Blank Lines

Rule 3.14.1 Horizontal spaces should highlight keywords and important information, avoiding unnecessary whitespace

Horizontal spaces should highlight keywords and important information. Do not add spaces at the end of each line of code. The general rules are as follows:

  • Add a space after keywords like if, switch, case, do, while, for, etc.
  • Do not add spaces on either side inside parentheses.
  • Whether there are spaces on both sides inside braces must be consistent.
  • Do not add a space after unary operators (& * + - ~ !).
  • Add spaces on both sides of binary operators (= + - < > * / % | & ^ <= >= == !=).
  • Both sides of the ternary operator (? :) need spaces.
  • No space between pre/post-increment/decrement (++ --) and the variable.
  • No spaces before or after the structure member operator (. ->).
  • No space before the comma (,), add a space after it.
  • Do not add spaces between templates, type casts (<>), and the type.
  • Do not add spaces before or after the scope operator (::).
  • Whether to add spaces before and after the colon (:) depends on the situation.

General cases:

void Foo(int b) {  // Good: There should be a space before the opening brace

int i = 0;  // Good: When initializing a variable, there should be spaces around =, no space before the semicolon

int buf[BUF_SIZE] = {0};    // Good: No spaces on either side inside the braces

Function definitions and calls:

int result = Foo(arg1,arg2);
                    ^    // Bad: A space is needed after the comma

int result = Foo( arg1, arg2 );
                 ^          ^  // Bad: There should be no space after the left parenthesis of the function parameter list, and no space before the right parenthesis

Pointers and address-of

x = *p;     // Good: No space between the * operator and the pointer p
p = &x;     // Good: No space between the & operator and the variable x
x = r.y;    // Good: No space when accessing member variables through .
x = r->y;   // Good: No space when accessing member variables through ->

Operators:

x = 0;   // Good: Add spaces on both sides of the assignment operator =
x = -5;  // Good: No space between the negative sign and the number
++x;     // Good: No space between pre/post ++/-- and the variable
x--;

if (x && !y)  // Good: Add spaces around Boolean operators, no space between ! and the variable
v = w * x + y / z;  // Good: Add spaces around binary operators
v = w * (x + z);    // Good: No spaces needed before and after the expression inside parentheses

int a = (x < y) ? x : y;  // Good: For ternary operators, add spaces around ? and :

Loops and conditional statements:

if (condition) {  // Good: Add a space between the if keyword and the parenthesis, no spaces inside the parenthesis around the condition
    ...
} else {           // Good: Add a space between the else keyword and the brace
    ...
}

while (condition) {}   // Good: Add a space between the while keyword and the parenthesis, no spaces inside the parenthesis around the condition

for (int i = 0; i < someRange; ++i) {  // Good: Add a space between the for keyword and the parenthesis, add a space after the semicolon
    ...
}

switch (condition) {  // Good: 1 space after the switch keyword
    case 0:     // Good: No space between the case condition and the colon
        ...
        break;
    ...
    default:
        ...
        break;
}

Templates and casts

// Angle brackets (< and >) are not adjacent to spaces, no space before <, and no space between > and (.
vector<string> x;
y = static_cast<char*>(x);

// It's also okay to leave a space between the type and the pointer operator, but be consistent.
vector<char *> x;

Scope operator

std::cout;    // Good: For namespace access, do not leave a space

int MyClass::GetValue() const {}  // Good: For member function definitions, do not leave a space

Colon

// Scenarios where spaces are added

// Good: Spaces are required for class derivation
class Sub : public Base {
   
};

// Constructor initialization list requires spaces
MyClass::MyClass(int var) : someVar_(var)
{
    DoSomething();
}

// Bit field representation also has spaces
struct XX {
    char a : 4;    
    char b : 5;    
    char c : 4;
};
// Scenarios where spaces are not added

// Good: For class access rights like public:, private:, no space is needed after the colon
class MyClass {
public:
    MyClass(int var);
private:
    int someVar_;
};

// For the colon after case and default in switch-case, no space is needed
switch (value)
{
    case 1:
        DoSomething();
        break;
    default:
        break;
}

Note: Current Integrated Development Environments (IDEs) can be configured to remove trailing spaces. Please configure this correctly.

Recommendation 3.14.1 Arrange blank lines reasonably to keep code compact

Reducing unnecessary blank lines can display more code and make it easier to read. Here are some recommended rules to follow:

  • Arrange blank lines based on the relevance of the context.
  • Do not use consecutive blank lines inside functions, type definitions, macros, or initialization expressions.
  • Do not use 3 or more consecutive blank lines.
  • Do not add blank lines before the first line or after the last line of a code block inside braces, but this is not required for the braces of a namespace.
int Foo()
{
    ...
}



int Bar()  // Bad: Use a maximum of 2 consecutive blank lines.
{
    ...
}


if (...) {
        // Bad: Do not add a blank line at the beginning of a code block inside braces
    ...
        // Bad: Do not add a blank line at the end of a code block inside braces
}

int Foo(...)
{
        // Bad: Do not add a blank line at the beginning of a function body
    ...
}

Classes

Rule 3.15.1 The declaration of class access control blocks should be in the order of public:, protected:, private:, indented and aligned with the class keyword

class MyClass : public BaseClass {
public:      // Note: no indentation
    MyClass();  // Standard 4-space indent
    explicit MyClass(int var);
    ~MyClass() {}

    void SomeFunction();
    void SomeFunctionThatDoesNothing()
    {
    }

    void SetVar(int var) { someVar_ = var; }
    int GetVar() const { return someVar_; }

private:
    bool SomeInternalFunction();

    int someVar_;
    int someOtherVar_;
};

Within each section, it is recommended to group similar declarations together and in the following order: types (including typedef, using, and nested structs and classes), constants, factory functions, constructors, assignment operators, destructors, other member functions, and data members.

Rule 3.15.2 Constructor initialization lists should be on the same line or wrapped and aligned with multiple lines with a four-space indent

// If all variables can be on the same line:
MyClass::MyClass(int var) : someVar_(var)
{
    DoSomething();
}

// If they cannot be on the same line,
// they must be placed after the colon and indented by 4 spaces
MyClass::MyClass(int var)
    : someVar_(var), someOtherVar_(var + 1)  // Good: space after the comma
{
    DoSomething();
}

// If the initialization list needs to be on multiple lines, each line should be aligned
MyClass::MyClass(int var)
    : someVar_(var),             // 4-space indent
      someOtherVar_(var + 1)
{ 
    DoSomething();
}

4 Comments

Generally, try to improve code readability through clear architectural logic and good symbol naming; use comments to supplement when necessary. Comments are to help the reader quickly understand the code, so they should be written from the reader’s perspective and on-demand.

Comment content should be concise, clear, unambiguous, comprehensive, and not redundant.

Comments are as important as the code. When writing comments, think from the reader’s perspective and use comments to express the information the reader truly needs at that moment. Comment on the functional and intent level of the code, i.e., comments explain the intent that the code cannot express, do not repeat code information. When modifying code, also ensure the consistency of its related comments. Changing code without updating comments is an unprofessional practice that undermines the consistency between code and comments, confusing and perplexing the reader, or even leading to misunderstandings.

Use English for comments.

Comment Style

In C++ code, both /* */ and // are acceptable. Based on the purpose and location of the comment, comments can be divided into different types, such as file header comments, function header comments, code comments, etc. Comments of the same type should maintain a consistent style.

Note: In the example code in this document, the extensive use of trailing comments with // is only for a more precise description of the problem and does not mean that this comment style is better.

File Header Comments

/*
 * Copyright (c) 2020 XXX
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

Function Header Comments

Rule 4.3.1 Public (public) functions must have function header comments

Public functions are the external interfaces provided by a class. Callers need to understand the function’s functionality, parameter value ranges, return results, and precautions to use it correctly. Information such as parameter value ranges, return results, and precautions cannot be self-explanatory, so function header comments are needed for supplementary explanation.

Rule 4.3.2 Do not use empty, formatted function header comments

Not all functions need function header comments. For information that cannot be expressed by the function signature, add function header comments for supplementary explanation.

Function header comments should be uniformly placed above the function declaration or definition, using one of the following styles: Use // for function headers

// Single-line function header
int Func1(void);

// Multi-line function header
// Second line
int Func2(void);

Use /* */ for function headers

/* Single-line function header */
int Func1(void);

/*
 * Another single-line function header
 */
int Func2(void);

/*
 * Multi-line function header
 * Second line
 */
int Func3(void);

Functions should be self-commenting through their names as much as possible, and function header comments should be written on demand. Do not write useless or redundant function headers; do not write empty, formatted function headers.

The content of function header comments is optional, but not limited to: functional description, return value, performance constraints, usage, memory conventions, algorithm implementation, reentrancy requirements, etc. For function interface declarations in external header files of a module, the function header comments should clearly express important and useful information.

Example:

/*
 * Returns the actual number of bytes written, -1 indicates write failure.
 * Note: The memory buf is the responsibility of the caller to free.
 */
int WriteString(const char *buf, int len);

Bad example:

/*
 * Function name: WriteString
 * Function: Write a string
 * Parameters:
 * Return value:
 */
int WriteString(const char *buf, int len);

Problems with the above example:

  • Parameters and return value have format but no content.
  • Function name information is redundant.
  • It is not clearly stated who is responsible for freeing buf.

Code Comments

Rule 4.4.1 Code comments should be placed above or to the right of the corresponding code

Rule 4.4.2 There should be 1 space between the comment symbol and the comment content; right-aligned comments should have at least 1 space from the preceding code

Comments above the code should maintain the same indentation as the corresponding code. Choose and consistently use one of the following styles: Use //


// This is a single-line comment
DoSomething();

// This is a multi-line comment
// Second line
DoSomething();

Use /*' '*/

/* This is a single-line comment */
DoSomething();

/*
 * Another style of multi-line comment
 * Second line
 */
DoSomething();

For comments to the right of the code, leave at least 1 space between the comment and the code. It is recommended not to exceed 4 spaces. Usually, using the extended TAB key can achieve a 1-4 space indent.

Choose and consistently use one of the following styles:

int foo = 100;  // Comment on the right
int bar = 200;  /* Comment on the right */

For right-aligned format, top-and-bottom alignment can be more aesthetically pleasing when appropriate. For aligned comments, the line closest to the code on the left should maintain a 1-4 space interval. Example:

const int A_CONST = 100;         /* For related similar comments, consider top-and-bottom alignment */
const int ANOTHER_CONST = 200;   /* When aligned, maintain spacing from the code on the left */

When a right-aligned comment exceeds the line width, consider placing the comment above the code.

Rule 4.4.3 Unused code segments should be deleted directly, not commented out

Commented-out code cannot be maintained normally. When an attempt is made to reuse this code, it is highly likely to introduce easily overlooked defects. The correct approach is to delete the code directly if it is not needed. If it is needed again, consider porting or rewriting the code.

The commented-out code mentioned here includes using /* */ and //, as well as #if 0, #ifdef NEVER_DEFINED, etc.

5 Header Files

Header File Responsibilities

Header files are the external interfaces of a module or file, and the design of header files reflects most of the system design. Header files are suitable for placing interface declarations, not implementations (except for inline functions). Functions, macros, enums, struct definitions, etc., that are only needed internally in a .cpp file should not be placed in header files. Header files should have a single responsibility. Overly complex header files and dependencies are a major cause of long compilation times.

Recommendation 5.1.1 Every .cpp file should have a corresponding .h file for declaring classes and interfaces that need to be made public

Typically, each .cpp file has a corresponding .h file for placing function declarations, macro definitions, type definitions, etc., that are provided externally. If a .cpp file does not need to expose any interfaces, then it should not exist. Exception: Program entry points (e.g., files containing the main function), unit test code, dynamic library code.

Example:

// Foo.h

#ifndef FOO_H
#define FOO_H

class Foo {
public:
    Foo();
    void Fun();
   
private:
    int value_;
};

#endif
// Foo.cpp
#include "Foo.h"

namespace { // Good: Declarations of internal functions are placed at the top of the .cpp file and declared in an anonymous namespace or static to limit their scope
    void Bar()
    {
    }
}

...

void Foo::Fun()
{
    Bar();
}

Header File Dependencies

Rule 5.2.1 Circular dependencies between header files are forbidden

Circular dependency between header files means a.h includes b.h, b.h includes c.h, and c.h includes a.h, causing any modification to any of these header files to result in all code including a.h/b.h/c.h being recompiled. If it’s a one-way dependency, e.g., a.h includes b.h, b.h includes c.h, and c.h does not include any header files, then modifying a.h will not cause source code that includes b.h/c.h to be recompiled. Circular dependencies in header files directly reflect unreasonable architectural design and can be avoided by optimizing the architecture.

Rule 5.2.2 Header files must have #define guards to prevent multiple inclusions

To prevent header files from being included multiple times, all header files should use #define guards; do not use #pragma once.

When defining guard symbols, the following rules should be observed:

  1. The guard symbol uses a unique name.
  2. Do not place code or comments before or after the protected part, except for file header comments.

Example: Assuming the timer module’s timer.h is in the directory timer/include/timer.h, it should be protected as follows:

#ifndef TIMER_INCLUDE_TIMER_H
#define TIMER_INCLUDE_TIMER_H
...
#endif

Rule 5.2.3 Do not reference external function interfaces or variables through declarations

Only use interfaces provided by other modules or files by including their header files. Using external function interfaces or variables through extern declarations can easily lead to inconsistencies between declarations and definitions when the external interface changes. At the same time, such implicit dependencies can easily lead to architectural decay.

Non-compliant example:

// Contents of a.cpp

extern int Fun();   // Bad: Using external functions via extern

void Bar()
{
    int i = Fun();
    ...
}

// Content of b.cpp

int Fun()
{
    // Do something
}

Should be changed to:

// Content of a.cpp

#include "b.h"   // Good: Use the interface provided by other .cpp files by including the header file

void Bar()
{
    int i = Fun();
    ...
}

// Content of b.h

int Fun();

// Content of b.cpp

int Fun()
{
    // Do something
}

Exception: In some scenarios where you need to reference an internal function but do not want to modify the code, you can reference it via an extern declaration. For example: When performing unit testing for a specific internal function, you can reference the function under test through an extern declaration; When you need to stub or patch a specific function, an extern declaration of that function is allowed.

Rule 5.2.4 Do not include header files within extern “C”

Including header files within extern “C” can lead to nesting of extern “C”. Some compilers have limits on the nesting level of extern “C”, and too many levels can cause compilation errors.

In C/C++ mixed programming, including a header file within extern “C” may corrupt the original intent of the included header file, for example, by incorrectly changing the linkage specification.

Example: Given two header files, a.h and b.h:

// Content of a.h

...
#ifdef __cplusplus
void Foo(int);
#define A(value) Foo(value)
#else
void A(int)
#endif

// Content of b.h

...
#ifdef __cplusplus
extern "C" {
#endif

#include "a.h"
void B();

#ifdef __cplusplus
}
#endif

Expanding b.h with the C++ preprocessor will result in

extern "C" {
    void Foo(int);
    void B();
}

According to the original intent of the author of a.h, the function Foo is a C++ free function with “C++” linkage. However, in b.h, because #include "a.h" is placed inside extern "C", the linkage specification of the function Foo is incorrectly changed.

Exception: If in a C++ compilation environment, you want to include a pure C header file, and these C header files do not have extern "C" decoration. A non-intrusive approach is to include the C header file within extern "C".

Recommendation 5.2.1 Avoid using forward declarations; instead, use #include to include header files

A forward declaration usually refers to a pure declaration of a class or template, without its definition.

  • Advantages:
    1. Forward declarations can save compilation time. Unnecessary #include forces the compiler to expand more files and process more input.
    2. Forward declarations can save time on unnecessary recompilation. #include can cause code to be recompiled multiple times due to unrelated changes in the header file.
  • Disadvantages:
    1. Forward declarations hide dependencies. When a header file is changed, user code may skip necessary recompilation.
    2. Forward declarations can be broken by subsequent library changes. Forward declaring templates can sometimes prevent header file developers from changing their API. For example, expanding a parameter type or adding a template parameter with a default value.
    3. The behavior is undefined when forward declaring symbols from the std:: namespace (as explicitly stated in the C++11 standard).
    4. Forward declaring many symbols from a header file can be more verbose than a single line of include.
    5. Refactoring code just to allow a forward declaration (e.g., using pointer members instead of object members) can make the code slower and more complex.
    6. It is difficult to judge when to use a forward declaration and when to use #include. In some scenarios, swapping a forward declaration for an #include can lead to unexpected results.

Therefore, we should avoid using forward declarations as much as possible and use #include to include header files to ensure dependencies are clear.

6 Scope

Namespaces

Recommendation 6.1.1 For variables, constants, or functions in a .cpp file that do not need to be exported, please use an anonymous namespace or the static modifier

In the C++ 2003 standard, using the static modifier for file-scope variables, functions, etc., is marked as a deprecated feature, so using an anonymous namespace is more recommended.

Main reasons are as follows:

  1. static has been given too many meanings in C++: static member variables, static member functions, static global variables, static function-local variables, each with special handling.
  2. static can only guarantee file scope for variables, constants, and functions, but a namespace can also encapsulate types, etc.
  3. Unify scope management in C++ using namespaces, without needing to use both static and namespace for management.
  4. Functions modified by static cannot be used to instantiate templates, while anonymous namespaces can.

However, do not use anonymous namespaces or static in .h files.

// Foo.cpp

namespace {
    const int MAX_COUNT = 20;
    void InternalFun() {};
}

void Foo::Fun()
{
    int i = MAX_COUNT;
   
    InternalFun();
}

Rule 6.1.1 Do not use using to import namespaces in header files or before #include

Description: Using using to import a namespace affects subsequent code and can easily cause symbol conflicts. Therefore, do not use using to import namespaces in header files or before #include in source files. Example:

// Header file a.h
namespace NamespaceA {
    int Fun(int);
}
// Header file b.h
namespace NamespaceB {
    int Fun(int);
}

using namespace NamespaceB;

void G()
{
    Fun(1);
}
// Source code a.cpp
#include "a.h"
using namespace NamespaceA;
#include "b.h"

void main()
{
    G(); // using namespace NamespaceA is before #include "b.h", causing ambiguity: call to NamespaceA::Fun, NamespaceB::Fun is ambiguous
}

For using using to import a single symbol or define an alias in a header file, it is allowed within a module’s custom namespace but prohibited in the global namespace.

// foo.h

#include <fancy/string>
using fancy::string;  // Bad, prohibited from importing symbols into the global namespace

namespace Foo {
    using fancy::string;  // Good, symbols can be imported in a module's custom namespace
    using MyVector = fancy::vector<int>;  // Good, C++11 allows defining aliases in a custom namespace
}

Global Functions and Static Member Functions

Description: Placing non-member functions in a namespace avoids polluting the global scope. Also, do not simply manage global functions with a class + static member methods. If a global function is closely related to a certain class, it can be a static member function of that class.

If you need to define some global functions for use by a specific .cpp file, please use an anonymous namespace to manage them.

namespace MyNamespace {
    int Add(int a, int b);
}

class File {
public:
    static File CreateTempFile(const std::string& fileName);
};

Global Constants and Static Member Constants

Description: Placing global constants in a namespace avoids polluting the global scope. Also, do not simply manage global constants with a class + static member constants. If a global constant is closely related to a certain class, it can be a static member constant of that class.

If you need to define some global constants for use only by a specific .cpp file, please use an anonymous namespace to manage them.

namespace MyNamespace {
    const int MAX_SIZE = 100;
}

class File {
public:
    static const std::string SEPARATOR;
};

Global Variables

Recommendation 6.4.1 Avoid using global variables; consider using the singleton pattern

Description: Global variables can be modified and read, which can lead to data coupling between business logic and this global variable.

int g_counter = 0;

// a.cpp
g_counter++;

// b.cpp
g_counter++;

// c.cpp
cout << g_counter << endl;

Using the singleton pattern

class Counter {
public:
    static Counter& GetInstance()
    {
        static Counter counter;
        return counter;
    }  // Simple example of singleton implementation
   
    void Increase()
    {
        value_++;
    }
   
    void Print() const
    {
        std::cout << value_ << std::endl;
    }

private:
    Counter() : value_(0) {}

private:
    int value_;
};

// a.cpp
Counter::GetInstance().Increase();

// b.cpp
Counter::GetInstance().Increase();

// c.cpp
Counter::GetInstance().Print();

After implementing the singleton pattern, there is a globally unique instance, which has the same effect as a global variable, and the singleton provides better encapsulation.

Exception: Sometimes the scope of a global variable is only within a module. In this case, there will be multiple instances of the global variable in the process space, with each module holding one. This scenario cannot be solved by the singleton pattern.

7 Classes

Constructors, Copy Constructors, Assignment, and Destructors

Constructors, copy, move, and destructors provide methods for object lifecycle management:

  • Constructor: X()
  • Copy constructor: X(const X&)
  • Copy assignment operator: operator=(const X&)
  • Move constructor: X(X&&) Available since C++11
  • Move assignment operator: operator=(X&&) Available since C++11
  • Destructor: ~X()

Rule 7.1.1 Class member variables must be explicitly initialized

Description: If a class has member variables, does not define a constructor, and does not have a default constructor defined, the compiler will automatically generate a constructor. However, the compiler-generated constructor does not initialize the member variables, leaving the object’s state in an uncertain condition.

Exception:

  • If the class’s member variables have default constructors, then explicit initialization is not necessary.

Example: The following code has no constructor, so the private data members cannot be initialized:

class Message {
public:
    void ProcessOutMsg()
    {
        //…
    }

private:
    unsigned int msgID_;
    unsigned int msgLength_;
    unsigned char* msgBuffer_;
    std::string someIdentifier_;
};

Message message;   // message's member variables are not initialized
message.ProcessOutMsg();   // Subsequent use has hidden risks

// Therefore, it is necessary to define a default constructor, as follows:
class Message {
public:
    Message() : msgID_(0), msgLength_(0), msgBuffer_(nullptr)
    {
    }

    void ProcessOutMsg()
    {
        // …
    }

private:
    unsigned int msgID_;
    unsigned int msgLength_;
    unsigned char* msgBuffer_;
    std::string someIdentifier_; // Has a default constructor, no need for explicit initialization
};

Recommendation 7.1.1 Prioritize in-class initialization (C++11) and constructor initializer lists for member variables

Description: C++11’s in-class initialization makes the member’s initial value clear at a glance and should be prioritized. If a member’s initialization value is related to the constructor or C++11 is not supported, you should prioritize using the constructor’s initializer list to initialize members. Compared to assigning values to members in the constructor body, the initializer list is more concise, has better execution performance, and can initialize const and reference members.

class Message {
public:
    Message() : msgLength_(0)  // Good, prioritize using the initializer list
    {
        msgBuffer_ = nullptr;  // Bad, not recommended to assign in the constructor body
    }
   
private:
    unsigned int msgID_{0};  // Good, used in C++11
    unsigned int msgLength_;
    unsigned char* msgBuffer_;
};

Rule 7.1.2 To avoid implicit conversion, declare single-parameter constructors as explicit

Description: A single-parameter constructor without an explicit declaration becomes an implicit conversion function. Example:

class Foo {
public:
    explicit Foo(const string& name): name_(name)
    {
    }
private:
    string name_;
};


void ProcessFoo(const Foo& foo){}

int main(void)
{
    std::string test = "test";
    ProcessFoo(test);  // Compilation fails
    return 0;
}

The code above fails to compile because ProcessFoo expects a parameter of type Foo, but a string type was passed.

If the explicit keyword is removed from the Foo constructor, then calling ProcessFoo with a string will trigger an implicit conversion, generating a temporary Foo object. Often, this kind of implicit conversion is confusing and can hide bugs, leading to an unexpected type conversion. Therefore, single-parameter constructors are required to be declared explicit.

Rule 7.1.3 If you do not need copy constructors, assignment operators / move constructors, assignment operators, please explicitly disable them

Description: If not defined by the user, the compiler will generate a copy constructor and a copy assignment operator by default. Move constructor and move assignment operator (move semantic functions are available only after C++11). If we do not want to use the copy constructor or the assignment operator, please explicitly refuse them:

  1. Set the copy constructor or assignment operator to private and do not implement it:
class Foo {
private:
    Foo(const Foo&);
    Foo& operator=(const Foo&);
};
  1. Use delete provided by C++11, please refer to the relevant sections on Modern C++ later.

  2. It is recommended to inherit from NoCopyable and NoMovable, and avoid using macros like DISALLOW_COPY_AND_MOVE, DISALLOW_COPY, and DISALLOW_MOVE.

class Foo : public NoCopyable, public NoMovable {
};

Implementation of NoCopyable and NoMovable:

class NoCopyable {
public:
    NoCopyable() = default;
    NoCopyable(const NoCopyable&) = delete;
    NoCopyable& operator = (NoCopyable&) = delete;
};

class NoMovable {
public:
    NoMovable() = default;
    NoMovable(NoMovable&&) noexcept = delete;
    NoMovable& operator = (NoMovable&&) noexcept = delete;
};

Rule 7.1.4 Copy constructors and copy assignment operators should appear in pairs or be disabled together

Copy constructors and copy assignment operators both have copy semantics and should either both be present or both be disabled.

// Both present
class Foo {
public:
    ...
    Foo(const Foo&);
    Foo& operator=(const Foo&);
    ...
};

// Both defaulted, supported in C++11
class Foo {
public:
    Foo(const Foo&) = default;
    Foo& operator=(const Foo&) = default;
};

// Both disabled, can use delete in C++11
class Foo {
private:
    Foo(const Foo&);
    Foo& operator=(const Foo&);
};

Rule 7.1.5 Move constructors and move assignment operators should appear in pairs or be disabled together

In C++11, move operations were added. If you want a class to support move operations, you need to implement a move constructor and a move assignment operator.

Move constructors and move assignment operators both have move semantics and should either both be present or both be disabled.

// Both present
class Foo {
public:
    ...
    Foo(Foo&&);
    Foo& operator=(Foo&&);
    ...
};

// Both defaulted, supported in C++11
class Foo {
public:
    Foo(Foo&&) = default;
    Foo& operator=(Foo&&) = default;
};

// Both disabled, using C++11's delete
class Foo {
public:
    Foo(Foo&&) = delete;
    Foo& operator=(Foo&&) = delete;
};

Rule 7.1.6 Do not call virtual functions in constructors and destructors

Description: Calling a virtual function on the current object in a constructor or destructor will lead to polymorphic behavior not being realized. In C++, a base class constructs one complete object at a time.

Example: Class Base is the base class, Sub is the derived class

class Base {                      
public:               
    Base();
    virtual void Log() = 0;    // Different derived classes call different log files
};

Base::Base()         // Base class constructor
{
    Log();           // Calls virtual function Log
}                                                 

class Sub : public Base {      
public:
    virtual void Log();         
};

When the following statement is executed: Sub sub; Sub’s constructor is executed first, but it first calls Base’s constructor. Since Base’s constructor calls the virtual function Log, at this point Log is still the base class’s version. Only after the base class construction is complete is the derived class’s construction finished, thus failing to achieve polymorphic behavior. The same logic applies to destructors.

Rule 7.1.7 Copy constructors, copy assignment operators, move constructors, and move assignment operators in a polymorphic base class must be non-public or deleted functions

If a derived class object is directly assigned to a base class object, object slicing will occur, copying or moving only the base class part, which damages polymorphic behavior. 【Negative Example】 In the following code, the base class does not define a copy constructor or copy assignment operator. The compiler will automatically generate these two special member functions. If a derived class object is assigned to a base class object, slicing occurs. In this example, the copy constructor and copy assignment operator can be declared as delete, allowing the compiler to detect such assignment behavior.

class Base {                      
public:               
    Base() = default;
    virtual ~Base() = default;
    ...
    virtual void Fun() { std::cout << "Base" << std::endl;}
};

class Derived : public Base {
    ...
    void Fun() override { std::cout << "Derived" << std::endl; }
};

void Foo(const Base &base)
{
    Base other = base; // Non-compliant: Slicing occurs
    other.Fun(); // Calls the Fun function of the Base class
}
Derived d;
Foo(d); // A derived class object is passed in
  1. Set the copy constructor or assignment operator to private and do not implement it:

Inheritance

Rule 7.2.1 A base class’s destructor should be declared virtual, and classes not intended to be inherited from should be declared final

Description: Only if the base class destructor is virtual can the derived class’s destructor be guaranteed to be called when invoked polymorphically.

Example: A non-virtual destructor in the base class leads to a memory leak.

class Base {
public:
    virtual std::string getVersion() = 0;
   
    ~Base()
    {
        std::cout << "~Base" << std::endl;
    }
};
class Sub : public Base {
public:
    Sub() : numbers_(nullptr)
    { 
    }
   
    ~Sub()
    {
        delete[] numbers_;
        std::cout << "~Sub" << std::endl;
    }
   
    int Init()
    {
        const size_t numberCount = 100;
        numbers_ = new (std::nothrow) int[numberCount];
        if (numbers_ == nullptr) {
            return -1;
        }
       
        ...
    }

    std::string getVersion()
    {
        return std::string("hello!");
    }
private:
    int* numbers_;
};
int main(int argc, char* args[])
{
    Base* b = new Sub();

    delete b;
    return 0;
}

Because the Base class’s destructor is not declared virtual, when the object is destroyed, only the base class’s destructor is called, not the derived class Sub’s destructor, leading to a memory leak. Exception: Classes like NoCopyable and NoMovable, which have no behavior and are only used as markers, can omit a virtual destructor and not be declared final.

Rule 7.2.2 Do not use default parameter values for virtual functions

Description: In C++, virtual functions are dynamically bound, but default parameters for functions are statically bound at compile time. This means the function you ultimately execute is one defined in a derived class but uses the default parameter value from the base class. To avoid confusion and problems caused by inconsistent parameter declarations when overriding virtual functions, it is stipulated that no virtual function shall have default parameter values declared. Example: The default parameter value text for the virtual function display is determined at compile time, not runtime, failing to achieve polymorphism:

class Base {
public:
    virtual void Display(const std::string& text = "Base!")
    {
        std::cout << text << std::endl;
    }
   
    virtual ~Base(){}
};

class Sub : public Base {
public:
    virtual void Display(const std::string& text  = "Sub!")
    {
        std::cout << text << std::endl;
    }
   
    virtual ~Sub(){}
};

int main()
{
    Base* base = new Sub();
    Sub* sub = new Sub();
  
    ...
   
    base->Display();  // Program output: Base! but expected output: Sub!
    sub->Display();   // Program output: Sub!
   
    delete base;
    delete sub;
    return 0;
};

Rule 7.2.3 Do not redefine a non-virtual function inherited from a base class

Description: Because non-virtual functions cannot achieve dynamic binding, only virtual functions can achieve dynamic binding: as long as you operate on a pointer to the base class, you can get the correct result.

Example:

class Base {
public:
    void Fun();
};

class Sub : public Base {
public:
    void Fun();
};

Sub* sub = new Sub();                    
Base* base = sub;

sub->Fun();    // Calls the derived class's Fun                 
base->Fun();   // Calls the parent class's Fun
//...

Multiple Inheritance

In actual development, scenarios using multiple inheritance are relatively rare due to the following typical problems:

  1. Data duplication and name ambiguity caused by diamond inheritance. Therefore, C++ introduced virtual inheritance to solve such problems.
  2. Even without diamond inheritance, names between multiple parent classes can conflict, leading to ambiguity.
  3. If a derived class needs to extend or override methods from multiple parent classes, it can lead to unclear responsibilities and semantic confusion for the derived class.
  4. Compared to delegation, inheritance is a form of white-box reuse, meaning a subclass can access the parent class’s protected members, which leads to stronger coupling. Multiple inheritance, due to coupling with multiple parent classes, creates even stronger coupling relationships compared to single-root inheritance.

Multiple inheritance has the following advantages: Multiple inheritance provides a simpler way to compose and reuse multiple interfaces or classes.

Therefore, multiple inheritance is only allowed in the following situations.

Recommendation 7.3.1 Use multiple inheritance to implement interface separation and multi-role composition

If a class needs to implement multiple interfaces, you can combine multiple separate interfaces through multiple inheritance, similar to mixins in Scala.

class Role1 {};
class Role2 {};
class Role3 {};

class Object1 : public Role1, public Role2 {
    // ...
};

class Object2 : public Role2, public Role3 {
    // ...
};

Similar implementation examples can be found in the C++ standard library:

class basic_istream {};
class basic_ostream {};

class basic_iostream : public basic_istream, public basic_ostream {
 
};

Overloading

Operator overloading should have sufficient reasons, and do not change the original semantics of the operator, for example, do not use the ‘+’ operator for subtraction. Operator overloading makes code more intuitive, but it also has some drawbacks:

  • It can be counter-intuitive, leading one to mistakenly believe the operation is as high-performance as built-in types, ignoring potential performance degradation.
  • It is less intuitive for problem localization; searching by function name is clearly more convenient than by operator.
  • If the behavior of an overloaded operator is not intuitive (e.g., using the ‘+’ operator for subtraction), it can make the code confusing.
  • The implicit conversion introduced by overloading the assignment operator can hide deep bugs. You can define functions like Equals() or CopyFrom() to replace the = and == operators.

8 Functions

Function Design

Rule 8.1.1 Avoid overly long functions; functions should not exceed 50 lines (excluding blank lines and comments)

A function should be displayable on a single screen (within 50 lines), do only one thing, and do it well.

Overly long functions often mean that the function’s functionality is not singular, is overly complex, or excessively presents details without further abstraction.

Exception: Some functions that implement algorithms may exceed 50 lines due to the cohesiveness and comprehensiveness of the algorithm.

Even if a long function works perfectly now, once someone modifies it, new problems may arise, and even hard-to-find bugs can be introduced. It is recommended to split it into shorter, more manageable functions for easier reading and modification by others.

Inline Functions

Recommendation 8.2.1 Inline functions should not exceed 10 lines (excluding blank lines and comments)

Description: Inline functions have the characteristics of general functions. The only difference between them and general functions lies in the handling of function calls. When a general function is called, the program execution right is transferred to the called function and then returns to the calling function. In contrast, when an inline function is called, the call expression is replaced with the body of the inline function.

Inline functions are only suitable for small functions with 1-10 lines. For a large function with many statements, the overhead of the function call and return is relatively insignificant, and there is no need to implement it as an inline function. General compilers will abandon the inlining method and use the normal method to call the function.

If an inline function contains complex control structures, such as loops, branches (switch), try-catch, etc., most compilers will treat the function as a normal function. Virtual functions and recursive functions cannot be used as inline functions.

Function Parameters

Recommendation 8.3.1 Use references instead of pointers for function parameters

Description: References are safer than pointers because they are guaranteed to be non-null and will not be re-bound to another target. References do not require checking for illegal NULL pointers.

If developing a product based on an older platform, prioritize the handling style of the original platform. Use const to prevent parameters from being modified, which makes it clear to the code reader that the parameter will not be modified and greatly enhances code readability.

Exception: When the passed parameter is an array of unknown length at compile time, a pointer can be used instead of a reference.

Recommendation 8.3.2 Use strongly typed parameters; avoid using void*

Although different languages have their own views on strong and weak typing, C/C++ is generally considered a strongly typed language. Since we are using a strongly typed language, we should maintain this style. The benefit is to let the compiler catch type mismatch issues as much as possible at the compilation stage.

Using strong types helps the compiler find errors for us. In the following code, note the use of the function FooListAddNode:

struct FooNode {
    struct List link;
    int foo;
};

struct BarNode {
    struct List link;
    int bar;
}

void FooListAddNode(void *node) // Bad: Using void * type to pass parameters here
{
    FooNode *foo = (FooNode *)node;
    ListAppend(&g_FooList, &foo->link);
}

void MakeTheList()
{
    FooNode *foo = nullptr;
    BarNode *bar = nullptr;
    ...

    FooListAddNode(bar);        // Wrong: The intention was to pass parameter foo, but bar was passed by mistake, and no error was reported
}
  1. You can use template functions to implement parameter type variations.
  2. You can use base class pointers to achieve polymorphism.

Recommendation 8.3.3 The number of function parameters should not exceed 5

Having too many parameters in a function makes it susceptible to external changes, thus affecting maintenance work. Having too many parameters also increases the testing workload.

If this number is exceeded, consider:

  • Whether the function can be split
  • Whether related parameters can be grouped together into a struct

9 Other C++ Features

Constants and Initialization

Immutable values are easier to understand, track, and analyze, so you should use constants instead of variables whenever possible. When defining a value, const should be the default option.

Rule 9.1.1 Do not use macros to represent constants

Description: Macros are simple text replacements that are completed during the preprocessing stage. When a runtime error occurs, it reports the corresponding value directly; during debugging, it also displays the value, not the macro name. Macros have no type checking and are unsafe. Macros have no scope.

#define MAX_MSISDN_LEN 20    // Bad

// Use const constants in C++
const int MAX_MSISDN_LEN = 20; // Good

// For C++11 and later, constexpr can be used
constexpr int MAX_MSISDN_LEN = 20;

Description: Enums are safer than #define or const int. The compiler will check if a parameter value is within the enumeration’s range, preventing errors.

// Good example:
enum Week {
    SUNDAY,
    MONDAY,
    TUESDAY,
    WEDNESDAY,
    THURSDAY,
    FRIDAY,
    SATURDAY
};

enum Color {
    RED,
    BLACK,
    BLUE
};

void ColorizeCalendar(Week today, Color color);

ColorizeCalendar(BLUE, SUNDAY); // Compilation error, incorrect parameter type

// Bad example:
const int SUNDAY = 0;
const int MONDAY = 1;

const int BLACK  = 0;
const int BLUE   = 1;

bool ColorizeCalendar(int today, int color);
ColorizeCalendar(BLUE, SUNDAY); // No error

When enumeration values need to correspond to specific numbers, they must be explicitly assigned during declaration. Otherwise, explicit assignment is not necessary to avoid duplicate assignments and reduce maintenance effort (adding/removing members).

// Good example: Device ID values defined in the S protocol, used to identify device types
enum DeviceType {
    DEV_UNKNOWN = -1,
    DEV_DSMP = 0,
    DEV_ISMG = 1,
    DEV_WAPPORTAL = 2
};

For internal program use, when only used for categorization, explicit assignment should not be performed.

// Good example: Enumeration definition used to identify session state in a program
enum SessionState {
    INIT,
    CLOSED,
    WAITING_FOR_RESPONSE
};

You should avoid duplicate enumeration values. If duplication is necessary, use a defined enumeration to qualify it.

enum RTCPType {
    RTCP_SR = 200,
    RTCP_MIN_TYPE = RTCP_SR,       
    RTCP_RR    = 201,
    RTCP_SDES  = 202,
    RTCP_BYE   = 203,
    RTCP_APP   = 204,
    RTCP_RTPFB = 205,
    RTCP_PSFB  = 206,
    RTCP_XR  = 207,
    RTCP_RSI = 208,
    RTCP_PUBPORTS = 209,
    RTCP_MAX_TYPE = RTCP_PUBPORTS 
};

Rule 9.1.2 Do not use magic numbers

A magic number is a number that is hard to understand and comprehend.

The concept of a magic number is not black and white; incomprehensibility has degrees, and you must judge for yourself. For example, the number 12 means different things in different contexts: type = 12; is incomprehensible, but monthsCount = yearsCount * 12; is understandable. The number 0 can sometimes be a magic number, for example, status = 0; does not express what state it is.

Solutions: For numbers used locally, add comments to explain them. For numbers used in multiple places, you must define const constants and use symbolic names for self-commenting.

The following situations are prohibited: The meaning of the number is not explained by a symbol, e.g., const int ZERO = 0 The symbol name limits its value, e.g., const int XX_TIMER_INTERVAL_300MS = 300. Instead, use XX_TIMER_INTERVAL_MS to represent that this constant is a time interval for a timer.

Rule 9.1.3 Constants should ensure a single responsibility

Description: A constant is used to represent only one specific function; that is, one constant cannot have multiple uses.

// Good example: For protocol A and protocol B, the mobile number (MSISDN) length is 20.
const unsigned int A_MAX_MSISDN_LEN = 20;
const unsigned int B_MAX_MSISDN_LEN = 20;

// Or use different namespaces:
namespace Namespace1 {
    const unsigned int MAX_MSISDN_LEN = 20;
}

namespace Namespace2 {
    const unsigned int MAX_MSISDN_LEN = 20;
}

Rule 9.1.4 Do not use memcpy_s or memset_s to initialize non-POD objects

Description: POD stands for Plain Old Data, a concept introduced in the C++ 98 standard (ISO/IEC 14882, first edition, 1998-09-01). POD types mainly include primitive types like int, char, float, double, enumeration, void, pointers, as well as aggregate types. They cannot use encapsulation and object-oriented features (such as user-defined constructors/assignment/destructors, base classes, virtual functions, etc.).

For non-POD types, such as non-aggregate class objects, the memory layout may be uncertain and compiler-dependent due to the possible presence of virtual functions. Abusing memory copies can lead to serious problems.

Even for aggregate classes, using direct memory copy and comparison undermines information hiding and data protection, so memcpy_s and memset_s operations are also not recommended.

For a detailed description of POD types, please refer to the appendix.

Recommendation 9.1.2 Declare and initialize variables when they are used

Description: Using a variable before it has been assigned an initial value is a common low-level programming error. Declaring a variable just before using it and initializing it at the same time conveniently avoids such low-level errors.

Declaring all variables at the beginning of a function and using them later, with the scope covering the entire function implementation, can easily lead to the following problems:

  • The program is difficult to understand and maintain: the definition of the variable is separated from its use.
  • It is difficult to initialize the variable properly: at the beginning of the function, there is often not enough information to initialize the variable. A default empty value (like zero) is often used for initialization, which is usually a waste. If the variable is used before being assigned a valid value, it will also cause an error.

Follow the principles of minimizing variable scope and declaring locally, making the code easier to read and convenient for understanding the variable’s type and initial value. In particular, initialization should be used to replace declaration followed by assignment.

// Bad example: Declaration and initialization are separate
string name;        // Not initialized when declared: calls the default constructor
name = "zhangsan";  // Calls the assignment operator function again; declaration and definition are in different places, making it relatively difficult to understand

// Good example: Declaration and initialization are combined, relatively easy to understand
string name("zhangsan");  // Calls the constructor

Expressions

Rule 9.2.1 Do not reference a variable again in an expression that contains a variable’s increment or decrement operation

In an expression containing a variable’s increment (++) or decrement (–) operation, if the variable is referenced again, the result is not clearly defined in the C++ standard. Implementations may vary across different compilers or even different versions of the same compiler. For better portability, no assumptions should be made about operation orders that are undefined by the standard.

Note that the issue of operation order cannot be solved with parentheses, as this is not a matter of precedence.

Example:

x = b[i] + i++; // Bad: The order of b[i] operation and i++ is not clear.

The correct way is to put the increment or decrement operation on a separate line:

x = b[i] + i;
i++;            // Good: On a separate line

Function parameters

Func(i++, i);   // Bad: When passing the second parameter, it's uncertain whether the increment operation has occurred

Correct way

i++;            // Good: On a separate line
x = Func(i, i);

Rule 9.2.2 A switch statement should have a default branch

In most cases, a switch statement should have a default branch to ensure there is a default handling behavior when a case label is missed.

Exception: If the switch condition variable is an enumeration type and the case branches cover all values, adding a default branch is somewhat redundant. Modern compilers can check if any enumeration values are missed in a switch statement’s case branches and will issue a corresponding warning.

enum Color {
    RED = 0,
    BLUE
};

// Since the switch condition variable is an enumeration value, the default handling branch can be omitted here
switch (color) {
    case RED:
        DoRedThing();
        break;
    case BLUE:
        DoBlueThing();
        ...
        break;
}

Recommendation 9.2.1 In expression comparisons, the left side should tend to be variable and the right side should tend to be constant

When comparing a variable with a constant, if the constant is on the left, such as if (MAX == v), it does not conform to reading habits, and if (MAX > v) is even harder to understand. You should follow normal human reading and expression habits by placing the constant on the right. Write it as follows:

if (value == MAX) {
 
}

if (value < MAX) {
 
}

There are special cases, such as: if (MIN < value && value < MAX) used to describe a range, where the first part has the constant on the left.

Don’t worry about mistyping ‘==’ as ‘=’, because if (value = MAX) will generate a compiler warning, and other static analysis tools will also report an error. Let tools handle typos; code should prioritize readability.

Recommendation 9.2.2 Use parentheses to clarify operator precedence

Use parentheses to clarify operator precedence to prevent program errors caused by default precedence not matching the design intent; it also makes the code clearer and more readable. However, excessive parentheses can disperse the code and reduce readability. Here are some suggestions on how to use parentheses.

  • For binary and higher operators, if multiple types of operators are involved, parentheses should be used.
x = a + b + c;         /* Same operators, parentheses can be omitted */
x = Foo(a + b, c);     /* Expressions on both sides of the comma do not need parentheses */
x = 1 << (2 + 3);      /* Different operators, parentheses needed */
x = a + (b / 5);       /* Different operators, parentheses needed */
x = (a == b) ? a : (a  b);    /* Different operators, parentheses needed */

Type Conversion

Avoid customizing behavior with type branching: customizing behavior with type branching is error-prone and a clear sign of attempting to write C code in C++. This is a very inflexible technique; when adding new types, if you forget to modify all branches, the compiler will not tell you. Use templates and virtual functions to let the types themselves, not the code that calls them, decide the behavior.

It is recommended to avoid type conversion. In our code’s type design, we should consider what the data type of each piece of data is, rather than overusing type conversion to solve problems. When designing a certain basic type, please consider:

  • Whether it is unsigned or signed
  • Whether it is suitable for float or double
  • Whether to use int8, int16, int32, or int64, determining the length of the integer

However, we cannot prohibit the use of type conversion because C++ is a language for machine-level programming, involving pointer addresses, and we interact with various third-party or low-level APIs whose type designs may not be reasonable. Type conversion is easily introduced during this adaptation process.

Exception: When calling a function, if you do not want to handle the function’s result, the first thing to consider is whether this is your best choice. If you indeed do not want to handle the function’s return value, then you can use a (void) cast to solve it.

Rule 9.3.1 If you are certain you need to use type conversion, please use the type conversions provided by C++, not C-style type conversions

Description:

The type conversion operators provided by C++ are more targeted, easier to read, and safer than C-style conversions. The conversions provided by C++ are:

  • Type conversions:
  1. dynamic_cast: Mainly used for downcasting in inheritance hierarchies. dynamic_cast has type checking functionality. Please design base and derived classes properly to avoid using dynamic_cast for conversion.

  2. static_cast: Similar to C-style conversion, it can be used for forced value conversion or upcasting (converting a derived class’s pointer or reference to a base class’s pointer or reference). This conversion is often used to eliminate type ambiguity caused by multiple inheritance and is relatively safe. If it is purely an arithmetic conversion, please use the brace-initialization conversion method described later.

  3. reinterpret_cast: Used for converting unrelated types. reinterpret_cast forces the compiler to reinterpret the memory of an object of one type as another type. This is an unsafe conversion, and it is recommended to use reinterpret_cast as little as possible.

  4. const_cast: Used to remove the const attribute of an object, making it modifiable. This breaks the immutability of data, and it is recommended to use const_cast as little as possible.

  • Arithmetic conversion: (Supported starting from C++11) For arithmetic conversions where type information is not lost, such as float to double, int32 to int64, it is recommended to use brace initialization.
  double d{ someFloat };
  int64_t i{ someInt32 };

Recommendation 9.3.1 Avoid using dynamic_cast

  1. dynamic_cast relies on C++ RTTI, allowing programmers to identify C++ class object types at runtime.
  2. The appearance of dynamic_cast generally indicates problems in our base class and derived class design. The derived class breaks the base class’s contract, forcing the use of dynamic_cast to convert to a subclass for special handling. In this case, it’s better to improve the class design rather than solve the problem through dynamic_cast.

Recommendation 9.3.2 Avoid using reinterpret_cast

Description: reinterpret_cast is used for converting unrelated types. Attempting to use reinterpret_cast to force-convert one type to another breaks type safety and reliability, making it an unsafe conversion. Avoid conversions between different types as much as possible.

Recommendation 9.3.3 Avoid using const_cast

Description: const_cast is used to remove the const and volatile properties of an object.

Using a pointer or reference converted by const_cast to modify a const object results in undefined behavior.

// Bad example
const int i = 1024;
int* p = const_cast<int*>(&i);
*p = 2048;      // Undefined behavior
// Bad example
class Foo {
public:
    Foo() : i(3) {}

    void Fun(int v)
    {
        i = v;
    }

private:
    int i;
};

int main(void)
{
    const Foo f;
    Foo* p = const_cast<Foo*>(&f);
    p->Fun(8);  // Undefined behavior
}

Resource Allocation and Release

Rule 9.4.1 Use delete for single object release, use delete[] for array object release

Description: Use delete for single object deletion, use delete[] for array object deletion. Reasons:

  • Actions included in calling new: Request a block of memory from the system and call the constructor of this type.
  • Actions included in calling new[n]: Request memory that can accommodate n objects and call the constructor for each object.
  • Actions included in calling delete: First call the corresponding destructor, then return the memory to the system.
  • Actions included in calling delete[]: Call the destructor for each object, then release all memory.

If the format of new and delete doesn’t match, the result is undefined. For non-class types, new and delete do not call constructors and destructors.

Incorrect writing:

const int MAX_ARRAY_SIZE = 100;
int* numberArray = new int[MAX_ARRAY_SIZE];
...
delete numberArray;
numberArray = nullptr;

Correct writing:

const int MAX_ARRAY_SIZE = 100;
int* numberArray = new int[MAX_ARRAY_SIZE];
...
delete[] numberArray;
numberArray = nullptr;

Recommendation 9.4.1 Use RAII features to help track dynamic allocation

Description: RAII is the abbreviation for “Resource Acquisition Is Initialization”, a simple technique that uses object lifecycle to control program resources (such as memory, file handles, network connections, mutexes, etc.).

The general practice of RAII is: acquire resources when constructing the object, then control access to resources so they remain valid throughout the object’s lifecycle, and finally release resources when the object is destructed. This approach has two major benefits:

  • We don’t need to explicitly release resources.
  • The resources required by the object remain valid throughout its lifecycle. This eliminates the need to check resource validity, simplifying logic and improving efficiency.

Example: Using RAII doesn’t require explicitly releasing mutex resources.

class LockGuard {
public:
    LockGuard(const LockType& lockType): lock_(lockType)
    {
        lock_.Acquire();
    }
   
    ~LockGuard()
    {
        lock_.Release();
    }
   
private:
    LockType lock_;
};


bool Update()
{
    LockGuard lockGuard(mutex);
    if (...) {
        return false;
    } else {
        // Operate on data
    }
   
    return true;
}

Standard Library

The usage of STL standard template library varies across different products. Here are some basic rules and recommendations for team reference.

Rule 9.5.1 Do not save pointers returned by std::string’s c_str()

Description: The C++ standard does not specify that string::c_str() pointers remain persistently valid. Therefore, specific STL implementations can completely return a temporary storage area when calling string::c_str() and release it quickly. So to ensure program portability, do not save the result of string::c_str(), but call it directly each time it’s needed.

Example:

void Fun1()
{
    std::string name = "demo";
    const char* text = name.c_str();  // After the expression ends, name's lifecycle is still valid, pointer is valid

    // If non-const member functions of string are called in between, causing the string to be modified, such as operator[], begin(), etc.
    // This may cause text's content to become unavailable or not the original string
    name = "test";
    name[1] = '2';

    // Subsequent use of text pointer, its string content is no longer "demo"
}

void Fun2()
{
    std::string name = "demo";
    std::string test = "test";
    const char* text = (name + test).c_str(); // After the expression ends, the temporary object created by + is destroyed, pointer is invalid

    // Subsequent use of text pointer, it no longer points to valid memory space
}

Exception: In a few code sections with very high performance requirements, to adapt to existing functions that only accept const char* type parameters, you can temporarily save pointers returned by string::c_str(). However, you must strictly ensure that the string object’s lifecycle is longer than the saved pointer’s lifecycle, and ensure that the string object is not modified during the saved pointer’s lifecycle.

Recommendation 9.5.1 Use std::string instead of char*

Description: Using string instead of char* has many advantages, such as:

  1. No need to consider the terminating ‘\0’;
  2. Can directly use operators like +, =, == and other string manipulation functions;
  3. No need to consider memory allocation operations, avoiding explicit new/delete and the errors caused by them;

It should be noted that some STL implementations of string are based on copy-on-write strategy, which brings two problems: first, some versions of copy-on-write strategy are not thread-safe, which can cause program crashes in multi-threaded environments; second, when passing copy-on-write strategy-based strings with dynamic link libraries, reference counts cannot be reduced when the dynamic link library is unloaded, potentially leading to dangling pointers. Therefore, carefully choosing a reliable STL implementation is very important for ensuring program stability.

Exception: When calling system or other third-party library APIs, for already defined interfaces, you can only use char*. However, you can use string before calling the interface and use string::c_str() to get the character pointer when calling the interface. When allocating character arrays on the stack as buffers, you can directly define character arrays, do not use string, and there’s no need to use containers like vector<char>.

Rule 9.5.2 Prohibit using auto_ptr

Description: std::auto_ptr in the STL library has an implicit ownership transfer behavior, as shown in the following code:

auto_ptr<T> p1(new T);
auto_ptr<T> p2 = p1;

After executing the second line of code, p1 no longer points to the object allocated in the first line, but becomes nullptr. Because of this, auto_ptr cannot be placed in various standard containers. Ownership transfer behavior is usually not the expected result. For scenarios where ownership must be transferred, implicit transfer methods should not be used either. This often requires programmers to be extra cautious with code using auto_ptr, otherwise accessing null pointers may occur. There are two common scenarios for using auto_ptr: one is passing smart pointers to functions outside where auto_ptr was generated, and two is using auto_ptr as an RAII management class to automatically release resources when the auto_ptr’s lifecycle ends. For the first scenario, std::shared_ptr can be used instead. For the second scenario, std::unique_ptr from the C++11 standard can be used instead. Among them, std::unique_ptr is a replacement for std::auto_ptr, supporting explicit ownership transfer.

Exception: Before the C++11 standard became widely used, in scenarios where ownership transfer is necessary, std::auto_ptr can be used, but it’s recommended to encapsulate std::auto_ptr and disable the encapsulated class’s copy constructor and assignment operator to prevent this encapsulated class from being used in standard containers.

Recommendation 9.5.2 Use new standard header files

Description: When using C++ standard header files, please use <cstdlib> instead of <stdlib.h>.

Usage of const

Adding the keyword const before declared variables or parameters is used to indicate that the variable value cannot be tampered with (such as const int foo). Adding const qualifier to functions in a class indicates that the function will not modify the state of class member variables (such as class Foo { int Bar(char c) const; };). const variables, data members, functions, and parameters add a layer of protection for compile-time type checking, facilitating early error detection. Therefore, we strongly recommend using const wherever possible. Sometimes, using C++11’s constexpr to define true constants might be better.

Rule 9.6.1 For pointer and reference type parameters that do not need modification, please use const

Immutable values are easier to understand/track and analyze. Using const as the default option gets checked at compile time, making code more robust/secure.

class Foo;

void PrintFoo(const Foo& foo);

Rule 9.6.2 Use const modifier for member functions that will not modify member variables

Declare member functions as const whenever possible. Accessor functions should always be const. All member functions that do not modify data members should be declared as const. For virtual functions, consider from a design perspective whether all classes in the inheritance chain need to modify data members in this virtual function, rather than focusing only on individual class implementations.

class Foo {
public:

    // ...

    int PrintValue() const // const modifies member function, will not modify member variables
    {
        std::cout << value_ << std::endl;
    }

    int GetValue() const  // const modifies member function, will not modify member variables
    {
        return value_;
    }

private:
    int value_;
};

Recommendation 9.6.1 Define member variables that will not be modified after initialization as const

class Foo {
public:
    Foo(int length) : dataLength_(length) {}
private:
    const int dataLength_; 
};

Exceptions

Recommendation 9.7.1 In C++11, if a function will not throw exceptions, declare it as noexcept

Reason

  1. If a function will not throw exceptions, declaring it as noexcept allows the compiler to optimize the function to the greatest extent, such as reducing execution paths and improving error exit efficiency.
  2. For STL containers like vector, to ensure interface robustness, if the move operator of stored elements is not declared as noexcept, then when the container expands and moves elements, it will not use the move mechanism but the copy mechanism, bringing the risk of performance loss. If a function cannot throw exceptions, or a program does not catch and handle exceptions thrown by a function, then this function can be decorated with the new noexcept keyword, indicating that this function will not throw exceptions or thrown exceptions will not be caught and handled. For example:
extern "C" double sqrt(double) noexcept;  // Will never throw exceptions

// Even if exceptions might be thrown, you can use noexcept
// Here we don't plan to handle out-of-memory exceptions, simply declare the function as noexcept
std::vector<int> MyComputation(const std::vector<int>& v) noexcept
{
    std::vector<int> res = v;    // Might throw exceptions
    // do something
    return res;
}

Example

RetType Function(Type params) noexcept;   // Maximum optimization
RetType Function(Type params);            // Less optimization

// std::vector's move operation needs to be declared noexcept
class Foo1 {
public:
    Foo1(Foo1&& other);  // no noexcept
};

std::vector<Foo1> a1;
a1.push_back(Foo1());
a1.push_back(Foo1());  // Triggers container expansion, calls copy constructor when moving existing elements

class Foo2 {
public:
    Foo2(Foo2&& other) noexcept;
};

std::vector<Foo2> a2;
a2.push_back(Foo2());
a2.push_back(Foo2());  // Triggers container expansion, calls move constructor when moving existing elements

Note Default constructors, destructors, swap functions, and move operators should not throw exceptions.

Templates and Generic Programming

Rule 9.8.1 Prohibit generic programming in OpenHarmony projects

Generic programming and object-oriented programming have completely different ideas, concepts, and techniques. OpenHarmony projects primarily use object-oriented thinking.

C++ provides powerful generic programming mechanisms that can implement very flexible and concise type-safe interfaces, achieving code reuse for different types but same behaviors.

However, C++ generic programming has the following disadvantages:

  1. People who are not very proficient in generic programming often write object-oriented logic as templates, put members that don’t depend on template parameters in templates, etc., leading to logic confusion, code bloat, and many other problems.
  2. The techniques used in template programming are relatively obscure and difficult to understand for people who are not very proficient in C++. Code using templates in complex places is harder to read, and debugging and maintenance are very troublesome.
  3. Template programming often leads to very unfriendly compilation error messages: when code errors occur, even if the interface is very simple, complex internal implementation details of templates will be displayed in error messages, making compilation error messages very difficult to understand.
  4. If templates are used improperly, it can lead to excessive runtime code bloat.
  5. Template code is difficult to modify and refactor. Template code expands in many contexts, making it hard to confirm that refactoring is useful for all these expanded codes.

Therefore, most OpenHarmony components prohibit template programming, with only few components allowed to use generic programming, and developed templates must have detailed comments. Exception:

  1. STL adaptation layers can use templates

Macros

In the C++ language, we strongly recommend using complex macros as little as possible

  • For constant definitions, please use const or enums as described in previous chapters;
  • For macro functions, keep them as simple as possible and follow the principles below, and prefer to use inline functions, template functions, etc. for replacement.
// Not recommended to use macro functions
#define SQUARE(a, b) ((a) * (b))

// Please use template functions, inline functions, etc. for replacement.
template<typename T> T Square(T a, T b) { return a * b; }

If you need to use macros, please refer to the relevant chapters of the C language specification. Exception: Some universal and mature applications, such as: encapsulation handling of new and delete, can retain the use of macros.

10 Modern C++ Features

With ISO’s release of the C++11 language standard in 2011, and C++17 in March 2017, modern C++ (C++11/14/17, etc.) has added numerous new language features and standard libraries that improve programming efficiency and code quality. This chapter describes some guidelines that can help teams use modern C++ more efficiently and avoid language traps.

Code Conciseness and Safety Improvements

Recommendation 10.1.1 Use auto reasonably

Reason

  • auto can avoid writing lengthy, repetitive type names, and also ensures initialization when defining variables.
  • auto type deduction rules are complex and require careful understanding.
  • If it can make code clearer, continue using explicit types, and only use auto for local variables.

Example

// Avoid lengthy type names
std::map<string, int>::iterator iter = m.find(val);
auto iter = m.find(val);

// Avoid repeating type names
class Foo {...};
Foo* p = new Foo;
auto p = new Foo;

// Ensure initialization
int x;    // Compiles correctly, no initialization
auto x;   // Compilation fails, must initialize

auto type deduction can lead to confusion:

auto a = 3;           // int
const auto ca = a;    // const int
const auto& ra = a;   // const int&
auto aa = ca;         // int, ignores const and reference
auto ila1 = { 10 };   // std::initializer_list<int>
auto ila2{ 10 };      // std::initializer_list<int>

auto&& ura1 = x;      // int&
auto&& ura2 = ca;     // const int&
auto&& ura3 = 10;     // int&&

const int b[10];
auto arr1 = b;        // const int*
auto& arr2 = b;       // const int(&)[10]

If you don’t pay attention to auto type deduction ignoring references, it might introduce hard-to-detect performance issues:

std::vector<std::string> v;
auto s1 = v[0];  // auto deduced as std::string, copies v[0]

If you use auto to define interfaces, such as constants in header files, type changes might occur because developers modify the values.

Rule 10.1.1 Use override or final keywords when overriding virtual functions

Reason Both override and final keywords ensure that the function is virtual and overrides a base class virtual function. If the subclass function prototype is inconsistent with the base class function, a compilation warning is generated. final also ensures that the virtual function won’t be overridden by subclasses.

After using override or final keywords, if you modify the base class virtual function prototype but forget to modify the subclass’s overridden virtual function, it can be discovered at compile time. It also avoids missing modifications when there are multiple subclasses overriding virtual functions.

Example

class Base {
public:
    virtual void Foo();
    virtual void Foo(int var);
    void Bar();
};

class Derived : public Base {
public:
    void Foo() const override; // Compilation failure: Derived::Foo and Base::Foo prototypes are inconsistent, not an override
    void Foo() override;       // Correct: Derived::Foo overrides Base::Foo
    void Foo(int var) final;   // Correct: Derived::Foo(int) overrides Base::Foo(int), and Derived's derived classes can no longer override this function
    void Bar() override;       // Compilation failure: Base::Bar is not a virtual function
};

Summary

  1. When defining a virtual function for the first time in a base class, use the virtual keyword
  2. When a subclass overrides a base class virtual function (including destructors), use override or final keywords (but not both together), and do not use the virtual keyword
  3. For non-virtual functions, do not use virtual, override, or final

Rule 10.1.2 Use delete keyword to delete functions

Reason Compared to declaring class member functions as private but not implementing them, the delete keyword is more explicit and has a wider scope of application.

Example

class Foo {
private:
    // Looking at just the header file, you don't know if the copy constructor is deleted
    Foo(const Foo&);
};

class Foo {
public:
    // Explicitly delete copy assignment function
    Foo& operator=(const Foo&) = delete;
};

The delete keyword also supports deleting non-member functions

template<typename T>
void Process(T value);

template<>
void Process<void>(void) = delete;

Rule 10.1.3 Use nullptr instead of NULL or 0

Reason For a long time, C++ lacked a keyword representing null pointers, which was quite awkward:

#define NULL ((void *)0)

char* str = NULL;   // Error: void* cannot be automatically converted to char*

void(C::*pmf)() = &C::Func;
if (pmf == NULL) {} // Error: void* cannot be automatically converted to pointer to member function

If NULL is defined as 0 or 0L, it can solve the above problems.

Or directly use 0 where null pointers are needed. But this introduces another problem: unclear code, especially when using auto automatic deduction:

auto result = Find(id);
if (result == 0) {  // Does Find() return a pointer or an integer?
    // do something
}

0 is literally an int type (0L is long), so neither NULL nor 0 are pointer types. When overloading functions with pointer and integer types, passing NULL or 0 both call the integer type overloaded function:

void F(int);
void F(int*);

F(0);      // Calls F(int), not F(int*)
F(NULL);   // Calls F(int), not F(int*)

Additionally, sizeof(NULL) == sizeof(void*) doesn’t always hold true, which is also a potential risk.

Summary: Directly using 0 or 0L results in unclear code and cannot achieve type safety; using NULL cannot achieve type safety. These are all potential risks.

The advantage of nullptr is not just that it literally represents a null pointer, making code clear, but it’s no longer an integer type.

nullptr is of type std::nullptr_t, and std::nullptr_t can be implicitly converted to all raw pointer types, allowing nullptr to behave as a null pointer to any type.

void F(int);
void F(int*);
F(nullptr);   // Calls F(int*)

auto result = Find(id);
if (result == nullptr) {  // Find() returns a pointer
    // do something
}

Rule 10.1.4 Use using instead of typedef

Before C++11, you could define type aliases through typedef. No one wants to repeatedly write code like std::map<uint32_t, std::vector<int>> multiple times.

typedef std::map<uint32_t, std::vector<int>> SomeType;

Type aliases are actually encapsulations of types. Through encapsulation, code can be made clearer, and to a large extent, avoid shotgun-style modifications caused by type changes. After C++11, using is provided to implement alias declarations:

using SomeType = std::map<uint32_t, std::vector<int>>;

Comparing the formats of both:

typedef Type Alias;   // Type first, or Alias first
using Alias = Type;   // Conforms to 'assignment' usage, easy to understand, less error-prone

If this point isn’t enough to switch to using, let’s look at template aliases:

// Define template alias, one line of code
template<class T>
using MyAllocatorVector = std::vector<T, MyAllocator<T>>;

MyAllocatorVector<int> data;       // Using alias defined with using

template<class T>
class MyClass {
private:
    MyAllocatorVector<int> data_;   // Using alias defined with using in template class
};

While typedef doesn’t support aliases with template parameters, you can only take a “roundabout approach”:

// Wrap typedef through templates, need to implement a template class
template<class T>
struct MyAllocatorVector {
    typedef std::vector<T, MyAllocator<T>> type;
};

MyAllocatorVector<int>::type data;  // Using alias defined with typedef, extra ::type

template<class T>
class MyClass {
private:
    typename MyAllocatorVector<int>::type data_;  // Using in template class, besides ::type, also need to add typename
};

Rule 10.1.5 Prohibit using std::move on const objects

Literally, std::move means to move an object. Const objects are not allowed to be modified, so naturally they cannot be moved. Therefore, using std::move on const objects brings confusion to code readers. In terms of actual functionality, std::move converts the object to an rvalue reference type; for const objects, it converts them to const rvalue references. Since very few types define move constructors and move assignment operators with const rvalue reference parameters, the actual functionality of the code often degrades to object copying rather than object moving, bringing performance losses.

Incorrect example:

std::string g_string;
std::vector<std::string> g_stringList;

void func()
{
    const std::string myString = "String content";
    g_string = std::move(myString); // bad: doesn't move myString, but copies instead
    const std::string anotherString = "Another string content";
    g_stringList.push_back(std::move(anotherString));    // bad: doesn't move anotherString, but copies instead
}

Smart Pointers

Rule 10.2.1 For singletons, class members, etc., where ownership will not be held by multiple parties, prioritize using raw pointers instead of smart pointers

Reason Smart pointers automatically release object resources to avoid resource leaks, but they bring additional resource overhead. Such as: automatically generated classes by smart pointers, construction and destruction overhead, more memory usage, etc.

For situations where object ownership will not be held by multiple parties, such as singletons and class members, resources can be released only in the class destructor. Smart pointers should not be used to add additional overhead.

Example

class Foo;
class Base {
public:
    Base() {}
    virtual ~Base()
    {
        delete foo_;
    }
private:
    Foo* foo_ = nullptr;
};

Exception

  1. When returning created objects, if a pointer destruction function is needed, smart pointers can be used.
class User;
class Foo {
public:
    std::unique_ptr<User, void(User *)> CreateUniqueUser() // Can use unique_ptr to ensure object creation and release are in the same runtime
    {
        sptr<User> ipcUser = iface_cast<User>(remoter);
        return std::unique_ptr<User, void(User *)>(::new User(ipcUser), [](User *user) {
            user->Close();
            ::delete user;
        });
    }

    std::shared_ptr<User> CreateSharedUser() // Can use shared_ptr to ensure object creation and release are in the same runtime
    {
        sptr<User> ipcUser = iface_cast<User>(remoter);
        return std::shared_ptr<User>(ipcUser.GetRefPtr(), [ipcUser](User *user) mutable {
            ipcUser = nullptr;
        });
    }
};
  1. When returning created objects and the object needs to be referenced by multiple parties, shared_ptr can be used.

Rule 10.2.2 Use std::make_unique instead of new to create unique_ptr

Reason

  1. make_unique provides a more concise creation method
  2. Ensures exception safety for complex expressions

Example

// Bad: MyClass appears twice, repetition leads to inconsistency risk
std::unique_ptr<MyClass> ptr(new MyClass(0, 1));
// Good: MyClass appears only once, no possibility of inconsistency
auto ptr = std::make_unique<MyClass>(0, 1);

Repeating types can lead to very serious problems that are hard to detect:

// Compiles correctly, but new and delete don't match
std::unique_ptr<uint8_t> ptr(new uint8_t[10]);
std::unique_ptr<uint8_t[]> ptr(new uint8_t);
// Not exception safe: compiler might evaluate parameters in the following order:
// 1. Allocate memory for Foo,
// 2. Construct Foo,
// 3. Call Bar,
// 4. Construct unique_ptr<Foo>.
// If Bar throws an exception, Foo won't be destroyed, causing memory leak.
F(unique_ptr<Foo>(new Foo()), Bar());

// Exception safe: function calls won't be interrupted.
F(make_unique<Foo>(), Bar());

Exception std::make_unique doesn’t support custom deleter. In scenarios where custom deleter is needed, it’s recommended to implement a custom version of make_unique in your own namespace. Using new to create unique_ptr with custom deleter is the last resort.

Rule 10.2.4 Use std::make_shared instead of new to create shared_ptr

Reason Using std::make_shared besides consistency reasons similar to std::make_unique, there are also performance factors. std::shared_ptr manages two entities:

  • Control block (stores reference count, deleter, etc.)
  • Managed object

std::make_shared creating std::shared_ptr allocates enough memory on the heap at once to accommodate both the control block and managed object. While using std::shared_ptr<MyClass>(new MyClass) to create std::shared_ptr, besides new MyClass triggering one heap allocation, std::shard_ptr’s constructor will trigger a second heap allocation, generating additional overhead.

Exception Similar to std::make_unique, std::make_shared doesn’t support custom deleter

Lambda

Recommendation 10.3.1 Choose to use lambda when functions don’t work (capture local variables, or write local functions)

Reason Functions cannot capture local variables or be declared within local scope; if you need these things, choose lambda whenever possible, rather than hand-written functor. On the other hand, lambda and functor cannot be overloaded; if you need overloading, use functions. In scenarios where both lambda and functions work, prioritize functions; use the simplest tool possible.

Example

// Write a function that only accepts int or string
// -- overloading is the natural choice
void F(int);
void F(const string&);

// Need to capture local state, or appear in statement or expression scope
// -- lambda is the natural choice
vector<Work> v = LotsOfWork();
for (int taskNum = 0; taskNum < max; ++taskNum) {
    pool.Run([=, &v] {...});
}
pool.Join();

Rule 10.3.1 When using lambdas in non-local scope, avoid using reference capture

Reason Using lambdas in non-local scope includes return values, storing on the heap, or passing to other threads. Local pointers and references should not exist outside their scope. lambdas capturing by reference stores references to local objects. If this would result in references existing beyond the local variable’s lifecycle, reference capture should not be used.

Example

// Bad
void Foo()
{
    int local = 42;
    // Capture local by reference.
    // After the function returns, local no longer exists,
    // Therefore Process()'s behavior is undefined!
    threadPool.QueueWork([&]{ Process(local); });
}

// Good
void Foo()
{
    int local = 42;
    // Capture local by value.
    // Because of copying, local is always valid during Process() calls
    threadPool.QueueWork([=]{ Process(local); });
}

Recommendation 10.3.2 If capturing this, explicitly capture all variables

Reason In member functions, [=] appears to be capture by value. But because it implicitly gets the this pointer by value and can operate on all member variables, data members are actually captured by reference, which is generally recommended to avoid. If you really need to do this, explicitly write the capture of this.

Example

class MyClass {
public:
    void Foo()
    {
        int i = 0;

        auto Lambda = [=]() { Use(i, data_); };   // Bad: looks like copy/capture by value, member variables are actually captured by reference

        data_ = 42;
        Lambda(); // Calls use(42);
        data_ = 43;
        Lambda(); // Calls use(43);

        auto Lambda2 = [i, this]() { Use(i, data_); }; // Good, explicitly specify capture by value, most clear, least confusing
    }

private:
    int data_ = 0;
};

Recommendation 10.3.3 Avoid using default capture modes

Reason Lambda expressions provide two default capture modes: by reference (&) and by value (=). Default capture by reference implicitly captures references to all local variables, easily leading to access to dangling references. In contrast, explicitly writing the variables that need to be captured makes it easier to check object lifecycles and reduces the chance of errors. Default capture by value implicitly captures the this pointer, and it’s difficult to see which variables the lambda function depends on. If static variables exist, it will also mislead readers into thinking the lambda copied a static variable. Therefore, you should generally explicitly write the variables that the lambda needs to capture, rather than using default capture modes.

Incorrect example

auto func()
{
    int addend = 5;
    static int baseValue = 3;

    return [=]() {  // Actually only copies addend
        ++baseValue;    // Modification will affect the static variable's value
        return baseValue + addend;
    };
}

Correct example

auto func()
{
    int addend = 5;
    static int baseValue = 3;

    return [addend, baseValue = baseValue]() mutable {  // Use C++14's capture initialization to copy a variable
        ++baseValue;    // Modify your own copy, won't affect the static variable's value
        return baseValue + addend;
    };
}

Reference: “Effective Modern C++”: Item 31: Avoid default capture modes.

Interfaces

Recommendation 10.4.1 In scenarios not involving ownership, use T* or T& as parameters instead of smart pointers

Reason

  1. Only use smart pointers to transfer or share ownership when you need to clearly define ownership mechanisms.
  2. Passing through smart pointers limits function callers to use smart pointers (e.g., if the caller wants to pass this).
  3. Passing smart pointers with shared ownership has runtime overhead.

Example

// Accept any int*
void F(int*);

// Only accept int where ownership transfer is intended
void G(unique_ptr<int>);

// Only accept int where shared ownership is intended
void G(shared_ptr<int>);

// Don't change ownership, but need callers with specific ownership
void H(const unique_ptr<int>&);

// Accept any int
void H(int&);

// Bad
void F(shared_ptr<Widget>& w)
{
    // ...
    Use(*w); // Only uses w -- completely doesn't involve lifecycle management
    // ...
};