C++ Coding Standards – time is the answer

The following is not intended to generate a holy war, but merely a place for me to remind myself of the decisions I took along the way to get to a consistent style for any new code that I write.

#pragma warning( push, 3 )
#include <stdio.h>
#pragma warning( pop )

// /Wall Used. Compiler warnings turned off.
#pragma warning (disable :  5045) // Spectre code insertion warning.

class entity
{
    static constexpr size_t     _maxEntities = 1000;
    entity(char* name)
    {
        _name = name;
    };
    void set()
    char    _name[8];
};

inline float _floor(const float& a)
{
    if (a == 0.0f) return 0.0f;
    return floorf(a);
}

Tabs vs Spaces

Tabs set to 4, and set it to replace all tabs with spaces. Shift-Tab will backspace a tabs worth of spaces. Using spaces is so the code will look the same when cut and paste into other environments, like these blogs, email, and also online code links to godbolt.org

Compiler Warnings and Errors

While it can slow down some fast coding, I find it much more beneficial to keep on top of what ever code analysis can be done in real time. Thus I set the warnings levels to the max and also set warnings as errors. Ask the compiler and static analysis to give as much feedback as possible in real time.

-Wall and then turn off using pragma for any that are not needed. Being explicit about what you are prepared to ignore is a much better position to be in.

#pragma warning( push, 3 )
#include <stdio.h>
#pragma warning( pop )

// /Wall Used. Compiler warnings turned off.
#pragma warning (disable :  5045) // Spectre code insertion warning.
#pragma warning (disable :  4710) // Compiler decided not to inline.

Braces

Allman style with forced braces even for one line. Collapsing code into one neat little line is very tempting, but key reasons not to is for code readability and allowing for a place to step inside the debugger. Vertical space is not that critical if the code is clear and, simple and broken down to low Cyclomatic Complexity. I keep trying and can never get on with K&R style and its variants.

These exceptions prove the rule (grin):

Super trivial lines, typically with return for early out in a function. It helps to get these visually out of the way.

if (lines < 0) return;

For code using if constexpr() where it is a one liner selection, that can be without the braces.

if constexpr_(isStack)
    newBlock = alloca(_poolBlockSize);
else
    newBlock = malloc(_poolBlockSize);

Identifiers

Take your pick from: camelCase, PascalCase, snake_case, SCREAMING_SNAKE_CASE.

My preference is camelCase as that is easier to type than snake_case, even if the later feels more C++. Modern C++ have moved away from SCREAMING_SNAKE_CASE and I think that is a good move.

I tried snake_case for a while and the underscore is not that much of an issue because of the autocomplete. I still feel that camelCase is more readable and natural. It also takes up less space on a line.

There is also no need to use PascalCase for types and classes. The IDE can work that out.

camelCase it is then.

The exception proves the rule…

There is a special case for old style #define. This should only be used when a constexpr could not be. Which should only be when the preprocessor is needed. See the _assert code and the defer() and _assertEndOfFile(filename) implementations. Note that where the defer needs to look like normal syntax it gets an exception to the execution so that it does not stand out in normal use. This is an example:

define DEFER_VARIABLE CONCATENATE(_defer, __LINE__)

Numbers

Use the C++14 number separator for long numbers.

    int         start{1'231'100};
    float       place{12.001'098};
    byte        mask{0b1111'0000};
    uint32_t    register{0x1ff0'0008};

Pointers

The * binds to the type the variable is then a pointer type.

    int*    counterList;

This leads us on to Hungarian Notation.

Hungarian or English?

No Prefix

I grew up with strong Hungarian notation and it makes sense in C where the typing is very weak. For modern C++ there is a view that this is not necessary any more as the compiler will warn of any conversions. That leaves member variable and static/global variables. While the m_ approach is nice, if you felt you did not need prefixes then why here?

Underscores for local scope members

My solution was arrived at by realising the member and static variables are to some degree special and that this can follow the direction indicated by the reserved identifier formats (__example, _ANOTHER). I like the notation to show the intent that these things are ‘part of inner code’. The same approach is to be used for static variables in the class or struct scope and also for local functions that clash with standard library versions.

class entity
{
    static constexpr size_t     _maxEntities{1000};
    entity(char* name)
    {
        _name = name;
    };
    char    _name[8];
};

inline float _floor(const float& a);

It turns out that the community does not like underscore starting a variable name and there are some very strong views to the point it is thought of as as a hard C++ rule. That said, the the true official rules can be seen at ccpreference.com:

the identifiers with a double underscore anywhere are reserved;
the identifiers that begin with an underscore followed by an uppercase letter are reserved;
the identifiers that begin with an underscore are reserved in the global namespace.

This has been interpreted by the community that even for local namespace this should should be banned. Given the level of heat in this subject it would seem best to not use this in public places like Stack Overflow.

Getters and Setters

I have never liked the excess code to provide access to internal member variables. The mantra is to have getters and setters for all the values a class may need externally. Most of the time this is just access to the variable that is well understood to be part of the class. The real benefit of function getters and setters is that you can change the implementation underneath the user of the class and that it gets you read only protection so you don’t accidentally change an internal value. However, when the variables is key to the class it is very unlikely to change and the const protection can still be given with a simple external reference. The naming works well too as the prefix underscore is used when providing access to the member variable.

private: // members
    int _i{2};
    int _j{3};
    int* _pointer{nullptr};
public: // getters and setters
    int  const  &i{_i};
    int         &j{_j};
    int* const  &pointer{_pointer};
};

...

    int _i{2};
    int _j{3};
    int* _pointer{nullptr};
public: // getters and setters
    int i() { return _i; };
    int j() { return _j; };
    void j(int j) { _j = j };
    int* pointer { return _pointer; };

Note the position of the const so it is the same for the more tricky case of a pointer etc. Also put the & next to the variable name. This is different to the pointer case of

int* pointer;

The performance is the same when compiled for release with optimisation on. This can be seen here at this Godbolt example.

I have now seen another case that explains why some people prefer the binding of the * and & to the variable. This case forces the reference to the variable:

public:
	i32     (&size)[2]{_size};
private:
	i32     _size[2]{0, 0};

...

private: // members
    int        _i{2};
    int        _j{3};
    int*       _pointer{nullptr};
public: // getters and setters
    const int  &i{_i};
    int        &j{_j};
    int* const &pointer{_pointer};
};

Thus this implies that to be consistent it needs to have the pointer and reference syntax next to the variable. I will try that for a while and see if it is too ugly to use. Due to needing int* const, the * cant be put next to the varibale name.

True Globals

For true global variable, use a single global struct call ‘g’. This allows for a central way to create a warning that these values must be written to very carefully.

struct globals
{
    int     count = 0;
    bool    isGameRunning = false;
}
struct globals g;
...
g.count++;    // Not thread safe.

camelCase can also be used for filenames. This is in contradiction to keeping all filenames lowercase to help with any accidental file use under windows due to case-insensitivity. But for consistency sake and a nice look it will be used here as well.

hashTable.h
imageScaler.h
vectorTests.cpp

Template Meta Programming Style

All this code is to be done without STL. This gets hard when wanting to use templates. Using templates for classes can be dangerous for the complexity and compile times. So they need to be used sparingly. It is expected to only use them in a few key places to overcome some C++ language capabilities. The pair and hash map classes are a good example. In these cases, to get them to work efficiently they have to have some template tricks. In these cases the template methods like is_same_v have been lifted out of STL and put in the defines.h file. They are not perfect implementation so care must be given when used to check they are doing what is wanted.

declval was needed to check if a supplied class to a hash map has its own hash function.
add_rvalue_reference was needed for declval.
type_identity was needed for add_rvalue_reference.
enable_if was needed for getting movable types into the pair instantiation.
is_pod was used as a cheap substitute for is_moveable in the pair class.
is_float was used for is_pod and other use cases.
is_integer was used for is_pod and other use cases.

Naming Conventions

size_t  n;          // For bytes in a buffer
int     i, j;       // For loops and coordinates only
size_t  index;      // For array index - don't use 'i' that is for an integer.
size_t  length;     // For char* only
size_t  size;       // For number of items in classes
size_t  capacity;   // For current max number of items
type    rhs;        // For operators overloads with lhs and rhs

Use nullptr over NULL to get the type information for the compiler.

Some extra ideas.

Using the type as a unit of measurement at the end. Put the adjective at the front.

This leads to ‘new’ and ‘old’ etc at the front, the name ‘Count’ (not ‘n’ due to capitalisation) an the type ‘bytes’/’items’ as the suffix. This leads to the nice unit maths check as this can be read as counts / item * item / byte which is correctly counts / bytes.

    u32 newCountBytes = newCountItem * itemBytes;

Initialising

The named/designated initialisers are nice and were available in C. However this only available in C++20 and while it is nice to have the two big benefits of; context when reading, and zero initialising unlisted items, it can also be said that this is not really needed.

If you are initialising a struct then I prefer the initializer-list approach as this requires the list for the struct as it can be used for all variable initialisation. This does require filling in all the items needed, but from there the rest can still rely on the automatic zeroing of the rest of the values. When used with normal types it may look odd as the = has disappeared. But then this makes more sense from the point of view that this is constructing this item rather than it being static data that is being assigned to the variable. While it will be the same assembler, it can feel more efficient when thinking about what is going on.

The assignment case is still needed when the value to be initialised to comes from a function.

Note that the details of the structure members should be very easy to get to via the IDE. then you have the structure showing in another window while reviewing the code.

    int               i{10};
    float             x{0.123f};
    struct            point{0.1f,m -0.2f};
    HANDLE            handle{INVALID_HANDLE_VALUE};
    HIDD_ATTRIBUTES   attributes{sizeof(HIDD_ATTRIBUTES)};
    // Can fully fill it out like this or can rely on the zero intialisation.
    //HIDD_ATTRIBUTES attributes{sizeof(HIDD_ATTRIBUTES), 0x0, 0x0, 0x0};
    // Normal assignment used when calling a function or more complex code:
    u8  nLog2 = (u8)_ceilLog2(newCapacity);

The HID example above would look like this with designated initialisers:

    HIDD_ATTRIBUTES   attributes{ .Size = sizeof(HIDD_ATTRIBUTES) };
    //HIDD_ATTRIBUTES attributes{ 
                                     .Size = sizeof(HIDD_ATTRIBUTES), 
                                     .VendorID = 0x0, 
                                     .ProductID = 0x0, 
                                     .VersionNumber = 0x0
                                };

Don’t pass the value directly to a function, always make a variable and then pass that value to the function.

    HANDLE            handle{INVALID_HANDLE_VALUE};
    HIDD_ATTRIBUTES   attributes{sizeof(HIDD_ATTRIBUTES)};
    
    HidD_GetAttributes(handle, attributes);
    // Don't do this:
    // HidD_GetAttributes(handle, HIDD_ATTRIBUTES{sizeof(HIDD_ATTRIBUTES)});
    // Nor this popular approach using type inference and designated initialisers.
    // HidD_GetAttributes(handle, { .Size = sizeof(HIDD_ATTRIBUTES) });

Use ‘no-spaces‘ on the initialisers and ‘spaces‘ on functions so they are clearly different.

public: // interface
    ~mallocPointers() { _setCapacity(0); };
private: // members
    int               _i{10};

Namespace

Add only one level namespace for the whole project. This is not going to have name collisions as it will all be locally editable if that happens.

Nameless namespaces to allow for local ‘static’ classes is a nice idea, but it only helps the compiler a bit. Let’s wait on that idea for a bit first.

This is the one case that does not indent the code. This looks wrong and ugly but what else can you do?

To help, always add to the namespace terminator the full thing again.

#include "project.h"

namespace project
{
int main()
{
    /* Normal program code... */
}
} // namespace project

Enum

Use the C++ enums. This means to do bit wise operations it will require the creation of the operator functions to be written. Also use the underlying type explicitly.

enum class nodeType: uint32_t { walk = 0, wall = 1, start = 2, end = 3, path };
enum class nodeFlag: uint8_t{ off = 0x01 << 0, on = 0x01 << 1, bright = 0x01 << 2 };

nodeFlag operator|(nodeFlag lhs, nodeFlag rhs)
{
    return nodeFlag((uint8_t)lhs | (uint8_t)rhs); // Look into using __underlying_type.
}

Alignment

alignas
alignof
size_of

Align classes and structures to 8 bytes unless they only contain char. This is because all compiles will be 64bit.

If using SSEM then align to 16 bytes and make sure the class or struct is a multiple of 16 using packing.

Copyright

Add these to every file to claim ownership. Most of the code will be written from scratch so it should all be mine, and where there is code or even the idea, it will be mentioned directly in the code and a ‘Portions Copyright…’ is added.

// Copyright (c) 2023 Mark Butler
// Portions Copyright Ginger Bill
// See https://www.gingerbill.org/article/2015/08/19/defer-in-cpp
// See the end of file for the licence.

Then at the bottom of the file add the extra details and select either the MIT or commercial licence.

// About:
// This file is does something interesting.
// 
// Copyright:
// Copyright (c) 2023 Mark Butler
// 
// MIT License:
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files(the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and /or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions :
// 
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
// 
// The software is provided "as is", without warranty of any kind, express or
// implied, including but not limited to the warranties of merchantability,
// fitness for a particular purpose and noninfringement.in no event shall the
// authors or copyright holders be liable for any claim, damages or other
// liability, whether in an action of contract, tort or otherwise, arising from,
// out of or in connection with the software or the use or other dealings in the
// software.
// 
// Commercial License:
// All rights reserved. No part of this software, either material or conceptual 
// may be copied or distributed, transmitted, transcribed, stored in a 
// retrieval system or translated into any human or computer language in any
// form by any means, electronic, mechanical, manual or other - wise, or 
// disclosed to third parties without the express written permission of 
// Mark Butler (mark@timeistheanswer.com)
// 
// The software is provided "as is", without warranty of any kind, express or
// implied, including but not limited to the warranties of merchantability,
// fitness for a particular purpose and noninfringement.in no event shall the
// authors or copyright holders be liable for any claim, damages or other
// liability, whether in an action of contract, tort or otherwise, arising from,
// out of or in connection with the software or the use or other dealings in the
// software.

Project Include Files

I am generating a set of one file includes for various core functionality. EG, Vectors, Intrinsics, Maths, Lists, HashTable, Allocator etc. these are to be used in any project I work on and I have been temporarily using:

#include    "../include/maths.h"

Which is just terrible. The reason is that I am using Visual Studio IDE and I don’t really want to have any changes from the default configuration of the IDE. This is clearly wrong and it needs to go back to the traditional rule of local includes must be without any path. The build process is what needs to take care of this.

As part of this keep the cpp and the .h file next to each other.

Comments from Various Sources

I’m a contract videogame programmer, so the answer is: I follow whatever the company guidelines call for. Typically, videogame code doesn’t use RTTI or exceptions, and follow CamelCase naming rules, m_ member variables, s_ static variables, and tab=4s. It’s remarkably consistent across the industry, for some reason. BoarsLair

I learned about how Plan 9 C code had non-traditional scheme of #include files where they don’t put #ifdef wrappers in each .h file to allow multiple inclusion and .h files don’t include other .h files. As a result .c files have to include every .h file they need and in the right order. It’s a bit of a pain and no other modern C++ codebase I know of maintains such discipline. But it’s my project so I did it and I keep doing it. It prevents circular dependencies between .h files and doesn’t inflate C++ build times because of careless including the same files over and over again. Chris/Krzysztof Kowalczyk