Three C++ Tricks That Make Everyday Coding Easier (and Debugging Harder)

27 November 2024

About the author

Dan Cazarín is the author of the KFR C++ DSP library with focus on performance, adopted by companies and universities worldwide since 2016 (https://kfr.dev) and the creator of the recently released C++ cross-platform GUI framework with reactive features called Brisk (https://github.com/brisklib/brisk).

The “New” `new` Operator

The new operator has been a part of C++ since its early days. It integrates well with object-oriented programming, allowing constructors to be called on newly allocated (or provided) memory.

However, the new operator returns a raw pointer, and working with raw pointers is no longer considered good practice in modern C++, especially with the widespread adoption of the RAII (Resource Acquisition Is Initialization) idiom.

Today, the new operator is typically used either to pass a raw pointer to a smart pointer class constructor (though this is discouraged since std::make_shared and std::make_unique are safer alternatives) or is hidden beneath layers of code, wrappers, and libraries, rarely seen directly by developers.

// Bad
auto p = new Object(42);
someFunction(new Object(42));
// Not so good
std::unique_ptr<std::string> u(new std::string("C++"));

What Does the Standard Say?

C++11 introduced std::make_shared (with C++20 adding array support), so now the code should look like this:

// Much better
auto p = std::make_shared<Object>(42);
someFunction(std::make_shared<Object>(42));
std::unique_ptr<std::string> u = std::make_unique<std::string>("C++");

What Do Other Languages Do?

In C++/CLI (Common Language Infrastructure for .NET), the gcnew operator creates a CLR-managed pointer (denoted as T^ instead of T*), though this is a language extension implemented in Microsoft’s proprietary compiler.

What Can We Do?

Consider the following code:

auto p = new std::string("abc");
static_assert(std::is_same_v<decltype(p), std::string*>);
auto s = shnew std::string("abc");
static_assert(std::is_same_v<decltype(s), std::shared_ptr<std::string>>);

Here, shnew is our custom “operator” that returns a std::shared_ptr created from the constructed object, just as new returns a plain pointer.

How?

This trick is simple and (relatively) safe, unless you consider any macro unsafe.

struct ShNew {
    template <typename T>
    std::shared_ptr<T> operator*(T* rawPtr) const { return std::shared_ptr<T>(rawPtr); }
};
constexpr inline ShNew shNew{};
#define shnew ::shNew * new

Drawbacks

shnew T{} is equivalent to std::shared_ptr<T>(new T{}), which allocates the control block separately, unlike std::make_shared, leading to less efficient memory management. However, with std::make_shared and combined allocations, deallocation is delayed if weak pointers outlive strong ones.
Certain language constructs may break the shnew macro due to operator precedence, requiring parentheses. For instance, a C-style cast to a shared_ptr (C-style casts to a C++ class are bad style anyway) and dereferencing (e.g., int answer = *shnew int{42}) cause syntax errors. However, immediately dereferencing instead of storing or passing the shared_ptr to something seems useless.

Deferring things

When it comes to releasing resources, C++ recommends using RAII (Resource Acquisition Is Initialization). But writing a separate class for every single resource can get tedious and lead to a lot of boilerplate code. A quick way to simplify things is to use built-in objects like std::unique_ptr<>, especially if the resource is a pointer to something that needs freeing. Here’s a simple example to show the logic behind RAII:

// Resource external to our code:
struct SomeResource;
SomeResource* createResource();
void destroyResource(SomeResource*);

struct ResourceDeleter {
    void operator()(SomeResource* r) { destroyResource(r); }
};

void fun()
{
    std::unique_ptr<SomeResource, ResourceDeleter> resource(createResource());

    // At scope exit, ResourceDeleter::operator() will be called, freeing the resource
}

This may be very helpful, but the downside is that the resource release logic is pretty far from the initialization.

Other languages handle this differently. A great example is Go, which has the defer keyword to call arbitrary code at function exit. Imagine if C++ had something like Go’s defer:

FILE* f = fopen();
if (!f)
    return;
defer {
    fclose(f);
}
uint8_t buf[1];
fread(buf, 1, 1, f);
// other code working with f

Looks neat, right? Unfortunately, it’s unlikely that C++ will adopt this anytime soon. So, what can we do right now? Add one little semicolon:

FILE* f = fopen();
if (!f)
    return;
defer {
    fclose(f);
};
uint8_t buf[1];
fread(buf, 1, 1, f);
// other code working with f

Now it’s valid C++ — as long as defer is a macro that does some magic under the hood:

template <typename Fn>
struct DeferFn {
    template <typename Fn_>
    DeferFn(Fn_&& fn) : fn(std::forward<Fn_>(fn)) {}
    ~DeferFn() { fn(); }
    Fn fn;
};
template <typename Fn>
DeferFn(Fn&&) -> DeferFn<std::decay_t<Fn>>;

#define CONCAT2(a, b) a##b
#define CONCAT(a, b) CONCAT2(a, b)
#define defer ::DeferFn CONCAT(defer_, __LINE__) = [&]()

The magic is creating a stack object of a utility type that wraps a lambda capturing values by reference, and then calls it in the destructor. To make this work smoothly, we need a deduction guide so the compiler can deduce the lambda type automatically.

A small but important detail: instead of explicitly calling a constructor, we assign the lambda to the utility object. Why? If we used a constructor call, we’d need braces or parentheses around the entire lambda, which would ruin the clean look of the approach.

The only slightly annoying part is the extra semicolon at the end of the lambda, but it’s a small price to pay for elegance.

Comments

__LINE__ lets us pick a unique name for a utility object, so we don’t run into a compiler error if we have two defer statements in the same scope.

Note that it makes sense to name it DEFER in all caps, as it makes it clearer that it’s a macro.

UB-Driven Properties for C++

The previous two tricks rely on preprocessor macros. In modern code, this approach might be considered poor style or even prohibited by coding standards. The next trick goes even further by leveraging Undefined Behavior. Spoiler: it is (almost) guaranteed to work in current and future compilers and has been extensively tested in production code.

Properties in Other Languages

Languages that support properties implement them in various ways, but the underlying principle is common: they introduce a variable-like construct that invokes a getter and setter for every read and write access. The value may be computed on-the-fly or read from a backing field, and the setter may notify listeners about changes, etc.

The syntax for properties often resembles field access. Quick example:

auto w = shnew Widget{}; // Trick #1
w->id = "primary-button"; // setter updates style
w->borderWidth = 2; // setter triggers widget redraw
float op = w->opacity; // get the value

Simplest Implementation

Let’s start with a short class that mimics the properties found in other languages.

We need two things: an assignment operator to be able to write prop = value and a conversion operator to get the value back with value = prop. Also, let’s define old-fashioned get and set functions to be on the safe side.

It makes sense to pass the getter and setter as template arguments and keep Property as a simple wrapper around a class pointer.

template <typename T, typename Class, auto getter, auto setter>
struct Property {
    Class* class_;

    T get() const noexcept {
        return (class_->*getter)();
    }
    void set(T value) noexcept
        requires (setter != nullptr)
    {
        (class_->*setter)(std::move(value));
    }

    operator T() const noexcept {
        return get();
    }
    void operator=(T value) noexcept
        requires (setter != nullptr) {
        set(std::move(value));
    }
    Property& operator=(const Property&) = delete;
};

class Widget {
private:
    std::string getId() const noexcept;
    void setId(const std::string& value) noexcept;
    float getOpacity() const noexcept;
    void setOpacity(float value) noexcept;
public:
    Property<std::string, Widget, &Widget::getId, &Widget::setId> id{ this };
    Property<float, Widget, &Widget::getOpacity, &Widget::setOpacity> opacity{ this };
};

void test() {
    std::unique_ptr<Widget> widget{ new Widget{} };
    widget->id = "i1"; // operator=(T) called
    if (widget->opacity == 1) { // operator() called
        std::print("widget is opaque\n");
    }
}

Drawbacks

Memory waste: Each such property occupies sizeof(void*) bytes, used solely to store a copy of this. What happens when the number of properties becomes large? Properties are especially useful for widget systems, where a widget can have hundreds of properties. For example, 128 properties would consume 1 KiB in every instance of a widget, potentially leading to cache misses and other performance issues.

UB to the Rescue

To be honest, this type of “UB” occurs more frequently than you might think: accessing a non-active union member. This behavior has been around for a long time in legacy codebases, and compilers are generally designed to tolerate it.

In C, using a union to perform a bit-cast between types is a common practice. Since most C++ compilers share their codebase with C compilers, this can often lead to what is considered a “defined undefined behavior”.

To clarify, we’re not doing anything particularly odd. We’re simply writing to one member of type Class* and then reading from another member of the same type (Class*) that maps to the same memory location. This technique has been extensively tested in a commercial project and works reliably with all major C++ compilers.

One note: The property must not have any user-defined constructors, and its sole member must be public.

Modified example:

class Widget {
private:
    std::string getId() const noexcept;
    void setId(const std::string& value) noexcept;
    float getOpacity() const noexcept;
    void setOpacity(float value) noexcept;
public:
    Widget() {
        self = this;
    }
    union {
        Widget* self;
        Property<std::string, Widget, &Widget::getId, &Widget::setId> id;
        Property<float, Widget, &Widget::getOpacity, &Widget::setOpacity> opacity;
    };
    
    // Just to be sure
    static_assert(sizeof(Property<std::string, Widget, &Widget::getId, &Widget::setId>) == sizeof(Widget*));
    static_assert(alignof(Property<std::string, Widget, &Widget::getId, &Widget::setId>) == alignof(Widget*));
};

Notes

Some notes and edge cases regarding this technique:

Passing a property to a function with deduced parameters will deduce the property type instead of the value type. For example, writing std::min(w->opacity, 0.5f) will result in a compiler error due to parameter incompatibility. In such cases, using w->opacity.get() provides a quick fix.
It is no longer possible to keep the copy constructor trivial, as it must initialize the self member variable to the address of the new this. Deleting the copy constructor also resolves this issue if copying the class is unnecessary. This, of course, applies to any class containing pointers to itself.
The self member variable must be placed in every such union block and initialized before the first use of properties. Derived class may use own property block but must ensure the union is properly initialized.

Three C++ Tricks That Make Everyday Coding Easier (and Debugging Harder)

About the author

The “New” new Operator

What Does the Standard Say?

What Do Other Languages Do?

What Can We Do?

How?

Drawbacks

Deferring things

Comments

UB-Driven Properties for C++

Properties in Other Languages

Simplest Implementation

Drawbacks

UB to the Rescue

Notes

KFR 6 has been released

The “New” `new` Operator