UserData class

(1202 words)

Many libraries provide their users with a way to add user-defined content to the library’s objects. This is especially true for libraries in the graphics domain. Examples:

You get the point. Sometimes libraries help you add context to their objects, but they don’t know your classes (nor should they). They are stuck with suboptimal solutions:

There are many variations of these solutions, but I have yet to encounter one I like. So I’d like to propose one.

UserData class

Let’s start with an API:

class UserData {
public:
    template <class T>
    void Set(const T& t);

    template <class T>
    T& Get() const;

    template <class T>
    bool Has() const;

    template <class T>
    void Clear();
};

This allows users to store and retrieve their very own classes (plural!), in a type-safe manner. It is also not a templated class (though it obviously has templated methods), so it has a fixed size no matter what you store in it. I think this is a very neat API. In my particular scenario I don’t care about thread-safety, which simplifies things a bit.

Now take a moment to think about how you would implement this. It is non-trivial. If you don’t have a compiler at hand, try Ideone.com or cpp.sh. Come back with an answer!

You’re back? Awesome. Let’s explore a few possible solutions for this interesting problem.

static-based solution

This is the first solution I came up with, and it is far from perfect. It is interesting nontheless, and so I decided to put it here despite how embarrassing it is.

The idea here it to have a static template member-function (called it Multitool()). This function has a static unordered_map<int, T> (defined inside the function) which is used as storage for Ts. Each UserData instance has a unique int id (which is the key in the above map). This function takes care of storing and retrieving Ts, and is called by the templated-public API methods.

This is one ugly solution:

Here’s a complete implementation:

class UserData final {
public:
    UserData();
    ~UserData();

    template <class T>
    bool Has() const {
        void* vt = nullptr;
        Multitool<T>(MultitoolCommand::Get, &vt, m_UniqueId);
        T* t = (T*)vt;
        return (t != nullptr);
    }

    template <class T>
    T& Get() const {
        void* vt = nullptr;
        Multitool<T>(MultitoolCommand::Get, &vt, m_UniqueId);
        T* t = (T*)vt;
        assert(t != nullptr);

        m_Clear.insert(Multitool<T>);

        return *t;
    }

    template <class T>
    void Set(const T& t) {
        T* tmp = &t;
        void* vt = (void*)tmp;
        Multitool<T>(MultitoolCommand::Set, &vt, m_UniqueId);

        m_Clear.insert(Multitool<T>);
    }

    template <class T>
    void Clear() {
        void* tmp = nullptr;
        Multitool<T>(MultitoolCommand::Clear, &tmp, m_UniqueId);

        m_Clear.erase(Multitool<T>);
    }

private:
    enum class MultitoolCommand {
        Get,
        Set,
        Clear,
    };

    // This function is weird because it is the single point of
    // storage for Ts, and so has to satisfy all public operations
    template <class T>
    static void Multitool(MultitoolCommand command, void** vt, int id) {
        static std::unordered_map<int, T> store;
        T*& t = *((T**)(vt));

        switch (command) {
        case MultitoolCommand::Get: {
                auto it = store.find(id);
                if (it == store.end()) {
                    t = nullptr;
                } else {
                    t = &(it->second);
                }
            }
            break;
        case MultitoolCommand::Set: {
                auto it = store.find(id);
                if (it == store.end()) {
                    store.emplace(id, *t);
                } else {
                    it->second = *t;
                }
            }
            break;
        case MultitoolCommand::Clear:
            store.erase(id);
            break;
        }
    }

    static int m_NextUniqueId;
    int const m_UniqueId;

    typedef void (*TClearFunc)(MultitoolCommand, void**, int);
    mutable std::set<TClearFunc> m_Clear;
};

// .cpp file:
int UserData::m_NextUniqueId = 0;

UserData::UserData()
:    m_UniqueId(m_NextUniqueId++) {
}

UserData::~UserData() {
    void* tmp = nullptr;
    for (auto it : m_Clear) {
        it(MultitoolCommand::Clear, &tmp, m_UniqueId);
    }
}

type_index-based solution

In >= C++11 we can use type_index to store types as keys in associative containers (in this case an unordered_map).

We need some small trickery to call T’s correct destructor. We also require RTTI support (although we don’t use the ‘runtime’ part of it - we just need the tables to exist in the binary).

Here’s the complete solution:

class UserData final {
public:
    UserData() = default;
    ~UserData() = default;

    template <class T>
    bool Has() const {
        auto const& it = m_Items.find(typeid(T));
        return (it != m_Items.end() && it->second.get() != nullptr);
    }

    template <class T>
    T& Get() const {
        auto const& it = m_Items.find(typeid(T));
        assert(it != m_Items.end());
        assert(it->second.get() != nullptr);
        return static_cast<Wrapper<T>*>(it->second.get())->t;
    }

    // It may have been better to take a pointer to T instead. Up to you.
    template <class T>
    void Set(const T& t) {
        m_Items[typeid(T)] = std::make_unique<Wrapper<T>>(t);
    }

    template <class T>
    void Clear() {
        m_Items.erase(typeid(T));
    }

private:
    class EmptyBase {
    public:
        virtual ~EmptyBase() = default;
    };

    template <class T>
    class Wrapper : public EmptyBase {
    public:
        Wrapper() = default;
        Wrapper(T t_) : t(t_) {}

        T t;
    };

    std::unordered_map<std::type_index, std::unique_ptr<EmptyBase>> m_Items;
};

Custom type-id solution

I have not yet encountered a team who is asking the compiler to not generate RTTI tables. I’m sure that there are such teams out there. But even if you have RTTI - it’s much slower than, say, a simple integer.

But how can we assign a unique integer per-type? Using a simple templates trick:

size_t type_id = 0;

template <typename T>
size_t GetTypeId() {
    static size_t t_id = type_id++; // use std::atomic for thread safety
    return t_id;
}

This even allows us to have a contiguous std::vector (as long as we never shrink it). With a solution similar to EmptyBase and Wrapper above we can get better performance, better CPU caching and less memory usage. Unfortunately, we still have to use a vector of pointers rather than have the objects directly in it, as we don’t know their sizes up front.

Complete solution:

class UserData {
public:
    template <class T>
    void Set(const T& t) {
        auto id = GetTypeId<T>();
        assert(m_Items.size() >= id);
        if (id <= m_Items.size()) {
            m_Items.resize(id+1);
        }
        m_Items[id] = std::make_unique<Wrapper<T>>(t);
    }

    template <class T>
    T& Get() const {
        auto id = GetTypeId<T>();
        assert(m_Items.size() > id);
        auto const& it = m_Items[id];
        assert(it);
        return static_cast<Wrapper<T>&>(*it.get()).t;
    }

    template <class T>
    bool Has() const {
        auto id = GetTypeId<T>();
        return m_Items.size() > id && m_Items[id];
    }

    template <class T>
    void Clear() {
        auto id = GetTypeId<T>();
        assert(m_Items.size() > id);
        m_Items[id].reset();
    }

private:
    class EmptyBase {
    public:
        virtual ~EmptyBase() = default;
    };

    template <class T>
    class Wrapper : public EmptyBase {
    public:
        Wrapper() = default;
        Wrapper(T t_) : t(t_) {}

        T t;
    };

    std::vector<std::unique_ptr<EmptyBase>> m_Items;
};

If you have better ideas I am happy to hear! Let me know what you think in the comments below.


Comments