Szczepan Szpilczyński

Published
April 25, 2024

How about Finite State Machines & Networking in C++?

C++
FSM
ASIO
Finite State Machines

Here’s a fun fact – most of the things we use daily are basically state machines – from the toaster in your kitchen to the traffic lights you pass every day on your way home. Why would I even start thinking about this? Well, let’s go back in time a few months.

Story time

Before we proceed, let's make sure we have a clear understanding of Finite State Machines (FSM). FSM is an abstract machine that can be in one of a finite number of states at any given time and can transition from one state to another in response to certain inputs. In this article, I will be introducing you to the FSM topic using C++ and ASIO. However, if you are not familiar with ASIO, there is no need to worry.

So, having all the explaining out of the way. Let me tell you a story, urban legend of sorts that might seem unrelated at first yet gives a valuable insight. Did you know that anyone who attempted to read the current networking code was cursed and could not look at it again for two weeks? This created an uncomfortable situation in which one might need to inspect the code to fix issues but could not do so due to the curse.

Fortunately, the time finally came, a time to set us free from the curse. Everything was in place. From the people who were cursed, finally being back in the game, to the brave man who ignited the flame of change with his proof of concept.  All we needed was someone to launch it to start a long-needed rework of our networking component finally… we needed a code exorcist. They would take the holy POC and implement it further to bless us all. And as you might guess, I took that responsibility.

As you might already know, we aim to be compatible with PostgreSQL protocol to easily integrate with all the tools that speak Postgres, and we do that with sometimes mixed results. You might also know that we are able to scale into multiple nodes on different machines, and they need to communicate somehow.

After weighing all the pros and cons and deciding which one to implement, we decided to start with something you can actually connect without writing your own client. Days flew by as I hammered my keyboard in the basement to produce something that worked. The result was a working starting point that could even be connected to the PostgreSQL command-line tool.

There was just one thing that…

Do first, think later

It became pretty apparent that my design, drafted on my lap, was less than acceptable in the long term. Some keen eyes spotted two things.

  • I unconsciously wrote the worst Finite State Machine possible
  • It won't be long until we repeat the same mistakes as before

I know the first point is intriguing, but the second raises an interesting case.

How to (hopefully) prevent your future self from writing a bad code

I slightly modified the original code to make the issues more visible.

class StartupState : public IState {
 public:
  StartupState() {}

  StateType onData(asio::const_buffer buffer) override {
    _request_count++;
    
    if(_should_crash) {
      // ...
      return StateType::StartupState;
    }
    
    if(_hit_ssl_request) {
      // ...
      return StateType::StartupState;
    }
  
    // ...
    return StateType::StartupState;
  }

  void onError(std::error_code& ec) override {};

 private:
  Serializer _ser;
  Deserializer _des;
  
  bool _should_crash = false;
  bool _hit_ssl_request = false;
  
  i32 _request_count = 0;
};

It doesn’t look that bad… unless… we start adding more things. During the early stages, it's easy to justify bloating classes and adding logic. However, as we start adding more things, it becomes clear that it doesn't necessarily fit.

Imagine if there were 10 or 20 fields. Still manageable. Now, duplicate the amount of states by, let’s say 4. It starts to get out of control.

Another issue related to the design might be seen inside onData. We are creating one mega-state with many branches. Again – currently, there's no issue, but if it continues to grow, you'll soon be lost.

After some discussions, we have decided on two major things:

  • Firstly – no data in states! Now, we will pass a special StateContext struct (like in good old C) and operate on that;
  • Secondly – we will try to avoid handling too many branching situations inside one state. Instead, we will create a new state just for the branch and redirect our machine to that state.

These two things became the backbone of our final Finite State Machine implementation and helped us ensure pretty clean code.

This is how the code above would look like after applying our rules to it:

class StartupState : public IState {
 public:
  StartupState() {}

  StateType onData(StateContext *ctx, asio::const_buffer buffer) override {
    ctx->request_count++;
    
    if(should_crash(buffer)) {
      // No more code here
      return StateType::Crash;
    }
    
    if(hit_ssl_request(buffer)) {
      // No more code here
      return StateType::SSLRequest;
    }
  
    // ...
    return StateType::StartupState;
  }

  void onError(StateContext *ctx, std::error_code& ec) override {};
};

class CrashState : public IState {
public:
  StateType onData(StateContext *ctx, asio::const_buffer buffer) override {
    // Some logic here...
  }
  
  void onError(StateContext *ctx, std::error_code& ec) override {};
};

class SSLRequest : public IState {
  // ...
};

Back to the Future

Time passed, and we were able to implement our startup communication and later parts of the Postgres protocol by following this graph (reimagined for this article):

In the meantime, ideas were being thrown at the drawing board on how to improve our machine further. These included removing enums and replacing them with pointers to states, generalizing our code for managing states, using one overloaded function named handle instead of many different ones, and introducing the default state.

But… why replace the enum? Let’s go back to the code I showed you before, and take the onData method. Now, imagine we want to call something like onEnter in the state returned by onData. It should be simple, right?

StateType current_state = StateType::StartupState;
StateType previous_state = StateType::StartupState;

if(current_state == StateType::StartupState) {
  // Assume that onData and other methods are static for a moment
  current_state = StartupState::onData(ctx, buffer);
} else if(current_state == StateType::Crash) {
  current_state = CrashState::onData(ctx, buffer);
} else if(current_state == StateType::SSLRequest) {
  current_state = SSLRequest::onData(ctx, buffer);
}

if(current_state != previous_state) {
  previous_state = current_state;
  
  if(current_state == StateType::StartupState) {
    current_state = StartupState::onEnter(ctx);
  } else if(current_state == StateType::Crash) {
    current_state = CrashState::onEnter(ctx);
  } else if(current_state == StateType::SSLRequest) {
    current_state = SSLRequest::onEnter(ctx);
  }
    
  // Here would be recursive call to the top
}

I believe you understand the issue now. It would bloat with every state added. You could add some helper function to reduce the duplication and stuff, but it’s not worth it in the long run, really. It is also one more thing you need to remember when changing something, and I don’t have a good memory.

The holy coding grail

Now I will transfer the universal not-yet-AI-indexed code for writing your own state machines in C++ – it can be adapted for other languages like C# or anything with classes really given enough grit. We will also make small server using this stuff.

template <typename StateBaseType, typename StorageType>
class StateMachine {
  StateBaseType const* _state;

 protected:
  StorageType* _storage;

 public:
  StateMachine(StateBaseType const& init_state, StorageType* storage) : _state(&init_state), _storage(storage) {}

  template <typename... Args>
  void handle(Args&&... args) {
    StateBaseType const* next = &(_state->handle((StorageType&)*_storage, std::forward<Args>(args)...));
    if (next != _state) next->onEnter((StorageType&)*_storage);
    _state = next;
  }

  StateBaseType const* const getCurrentState() const { return _state; }
};

This guy will handle everything related to switching and calling states, so we can sleep pretty soundly at night without worrying if we forget something.

Let’s fill StateBaseType and StorageType with something. For this purpose, I have two handy things. Remember StateContext? It’s here now with some of our previous fields, with an exception of the books, as they are pretty useless for us now.

struct StateContext {
  Serializer _ser;
  Deserializer _des;
  i32 _request_count = 0;
};

We also have IState interface, which we can feed into our universal FSM class.

class IState {
 public:
  virtual ~IState() = default;

  [[nodiscard]] virtual std::string stateName() const = 0;

  virtual void onEnter(StateContext& ctx) const = 0;

  [[nodiscard]] virtual IState const& handle(StateContext& ctx, std::string msg) const = 0;
  [[nodiscard]] virtual IState const& handle(StateContext& ctx, std::error_code& ec) const = 0;
};

I do not have much here. There are just three handles I have stolen from the production code. When copy-pasting, just change some things to make sure no one notices.

***

Now, after putting it all together, we have something like this.

class ConnectionFSM : public StateMachine<IState, StateContext> {
  using base_t = StateMachine<IState, StateContext>;

 public:
  ConnectionFSM(IState const& init_state, StateContext* ctx)
    : StateMachine<IState, StateContext>(init_state, ctx) {}

  void processStartupMsgReceivedEvent(std::string msg) {
    handle(msg);
  }
  void processErrorEvent(std::error_code& ec){
    handle(ec);
  }
};

I think it looks awesome!

ASIO & FSM

Now, let’s glue it with a small boost::asio server and make a state!

std::string decodeMessage(asio::mutable_buffer buffer) {
  // Magic here.
}

asio::io_context ctx;
asio::ip::tcp::acceptor acceptor(ctx, asio::ip::tcp::endpoint(asio::ip::tcp::v4(), 1337);
asio::ip::tcp::socket socket(ctx);

acceptor.async_accept(socket, [ctx](std::error_code ec) {
  if (ec) return;

  auto sock = std::move(socket);
  asio::mutable_buffer buffer;
  
  StateContext state_ctx;
  ConnectionFSM fsm(WaitForStartupMessage::instance(), &state_ctx);

  sock.async_read_some(buffer, [buffer, fsm](std::error_code ec, std::size_t length) {
      if(!ec) {
          fsm.processStartupMsgReceivedEvent(decodeMessage(buffer));
      } else {
        fsm.processErrorEvent(ec);
      }
  });
});

Simple, right? As I said before, I expect some knowledge of ASIO from you, but if you don’t have any, then it all means that we will start a TCP server on port 1337 and accept connections asynchronously. async_read_some is just a callback that will let us know when we… read some data.

See how graciously this looks. It consists of just some function calls. Everything else is taken care of in the back.

I end this part with this lil’ snippet:

template <class SpecificState>
class DefaultState : public IState {
public:
  static IState const& instance() {
    static SpecificState i;
    static_assert(sizeof(i) == sizeof(void*));
    return i;
  }

  virtual void onEnter(StateContext& ctx) const override {}

  virtual IState const& handle(StateContext& ctx, std::string msg) const override {
    return this->instance();
  }
  virtual IState const& handle(StateContext& ctx, std::error_code& ec) const override {
    return this->instance();
  }
}

// Our awesome state!!!!!!!
class WaitForStartupMessage : public DefaultState<WaitForStartupMessage> {
public:
  IState const& handle(StateContext& ctx, const StartupMsgReceivedEvent& args) const override {
    std::cout << args.msg << std::endl;

    return this->instance();
  }
}

Very handy construct. Especially when there are a lot of handles. Your fingers will thank you later.

You are probably wondering about this linestatic_assert(sizeof(i) = sizeof(void*));. It is really important, as it ensures that no one steps out of the line and tries to expand states with fields by simply checking if the size ever goes above 0

The class that inherits some methods will need to have the Virtual Method Table needed for dynamic dispatch etc., etc.; read more about this on your own. This means we can’t just assume the size will be 0 because it isn’t really true. We need to account for the invisible VMT. Fortunately, it is usually a pointer, so quick void* helps as always.

Thanks to that little check, we could make our states into singletons for simplicity and still have no blood on our hands.

The number of pointers will change depending on the number of classes inherited and the hierarchy of inheritance itself. Fortunately, this behavior is well-defined in https://itanium-cxx-abi.github.io/cxx-abi/abi.html (literature for cyborgs) which is used in GCC 3 onwards and clang. Keep in mind this may not apply fully to the MSVC, as it uses its own ABI.
If you are looking for a slightly more in-depth explanation of how VTable works, I recommend https://guihao-liang.github.io/2020/05/30/what-is-vtable-in-cpp.

Epilogue

To this day, we are still improving the design. This is what I like about programming and creating things in general – endless improvement. Of course, it needs to end somewhere, but it’s fun while it lasts. Right now, it is time to move on. We were able to implement everything we had in mind with what I showed you earlier, though I left some details. I know you’ll figure it out.

Now, the stories about horrible code are just that, telltales of old times. Maybe there are more of them, maybe not. We’ll see, as always.

I hope you got something useful from this post. I made it slightly for myself as I wanted to have a complete, easy-to-use state machine code for my personal projects, but in the end, it might also be useful to you. If you want to get in touch, find us on LinkedIn, Reddit, and other platforms. And if you want to see what we're working on, click this link to try Oxla for free.

See ya~

Give Oxla Cloud a spin

Try out Oxla Cloud with a 30-day trial and $200 in credits to experience the efficiency of its query engine using demo datasets or your own data.