networking - Acknowledgement reliability using UDP

Wednesday, May 9, 2018

networking - Acknowledgement reliability using UDP

I have a question about UDP. For context, I'm working on a real-time action game.

I've read quite a bit about the differences between UDP and TCP and I feel I understand them quite well, but there's one piece that has never felt correct, and that's reliability, and specifically acknowledgements. I understand that UDP offers no reliability by default (i.e. packets can be dropped or arrive out of order). When some reliability is required, the solution I've seen (which makes sense conceptually) is to use acknowledgements (i.e. the server sends a packet to the client, and when the client receives that message, it sends back an acknowledgement to the server).

What happens when the acknowledgement is dropped?

In the example above (one server sending a packet to one client), the server handles potential packet loss by re-sending packets every frame until acknowledgements are received for those packets. You could still run into issues of bandwidth or out-of-order messages, but purely from a packet-loss perspective, the server is covered.

However, if the client sends an acknowledgement that never arrives, the server would have no choice but to eventually stop sending that message, which could break the game if the information contained in that packet was required. You could take a similar approach to the server (i.e. keep sending acknowledgements until you receive an ack for the ack?), but that approach would have you looping back and forth forever (since you'd need an ack for the ack for the ack and so on).

I feel my basic logic is correct here, which leaves me with two options.

Send a single acknowledgment packet and hope for the best.

Send a handful of acknowledgment packets (maybe 3-4) and hope for the best, assuming that not all of them will be dropped.

Is there an answer to this problem? Am I fundamentally misunderstanding something? Is there some guarantee of using UDP I'm not aware of? I feel hesitant to move forward with too much networking code until I feel comfortable that my logic is sound.

Answer

This is a form of the Two Generals Problem, and you're right - no number of retries is enough to perfectly guarantee receipt.

In practice in games, there's usually a time horizon beyond which the information doesn't really matter though even if it technically arrives reliably. Like finding out you had a perfect headshot lined up 2 seconds ago - it's too late for the player to use that information now.

If your packet loss is so high that you can't routinely get the needed info through inside a tight reaction window, then for a realtime game you might be better off kicking the player and trying to find a better match for them elsewhere, rather than continue trying to send the packet to emulate a reliable connection.

Because of this, some game replication systems skip acknowledgement & retries altogether and opt to just spam the newest update as often as they can. If one gets dropped or arrives late, too bad, skip it, pick up the next one and carry on, relying on the prediction & interpolation systems to smooth the gap and minimize hiccups visible to the player.

I suddenly want to start calling this "Simba Replication" for how it disregards problems in the past and tries to live in the present moment. ;)

A hybrid solution is to race ahead sending the new update AND (since game state updates can often be quite small / compressible) also pack-in the last update, and maybe the one before that... So just in case the client missed them, you don't have to wait a full round trip time to find out and fix it. Most of the time the client already saw this, so there's redundant data this way, but the latency for correcting a missed message is lower. The client's updates can include the index number of the most recent consecutive update they've seen, so you can be minimally conservative with how many old updates you include in the next update packet.

You could also implement a two-tier system as another type of hybrid, where short-lived state is replicated in an unreliable rapid-fire manner, and long-term state is synchronized reliably, using TCP or your own reliability implementation with a high retry count. This gets more complex to manage though, because you have two messaging systems to maintain, and the two snapshots can be out of sync with one another, adding a whole new class of edge case.

Blog

Wednesday, May 9, 2018

networking - Acknowledgement reliability using UDP

No comments:

Post a Comment

Simple past, Present perfect Past perfect