Bitwise compression

While creating an online RPG server, especially as an indie or small team developer, you will quickly learn that hardware and bandwidth becomes very expensive, very fast. The cost of a 3.2Ghz dual-core processor and 2 GB of RAM really isn't too much these days for a personal computer, but putting one of those on a server rack by a professional server company is going to boost the costs by at least $200 a month, often more. This is money that could be going straight to your pocket by either paying for a cheaper server or being able to support more players online at once. Simply, the faster code runs, the less RAM it uses, and the less bandwidth it uses, the better. There are plenty of compression algorithms out there, but they are for large chunks of data, not 50 byte packets. They have their times for usage, and in packets is not one of them.

Chances are that you are still a bit fuzzy on binary after reading the last section, which is something that will go away as you practice it. But you are probably asking, "How do you use these operators for compression?" As you probably noticed, when specifying certain values, we often left many bits untouched. Take a boolean (true [1] or false [0] only) for instance - in a byte, 7 of those bits go untouched. What a complete waste! So why don't we pack those bits with more information!

Writing Bit flags

Commonly, in a true/false situation, the numeric value 1 will represent true, and 0 will represent false. Lets say you have 8 skills in your game, and you want the server to tell the client what skills they know when loading their character. One way you could do this is create an array of 8 bytes, setting them to 1 and 0 accordingly. Each byte will then represent a skill, and the value of the byte represents whether they know it (true) or not (false). This would look like this:

	Dim b(1 to 8) As Byte
 	b(1) = UserKnowsSkill(1)
	b(2) = UserKnowsSkill(2)
	b(3) = UserKnowsSkill(3)
	...
	b(8) = UserKnowsSkill(8)
	

* On a side note, UserKnowsSkill() is an imaginary function that returns 1 if the user knows the skill by the ID passed in the parameter, and 0 if they do not know it. This is just for the example to make understanding easier.

That now leaves us with 8 bytes we have to send to the client, which is a horrible waste. Instead, what we can do is OR the values on a single byte, such as follows:

	Dim b As Byte
	Dim i as byte 'Our looping variable
	b = 0 'Clear all the bits
	For i = 1 to 8 'This code is run 8 times, each time i changes + 1
	    If UserKnowsSkill(i) Then 'Make sure the user knows the skill before applying the bit
		   b = b OR (2 ^ (i - 1)) 'Set the bit in the appropriate place
		End If
	Next i
	

Variable b now contains a bit = 1 for each skill known, and = 0 for each skill unknown. The equation (2 ^ (i - 1)) is used to get the bit location. The -1 is there because we start on 0, but the For/Next loop starts on 1, so we have to adjust it or else we could only fit 7 bits on there.

Lets pretend that for the above example, the user knows the skills represented in the index of 1, 3, 4, 5 and 7. In binary, that would look like this:

	0000 0001 'Skill 1
	0000 0100 'Skill 3
	0000 1000 'Skill 4
	0001 0000 'Skill 5
	0100 0000 'Skill 7
	---------
	0101 1101 'The resulting byte
   

I didn't write OR between every line because that would take up a lot of space, but you can pretend there is an OR operator between every line. As you can see, the index of the skill is represented by what bit is used. Skill 1 uses bit 2 ^ 0, farthest on the right, while skill 7 uses bit 2 ^ 6, second to farthest on the left. Again, none of the bits affect one another, so any combination possible between those 8 skills can be used.

Reading bit flags

Before we go any farther, it is important that we know how to read those bits after we wrote them. In our above example of the 8 skills, when the client receives the value from the server, this is what it would see as the value: 93. That is quite worthless to us as a decimal, so we are going to have to read it as binary.

There are many ways to approach this - my favorite is the simple method of using AND to see if the bit is equal to 1, then if so, we set the skill to 1, too.

   If b AND 2 ^ 0 Then UserKnowsSkill(1) = 1 Else UserKnowsSkill(1) = 0
   If b AND 2 ^ 1 Then UserKnowsSkill(2) = 1 Else UserKnowsSkill(2) = 0
   ...
   If b AND 2 ^ 7 Then UserKnowsSkill(7) = 1 Else UserKnowsSkill(7) = 0
   

There are many ways you can shorten this with a loop, but in many cases, you will have a lot of bits representing different things. It is all up to personal preference, and what the scenario is.

When you start getting more skills that have to be sent, you have two options: use a larger integer, or send multiple bytes. It is recommended that you use send only bytes. For one, it is more flexible - you won't run out of bytes you can send, but you won't be able to send a 200-bit variable. Secondly, bytes are the smallest, so you get the smallest results. If sending 20 skills, you can either use 3 bytes, or a 32-bit integer (4 bytes). Using an array of bytes is harder to follow, but well worth the practice.

Packet compression examples

It is time to put our knowledge to use. If you are a bit hazy on bits and binary still, I strongly suggest you go back and read the previous guides over again, play with binary a bit, and read the additional links supplied if needed.

Lets make up a packet structure - this packet is a packet that will go from the server to the client, and tells the client the structure of a map tile. Ignore the fact that this is a weird kind of packet to use, it's an easy example to imagine. The packet will be constructed, as follows, in the order shown:

   Graphic1 As Integer (16-bit int)
   Graphic2 As Integer (16-bit int)
   Light As Long (32-bit int)
   CanWalkOn As Byte (8-bit int)
   CanSeePast As Byte (8-bit int)
   

That is our imaginary tile. I added the bit sizes for the variables so for those unfamiliar with Visual Basic can easily do the conversions. What each variable does is irrelevant, but we do need to know that CanWalkOn and CanSeePast are only going to be 0 or 1. Ok, so now we want the server to tell the client the information. You could just send this all as shown, which would be a total of 10 bytes, but we want to make it even smaller. What we will do is make a byte that will state whether each variable needs to be sent. Each variable above has a "common value". For the two graphic integers, it will be 0, since no graphic is the most common graphic (unless you know half your world is going to be the same graphic, in which case, you have other problems to deal with). For the light, it is going to be -1, which is the same as white (or no) light. For CanWalkOn, we will use 0, assuming more then 50% of the map is covered by tiles you cant walk on. Finally, CanSeePast will be 1, assuming more then 50% of the map you can see past the tile. To clarify, here is a list of the variable and their assumed common value:

   Graphic1 = 0
   Graphic2 = 0
   Light = -1
   CanWalkOn = 0
   CanSeePast = 1
   

What we will now do is use a byte to state if the values are not as listed above. We will be using 1 for if the value is different then above, and 0 for if the value is the same as above. For example, if Graphic1 = 8, we will put a 1, but if it is 0, we will put a 0.

   Dim b As Byte
   b = 0 'Clear the bits
   If Graphic1 <> 0 Then b = b OR 2 ^ 0
   If Graphic2 <> 0 Then b = b OR 2 ^ 1
   If Light <> -1 Then b = b OR 2 ^ 2
   If CanWalkOn <> 0 Then b = b OR 2 ^ 3
   If CanSeePast <> 1 Then b = b OR 2 ^ 4
   

* The operator <> means "less then or greater then", or in other words, "not equal to". So the first line reads, "If Graphic1 is not equal to 0..."

Byte b now holds a 1 for every value that is not equal to the common value. For every non-common value, we have to send the real value. For every common value, we do not have to send the value, since we know what it is already by the bits - the common value! But to let the client know what the common values are, we have to send an extra byte, b.

* For this example, 6 functions will be made up: SendByte, SendInteger, SendLong, GetByte, GetInteger and GetLong. These are for example only. Pretend that the send functions send a variable to the socket, while the get functions get the variable of that size from the socket. More on packets and networking will be explained later.

   SERVER SEND INFORMATION:
   SendByte b
   If b AND 2 ^ 0 Then SendInteger Graphic1
   If b AND 2 ^ 1 Then SendInteger Graphic2
   If b AND 2 ^ 2 Then SendLong Light
   
   CLIENT GET INFORMATION:
   Dim b As Byte
   b = GetByte 'Get the bit flags from the socket
   If b AND 2 ^ 0 Then Graphic1 = GetInteger Else Graphic1 = 0
   If b AND 2 ^ 1 Then Graphic2 = GetInteger Else Graphic2 = 0
   If b AND 2 ^ 2 Then Light = GetLong Else Light = -1
   If b AND 2 ^ 3 Then CanWalkOn = 1 Else CanWalkOn = 0
   If b AND 2 ^ 4 Then CanSeePast = 0 Else CanSeePast = 1
   

As you can see, b is sent and received, no matter what. b tells us what information we need to get if it is not equal to the common value. For the client, if the bit is 1, the value has to be gathered, but if the bit is 0, then the value is the common value. The CanWalkOn and CanSeePast work differently, though - the server doesn't send their value no matter what. We know the value is going to be either 1 or 0, so if it is not the common value, then that only leaves one value left to be used. This prevents us from sending an extra byte holding the value. Think of it this way:

"Your mom will give you only one slice of pizza or two slices of pizza for dinner. If she tells you that you are not getting two slices of pizza, how much are you getting?"

Theres no chance of you getting anything but one or two slices, and you know you are not getting two, so you have to get one. No exceptions.

You can also use the bit to represent the value, so if the bit is 1, then the value is 1, but it is better to get in practice of using common values, in case you have instances where it could be 1 or 2, or -1 and 1.