As we established in the first article in this series, a game save password stores game state data. In this article, we’re going to talk about the low level details of how this is done. We won’t be talking much about the password itself, or the game state data. We’ll be focusing instead rather narrowly on how to encode and decode the data.
Numeric data
As far as numeric data is concerned, we’re only going to worry about encoding/decoding integers, for simplicity’s sake. For the most part, we’ll be concerned with positive integers, but we’ll cover negative values as well.
Actually, treating numeric GML data values as integers is a little bit tricky, since in GameMaker the only data types available are strings and reals (floating point numbers). For the most part you can treat a number as though it were an integer in GameMaker, as long as you don’t do math on the number that causes the digits after the decimal point to become non-zero values.
If you want decimal values, you can store the value as an integer, and then divide it by 10, or 100, or however many decimal places as you need after you decode it. Or, if you want to store a fraction, you can store the numerator and denominator as integers, and divide them after decoding them.
As long as you don’t deal in fractions and decimals, you can pretend that your numbers are integers, and GameMaker for the most part will act as though they are. This will come up later when we hit the limit for the largest integer value that we can encode using our method (16,777,215). I’ll explain why below, but for now it’s enough to know this should be a large enough value that we don’t really need to worry about trying to encode larger values, at least for most games.
The basic idea is that each character in your password alphabet stands for an integer value. Notice that the complete alphabet (26 upper case + 26 lower case) + the 10 numerals + 2 special characters gives you a range of 64 values; 2^6 = 64, so each character in a password can represent a 6-bit binary value.
A | B | C | D | E | F | G | H |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
I | J | K | L | M | N | O | P |
8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
Q | R | S | T | U | V | W | X |
16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 |
Y | Z | a | b | c | d | e | f |
24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 |
g | h | i | j | k | l | m | n |
32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 |
o | p | q | r | s | t | u | v |
40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 |
w | x | y | z | 0 | 1 | 2 | 3 |
48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 |
4 | 5 | 6 | 7 | 8 | 9 | ! | ? |
56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 |
Beyond 63
What can you do if you want to store a value larger than 63? Simple, you just use a second digit in your base-64 number. Just like the next number after 9 is 10, in base-64 the next number after ? is BA. Until you’ve looked at base-64 numbers for a long time, it’s going to be very difficult to recognize the value represented by a base-64 number, but we don’t need to — we’ll tell the computer to do it for us with some gml scripts.
We said before that a 4-digit base-64 number stores 24 bits, or up to 16,777,215 in base-10. But for some numbers that might be overkill. If we wanted to, we could treat each digit as a single base-64 value, and add them together. For a 4-character base-64 string, this would give us a range of 0-252, nearly a byte. It’s a much less compact way of storing the data, but for small values it’s not too bad.
To do the conversion from base-10 to base-64 and back, we’ll need some gml scripts.
b64_to_dec(b64)
/* Takes a string of a b64-encoded value and converts it to a real number */ var b64_alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!?"; var value = 0; var digit = 0; var neg = 1; if string_copy(argument0,1,1) == "-"{ argument0 = string_copy(argument0,2,string_length(argument0)-1); neg = -1; } for (var i = string_length(argument0); i >= 1; i--){ value += (string_pos(string_copy(argument0,i,1), b64_alphabet)-1) * power(64,digit); digit++; } return neg * value;
/* Takes a real number and converts it to a base-64 encoded string. Supports integer values from 0-16777215. */ var b64_alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!?"; var value = abs(argument0); var r; var str = ""; var done = false; while !done{ r = value mod 64; str = string_char_at(b64_alphabet, r+1) + str; value = (value - r) div 64; if (value == 0){done = true;} } if (argument0 < 0){ str = "-" + str; } return str;
Lastly, it will be handy to have a function that can pad a B64 encoded string to a specific length:
b64_padded
/* Takes a base-64 encoded string (argument0) and pads it to padded_length (argument1) with leading "A"'s (0's). If the number is negative, the first character in the padded string will be a "-". If the padded_length is less than the length of the b64 value, the function returns -1 to signify an error. (Note the error code returned is a real number, -1, not to be confused with the correct output of b64_padded(-1,2), which would return the *string* "-1") */ //argument0 the b64 string to pad to length //argument1 the length to pad to b64 = argument0; len = argument1; if len < string_length(b64){return -1;} //too short, return error if len == string_length(b64){return b64;} //just right; we're already done var str = ""; if string_copy(b64,1,1) == "-"{ str = "-"; len--; b64 = string_copy(b64,2,string_length(b64)-1); } repeat (len - string_length(b64)){str+="A";} str+=b64; return str;
Negative Numbers
The scripts above handle negative values just fine. But the minus sign is not part of the base-64 alphabet. If we need to store negative numbers, we have a few choices.
We can expand our password alphabet to include a minus sign (or an arbitrary symbol that we can use as a substitute).
Or we can designate certain characters in the password to store a boolean which signifies whether a given numeric value stored elsewhere in the password is positive or negative. Then encode the absolute value of the number, and re-combine it later with the boolean that holds the sign.
Or we can sacrifice a bit in our base-64 encoded numbers and treat them as signed integers, such that A-g represent values 0 through 32, and h-? represent negative values -1 through -31.
Beyond 16,777,215?
The encoding/decoding scripts work for values up to 16,777,215, or ???? in b64. This value is (2^24)-1. Beyond that, the numbers do not encode/decode properly. The reason for this has to do with the way GameMaker stores numeric values. All numbers in GameMaker are floating point values. GameMaker uses a 32-bit floating point, of which 24 of those bits are used for the digits to the left of the decimal point. The remaining 8 digits are used for the fractional value to the right of the decimal point. This means that for a number above 16777215, we can’t store the value in a 32-bit floating point variable without losing some precision. This precision prevents us from cleanly encoding a value above 16777215 and decoding it back to the same value. Fortunately, such high values should be rare to encounter in the game state data.
What happens if you try to store a larger value depends on the build target. Some may lose precision, resulting in off-by-one conversions — data corruption. Others may fail to return a value entirely when the conversion script is called, which would result in total loss of the data. Certain build targets (I’m not certain which ones) may use 64-bit floating point values, rather than 32-bit. In this case, we can go even higher, up to (2^52)-1. This is a ridiculously large number, and it’s almost certainly more than enough for any game state value you might want to encode.
Tip for validation
If the password space allows a value larger than the max value for a given game state variable, you can use those larger values as a form of validation check. For example, let’s say that the maximum number of lives in your game is 255. This requires an 8-bit value, but since each character represents 6 bits, and we don’t want to bother splitting characters, we use 2 password characters, which represents 12 bits of data. We can use these additional value space between 256 and 4096 for a validation check.
The simplest method would be to reject any password that contains data in the lives bits that decodes to a value greater than 255. Another way to handle this is to use a math value to obfuscate the lives value. Since 4096/256 = 16, we can take the value of the two characters that we use to encode the lives count as follows: lives x 16 = password substring. Now, when you’re validating the password, you can mod the value of the characters that represent the lives by 16, and if the calculation doesn’t work out to 0, then you know the password isn’t valid.
Or you can make the “right” remainder be dependent upon some other part of the password — for example, when the Level is even, the lives substring should mod16 to 0, but when Level is odd, it should mod16 to 1, unless Level is divisible by 3, in which case… Sneaky/unnecessary complexity like this will make the password harder to understand, and therefore require more effort to crack the password system. Don’t fool yourself into thinking you’re coming up with a super secure uncrackable password, but it will make the password a little less obvious than a simple counter, and for password system crackers, will make the puzzle more fun.
String data
Old NES password games did not typically have any string variables to preserve in a save password. In fact, the few games where you could enter a name were all, to the best of my knowledge, battery backup games that used savefiles.
It’s simple enough to store a string in a password, though, since a password is a string. The only real issue is dealing with variable length strings. For a lot of reasons, it’s probably best to stick with a fixed length for strings and pad shorter strings with spaces, trimming them later when applying the password to game state.
However, it looks weird if your password has string values stored as plain text in the password — it’s an obvious indicator to the player that the password is storing data, which could invite mischief. So it’s probably a good idea to encode the data somehow.
We could encode strings by using the numeric ASCII values of the letters, and then convert them back, character by character. While not encrypting the data, it would be sufficiently obfuscated for our purposes.
I won’t bother implementing this for now, since our demo password only has 4 characters, and in any case it’s not really necessary, but it’s good to have an idea of how we might do it if we decide to later on.
Boolean data
GameMaker doesn’t have a true boolean data type. The boolean constants true
and false
are equal to 1 and 0 in GML. Boolean expressions in GameMaker are handled by evaluating to a real number, and any number <0.5 evaluates as false, and 0.5 or greater evaluates as true. Therefore, to encode a boolean, all we really need to do is encode a value of 0 or 1. Or any other value that will evaluate to true or false.
But, since a single base-64 character can store up to 6 bits of data, and each bit can store a 0 or a 1, a single character could in theory store up to 6 boolean values, which means a 6x greater density than if we just stored one boolean value in each character.
To achieve this, we need to convert a base-64 value to a binary value, and vice versa. Then we would need to break the binary value into its individual bits, and designate each bit for a specific boolean value.
Note that the next few functions do not really deal with binary values, but rather with strings storing 0 and 1 characters, which we can convert to boolean values by using the real() function.
Edit: I’ve updated these functions with more elegant implementations, and replaced the brute force switch statement lookup table approach that I had here originally. Thanks to Ian Schreiber for the suggestion.
bin_to_dec(bin)
/* Takes a string of a binary encoded value and converts it to a real value. */ var bin_alphabet = "01"; var value = 0; var digit = 0; for (var i = string_length(argument0); i >= 1; i--){ value += ((string_pos(string_copy(argument0,i,1), bin_alphabet)-1) * power(2,digit)); digit++; } return value;
dec_to_bin(dec)
/* Takes a base-10 value and encodes it as a binary string */ var bin_alphabet = "01"; var r; var str = ""; var done = false; neg = sign(argument0); value = abs(argument0); while !done{ r = value mod 2; str = string_char_at(bin_alphabet, r+1) + str; value = (value - r) div 2; if (value == 0){done = true;} } if (neg < 0) {str = "-1" + str;} return str;
bin_to_b64(bin)
/* Takes a 6-digit binary encoded string and converts it to a b64-encoded character. */ return dec_to_b64(bin_to_dec(argument0));
bin_to_bool(bin)
/* Extracts the argment1-th bit out of the binary string supplied in argument0 and returns it, converted to a real number. The real value can be interpreted as a boolean (0=false; 1=true). Argument0 must be a string consisting only of "0"'s and "1"'s Argument1 must be a number between 1 and string_length(argument0). */ return real(string_copy(argument0,argument1,1));
bool6_to_bin(bool,bool,bool,bool,bool,bool)
/* Takes six boolean values and concatenates them to create a 6-bit binary encoded string, suitable for converstion to a b64 value with the bin_to_b64() function. */ /* force conversion of arguments to gml boolean constants */ if argument0{argument0 = true;}else{argument0 = false;} if argument1{argument1 = true;}else{argument1 = false;} if argument2{argument2 = true;}else{argument2 = false;} if argument3{argument3 = true;}else{argument3 = false;} if argument4{argument4 = true;}else{argument4 = false;} if argument5{argument5 = true;}else{argument5 = false;} /* concatenate bools to string */ return string(argument0) + string(argument1) + string(argument2) + string(argument3) + string(argument4) + string(argument5);
bin_padded
//pads a binary encoded string (argument0) to length (argument1) var bin = argument0;//bin string to pad var len = argument1;//padded length if len < string_length(bin){return -1;} if len == string_length(bin){return bin;} repeat(len - string_length(bin)){bin = "0" + bin;} return bin;
These scripts should suffice for encoding/decoding all of our data values.
Advanced password validation with checksums and hashes
Additionally, to keep the player honest, you probably want some kind of tamper-proofing to go with your password encoding. Otherwise, it makes tinkering with the password to cheat too easy. For example, you could play the game, get a password, then get hurt, and get a new password, and by comparing the two, you could infer which characters in the password correspond to your health. Then, you could substitute different values for the password characters that correspond to your health health until you hit upon a value that gave you max health, or, depending on how the game works, even infinite health or invulnerability.
It’s true that deciphering password encoding schemes is a lot of fun, but we don’t want to make the system so easy that it invites cheating. Make the player work at it, and feel like they truly hacked something when they figure out your system.
To do this, we can reserve certain characters in the password for storing checksum values, or even hash values. Understanding hash and checksum math isn’t too difficult, but isn’t terribly necessary, either. We don’t need something impossible to crack, just something functional. There are better checksum functions than I’ll demonstrate here, but this should suffice to make the concept understandable to anyone.
Simple checksum
Using our 4-character example password, we can add an additional two characters for checksum data.
Level | Lives | Health | Ammo | Checksum |
A-? | A-? | A-? | A-? | AA-?? |
Let’s say the game state data is encoded with the following 4 values from our cipher table, above:
Level | Lives | Health | Ammo | |
password char | B | D | ? | ? |
base-10 value | 1 | 3 | 63 | 63 |
A simple checksum for this would be 1+3+63+63 = 130. Of course, this checksum isn’t very strong; any four values for Level, Lives, Health, and Ammo that add up to 130 will have the same checksum. But it does stop someone from changing a single character in the password and getting another valid password.
To add the checksum value to the password, we need to encode 130 as a base-64 value. This would be a 2-digit base-64 number: 22. According to our base-64 encoding table, the value 2 is encoded by the symbol C. So, 22 == CC. So the entire password would be:
Level | Lives | Health | Ammo | Checksum | ||
password char | B | D | ? | ? | CC | |
base-10 value | 1 | 3 | 63 | 63 | 130 |
Note that since 63*4 = 252, the highest the checksum value will ever go will be 252, or D8.
If you wanted to get really crazy, you can scramble the order of the characters, shuffling checksum characters in between the data value characters. But we’re not that concerned with obfuscation, so we won’t bother with an example of this.
In the next article, we’ll cover how to design the password specification in greater detail, and write some sample scripts that takes the password, validates and decodes it, and assigns the stored values to re-create the game state.
I avoid decimals/fractions on general principle. Just multiply by a large enough number to make them whole. Like, if you have increments of 1/2 on one piece of game data, just multiply it by 2, at least when you store it internally. Removes the possibility of rounding errors, GameMaker displays it numerically without decimals, in general it’s just cleaner, and there are precious few cases where game data REALLY NEEDS to be fractional. I mean, really, try to think right now of any game you’ve played that has fractional data shown on the HUD somewhere. I bet you can’t, at least off the top of your head.
For 64 bit encoding, note that between upper case, lower case, and punctuation, you actually have quite a bit more than 64 characters to work with, so you can afford to be a bit selective. Give some thought to which characters are ‘in’ and which are ‘out’. The easiest way to do this programming-wise is to start with, say, ASCII 48 (the number ‘0’) and go up from there, so your 64 characters are: 0123456789:;?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmno
Then you don’t need any special custom function to map from characters to base 64, just take the ASCII value of each character and subtract 48.
However, while that’s great for the programmer, it’s downright ugly for the player. Better would be to use the method you laid out, but selectively remove confusing characters: so remove oO0, 1Il, and so on, and replace with other recognizable punctuation. Why yes, I do still have bad memories of writing down a 24-character password after a several-hour-long session, only to have accidentally written one of those characters wrong, thus destroying my progress, and yes, I’m still bitter about that. Make the password easy to read. (Ideally, print out upper and lower case in different colors, to prevent getting similar-looking letters mixed up, like cC kK pP sS uU vV wW xX zZ.)
Alternately, you can simplify your alphabet down to 16 or 32 characters and just have longer passwords. Or heck, just use the capital letters and use base 26; yes, hardcore programmers wince at something that’s a power of 2, but for the algorithm you’ve outlined here there is no reason it couldn’t be used for arbitrary base.
Didn’t understand what you meant when you were saying to add the four values together for a 4-digit base-64 number. As far as I can tell, this will flat out not work, because there’s no way to get the original values from the sum. Simple example: let’s say I have four values, and those values are HP=2, Attack=3, Defense=2, Speed=1. The sum is 8. But if the four values were 3/3/1/1 or 1/1/1/5, those also sum to 8. If you just store the value 8, you have no way of knowing which of those sets of values the 8 represents.
The b64_to_bin algorithm works, but it’s brute force and an obnoxious amount of mindless typing to implement. There’s a cleaner way: just convert argument0 to a number (using b64_to_dec), which gives you an integer from 0 to 63; then convert that to individual bits (and the reverse for bin_to_b64), exactly the way you convert between numbers and characters for b64_to_dec and dec_to_b64 (note that in these earlier functions you didn’t need a 64-case switch statement to convert).
Ian Schreiber
Thanks for all the suggestions, Ian.
I’m all in favor of a simplified alphabet that makes confusion due to ambiguous characters a non-issue. My thought on that was to use a custom font, rather than selectively culling the alphabet, but really either approach works and it’s a matter of preference, and it’s probably easier to cull the alphabet than it is to find or make a font that omits those characters.
You’re quite right that I could have simply used shifted ASCII values. However, I chose to implement the IETF RFC 3548 standard for base-64 encoding, I figured it was better to follow a standard — but again either approach is perfectly viable.
The “adding values together” for values above 63 approach isn’t a very good one, but I wanted to show that you can think of 4 characters in a b64-encoded string as a 4-digit b64 number, or as 4 separate single-digit values. You probably wouldn’t want to implement it that way, but you’re right that you can’t get the original values from the sum. It doesn’t necessarily matter — as long as you only care what the sum is, and have some way of splitting it up among the four numbers when encoding.
The b64_to_bin is brute force, yes. It’s ugly and I didn’t want to do it that way, but I have some personal deadlines to meet with getting this series finished, so I had to take a shortcut with writing that code, rather than research and test a more elegant method. I figured there had to be a better way of doing it — typing all that binary in and proofreading it is a chore, let me tell you — but now that I’ve done it, it’s done. When I share the project files for this, at the end of the series, users won’t have to re-type it themselves, they can just download it and use it. I really would like to replace the implementation in those functions with the cleaner approach that you describe. Maybe I’ll get that implemented before the final article.
Chris Sanyk