CS50x Caesar

From problem set 2

Guilherme Pirani
6 min readOct 10, 2020

--

Photo by Kai Dahms on Unsplash

Although it may be obfuscated by the salad, the real Caesar did have his fair share of inventions. One of them, supposedly, is a simple cryptography system that consists in adding a number, or “key” to a letter, transforming it on another letter. A jump of positions based on said key. Here’s a working code example of “Hello” with key = 1.

$ ./caesar 1
plaintext: Hello
ciphertext: Ifmmp

Unencrypted text is generally called plaintext. Encrypted text is generally called ciphertext. And the secret used is called a key.

So, how it works for the code we’re supposed to write?

For the first time we’re passing an argument at the same time we open the program at the terminal. The key. Notice that the case of the original message has been preserved. Lowercase letters remain lowercase, and uppercase letters remain uppercase. We’re given a formula to make sure our math is correct.

Cyphertext = (Plaintext + Key) % 26

That formula assumes that the index of ‘A’ on our alphabet is 1. That’s correct to humans, but computers store ‘A’ using something called ASCII Table. There’s no A in computer memory. There’s a number that represents ‘A’, which by human convention is 65. But that is uppercase ‘A’, lowercase ‘a’ is 97. So we’ll have to work around that in our code. Take a look at the table, there’s no need to memorize, it’s always there to be consulted. http://www.asciitable.com/

One more consideration before starting to code:

$ ./caesar
Usage: ./caesar key

As we’re taking “key” as argument to our main function, we can’t re-prompt the user for a key. Our only option is to print a message informing the error and terminate the program. Same should happen if the key isn’t all numerical digits. Inside the plaintext, the instruction is to keep numerals as they were prompted.

Now for pseudocode.

// Get Key
// Make sure parameters are valid
// Get text
// Encrypt with given formula
// Print ciphertext

There will be some steps inside encrypting and printing that I’d rather specify further on. Right now the base of our code looks like this:

#include <stdio.h>
#include <cs50.h>
#include <ctype.h>
#include <stdlib.h>
void caesarCode(char*, int);int main(int argc, char *argv[])
{
//Validate key here
char *plaintext = get_string("Enter text to encrypt: ");
caesarCode(plaintext, key);
}
void caesarCode(char* t, int k)
{

}

By now you should’ve noticed that our main parameters are not (void) but an int (argc) and a string (*argv[]). I started using the common syntax for string, a preparation for weeks ahead when the training wheels of cs50.h are taken off. Argc keeps track of the number of items inside argv, which records everything we wrote when starting the program. So we can validate the user’s input by checking these two variables.

if (argc != 2)
{
printf("Usage: ./caesar key\n");
return 1;
}

for (int i = 0; argv[1][i] != '\0'; i++)
{
if (isalpha(argv[1][i]) != 0)
{
printf("Usage: ./caesar key\n");
return 1;
}
}
int key = atoi(argv[1]);

Explaining: ./caesar is the first item in our argv array, or argv[0], being key the second, argv[1]. Knowing the proper usage is “./caesar key”, anything else than argc equals 2 is wrong. For checking if our key is composed of numbers only we need a for loop to iterate through each element of argv[1]. That we do by checking every iteration for a alphabetical character and stopping the program if we find one. As always, the last character of the string will be NULL (‘\0’) that’s where our loop ends. Argv[1][i] refers to position i (0 on first iteration) of our string. Next iteration will check agrv[1][1] an so on. Finally, Even though we’re talking about numbers, they’re stored as characters because argv is a string. So to finish this codeblock we nee to use the function atoi to convert key to an integer.

The code that asks the user for a text is already on main, here:

char *plaintext = get_string("Enter text to encrypt: ");

Next on the line is the actual encrypting of the text.

printf("ciphertext: ");

for (int i = 0; t[i] != '\0'; i++)
{
if (isalpha(t[i]) != 0)
{
if (isupper(t[i]) != 0)
{
printf("%c", ((t[i] - 'A' + k) % 26) + 'A');
}
else
{
printf("%c", ((t[i] - 'a' + k) % 26) + 'a');
}
}
else
{
printf("%c", t[i]);
}
}
printf("\n");

We start by printing “ciphertext:” (without breaking the line) just by course design choices, we need that to pass the automatic tests. Then we’ll need another loop to iterate over each of the text’s characters. An isalpha function checks if they’re alphabetical (remember it returns 0 if not). Being alphabetical we have now a fork where it could be uppercase or lowercase. Luckly there’s also functions to check that on the same library, isupper and islower. They have the same type of return that isalpha, i.e. will return 0 on false. After we check if the letter is upper or lowercase, it’s just a matter of applying the formula. There’s a catch I mentioned on the beginning though. The formula assumes A is our index 1, but’s not the case. Look an explanation example on how to solve that (my choice, there are other methods) . BTW, we are printing the cipher for each letter right after it’s calculated, good choice to not store the plaintext and ciphertext in the same place.

Given: key = 1         plaintext index = 1         plaintext = ABEL
% symbol means modulo
printf("%c", ((t[0] - 'A' + k) % 26) + 'A');
(65 - 65 + 1) % 26 + 65 <- Tis is ASCII
Tis the alpha index -> 1 % 26 + 65
1 + 65
now back to ASCII -> 66
ASCII 66 = B

When we subtract ‘A’ from our text character ASCII index, it will give us the alphabetical index of that same character, in this case 0. We then add the key to transform it. A(0) plus 1 equals B(1). Now we got to make sure that we didn’t reach the end of the alphabet. That’s what %26 do. Suppose t[0] equals ‘Z’(ASCII 90) . That would give us a sum of 90–65+1 = 26. Not inside the alphabet! But after calculating 26%26 that equals 0. We’re back to alphabetical ‘A’. The magic is that every number from 1 to 25 % 26 equals itself. Not 26, that equals 0. Funny though we might think it’s over, but if we print the results right now we’d terrible wrong, leading to unexpected behavior of our program. Remember characters come from ASCII, and ASCII 0 is NULL. We can’t forget to transform our alphabetical index back to ASCII index. That’s why we add ‘A’ back into the formula at the end. Hope it made sense. We do the same for the lowercase letters, but with ‘a’ instead of ‘A’.

The last else handles every non-alphabetical character, they just stay the same. A print break line ends the functions and our code. Another fun one to work through! You can check the usual tests and full code with comments bellow. Thank you for reading!

Results generated by style50 v2.7.4
Looks good!
Results for cs50/problems/2020/x/caesar generated by check50 v3.1.2
:) caesar.c exists.
:) caesar.c compiles.
:) encrypts "a" as "b" using 1 as key
:) encrypts "barfoo" as "yxocll" using 23 as key
:) encrypts "BARFOO" as "EDUIRR" using 3 as key
:) encrypts "BaRFoo" as "FeVJss" using 4 as key
:) encrypts "barfoo" as "onesbb" using 65 as key
:) encrypts "world, say hello!" as "iadxp, emk tqxxa!" using 12 as key
:) handles lack of key
:) handles non-numeric key
:) handles too many arguments

--

--