Project Rules: You should work on your own on this project; projects 4,5,6 carry more weight than projects 1,2,3 (which were team projects). You can use the code handed out for the calculator program, and you can reuse your code from Homework 5 and Lab assignments. Your project must meet the required specifications described in this document. However, you can choose to implement additional features that you build into the program (if the features are shown to be useful) and improve your grade on the project. There is no collaboration, of any sort, allowed on this project.
Recommendation: Read this project description as soon as possible. It is also recommended that you write out ‘pseudo code’ (or flow chart) for this program – this will help you get started with writing the assembly program.
In this project you are required to implement an “encryption module”, in LC3 assembly, that allows users to encrypt their messages using a simple encryption algorithm (using a variation of the Caesar’s cipher algorithm and One Time Pads) that mimics the ARX (Add Rotate Xor) ciphers of today (though we will leave out the Rotate aspect in this project!). Applications where a sequence of digits/numbers, such as the social security number, passwords or credit card numbers, have to be encrypted are numerous – examples include online purchasing (such as Amazon), patient records in health care systems. In this assignment, you will develop a program to encrypt a string of at most 10 characters using an encryption algorithm described in this document. (The characters from the keyboard, in terms of their ASCII representation, are all printable characters with ASCII representation from x21, for the symbol “!”, to x7E which is the symbol ~).
1. Cryptography Background
Cryptography, from the Greek word kryptos, meaning “hidden”, deals with the science of keeping information secret. The desire to do this has existed ever since humankind was first able to write. There are many ways to keep something secret. One way is to physically hide the document. Another way is to encrypt the text you wish to remain secret. Some people use this to keep others from understanding the contents of the files in their computer. Encrypting a file requires an input message, called the “plain text,” and an encryption algorithm. An encryption algorithm transforms “plain text” into “cipher text.” Just like a door needs a key to lock and unlock it, an encryption algorithm often requires a key to encrypt and decrypt a message. Just like a homeowner can selectively give the key to the front door to only those he wants to allow unaccompanied access, the author of a message can selectively give the encryption key to only those he wants to be able to read the message. In order for someone to read the encrypted message, they have to decrypt the cipher text, which usually requires the key.
For example, suppose the plain text message is HELLO WORLD. An encryption algorithm consisting of nothing more than replacing each letter with the next letter in the alphabet would produce the cipher text IFMMP XPSME. If someone saw IFMMP
XPSME, they would have no idea what it meant (unless, of course, they could break the encryption, or had the key to decrypt it.) The key to encrypt in this case is “pick the next letter in the alphabet,” and the decryption key is “pick the previous letter in the alphabet.” The decryption algorithm simply replaces each letter with the one before it, and presto: the plain text HELLO WORLD is produced.
Encryption Algorithm
In this programming project you will implement a simple ARX cipher (with the rotational component left out for simplicity; but you could think of it as adding a shift operation) encryption algorithm which we call “SAD” based on a composition of two methods – Caeser’s cipher and the Vigenere Cipher (both of which are described in detail at the end of this document). The goal is to encrypt a message, consisting of ASCII characters, using a key provided by the user. To decrypt the message correctly, the same key must be used by the decryption algorithm.
2. Project Requirements and Specifications
Your encryption/decryption programs (in LC3 assembly) must meet the following requirements. Any deviation from this, without approval from the instructor, or
the TAs will result in points being deducted. The program must use the SAD encryption algorithm described in this document.
Specifications:
Your program should prompt the user for three separate inputs from the keyboard, as follows:
(0) When you start the program, it first prints out “Starting Privacy Module” and then goes to Step 1.
(1)The prompt:
ENTER: E TO ENCRYPT, D TO DECRYPT, X TO EXIT:
o The user will type E or D or R or X. A real world program, with error checking for invalid inputs, should also detect any other character typed and respond with
INVALID ENTRY. PLEASE TRY AGAIN.
o Your program will accept that character, store it someplace (you should
specify where in memory) and use it, as we shall see momentarily.
o If the user types X then the program first erases the encrypted data (i.e.,
replace the contents in memory where the encrypted data was stored with zeros), and then exits and halts.
(2) If either E or D is chosen in (1) then the program prints the prompts:
ENTER KEY (Length 4, one non-numeric character followed by 3 digit number between 0 and 127).
WHEN DONE PRESS ENTER
1. (a) The user will type their key (note that the key in this case is a non-numeric character followed by a 3 digit number between 1 and 127). Again, we will assume the user can successfully hit digit keys on the keyboard. You can make any of the two assumptions: (1) user will enter return after they are done entering numeric number between 0 and 127 – for example A23 and enter return; or (2) user will always enter 3 digits after the non-numeric character – for example they will enter A023. To figure out which design is ‘easier’ take a look at the ASCII to Binary (and Binary to ASCII) subroutines from HW5 (and the Calculator example in the textbook).
Question (during your design): what do you echo when the user types
in the key?
2. (b) Your program will accept this input, store it someplace, and use it to encrypt or decrypt the message.
•Important: You will need to figure out how to store and interpret this (max) length 4 key entered by the user. Read details of the encryption algorithm to determine what needs to be done.
(c) IftheinputinStep1wasDthenitgoestoStep(3).IftheinputinStep1was E then program goes to step (4). If the input in Step 1 was X then it goes to step (5).
(3) If the input in (1) was D (decryption) then your program must decrypt the string (some ciphertext/message you stored) using the key entered in this step and print it out to the display. The cipher text string will be a 10 character cipher text (i.e., after encryption) stored starting at location with label MESSAGE – details follow in step (3).
• • Note: to protect against losing the encrypted data stored starting at MESSAGE if a wrong key was entered, make sure you do not write into MESSAGE the string you are going to print.
• • The decrypted message must be printed to the display. Note that if the user entered the wrong key, then it will print out a message (string of characters) that will not be equal to the original message.
• • After decrypting the program returns to Step 1 to prompt for next user input.
(4) If the input in (1) was E (i.e.,encryption) then after getting the key in Step (2), print the prompt:
Enter input plain text of length at most 10. When done press <enter>
1. The user will input a string/sequence of at most 10 characters (the ascii
printable characters and do not include control characters) from the keyboard, terminating the message with the <ENTER> key. You can make the simplifying assumption of exactly 10 characters – for penalty of 2 points.
2. Your program must store this ASCII string in memory location MESSAGE starting at x4000. It must then write the encrypted characters into the same location since you do not want plain text to persist. Question: – why ?
3. Using the value of the key entered in Step(2), and the encryption algorithm provided, generate the cipher text and store the cipher text into memory
starting at location labeled with MESSAGE (i.e., the label of the starting
address of the location is MESSAGE).
d. After encrypting, the program returns to Step 1 where it can prompt for next
user input.
Question: How would you change the program if instead of at most 10 message, you want to allow strings of any length ?
(4) If the input in (1) was X (i.e., Exit) then the program erases the memory locations where the encrypted data was stored (erase is simply writing 0 into the locations) and exits and halts.
3. Hand-in Requirements – What you need to submit.
You are required to hand-in two components (as a zip file): (1) your assembly code and (2) a report describing your implementation – this should include pseudo code. Your assembly code must be a text file – i.e., an .asm file. You must name the file as firstinitial-lastname.asm. For example, if your name is Jane Doe then the filename is J- Doe.asm. The report is meant to assess the effectiveness of your design and we are looking (a) for an algorithm describing your implementation and (b) answers to the (underlined) questions in this document. You must document your code, and include a flow-chart describing your solution. Note that you will get a grade of zero if your program does not work and you do not submit a report describing your algorithm.
4. Grading Criteria
• • You must submit code that assembles correctly without errors – if your code does not assemble then you get zero points. If the code crashes during testing of specifications, you cannot earn more than 50% of the points.
• • You must document your code – you lose up to 10% for poor (or no) code documentation.
• • The code must meet all the specifications (user interface, encryption and decryption algorithms).
• • The encryption algorithms must be correctly implemented. If you implement part (subset) of the encryption algorithms correctly then you will receive partial credit (of up to 80%).
• • Performance of the code – pay attention to minimizing redundant code.
• • Error checking – you can get correctly working code without error checking (for
invalid inputs) by assuming that the users enter valid input data. However, making this simplifying assumption means you will lose some points for not implementing error checking. (You can still get 90% without error checking if everything else works perfectly.)
5. Encryption Algorithms
The encryption algorithm you will implement is a composition of two well known (and no longer used!) techniques: (1) f1 : Caeser’s cipher (generalized to a shift cipher), and (2)
f2 : Vigenere Cipher – this implements a lightweight variation of the one time pad techniques. (For it to be in the ARX category, we need a third step of rotating the bits – this step is left out in this assignment but you are welcome to try adding this component for extra credit as a f3: shift k bits where 0<= k<8 is also input as a key.) This SAD encryption algorithm used by your security module works as follows:
• • System prompts user for a key consisting of one non-numeric character followed by a 3 digit number between 0 and 127 (i.e., they must enter 3 digits – each 0 through 9 ). Assume user enters: x1 y1y2y3 (the key must be of length 4 – you can go with a length between 2 and 4 if you your system does not require them to enter 3 digits for the number between 0 and 127) –the non-numeric character is x1 and the 3 digit number is y1y2y3 . For example, if they enter &106 then x1=&, y1=1, y2=0, y3=6 (for the 3 digit number 106). Note: If they must enter 3 digits, then to enter the number 29 they will enter 029.
• • User then enters (as described in the specifications in section 2) the plain text string w of 10 characters from the keyboard.
• • The string w is first encrypted using Caeser’s cipher with key K=y1y2y3 , i.e., the function f1 , to get the cipher text u = f1(w). (For example, with input key &106, K=106 for this phase of encryption.)
• • The string u (output of previous step encryption) is then encrypted using Vigenere cipher with key K = x1, i.e., the function f2 , to get the cipher text v= f2(y) = f2(f1(w)). (For example, with input key &106, K=& for this phase of encryption.)
• • Question: Figure out how decryption works – you have to apply the inverse functions f1-1 and f2-1 which you can derive based on the descriptions of f1 (Caeser’s cipher) and f2 (Vignere’s cipher) in what follows.
Function f1: Caesar’s Cipher
For the encryption function f1 , you will implement a simple version of a cryptography algorithm called Caesar’s Cipher. The term Caesar’s cipher is an ancient
encryption technique used by Julius Caesar to send secret messages. At its heart is the concept of modulo arithmetic (recall this from your discrete math class and from
a homework assignment). Caesar used it to encrypt the alphabet but the key was fixed at K=3. The shift cipher we use allows the key K to vary.
Conceptually it works as follows. The input (message) plain text to be encrypted is a string of M symbols p1 p2 p3…pM , each symbol pi can take on N different values.
The other part of the input is the encryption key K which must be less than N. To encrypt the message each character/value pi in the input is replaced by the cipher text character/value ci = (pi + K) modulo N.
•In this project, the value of N=128. Therefore, the final cipher text character ci is obtained by adding the key using modulo 128 arithmetic (addition mod 128). In other words, ci = (pi + K) Modulo 128 for each character.
o Example: If the input plain text character string is )r and the encryption key is 6, then in step 1 the program takes ASCII
value of r, 01110010 (x72) which is 114 in decimal, and adds (modulo 128) encryption key 6 to get 120 decimal ( binary 01111000 or x78). This is the ASCII value for x. It takes ASCII value of (, which is x29 =00101001, decimal 41) and adds 6 (modulo 128) to get x2F or decimal 47. This is the ASCII value for / and therefore the string in cipher text is /x
Question: Why are we using modulo 128 ? Should we be using 128 or a different value if the characters have to be alpha numeric or special characters (i.e., they have to be visible characters from the keyboard – cannot be space, tab etc.).
•The decryption algorithm works in exactly the reverse order as the encryption. The user provides the key K, and we apply the algorithm to the cipher text c1 c2 …cn (where n=16 since we are dealing with 16 character strings). We subtract the encryption key K from ASCII code for ci. The subtraction is once again using modulo arithmetic. Therefore, we compute pi = (ci – K ) modulo 128.
o Example: if the cipher text is G and the encryption key is 6, we first subtract encryption key (i.e., we add -6 modulo 128) from 01000111 to get 01000001.
•Note: Recall from cs1311 that (x-y)mod N = (x+ y’) mod N where y’=N-y (mod N) (i.e., y’ is the inverse of y modulo N). Therefore to decrypt we can add the inverse of the key modulo N. For example, (1-4) mod 10 = (1+6) mod 10 since inverse of 4 is 10-4=6. You need to use this property to implement your assembly code LC3, so make sure you understand how this works. Note that since the key K is less
than N, the number N-K is a positive integer less than N. Function f2: One Time Pad (OTP) – Vigenere Cipher version
The one time pad (OTP) encryption method is known to be the most secure form of encryption. In a simplistic definition of this scheme, a string of binary symbols p1 p2 p3…pM of length M is encrypted using a key k1 k2 k3…kM of length M to create a cipher text c1 c2 c3…cM where ci = (pi ki) (where is the exclusive OR operation). If the key is random and sufficiently long, encryption is impossible to break assuming the key is used exactly once. An example of a OTP for a single character from the keyboard (i.e., an ASCII character) would be a 7-bit key (since each ASCII character can be represented as an 7 bit number).
In this project we will use a predecessor (and weaker form) of this method – a Vigenere Cipher – to implement our encryption function f2. In this encryption scheme, for an input plain text string p1 p2 p3…pM of length M (which is 10) the cipher text is computed as c1 c2 c3…cM where ci = (pi K) where K is a 7-bit key (i.e., a 7 bit binary number which can also be interpreted as an integer between 0 and 127), each pi is a symbol/character from the keyboard, and is the bitwise exclusive OR operation. For example, if the input
string is a8 (with ascii codes x61 and x38 respectively or binary strings 01100001 and 00111000) and the key K is x49 (binary 01001001 and corresponds to character I on the keyboard) then the encryption results in the two binary strings (01100001 01001001= 00101000 = x28) and (00111000 01001001 = 0111001 = x71). (Note: LC3 uses 16 bits but in these examples the higher order 8 bits are all set to 0, so we did not include the values of bits 15 through 8). Therefore the two encrypted values, i.e., c1 c2 , are (in binary) 00101000 and 01110001 or x28 and x71 which if printed to the display would print (q.
Question: How do you decrypt a message that has been encrypted using this scheme ?
You will need to work this out to complete the project. Recall properties of XOR operation to figure this out.
(Note: The rotate aspect of an ARX cipher performs a circular rotate of the bit string. For this project, we are not requiring this step. An example of a simple rotate function is a bit shift– each 16 bit value will be shifted left K times where K is the key for this phase and will need to be input by the user.)