Programming Assignment Two: PA2


Due Wednesday, May 6 @ 11:59pm

Hackers abound and you have been asked to write a crypto program to encrypt/decrypt data. The program will be able to read the data to be encrypted either from a file specified on the command line or from stdin (data could either be typed in at the keyboard or stdin redirected from a file or via a pipe). The encrypted data will be written to stdout which can be redirected to a file.

The program will ask the user to enter a pass phrase of at least 8 characters, two 32-bit crypto keys, and a rotation key [-63 <--> +63]. The program will XOR the pass phrase and the two 32-bit crypto keys to form a 64-bit cypto mask that will be used to XOR the data in 8 byte chunks (two 32-bit register operations per 8 byte chunk) plus individual single byte masks for the trailing bytes. With each 8 bytes of data encrypted with the 64-bit mask, the mask will be rotated according to the rotation key value (rotating left if the rotation key is positive and rotating right if the rotation key is negative).

The purpose of this assignment is to learn more SPARC assembly and become familiar with more useful Standard C Library routines. In particular, we will use load and store instructions to access memory locations and bit-wise instructions (shifts and masks) to implement various operations, allocate local variables on the runtime stack, perform 64-bit operations across two 32-bit registers maintaining this data in an array of two unsigned longs in memory, convert some C code you wrote in PA1 into equivalent assembly routines in PA2 (strToLong.c --> strToULong.s), reuse an assembly routine you wrote in PA1 (checkRange.s), access global data across C and assembly, read potentially non-ASCII input from either stdin or a file (fread()) and write potentially non-ASCII output to stdout (fwrite()), write user prompts to stderr and read expected input from stdin, parse a line into multiple input values/tokens (strtok_r()), and just lots of other good C Preprocessor and C language uses.

We will use a mixture of our own routines and Standard C Library routines to write prompts [fprintf() to stderr] to the user and read user input [fgets()] from stdin, tokenize user input for multiple input data [strtok_r()], validate the user input [strlen(), strToULong(), checkRange()], open [fopen()] and read blocks of (possibly non-ASCII) data [fread()] from a disk file or stdin (may be redirected or piped), and write blocks of (possibly non-ASCII) data [fwrite()] to stdout (which may be redirected to a file). Some of the Standard C Library routines we called from C in PA1 we will use/call from assembly this time [strtoul(), fprintf(), perror()]. Appropriate error checking and reporting is required. Example runs are given below.

NOTE: We are not trying to develop a great, generally usable crypto program to save the world. There are plenty of better crypto algorithms -- this is just a contrived exercise to use some of the routines and bit-wise operations detailed above. If anyone is truly interested in crypto systems, I would be happy to direct you to books, classes, and faculty who specialize in this area.

Grading Breakdown

README: 10 points
Compiling (supplied Makefile; no warnings): 10 points
Style (Including Comments); 30 points
Correctness: 50 points

Extra Credit: 5 points total
  1) Early Turnin: 3 points
      3 points if last turnin dated before Monday, May 4 @ 11:59pm
      2 points if last turnin dated before Tuesday, May 5 @ 11:59pm

  2) Optimization: 2 points
      Must correctly pass at least 70% of test cases to be eligible for Extra Credit
      Filling delay slots with useful instructions (eliminating nops)
      At least 80% of nops need to be filled to get full "filling delay slots" Extra Credit
      Sometimes there are many ways to perform a task, some ways are more optimal
      (and worth more Extra Credit points) than others

NOTE: If what you turn in does not compile, you will receive 0 points for this assignment. The files you turn in must compile with the supplied template Makefile in order for your programming assignment to be graded.

The Makefile for PA2 (~/../public/Makefile-PA2) creates an executable named mycrypt, so you will use this program name vs. the default a.out name.

A sample stripped executable for you to try is available at


~/../public/pa2test

Let us start off by looking at some examples (bold indicates what you type):

ieng9.ucsd.edu% mycrypt
Usage: mycrypt filename | -
ieng9.ucsd.edu% mycrypt - > test.encrypted
Enter the passphrase [at least 8 chars]: CS30 Rules
Enter the two 32-bit keys [octal, decimal, or hex]
separated by a space or tab: -5 0xCAFEBABE
Enter the rotation key [-63 <-> +63]: -43
This is my first crypto test.
This text could also be in a file that I specify on the command line.
The '-' on the command line means read from stdin instead of a file.
We keep reading input from the keyboard in this mode until the user
types Control-D (in Unix) as the first characters on a line.
^D [Control-D typed here.]
To decrypt our encrypted file, we run our mycrypt program on the encrypted file with the same passphrase, 32-bit keys, and rotation key:
ieng9.ucsd.edu% mycrypt test.encrypted
Enter the passphrase [at least 8 chars]: CS30 Rules
Enter the two 32-bit keys [octal, decimal, or hex]
separated by a space or tab: -5 0xcafebabe
Enter the rotation key [-63 <-> +63]: -43
This is my first crypto test.
This text could also be in a file that I specify on the command line.
The '-' on the command line means read from stdin instead of a file.
We keep reading input from the keyboard in this mode until the user
types Control-D (in Unix) as the first characters on a line.
We can even encrypt binary files like our mycrypt executable:
ieng9.ucsd.edu% mycrypt mycrypt > mycrypt.encrypted
Enter the passphrase [at least 8 chars]: Hello World
Enter the two 32-bit keys [octal, decimal, or hex]
separated by a space or tab: 075242 0xDEADBEEF
Enter the rotation key [-63 <-> +63]: 19
ieng9.ucsd.edu% mycrypt mycrypt.encrypted > mycrypt.encrypted.decrypted
Enter the passphrase [at least 8 chars]: Hello World
Enter the two 32-bit keys [octal, decimal, or hex]
separated by a space or tab: 075242 0xDEADBEEF
Enter the rotation key [-63 <-> +63]: 19
ieng9.ucsd.edu% diff mycrypt mycrypt.encrypted.decrypted
ieng9.ucsd.edu%
Beware -- you probably do not want to send the encrypted bytes to the screen because they are most likely non-ASCII control characters.

If you give a different passphrase, set of keys, or rotation key to decrypt an encrypted file, you will not get the original back.

As always, we need to perform appropriate error handling and reporting:
ieng9.ucsd.edu% mycrypt - > test
Enter the passphrase [at least 8 chars]: Hi
Passphrase must be at least 8 chars long; Try again.
Enter the passphrase [at least 8 chars]: Hi There
Enter the two 32-bit keys [octal, decimal, or hex]
separated by a space or tab: 14
Only 1 key entered. You must enter 2 keys.
Enter the two 32-bit keys [octal, decimal, or hex]
separated by a space or tab: 1n4 abc
"1n4" is not an integer
"abc" is not an integer
Enter the two 32-bit keys [octal, decimal, or hex]
separated by a space or tab: 14 999999999999999999999999
Converting "999999999999999999999999" base "0": Result too large
Enter the two 32-bit keys [octal, decimal, or hex]
separated by a space or tab: 14 098
"098" is not an integer
Enter the two 32-bit keys [octal, decimal, or hex]
separated by a space or tab: 14 0xfgh
"0xfgh" is not an integer
Enter the two 32-bit keys [octal, decimal, or hex]
separated by a space or tab: 14 0xabcdef
Enter the rotation key [-63 <-> +63]: -99
Rotation key must be within the range of [-63 <-> 63]
Enter the rotation key [-63 <-> +63]: 64
Rotation key must be within the range of [-63 <-> 63]
Enter the rotation key [-63 <-> +63]: ^D
The user can always exit the program by typing Control-D on Unix as the first character sequence on an input line. So at any point the user is asked for input (passphrase, 32-bit keys, rotation key), the user can exit without performing any encryption.

Strategy

Start Early!
The function prototypes for the various C and Assembly functions are as follows:

C routines

int main( int argc, char *argv[] );
void getPassPhrase( char passPhrase[] );
void getKeys( unsigned long keys[] );
int getRotateValue( void );

Assembly routines:

void maskPassPhrase( unsigned long keys[], char passPhrase[], unsigned long mask[] );
void mycrypt( FILE *inFile, unsigned long mask[], int rotateValue );
unsigned long strToULong( char* str, int base );
int checkRange( long value, long minRange, long maxRange );
void rotate( unsigned long mask[], int rotateCnt );

Other important parts of your mycrypt.h header file may include:

#define PASS_PHRASE_SIZE 8
#define WHITE_SPACE_CHARS " \t\n"

#define MIN_ROTATE -63
#define MAX_ROTATE +63

Here is some information on the various Standard C Library functions (see the man pages for more info):

The C routines
main.c
int main( int argc, char *argv[] );
This is the main program driver. Check for a filename on the command line and open it with fopen(). If argv[1][0] is '-', input will be coming from stdin. Output usage message if argc != 2. Change stdout to be unbuffered so stdout and stderr output is intermixed properly when we redirect both to a file for testing. This can be done with the line:

(void) setvbuf( stdout, NULL, _IONBF, 0 );
Get the pass phrase [getPassPhrase()], the two 32-bit keys [getKeys()], and the rotation key [getRotateValue()] from the user. Then create the 64-bit mask based on the pass phrase and two 32-bit keys [maskPassPhrase()]. And finally read and encrypt the data [mycrypt()]. Be sure to close any files you opened.

The two 32-bit keys and the 64-bit mask are each stored in an array of two 32-bit unsigned longs. The pass phrase is an array of 8 (PASS_PHRASE_SIZE) chars.

getPassPhrase.c
void getPassPhrase( char passPhrase[] );
This function prompts the user (via stderr) for a passphrase, reads the user's input with fgets() (via stdin), and makes sure the pass phrase entered is at least PASS_PHRASE_SIZE number of chars not including the newline char. If it isn't at least this long (it could be longer), output an error message and reprompt until a valid length pass phrase has been entered or the user hit Control-D to indicate EOF. If the user hits Control-D, exit the program immediately. If you get valid input, copy PASS_PHRASE_SIZE number of chars into the passPhrase[] array back in main() referenced/pointed to by the passPhrase[] parameter/pointer.

getKeys.c
void getKeys( unsigned long keys[] );
This function prompts the user (via stderr) to enter two 32-bit keys. Again use fgets() (via stdin) to read up the line the user input and parse the line into tokens [strtok_r()] that will be converted to unsigned longs [strToULong()]. Display an error message and reprompt on any invalid input. Again, exit immediately if the user types Control-D to indicate EOF. If you get two good 32-bit keys, store them in the keys[] array back in main() referenced by the keys[] parameter. Ignore extra keys (beyond the first two) entered by the user.

getRotateValue.c
int getRotateValue( void );
This function gets the rotation key from the user using the same basic mechanisms used in the above two routines. In addition to getting a valid number [strToULong()], make sure it is within (inclusive) MIN_ROTATE and MAX_ROTATE range [checkRange()]. Return the valid rotate value. Again, exit immediately if the user types Control-D to indicate EOF. Check to make sure the user actually entered a value (don't accept an Enter alone). You can check the length of the string entered by the user using strlen().

The Assembly routines
maskPassPhrase.s
void maskPassPhrase( unsigned long keys[], char passPhrase[], unsigned long mask[] );

This function will XOR the first 4 bytes of the pass phrase with the first 32-bit key (keys[0]) and store this result in the upper 32 bits of the 64-bit mask (mask[0]), and XOR the second 4 bytes of the pass phrase with the second 32-bit key (keys[1]) and store this result in the lower 32 bits of the 64-bit mask (mask[1]). This is not as long or as complicated as it may seem, but does involve loads and stores from/to memory locations pointed to by the parameters.

strToULong.s
unsigned long strToULong( char* str, int base );

Simply convert your strToLong.c routine from PA1 into assembly and call strtoul() instead of strtol().

Passing a base of 0 to strtol()/strtoul() allows for either decimal, octal, or hexadecimal input. If the string starts with a 1-9 character the input is assumed to be decimal as if the value of base was 10. If the string starts with a 0 character the input is assumed to be octal as if the value of base was 8. If the string starts with a 0x or 0X the input is assumed to be hexadecimal as if the value of base was 16.

checkRange.s
int checkRange( long value, long minRange, long maxRange );
This is the same checkRange.s you wrote in PA1. Just copy it over to your pa2 directory. Code Reuse!!!

rotate.s
void rotate( unsigned long mask[], int rotateCnt );

This function rotates the current 64-bit mask (mask[]) by rotateCnt places. If the rotateCnt is positive, rotate left; if the rotateCnt is negative, rotate right. Only the lower 6 bits of the rotateCnt should be used for the rotateCnt.

mycrypt.s
void mycrypt( FILE *inFile, unsigned long mask[], int rotateValue );

This function does the encrypting/decrypting of the data using the keys and masks we got from the user. In a loop, read up a block of data (BUFSIZ bytes at a time) from inFile with fread(). While there is more data to read (fread() returns a 0 to indicate it did not read up any items) perform the follow. Encrypt 8 byte chunks of the data with the 64-bit mask (XOR first 4 bytes of data with mask[0] and the second 4 bytes of data with mask[1]). Rotate the 64-bit mask with the rotation key using rotate(). Then do the same for the next 8 bytes of data, and so on until there are less than 8 bytes of data left in your input buffer. At this point you have to encrypt the remaining bytes in the buffer one at a time being sure to XOR each byte with the next byte of the 64-bit mask. For example, if there are 5 bytes left to encrypt, XOR the first byte with the most significant byte of mask[0], XOR the next byte with the next significant byte of mask[0], etc. The last byte to encrypt will be XORed with the most significant byte of mask[1] in this example. We will discuss how to do this and many other seemingly difficult operations in discussion sections.

When all the bytes in the input buffer have been encrypted, write the encrypted data out to stdout with fwrite().

See the sample pa2test in ~/../public.

The Unit Tests

The unit test for this assignment are:

testgetPassPhrase.c
testgetKeys.c
testgetRotateValue.c
testmaskPassPhrase.c
teststrToULong.c - very similar to pa1's but with base 0; test octal and hex
testcheckRange.c - same as pa1's
testrotate.c
testmycrypt.c

You are given testrotate.c in the public dir.

The public Makefile-PA2 has the rules for all the unit tests similar to what was done in PA1.

Turn In, Due Wednesday night, May 6 @ 11:59pm

Start early and proceed with baby steps. Don't try to write the entire program all at once. Use small sample/test input data and files as you test/debug your program.

You can use od -x filename to display the bytes of a (binary) file as hex ASCII characters to help aid in debugging your program.

Use the turnin program to turn in the following:


Using stderr, stdout, and errno in Assembly:

In main.c:

Define a ***global*** variable

	FILE *stdError = stderr;

A *global* variable is defined *above* main() and not inside the body of main().

Then in your Assembly file:

	set stdError, %o0
	ld  [%o0], %o0

Now stderr is in %o0 as the first argument for fprintf().

A symbol in assembly is an address. Therefore in assembly, the name of a
global variable (symbol) is the address of where that variable has been allocated.

Do the same for stdout; you need access to stdout in assembly (mycrypt.s).

The same is true with the global variable errno defined in errno.h. errno
is already defined so we do not define errno anywhere in our program. We
just access it.

These were also discussed in class and in the class notes.