Multithreaded C++ implementation of the Vigenère polyalphabetic cipher

lotus 5797876121 Merge branch 'master' of https://git.zerohack.xyz/lotus/vigenere 1 month ago
include 228ca9bba9 added debug and optimized targets in makefile, added --version flag, moved from 8 space tabs to 2 space tabs (c++ is too verbose for 8 spaces!), updated compile_commands for rtags, removed old comments, version bump to 0.2 3 months ago
tests 228ca9bba9 added debug and optimized targets in makefile, added --version flag, moved from 8 space tabs to 2 space tabs (c++ is too verbose for 8 spaces!), updated compile_commands for rtags, removed old comments, version bump to 0.2 3 months ago
.clang_complete 1aaa1e4735 added more paths to clang_complete depending on lib version 2 months ago
.gitignore ffd1962bba add gitignore and fixed makefile 7 months ago
.travis.yml 88300c1a25 added travis build and updated readme with badge 1 month ago
LICENSE.md c1ecaa5a3e updated readme and license 6 months ago
README.md 88300c1a25 added travis build and updated readme with badge 1 month ago
compile_commands.json 228ca9bba9 added debug and optimized targets in makefile, added --version flag, moved from 8 space tabs to 2 space tabs (c++ is too verbose for 8 spaces!), updated compile_commands for rtags, removed old comments, version bump to 0.2 3 months ago
makefile 8bb5c5b0c7 add asan and ubsan to debug target, included .clang_complete for emacs 2 months ago
runner.cpp 228ca9bba9 added debug and optimized targets in makefile, added --version flag, moved from 8 space tabs to 2 space tabs (c++ is too verbose for 8 spaces!), updated compile_commands for rtags, removed old comments, version bump to 0.2 3 months ago
vigenere.cpp 228ca9bba9 added debug and optimized targets in makefile, added --version flag, moved from 8 space tabs to 2 space tabs (c++ is too verbose for 8 spaces!), updated compile_commands for rtags, removed old comments, version bump to 0.2 3 months ago

README.md



Vigenère Cipher

License Build Status

Takes a string of plaintext, encrypts it, then returns a ciphertext.

Motivation

I wanted a research project to familiarize myself with modern C++ concepts. While I hope the code is sound, please don't use this as a replacement for vetted cryptographic ciphers, this cipher is fundamentally vulnerable to cryptanalysis. I found the idea of the cipher interesting and wanted to write my own implementation. I plan on using what I've learned here to write an autokey cipher as well.

Project Goals

  1. Thread safety
  2. Modern C++ style
  3. Interactivity and scriptability
  4. Appropriate file modularity
  5. Follows C++ general guidelines
  6. Use file I/O (read in keyfile, save keyfile, etc.)
  7. Preserve case and symbols

Prerequisites

  • C++11 capable compiler
  • POSIX compatible make (Gmake, Bmake, etc.)

Build

git clone https://git.zerohack.xyz:443/lotus/vigenere.git
cd vigenere
make
# put it somewhere in your $PATH if you want

# if you want to build a debug version
make debug
# if you want to build the optimized version
make opt

Usage

# note: the order of arguments does not matter
vigenere                                                            # run interactively
vigenere -e -i plans.txt                                            # encrypt plans.txt as plans.txt.vc (try to use defaults)
vigenere -e -i plans.txt -t 8                                       # same + manually specify use of 8 threads
vigenere -e -i plans.txt -o ../secret.txt.vc -k ../mynew.key        # encrypt; manually specify output file name+location and keyfile name+location
vigenere -d -i secret.pac -o ../plans.txt -k ../mynew.key           # decrypt; manually specifying output file name+location and keyfile name+location

default keyfile is ./vc.key and default filename adds or removes *.vc

Core Algorithm Notes 🤖

If you want to know more about how this program work, you can read this section. Otherwise feel free to skip over it.

We start off with our ASCII alphabet (including symbols) whos size is 96 characters long.

... a b c d e f h i j k l m n o p q r s t u v w x y z { | } ~

We are going to make a randomized version of this alphabet for the amount of rows the user has specified. So, if the user specified 5 rows, we might get a final tabula recta (i.e. a matrix) that something that looks like this:

          *column 0*
          |
*row 0* - T % f N e F H I J k ...
         Q w E R ^ l y ) & r ...
         A T c e W Y $ i j k ...
         P M , | T F x Z ) ( ...
         L R V B E F K U J P ...

The column numbers associate with the index of our original alphabet, so if we are at position(0,0) a == T. We can now use this "matrix" to encrypt our message.

When stepping through the rows, we do so according to a unique order, a key order. This order is determined at random after we make our tabula recta. By determining the key_order at random for each file, we can more closely abide by Kerckhoffs's principle. The key_order corresponds to a row number. To explain, let's say we want to encrypt our secret message "cab". Assume that our determined key_order is 0, 2, 4, 1, 3. Note that a row value could be repeated. With this key_order our we use row 0 to turn our plaintext "c" into a "f". Moving to our next plaintext letter a and row 2 gives us the letter A. Next we move to row 4 and plaintext letter b gives us letter R

So, with this final "matrix" combined with the key order, the plaintext message cab is equivalent to fAR

To decrypt the message, you need to know both the original matrix, and the key order. The decryption process is simply the reverse of the encryption process.

Also, note that this algorithm does preserve case, meaning that in our earlier example CAB != fAR.

Multithreading

The core algorithm does not change with the addition of multithreading. As of now only encryption and decryption process is multithreaded; and not the other helper functions. The application will detect how many cores your system has available and spawn an appropriate number of threads by default. The idea to allow multithreading is that we want to partition off work for each of the threads we create. We will call the function start_workers() passing in the appropriate work function we wish each newly spawned thread execute. For instance, in our previous example if we want to process the word cab and we can spawn two threads, we could give c to thread α1, and give ab to thread α2. I have dubbed these units of work Chunks, and given them a class to help organize their data. This is an implementation of the map-reduce model.

In short; thread α1 can work on Chunk X, while thread α2 is working on Chunk Y. For small messages this overhead is not worth the effort, but for large messages it helps out immensely. A Chunk hold a couple simple pieces of data:

Chunk X:            Chunk Y:
    text = 'c';         text = 'ab';
    sa = 0;             sa = 1;
    rt;                 rt;

Note that since cab contains an odd number of characters while we have an even number of threads, Chunk Y gets more text to process. sa is the start of alphabet. Think of it as "what key_order do I start with?". This corresponded to the absolute index in the original message. We have to process the Chunks respecting their order in the original message. Otherwise we could end up with garbaled data that we cannot decrypt. In regards to the rt value this is a promise value, which is to be initialized after we complete the work of encrypting our text. For Chunk X, the ct variable will eventually be set to the value of f when its assigned thread completes its work.

If we want to get access to all of the Chunks, we can iterate over split_pt. This vector contains smart pointers to each of the created Chunks. Before we can get the resulting ciphertext to give back to the user, we have to make sure that all threads have completed their work. This is where you can see with iterate over each of the threads and make sure to .join() them.

More detail about the Vigenère Cipher can be found here, or here.

TODO

  • remove using namespace std
  • write out to file
  • allow for arg flags in any order
  • use c++11 random_device
  • move stuff to header file
  • add num of threads as arg flag (e.g. -t 4)
  • UTF8 support

License / Disclaimer

BSD 3-clause License Once again, do not use this as a substitute for well exercised standard ciphers (AES256, Serpent, Twofish, etc..) Don't call me if something blows up.