I have been reading Knuth’s classic work The Art of Computer Programming. I have been doing some of the exercises in a machine language (ARM assembly code for the Raspberry Pi). You can find those resources on a one-page blog I have created. It’s here.
This exercise, I chose to do in Python!
Simple Encode & Decode (Python)
Knuth, TAOCP, vol. 1, 1.4.2 ex. 7
This simple exercise in encoding and decoding comes in a section on coroutines, but works quite nicely using plain old Python functions. The idea is that we provide a string of characters which can contain only letters and numbers and which is terminated with a full stop.
For each character of the input, if the character is a digit (let’s call it i), then we write the following character i + 1 times (whether it is a digit or a letter). If the character is a letter, we just write it to the output once. Here’s a sample input (not the same as given by Knuth).
Knuth TAOCP 1.4.2
The output is arranged into groups of three characters separated by a space:
AAA B44 44C CCC CCD E77 777 77F GGH HIJ JJJ JJK.
The task Knuth sets is to write a program that will reverse the process. But first it makes sense to write one that performs the encoding. I’ll start with the nuts and bolts part:
# encoder.py # from Knuth vol. 1, 1.4.2 (p195) def main(): print("Knuth TAOCP 1.4.2") a = input("String: ") print(prepare(process(a))) if __name__ == "__main__": main()
The main work happens in two functions, process() and prepare():
def prepare(string): output = '' count = 0 for ch in string: if count == 3: output += ' ' count = 0 output += ch count += 1 return output + '.'
This is very simple, it takes the output from process() and adds the spacing, plus the final full-stop. It’s important to add the spacing before adding the character, or you could end up with an unneeded space at the end.
Process() itself is fairly simple.
def process(string): output = '' i = 0 temp = string[i] while temp != '.': temp1 = string[i+1] if temp.isalpha(): output += temp i += 1 elif temp.isdigit(): output += temp1 * (int(temp)+1) i += 2 temp = string[i] return output
First we set things up before we enter a loop. Then, in the while loop we add characters to the output string according to the rules of the encoding.
A Possible Solution
Once again, I’ll give you the boring bits first.
# decoder.py # Knuth 1.4.2, ex. 7 def main(): print("Knuth 1.4.2, ex. 7") a = input("String: ").replace(' ', '') print(process(a)) if __name__ == "__main__": main()
I’ve taken advantage of the handy Python replace() function to strip out the spaces. This program only has one other function, which I have called “process()”.
def process(string): i = 0 output = '' temp = string[i] while temp != '.': i += 1 count = 0 while temp == string[i+count]: count += 1 if count: output += str(count) + temp i += count else: output += temp temp = string[i] return output + '.'
Once again, I place the first character into a variable called temp before entering the while loop. I update temp with the next character at the end of the loop and this makes it easy to check for when we have finished. Nested within the main while loop is another while loop that counts how many times a character appears in the input string. Knuth specifies that he doesn’t want zeroes to appear in the final string, so we only add a number to the output if it is over 1.
This is one of Knuth’s simpler examples, but it’s quite a nice one.