Friday, June 10, 2016

The Enterprising Researcher: Innovation, Entrepreneurship and University IP

Let's have a break from pure cryptography for a moment. Lift your head from the security proof you are trying to come up with, or from the poly-time factoring algorithm you are coding. Now look around: each object surrounding you has passed through a lot of stages before becoming the commercial product you stare at. There is also a good chance that many of them were once ideas in published research papers, probably similar to those we are all planning to write at some point. Papers not only carrying academic work but content that, through efforts and developments and visions, ended up being an object in our everyday life.

"The Enterprising Researcher: Innovation, Entrepreneurship and University IP" is a workshop I attended organised by Bristol Basecamp. It shed light on the process an idea needs to undergo to become a product and gave insights on the fact that the word "entrepreneurship" carries a stronger meaning than just "commercialising an idea" (valuable by itself). It suggests a sense of innovation and creation: like an artist who uses brushes to outline figures (and more importantly emotions), the entrepreneur combines resources and ideas to shape new markets and products (from which, most importantly, desires of people arise). This poetic point of view was particularly evident in the first talk.

Harry Destecroix and Laurent Chabanne are respectively CEO and CSO of Ziylo, a spin-off of the University of Bristol with expertise in continuous carbohydrate sensing technology. They shared their experience in bringing an academic idea to the market: successes, problems, obstacles and satisfactions are all part of the start-upper's life! Among all, I was impressed by how much passion and dedication they put in their work, hence I will often quote their exact own words to briefly describe their story, which is really worth listening to.

Their journey began with a scientific discovery (sugar-sensing platforms using synthetic receptors to isolate glucose-like structures) which was the outcome of research carried out by Professor Anthony Davis' research group at the University of Bristol. We are in the academic world for now, hence usual laws governing scientific publications still rule. Entering the market is different: first of all, it requires money. Since the idea was born in the university, the first source of help came from the inside. Research and Enterprise Development (RED) provides support and assists the growth of entrepreneurs from within the university. This is not just the step where you need money though, but it is also the phase during which awareness and information about business are vital. As they said, you need to "find complementary skill sets to yours", for instance in someone who's expert of marketing because "it costs a lot of money and time to write a business plan" (and it’s not something they teach you in school, I may add). Eventually, you need to "stop talking about tech and do marketing". I also enjoyed the human factor around this point, as they suggested to start these kinds of adventures with people you trust, with friends! It reminds us that start-ups as well as huge corporations are first of all human communities.

Another crucial point is that the trip is not easy. Now they are successfully working with SETsquared and Innovate UK, but raising funding was an issue to face and there might be moments in which the project seems to be failing: "you need to get used to the fact that you can lose your job at any time. If you’re not ok with that, don't bother". The important thing is to always have a positive attitude and to have faith in what you're doing, after all "without downs you don't know when you're up".

Funding is just one (although quite steep to climb) obstacle you might encounter in your way to the market. I would like to reference another one they mentioned as there is a very important entrepreneurship lesson to learn there. Being a company in biochemistry, they needed labs: expensive and dedicated structures, not the kind listed on Rightmove! Instead of coping with their specific problem, they identified a potential market and they (very recently) founded NS Science, a commercial real estate company focused on providing labs and science facilities to businesses. I would like to stress the moral about "being entrepreneur" laying behind this example: they didn't just commercialise a service, but they shaped the need of a community. Now that their company is around, other people who have similar issues may feel the desire to produce something new thanks to NS Science's facilities. Even better: NS Science may inspire other people who put aside their ideas because of a lack of spaces, for instance. Let me give a final (rather famous) example of the "needs/desires creation" aspect of entrepreneurship: did you need iPads before they were invented? What changed apart from the invention of iPads itself? I believe that, after scratching the surface of possible answers, the outcome could be incredibly surprising.

The second talk took a step back: how to publish research ideas and related legal aspects. Kathryn Smith, Research Engagement Librarian, focused on the importance of easily available research papers through green open access, that is to say repositories of manuscripts that differ from the published version (even just in the formatting, sometimes) so that they can be freely released. This is motivated by the fact that some entities may have restricted access to research outcomes (industry, public sector, charity foundations…) and that "you don't know who's out there: investors, new collaborators, people who could possibly develop your work in ways you hadn't even thought of". In our specific field, ePrint is an example of green open access. The University of Bristol also has its own green open access repository called Pure, whose team also checks for possible legal incompatibilities with the journals the paper is published in. In these regards, she pointed to a really useful search engine for manuscript publication policies of journals: SHERPA/RoMEO. For example, the picture shows the result of querying "LNCS", where being a "RoMEO green journal" means that you "can archive pre-print and post-print or publisher's version/PDF".

The third talk was also focused on legal aspects: Sue Sunstrom, Head of Commercialisation and Impact Development, answered several questions about Intellectual Property (IP) and how they relate to the university. First of all, there exist different kinds of IP (trademark, copyright, patent, trade secret, design...). They offer different features and are applicable to different (either concrete or abstract) objects. Since it can be quite expensive, choosing the right form of IP (hence the right form of protection) is crucial. But defence against competitors is not the only reason why IPs matter: investors like them because they are a proof you own something different and possibly innovative. They make the object they protect valuable in some situations (think of industrial secrets, for instance) and, as every form of value, they can be used as a currency in commercial trades. The interest of universities in IPs on the outcomes of research stems from various motivations: develop practical applications (hence have an impact on society), attract funding (for new research, hence possibly new IPs, entering a virtuous circle) and strengthen the link with industry are just few of them.

The last talk was given by Prof Alan Palmer on translational life science research, exemplified by the process of commercialising drugs. Such a topic is indeed strictly related to the biomedical field and is about academic and industrial efforts made in order to develop new solutions for prevention, diagnosis, and therapies. I am intentionally brief here, just have a look at Prof Palmer’s Linkedin profile to have a feeling of who an entrepreneur is!

Back to our beloved crypto, I think that this event has added a brick to my personal awareness (and I hope yours, after this reading) about what is going on out there, following what AvonCrypt had started. That was the perfect occasion in which to see how these realities exist in our field too (and I have the feeling it is particularly fertile). Thanks to this event, I understood what probably happened to those companies behind the scenes and I also found out another example (my bad I didn't know it before!): there is a start-up in Bristol called KETS which has recently won a prize and related funding for their usage of "quantum cryptography to improve data encryption, ensuring information is safe in all situations, from bank transactions to critical infrastructure, and to individuals shopping online from the comfort of their own home".

As Harry and Laurent said during their talk, "start-upping involves learning" so "take a book and read about business and economics": a suggestion I will certainly follow.

(Thanks to Matthias, Simon and Marie-Sarah for comments and corrections)

Monday, June 6, 2016

A visit to the National Museum of Computing in Bletchley Park

Last Friday, I went to the National Museum of Computing (TNMOC) in Bletchley Park, home of the British Government Code and Cypher School during World War II. It was hosting a special event to celebrate two new acquisitions: a Lorenz teleprinter recently found on eBay and a Lorenz SZ42 cipher machine on long-term loan from the Norwegian Armed Forces Museum. TNMOC now has all of the key parts (either original or rebuilt) used in the process of encryption, interception, and decryption of Lorenz messages sent by the German High Command during WWII. Five women of the WRNS (Women's Royal Navy Service, whose members are often called "wrens") who operated Colossus and the relatives of others who contributed to breaking Lorenz attended this special event.

John Whetter, one of the leaders of the team at TNMOC that rebuilt the British Tunny (in the background), holds a Spruchtafel, next to the Lorenz machine on loan from Norway.

The Lorenz cipher

The story of Lorenz, Bill Tutte, and Tommy Flowers is perhaps less well known than the story of Enigma and Alan Turing. The Enigma machine encrypted messages sent among units of the German army, navy, and air force. It had 3 or 4 rotors and operated directly on an alphabet of 26 letters, which were then transmitted in Morse code.

The Lorenz SZ42, on the other hand, was custom-built for the German High Command to send the most important strategic messages to its Field Marshals. It was more complex and less portable than an Enigma machine. The Lorenz cipher could handle letters, punctuation, and spacing: each character was encoded as 5 bits according to the Baudot code. The machine had 12 wheels, each with a number of cams (pins) on it. The numbers of cams on the wheels were co-prime. The "key" was in two parts: the starting position of each wheel ("wheel setting"), and the pattern of raised or lowered cams on each wheel ("wheel pattern"). The wheel settings were supposed to be changed for each message, while the wheel patterns were changed infrequently—for the first few years. When the wheel patterns did begin to change more frequently, however, Colossus II was operational and could find them.

The entire process of intercepting a message went roughly as follows.

1. Setting up to send the message

  • The sending operator in Berlin picks six pairs of letters at random from a prepared sheet.
  • He or she types them in to the teleprinter (without the Lorenz machine attached). The output is a paper tape with punched holes corresponding to the Baudot encoding of the letters.
  • Next, the operator uses a board of wheel settings (a Spruchtafel) to determine the starting position of the Lorenz SZ42's rotors. Each of the letters corresponds to a number.
Lorenz SZ40 (Tunny) Indicator Reading Board

German Lorenz operators consulted a Spruchtafel to determine which wheel settings (starting positions) to use based on a given 12-letter indicator. (source)

2. Encrypting and sending the message

  • Now, the teleprinter operator in Berlin hooks up the Lorenz encryption machine to the teleprinter and types the plaintext message.
  • The encrypted message is output on the same perforated paper tape, again encoded with the Baudot code.
  • The paper tape corresponding to the 12-letter indicator and the ciphertext is fed to a radio transmitter, which broadcasts it.

3. Intercepting the message

  • Radio receivers at an intercept station at Knockholt, Kent (south-east of London) pick up the encrypted message.
  • The faint signals are fed to an undulator, which uses an ink pen to record a continuous trace of the signal on a strip of paper tape, the "slip".
  • Slip readers (people, not machines) translate the highs and lows on the slip to characters according to the Baudot code. To minimize errors, two or more slip-readers read each transmission.
  • The characters are typed in to a perforator that produces another strip of paper upon which the characters are encoded in Baudot code.
  • The intercepted message is sent to Bletchley Park (100 km away) in two ways: by secure landline and by motorcycle courier.

4. Decrypting the message

  • The perforated tape is fed to Colossus, which outputs the most likely wheel settings (Colossus I) and wheel patterns (Colossus II onwards).

The input to the Colossus machine is perforated paper tape with characters in 5-bit Baudot code.

WWII-era cryptography vs. modern cryptography

I went to TNMOC with Thyla van der Merwe, another PhD student at Royal Holloway, to speak to the guests for a few minutes about cryptography today and how it works now compared to how it worked in the WWII era.

Thyla and Marie-Sarah next to TNMOC's rebuilt Colossus.

Thyla explained the benefits of using a stream cipher, like the Lorenz cipher—they're fast, they don't propagate ciphertext errors, and they require only small buffers. These properties made it appropriate for encrypting radio transmissions. She pointed out how ordinary citizens of the WWII era probably didn't use encryption, while today, it is ubiquitous: everyone who's been online or had a cell phone has used it.

I talked about what makes "modern" cryptography different. At the time of WWII, public-key cryptography had not yet been discovered, so sharing keys for any kind of symmetric protocol was still hard. Cryptography in that era also didn't have the precise definitions, clear assumptions, and rigorous security reductions we have today. (Katz and Lindell's textbook does a wonderful job of explaining these three features of modern cryptography.) Although these more formal aspects of modern cryptography are powerful, their strength in the real world is limited in two ways. First, they may not capture all of the information or capabilities an attacker may have (e.g., side-channel attacks). Second, and maybe even more importantly, they come with the assumption that protocols are implemented and used exactly as they should be.

For example, cryptographers know how important it is that a stream cipher (like the Lorenz cipher) never re-uses the same keystream for different messages, because the XOR of two ciphertexts would equal the XOR of the two plaintexts. If the two messages are similar, then keystream re-use is particularly dangerous. This mistake is exactly what led cryptanalysts at Bletchley Park to decrypt two long messages and obtain 4000 characters of keystream: in August 1941, a long message was retransmitted with a few minor changes, but with the same key settings. Within a few months, cryptanalyst John Tiltman had recovered the keystream. By January 1942, Bill Tutte had fully reverse-engineered the Lorenz machine... without ever having seen it!

The operator or implementer of a cryptographic protocol that uses a stream cipher may not understand how important it is that the keystream never be re-used, or may simply make a mistake. This type of mistake hasn't happened only in WWII. In the late 1990s, the IEEE 802.11 standard specified the WEP (Wired Equivalent Privacy) protocol for Wi-Fi networks. WEP uses the stream cipher RC4 to encrypt traffic from access points (wireless routers) to mobile stations (all devices wirelessly connected to the network). Partly due to the WEP protocol's design, and partly due to how the access points' keys tended to be managed in practice, the same RC4 keystream was frequently re-used in implementations. (The key supplied to RC4 was a concatenation of a 24-bit IV, sent in plaintext along with each encrypted message, and a 40-bit shared secret key, which was rarely changed.) Read more about WEP's shortcomings in Borisov, Goldberg, and Wagner's CCS 2001 paper.

Modern cryptography may offer many new tools, definitions, and rigorous proofs, but some things will never change: designing protocols that are secure in the real world is still really hard, and breaking cryptographic schemes still requires a great deal of creativity, analysis, and dedication.

More about the Lorenz story

Determining how the Lorenz machine worked was only the first step. Tommy Flowers, an engineer at the Post Office Research Station, designed and built an emulator, "Tunny," of the Lorenz machine. An entire section at Bletchley Park (the "Testery," named after the section head, Ralph Tester) was devoted to decrypting the messages—which they did by hand for the first 12 months. Max Newman and Tommy Flowers designed and built machines to speed up the decryption process: the "Heath Robinson" and "Colossus". Colossus was the first electronic digital machine that was programmable (with plugs and switches). Heath Robinson and Colossus were operated (and named, actually) by members of the WRNS.