Thursday, December 29, 2016

[33C3 TLDR] PUFs, protection, privacy, PRNGs

Pol van Aubel gave an excellent introduction to Physical(ly) Unclonable Functions from both a modern and historical perspective. If you want to know how to exploit intrinsic random properties of all kinds of things, give this talk a shot and don't be afraid of all the footnotes.

Takeaway: PUFs are kinda neat and if you're interested check out this talk.

[33c3 TLDR] Decoding the LoRa PHY

Matt reverse engineered the PHY layer of the proprietary and patented LoRa techology and even wrote a GNUradio plug-in for it.

He also gave an excellent introduction to basics of the LoRa technology.
Actually, it was somehow an introduction to basics of radio technology in general and I highly recommend watching his talk.


[33c3 TLDR] Making Technology Inclusive Through Papercraft and Sound

A talk by bunnie introducing their Love to Code program, a program for teaching children (adults too) to code using cheap papercraft-inspired devices and tools. The argument is that the diversity problem that exists in coding can maybe be fixed by starting at the earliest stages, i.e., starting with children.

From a technical perspective they had to solve some interesting problems to make the technology fast, cheap and usable. In particular, they have a pretty awesome technique using sound to upload code to the microcontrollers they use.

Takeaway: Pretty cool technology for a societally relevant goal. Check it out.

[33c3 TLDR] Tapping into the core

An interesting talk about how to abuse the USB to JTAG bridge available in newer Intel processors to attack computers through their USB port.

It is not entirely clear to me if this is "just" another case of somebody forgetting to turn of a debug facility and leaving it as a security hole or if there are actually cases when it cannot be disabled.


[33c3 TLDR] Wheel of Fortune

Another talk about bad RNGs in embedded devices. This time discussing challenges in embedded systems a bit more in general.

The takeaway might be that these devices really need a hardware RNG even if it costs a little bit extra.

(Similar to Mathy's talk, maybe only watch one of them.)


[33c3 TLDR] Do as I Say not as I Do: Stealth Modification of Programmable Logic Controllers I/O by Pin Control Attack

This talk was about how to abuse a design peculiarity in PLCs. It is used to change the behavior of the controllers by simply changing the configuration of the I/O pins.

For example, switching a pin from output to input will suppress all writing to that pin without any feedback.


[33c3 TLDR] A Tale of Two Skype Calls

Earlier today we had two great talks focusing on the aftermath of the Snowden revelations:
- 3 Years After Snowden: Is Germany fighting State Surveillance?
- The Untold Story of Edward Snowden’s Escape from Hong Kong

First of all a chilling report on the state of the German surveillance machine detailing how the German intelligence agencies have gotten more money, more capabilities (under the law) and are processing more data. Often without good supervision. It is only because of Germany's attempt at an inquiry, which the two speaking journalists recorded, that we know some of the details of the mass surveillance apparatus.

At the end, instead of the regular Q&A, key witness #1 came and joined us over Skype: Edward Snowden.

Takeaway: things are getting worse, not better and we need to do more to combat this.


Second, a talk detailing the story of the people that kept the very same Edward Snowden safe during his stay as a refugee in Hong Kong, detailing their particular brand of self-sacrifice. Being refugees in Hong Kong, they are forced to live in squalor without rights. Government support was stripped away once their identities were known. These people, these guardian angels, now subsist on the donation efforts of third parties. Further attempts are being made to get these 7 people out of the country and to a safer place.

For the Q&A, Vanessa, one of the people that helped Snowden, skyped in and answered some of the questions pertaining to her recollection of the events as well as her current status.

Takeaway: the seven refugees that helped Edward Snowden stay hidden during his stay in Hong Kong need assistance. If you want to you can donate at the following places:

Tuesday, December 27, 2016

[33c3 TLDR] How Do I Crack Satellite and Cable Pay TV?

Chris had too much time and over a span of three years reverse engineered set top boxes (Digicipher 2 system) with relatively low cost equipment. In the end he was able to extract the long term keys and understand the used crypto (DES with XOR-preprocessing of data/key inputs) such that he could watch satellite and cable pay TV for free now (nothing interesting to see there though according to him).

Take away: Reverse engineering was made simpler by relatively old design with not too many countermeasures. Never underestimate the effort people are willing to invest to break your system.

Watch his talk on YouTube.

[33c3 TLDR] Everything you always wanted to know about Certificate Transparency -- Martin Schmiedecker

Martin gave a short review of problems with CAs in recent history and then presented the basics of Certificate Transparency, Google's proposed solution to the problem.

The takeaway message is that Certificate Transparency is great but to make it really effective clients need to check the public logs before accepting a certificate.

Martin tried to introduce the technical concepts but could not go into more detail. For a nice start, watch the Video.

[33c3 TLDR] Predicting and Abusing WPA2/802.11 Group Keys -- Mathy Vanhoef

Mathy gave an excellent talk showing ways to attack wireless security (WPA2) via a group key used for broadcast transmissions. This leads to decryption of all traffic. It included introductions to all the relevant parts of the system and a demo.

The take-away message is actually rather short:
People standardize and implement bad RNGs and if those are used to generate cryptographic secrets it leads to vulnerabilites.

It included a few more tricks and gimmicks and I recommend you watch the Video.

Sunday, December 18, 2016

Lightweight Cryptography

The need for lightweight cryptography emerged from the lack of primitives that are capable to run in constrained environments (e.g. embedded systems). In recent years, there has been an increased focus on development of small computing devices that have limited resources. Current state-of-art cryptographic primitives provide the necessary security conditions when put into constraint environments, but their performance may not be acceptable.  
Over the last decade, numerous lightweight cryptographic primitives have been proposed providing performance advantage over conventional cryptographic primitives. These primitives include block ciphers, hash functions, stream ciphers and authenticated ciphers.

Lightweight Block Ciphers

A block cipher provides a keyed pseudo-random permutation that can be used in a more complex protocol. It should be impossible for an adversary with realistic computing power to retrieve the key even if the adversary has access to a black-box model of the cipher where she is able to encrypt/decrypt plaintext of her choice. Block ciphers are normally based on either Substitution-Permutation Networks or Feistel-Networks.

The components and operations in a lightweight block cipher are typically simpler than in normal block ciphers like AES. In contrast to simplifying the round functions, the number of rounds simply increases to achieves the same security. As memory is very expensive the implementation of a S-Box as look-up table can lead to a large hardware footprint. Therefore, lightweight block ciphers have usually small (e.g. 4-bit) S-Boxes. To save further memory, lightweight block ciphers are using small block sizes (e.g. 64 or 80 bits, rather then 128). Another option is to reduce the key sizes used to sizes less than 96 bits for efficiency. Simpler key schedules improve the memory, latency and power consumption of lightweight block ciphers.

Recently, Banik et. al. showed an implementation of AES requiring just 2227 GE and latency of 246/326 cycles per byte for encryption and decryption respectively. In 2007 Bogdanov et. al. proposed PRESENT, an ultra-lightweight block cipher based on a Substitution-Permutation Network that is optimized for hardware and can be implemented with just 1075 GE. PRESENT is bit-oriented and has a hardwired diffusion layer. In 2011, Guo et. al. designed LED, an SPN cipher that is heavily based on AES. Interesting in that design is the lack of the key schedule, as it  applies the same 64-bit key every four rounds to the state for LED-64. The 128-bit version simply divides the key in two 64-bit sub-keys and then alternately adds them to the state. Reducing the latency is the main goal of the block cipher PRINCE. There is no real key schedule in PRINCE, as it derives three 64-bit keys from a 128-bit master key. PRINCE is a reflection cipher, meaning that the first rounds are the inverse of the last rounds, so that the decryption of a key $k$ is identical to an encryption with key $k\oplus\alpha$ where $\alpha$ is a constant based on $\pi$. The block cipher Midori was designed for reducing the energy consumption when implemented in hardware. It has an AES-like structure and a very lightweight almost-MDS involution matrix M in the MixColumn step. In 2013, Simon and Speck have been designed by NSA. Both ciphers perform exceptionally well in both hardware and software and were recently considered for standardization. Compared to all other ciphers, no security analysis or design rational was given by the designers. Simon is hardware-oriented and based on a Feistel-Network with only the following operations: and, rotation, xor. Speck is software oriented and based on an ARX construction with the typical operations: addition, rotation, xor. Very recently, SKINNY has been published to compete with Simon. The main idead behind the design is to be efficient as possible but without sacrificing security. SKINNY is a tweakable block cipher based on the Tweakey framework with the components chosen because of the good compromise the provide between cryptographic properties and hardware costs.

Lightweight Hash Functions

The desired properties of a hash function are:
  •  Collision resistance: it should be not feasible to find $x$ and $y$, such that $H(x) = H(y)$
  •  Preimage resistance: given a hash $h$, it should be infeasible to find a message $x$ such that $H(x) =  h$
  •  Second preimage resistance: given a message $y$, it should be infeasible to find $x\neq y$ such that $H(x) = H(y)$
Conventional hash functions such as SHA1, SHA2 (i.e. SHA-224, SHA-256, SHA-384, SHA-512, SHA-512-224 and SHA-512-256) and SHA3 (i.e. Keccak) may not be suitable for constraint environments due to their large internal state sizes and high power consumption. Lightweight hash functions differ in various aspects as they are optimized for smaller message sizes and/or have smaller internal states and output sizes.

PHOTON is a P-Sponge based AES-like hash function, with an internal state size of 100 to 288 bits and an output of 80 to 256 bits. The state update function is close to the LED cipher. In 2011, Bogdanov et. al. designed SPONGENT, a P-Sponge where the permutation is a modified version of the block cipher PRESENT. SipHash has a ARX structure and is inspired by BLAKE and Skein and has a digest size of 64-bits.

Lightweight Stream Ciphers

A stream cipher generates a key stream from a given key $k$ and an initialization vector $IV$, which is then simply xored with the plaintext to generate a ciphertext. It must be infeasible for an attacker to retrive the key, even if a large part of the keystream is available to the attacker. In 2008, the eSTREAM competition aimed to identify a portfolio of stream ciphers that should be suitable for widespread adoption. Three of the finalists are suitable for hardware applications in a restricted environment.

Grain was designed by Hell et. al. and is based on two finite state registers whose clocking influence each others update function to make it non linear. Grain requires 3239 GE in hardware. MICKEY v2 is based on two LFSR (linear feedback shift registers) that are irregularly clocked. MICKEY v2 requires 3600 GE in hardware. Trivium is also a finalist from the eSTREAM competition that has three LFSR's with different length. Trivium requires 3488 GE in hardware.

Lightweight Authenticated Ciphers

The aim of authenticated encryption is to provide confidentiality and integrity (i.e. data authenticity) simultaneously. In 2014, the CAESAR (Competition for Authenticated Encryption: Security, Applicability and Robustness) competition started with the aim to identify a portfolio of authenticated ciphers that offer advantages over AES-GCM and are suitable for widespread adoption.

ACORN is based on 6 LFSR's and has a state size of 293 bits. ACORN provides full security, for both, encryption and authentication. The hardware costs should be close to that of Trivium according to the designers. SCREAM is a tweakable block ciphers in the TAE (Tweakable Authenticated Encryption) mode. SCREAM is based on LS designs Robin and Fantomas. Bertoni et. al. designed Ketje that is a lightweight variant of SHA3 (i.e. Keccak). Ketje relies on the sponge construction in the MonkeyWrap mode. The internal state size is only 200 bits for Ketje-Jr and 400 bits for Ketje-Sr. Ascon is an easy to implement, sponge-based authenticated cipher with a custom tailored SPN network. It is fast in both, hardware and software even with added countermeasures against side-channel attacks. Another CAESAR candidate is the 64-bit tweakable block cipher Joltik, that is based on the Tweakey framework. Joltik is AES-like and uses the S-Box of Piccolo and the round constants of LED. The MDS matrix is involutory and non-circulant.

Tuesday, December 13, 2016

Living in a Data Obsessed Society (Part II)

This post is about the event “Living in a Data Obsessed Society” that took part in Bristol on Friday the 2nd of December. If you missed the first part, you can find it here.
The view that decisions based on data are neutral, efficient and always desirable was probably the most challenged during the evening. A very disturbing example was the result of a recent investigation by ProPublica, which found that machine learning software used in the United States to assess the risk of past criminals to recidivate was twice as likely to mistakenly flag black defendants as being more prone to repeat an offence. The same AI-led software was also twice as likely to incorrectly flag white defendants at low risk.
The reasons for these biases remain mostly unknown, as the company responsible for these algorithms keeps them as a trade secret. Even assuming racism was not explicitly hardcoded in the algorithms, Charlesworth and Ladyman reminded that not only humans, but also algorithms, make decisions in conditions that are far from ideal. Machines learn on datasets that are chosen by engineers, and the choice of which data is used on this step is also going to teach the underlying biases to the algorithms. Predictive programs are at most as good as the data on which they are trained on, and that data has a convoluted history that does include discrimination.
There is, then, a big risk of perpetuating a vicious cycle in which people keep being discriminated because they were in the past and are in the present. Moreover, the entrenching of these biases could be seen as something correct, just because of the ideological unquestioning of the authority that we give to the ruling of data and algorithms in these processes. Because, as the speakers pointed at several moments, algorithms in general or machine learning in particular do not eliminate the human component: Humans and human-related data take part in the designing steps, but the centralized power related to these choices remains mostly hidden and unchallenged.
Whereas in existing social institutions there is usually a better or worse designed procedure to object when errors happen, it is hard to dispute the outputs of Artificial Intelligence. In words of Ladyman, the usual answer would be something like “we used our proprietary algorithms, trained with our private dataset X of this many terabytes, and on your input, the answer was Bla”. He also pointed the huge swift of power this meant from the classical institutions to computer scientists working on big companies, and their probable lack of exposure to social sciences during their studies and career.
Assuming we were to set aside the important aspect of dispute resolution, then the question should be framed as whether these algorithms make “better” decisions than humans, as biases are inherent to both. The meaning of being “better” is, though, another choice that has to be made, and it is a moral one. As such, society should be well informed about the principles embedded on this new forms of authority, in order to establish what aligns with its goals. A very simple example was given: Is more desirable an algorithm that declares guilty people guilty with a 99% probability, while putting a lot of innocent people in jail; or one that almost never condemns the innocents but leaves numerous criminals go away? The same kind of reasoning could be iterated for the discrimination between defendants commented above.
Someone could maybe derive from this text that the speakers were some kind of dataphobes who would not like to see any automation or data ever involved in decision making, but it was not the case. At several points, all of them praised in one way or another the benefits of data and its use in our societies, from healthcare to science to the welfare state. The questioning was about the idealization of data gathering and its algorithmic processing as an ubiquitous, ultimate goal for all aspects of our lives. A critique of its authority based on objectivity claims and, ultimately, a review of the false dichotomy between moral and empiricist traditions. Charlesworth celebrated the dialogue that has taken place between lawyers and computer scientists during the last years. Nevertheless, he urged to expand the horizons of these discussions to include philosophers and the rest of social sciences as well. Cristianini, being a professor on Artificial Intelligence himself, concluded his intervention with a concise formulation: Whatever we use these technologies for, humans should be at the centre of them.

The moral dimension of Cryptography, and its role on rearranging what can be done, by whom and from which data, might come to mind for several readers as well. Since the Snowden revelations, it has become a more and more central debate within the community, which I find best crystallized on the IACR distinguished lecture given by Rogaway on Asiacrypt 2015 and its accompanying paper. Technologies such as Fully Homomorphic Encryption, Secure Multiparty Computation or Differential Privacy can help mitigate some of the problems related with data gathering while retaining its processing utility. All in all, one of Ladyman's conclusions applies here: we should still question who benefits from this processing, and how. A privacy-preserving algorithm that incurs (unintended) discrimination is not a desirable one. As members of society, and as researchers, it is important that we understand the future roles and capabilities of data gathering, storing and processing. From mass surveillance, to biased news, to other forms of decision making.

Tuesday, December 6, 2016

Living in a Data Obsessed Society (Part I)

It’s Friday evening and inside Will’s Memorial Building, Bristol almost two hundred people have gathered to hear and discuss about our lives in a data obsessed society. The opportunities, as well as some of the risks on which we are sleepwalking into by living in the current data-driven world are discussed by three different people with three different perspectives: Nello Cristianini, professor of Artificial Intelligence; Andrew Charlesworth, reader in IT and Law, and James Ladyman, professor of the Philosophy of Science.

Dr. Abigail Fraser, researcher in Epidemiology, introduces the event: It is a common agreement that different data infrastructures are mediating, or even leading, a broad range of aspects in our daily lives, from communications to transactions to decision-making. Fraser herself recognizes the crucial role of big volumes of data for her work, without which she would not be able to conduct much of her research, if any. But there is also another side of data gathering and the decisions being made based on it, whether directly by humans or with the mediation of human-designed algorithms. There seems to be an unquestioned acceptance of data ideology, according to which all the decisions made based on the data are inherently neutral, objective and effective. Nevertheless, the growing evidence provided by the speakers disproved many of these assumptions.

The evolution of the data infrastructure

Cristianini is the first to mention during the evening the enormous amount of research that has been put into hardware and software since his teen years in the early 80s. At that time, he would have to instruct his computer beforehand in order for it being able to answer who, for example, Alexander V was. Today, he just has to take out his smartphone and speak to it. Various internals based on Artificial Intelligence (AI) would then recognize his speech, look for the answer amongst huge volumes of data and, in a matter of seconds, produce an answer for anyone listening.

Computers can now learn and what Cristianini demonstrated live was possible not only because of our technological evolution, but also the status of our social institutions. We have overcome great challenges thanks to AI and the availability of large volumes of data, but we have not developed our laws, learning and morals accordingly in pace with these changes. A series of mostly unquestioned social consequences of the gathering and use of data became then the focus of the event, especially in the cases for which there is not an individual consent or an informed social consensus. Concrete cases included people being progressively fed only with the kind of news they want to hear about, the exercise of mass surveillance by several governments, staff selection processes and banks and insurances companies making decisions using data from even social network profiles.

What can go wrong with massive data gathering?

Charlesworth elaborated the most on the matter of massive data gathering and storage. One of his main concerns was the evolution of Law within the United Kingdom, particularly now the Investigatory Powers Act has been passed by the House of Lords. One of the most infamous parts of this law mandates internet providers to keep a full record of every site its customers have visited for the government’s use.

Different aspects discussed and related to this issue were the lack of evidence for the effectiveness of mass surveillance, its potential to cause alienation on communities, and the lack of adequate safeguards to protect against misuse or exfiltration of the surveillance data. Evidence supporting these views, amongst others, was addressed to the House of Commons by one of the PublicBill Committees earlier this year, but without much success.

On other grounds, Charlesworth was also concerned by the oft-repeated line that there is nothing to worry about as these kind of practices were already taking place outside the rule of law and public scrutiny. The legitimization of mass state surveillance and the extension of powers recently approved are, he explained, something to worry about. After enumerating some of the forty-seven agencies that can access this data without any warrant, he pointed out that the big number of data breaches suffered by UK institutions in the pasts does not help with trusting this data collection and storage.

Ladyman expanded on the chilling effect of people of being or feeling watched. Society is damaged when individuals have to think constantly about the consequences of their communications or web browsing, as it has been shown that people tend to avoid researching or debating about matters that challenge established social norms and factual powers. A number of sociologist, philosophers and lawyers have addressed this question, but perhaps one of the most famous experiments would be that of “The Chilling Effects of Surveillance” conducted by Gregory White and Philip Zimbardo.

During White and Zimbardo’s study, two group of participants were made: One being told that their statements would be shared with the police for training purposes, and one that was not. Fear of authorities equating dissent with wrongdoing, a threat or any kind of punishable behaviour made the former group more likely to condemn marijuana usage and to use second and third person pronouns in their language. Whereas 77% of the group not given a surveillance warning advocated for legalization, only 44% of those who were told that their data would be shared with the police did so. Moreover, 31% of the monitored participants looked for approval from the people conducting the experiment when talking, whereas only 7% of the other group did so.

Last but not least, Ladyman remarked that the way on which we gather, store and represent data, as well as how we decide which of this information is to be retained or not, is not aseptic. One of his side comments, of which the cryptological community is very aware of, was that the fact of eliminating some potentially sensitive or discriminatory entry from a database does not mean that it cannot be inferred by the use of correlations within the same dataset or with the help of some external information.

Next post will continue with the other main concern of the discussion: Are exclusively data-based decisions always objective and desirable?