Sponsored by the UK Research Institute in Verified Trustworthy Software Systems and the Royal Holloway Centre for Doctoral Training in Cyber Security.

The project sought to prove that an inherently dynamic language is best served by a dynamic approach to verification. We therefore chose to use test generation by dynamic symbolic execution (DSE) to systematically cover paths through programs and check security properties along those paths. While this prevents generating proofs, it ensures that all paths executed are feasible with respect to the execution environment. The two main lines of work were to improve DSE for real-world JavaScript code and to develop a flexible specification methodology for security properties.

JavaScript Symbolic Execution

ExpoSE, the open source DSE engine for JavaScript that we developed as part of the project, is able to automatically generate test cases for most Node.js programs without modifications. The main challenge faced by ExpoSE is that it needs to reason about the effects of the rich JavaScript semantics, including its vast standard library. In particular, most JavaScript programs process strings in some form and often rely on regular expressions, which are often a limiting factor for any kind of automated analysis.

In this project, we devised the first complete strategy to automatically reason about JavaScript regular expressions. We encoded the semantics of regular expression operations using string constraints and classical regular expressions and we devised a refinement scheme to address the problem of matching precedence and greediness: regular expressions will consume as many input characters as possible, from left to right. This constraint has so far been ignored in related work and existing solvers for regular expressions.

Our survey of over 400,000 JavaScript packages from the NPM software repository shows that one fifth makes use of complex regular expressions features. We implemented our encoding and refinement scheme in ExpoSE and evaluated it on 1,131 Node.js packages, demonstrating that the encoding is effective and can increase line coverage by up to 30%. We used line coverage here as a proxy metric for the effectiveness of the encoding: it demonstrates that more parts of the program can be reached, increasing the analysis surface for detecting bugs and vulnerabilities. We describe the encoding, refinement, and evaluation in an article published at PLDI 2019. ExpoSE is available on GitHub.

JavaScript Security Annotations

Several security-relevant properties, such as data provenance or integrity, cannot be written as assertions in JavaScript itself. We therefore developed an approach to attach additional security annotations to a language. This lightweight metatheory of security annotations allows tests to encode properties not present in the program state. We demonstrated the consistency of our system in a statically typed lambda calculus (published at PEPM 2018) and then followed up with a fully dynamic reference implementation for JavaScript. We specified a partial fragment of the WebCrypto API in terms of Security Annotations and demonstrated how to use it to detect security vulnerabilities. This work was published at ESORICS 2019.

Outputs

  • Blake Loring, Duncan Mitchell, and Johannes Kinder. Sound Regular Expression Semantics for Dynamic Symbolic Execution of JavaScript. In Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI), pp. 425–438, ACM, 2019. [ PDF ]
  • Duncan Mitchell and Johannes Kinder. A Formal Model for Checking Cryptographic API Usage in JavaScript. In Proc. European Symposium on Research in Computer Security (ESORICS), pp. 341–360, Springer, 2019. [ PDF ]
  • Duncan Mitchell, L. Thomas van Binsbergen, Blake Loring, and Johannes Kinder. Checking Cryptographic API Usage with Composable Annotations. In ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM), 2018. [ PDF ]
  • Blake Loring, Duncan Mitchell, and Johannes Kinder. ExpoSE: Practical Symbolic Execution of Standalone JavaScript. In Proc. Int. SPIN Symp. Model Checking of Software (SPIN), pp. 196–199, ACM, 2017. [ PDF ]
  • ExpoSE Dynamic Symbolic Execution Engine for JavaScript [ GitHub ]