Inline Integrity

Draft Community Group Report,

This version:
https://mikewest.github.io/inline-integrity/
Issue Tracking:
GitHub
Inline In Spec
Editor:
(Google LLC.)
Toggle Diffs:

Abstract

This document defines a mechanism of asserting the provenance and integrity of inline script blocks, similar conceptually to other mechanisms that support assertions about externally-fetched resources. This is not intended to be a stand-alone specification, but should fold into HTML (and potentially SRI).

Status of this document

This specification was published by the Web Platform Incubator Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

1. Introduction

Websites are well-understood to be compositions of resources from a variety of servers, woven together in one origin’s context to satisfy a developer’s goals. These resources are generally requested through elements like script and link, instructing user agents to make explicit requests to other servers on the page’s behalf, and to include those resources as part of the page’s construction:

<script src="https://widgets-r-us.example/widget.js"></script>

Developers, however, often have reasons (performance, privacy, etc) to avoid asking users' agents to take responsibility for these additional requests. Rather than composing the page at runtime, they might choose to embed code more directly, copy/pasting snippets into inline script blocks, or relying upon layers of infrastructure to inline their dependencies through server-side includes (Fastly implements a subset of [ESI-LANG], for example). Depending on the development team, these dependencies might not even be third-party in the traditional sense, but developed as internally-shared frameworks that are jammed together through internal infrastructure:

<!-- The main document: -->
<esi:include src="https://widgets-r-us.example/widget.include" />
<!-- https://widgets-r-us.example/widget.include -->
<script>
  /* Code goes here. */
</script>

These inlined blocks are a stumbling block for developers who wish to deploy strong protections against injection attacks, as architectural decisions might make it difficult to coordinate nonce attributes or content hashes between inlined scripts and `Content-Security-Policy` headers delivered with the page. Sites might fall back to allowing 'unsafe-inline', or simply forgoing a policy in the first place.

Signatures might provide an option that satisfies developers' need without additionally complicating deployments. In short, if developers can agree with their dependencies on a (set of) signing key(s), they can encode those relatively static constraints in the page’s content security policy, and validate inlined script’s signatures against those known-good keys. This provides a proof of provenance for the code in question, allowing developers to ensure the integrity of their supply chain in a dynamic fashion:

<!-- https://widgets-r-us.example/widget.include -->
<script x-inlined-content-signature="ed25519-[base64-encoded signature]"
        x-inlined-content-key="ed25519-[base64-encoded public key]">
  /* Code goes here. */
</script>

These attributes have terrible placeholder names on purpose. See § 3.2 x-inlined-content-signature is a terrible attribute name for some spelling options.

Pages will assert both a signature and a key for a given script or style element, either including both inline, or asserting a set of keys and assuming their applicability to elements that only include a signature. Here, we’ll use the test Ed25519 keys from [RFC9421] to demonstrate:

<script x-inlined-content-signature="ed25519-hyFFWrQ21vPXZDV07Mn17Q3ufvYBJDs23CeYu1hGUQi4D+LN99D9I1KmXBGV5kBZtf8h4JIxBLoBzIqLdpudDg=="
        x-inlined-content-key="ed25519-JrQLj5P/89iXES9+vFgrIy29clF9CC/oPPsw3c5D0bs=">
  alert(1);
</script>

Or:

<meta name="x-inline-content-key" content="ed25519-JrQLj5P/89iXES9+vFgrIy29clF9CC/oPPsw3c5D0bs=">

<script x-inlined-content-signature="ed25519-hyFFWrQ21vPXZDV07Mn17Q3ufvYBJDs23CeYu1hGUQi4D+LN99D9I1KmXBGV5kBZtf8h4JIxBLoBzIqLdpudDg=="
  alert(1);
</script>

Pages can restrict execution of script through reference to these keys:

Content-Security-Policy: script-src 'ed25519-JrQLj5P/89iXES9+vFgrIy29clF9CC/oPPsw3c5D0bs='

Yay!

2. Framework

2.1. Validation

An element has an invalid inline signature if the following algorithm returns "Invalid" given an element el:

  1. Turn this into a step that produces signatures and keys from el. It’ll change a lot based on the spelling option we choose in § 3.2 x-inlined-content-signature is a terrible attribute name, ranging from replicating the SRI parsing model to walking around a bit to find keys that were placed elsewhere (headers, head, etc. Assume we end up with those two variables, each of which contains a list of decoded byte sequences.

  2. If signatures is empty, return "Valid".

  3. Let inline content be el’s child text content.

  4. For each signature in signatures:

    1. For each key in keys:

      1. Execute the Ed25519 verification algorithm as defined in Section 5.1.7 of [RFC8032] using key as the public key portion of the verification key material (A), inline content as the message (M), and signature as the signature to be verified.

      2. If verification succeeded, return "Valid". Otherwise continue.

    Note: This means that the algorithm will return "Valid" if any asserted signature matches any required key.

  5. Return "Invalid".

    Note: If no keys are specified, no signatures can be validated. We could punt on this condition in step 2, but it feels reasonable to fail closed if a signature was asserted but couldn’t be validated.

2.2. Monkey-patching HTML

The following changes wire up both script and style elements to perform signature validation before being used on a page:

2.2.1. Script execution

We’ll add a new validation step alongside the callout to CSP in step 19 of HTML’s prepare the script element algorithm as follows:

  1. If el does not have a src content attribute, then return if either of the following statements is true:

2.2.2. Style application

We’ll add a new validation step alongside the callout to CSP in step 5 of HTML’s update a style block algorithm as follows:

  1. If the Should element’s inline type behavior be blocked by Content Security Policy? algorithm returns "Blocked" when given the style element, "style", and the style element’s child text content, then return.

  2. If the style element has an invalid inline signature, then return.

3. Implementation Considerations

3.1. What is being signed?

The signature is asserted over the script or style element’s child text content, which importantly includes both leading and trailing whitespace. That means that <script>alert(1);</script> will have a different signature than <script> alert(1);</script> and <script>alert(1); </script>.

3.2. x-inlined-content-signature is a terrible attribute name

It is. We’ll replace it with one that makes sense. Here are some spelling options:

  1. We could embed public keys in integrity attributes, just as we do for remote resources in [SIGSRI], and embed signatures via the as-yet-unused option syntax that [SRI] defines for that attribute:

    <script integrity="ed25519-[base64-encoded public key]?signature=[base64-encoded signature]">
        /* Code goes here. */
    </script>
    

    Beyond aesthetics and legibility, the downside of this approach is that it would not allow us to specify a key elsewhere: each script would embed both key and signature, which is a bit verbose.

    On the other hand, it doesn’t require any new attributes, and allows us to tell a somewhat clear story about the impact of integrity on script execution.

  2. We could embed public keys in integrity attributes, just as we do for remote resources in [SIGSRI], and embed signatures in a new signature attribute:

    <script signature="ed25519-[base64-encoded signature]"
            integrity="ed25519-[base64-encoded public key]">
        /* Code goes here. */
    </script>
    

    This has the upside of reusing an existing attribute, and not requiring us to exercise the option syntax. signature is also quite clear in the context, though it does invite some questions about the interaction between these attributes for subresource fetches that we’d need to answer.

  3. We could create inline variants of integrity and/or signature that apply only to inline blocks and explicitly not to subresource requests:

    <script inline-signature="ed25519-[base64-encoded signature]"
            inline-integrity="ed25519-[base64-encoded public key]">
        /* Code goes here. */
    </script>
    

    or:

    <script inline-integrity="ed25519-[base64-encoded public key]?signature=[base64-encoded signature]">
        /* Code goes here. */
    </script>
    
  4. We could rename everything in some arbitrary way that appeals to someone.

Consider this a placeholder while we work through the discussion in WICG/signature-based-sri#10 and WICG/signature-based-sri#42. [Issue #WICG/signature-based-sri#42]

3.3. How does this compare to Signature-Based Integrity?

The signature-based integrity proposal relies upon HTTP Message Signatures [RFC9421] to explain how signatures can be validated over resources requested from a remote server. As the request and response metadata plays an important role in how the resource is treated by the user agent, the intermediate signature base concept is a necessary complexity that allows a server to ensure that the signature covers all the relevant data.

Here, we have a simpler task: the entirety of the content to be validated is embedded in the document, available at parse time. We can work with the content directly, as there’s no relevant metadata.

This means that a resource delivered via HTTP will have a different signature than a resource delivered inline, even if the keys used are the same. This is unfortunate, but the alternative of synthesizing a signature base for inline content seems worse in practically every way.

4. Security Considerations

4.1. Integration with CSP

Broadly, the mechanism described here aims to make it easier for developers to deploy protections against unintended injection attacks even while relying upon inlined script or style blocks. It aims to do so in a way consistent with existing protections like [SRI] and [CSP], giving developers a clear path towards more safely including their dependencies.

Rather than allowing 'unsafe-inline', developers will have the option of restricting themselves to

5. Privacy Considerations

5.1. Implications for Content Blocking

Developers inline script in many cases to improve user experience, but script is also inlined for purposes that undercut user agency. It’s more difficult, for example, for extensions and other mediating software to modify or block particular resources when they’re not fetched independently, but are instead part of the document itself.

This proposal has the potential to remove some of the security risk associated with inlined script, but might also be seen as encouraging inlining in ways that could have negative privacy implications. Two considerations mitigate this risk:

  1. This proposal doesn’t create any more encouragement to inline script for the purposes of evading user’s intent to block it than the status quo already does. It allows developers to remove one risk associated with inlined content, but that seems quite unlikely to shift incentives to anything near the extent that content-blocking extensions already do.

  2. Tying inline content to a specific public key to prove provenance might provide an additional hook for content blocking scripts that could allow more clean identification of a given script’s owner. As these keys are more static than the content itself, this proposal might actually simplify the process of pointing to a specific script in a document as being worthy of additional inspection.

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[CSP]
Mike West; Antonio Sartori. Content Security Policy Level 3. URL: https://w3c.github.io/webappsec-csp/
[DOM]
Anne van Kesteren. DOM Standard. Living Standard. URL: https://dom.spec.whatwg.org/
[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119

Informative References

[ESI-LANG]
ESI Language Specification 1.0. 4 August 2001. NOTE. URL: https://www.w3.org/TR/esi-lang
[RFC8032]
S. Josefsson; I. Liusvaara. Edwards-Curve Digital Signature Algorithm (EdDSA). January 2017. Informational. URL: https://www.rfc-editor.org/rfc/rfc8032
[RFC9421]
A. Backman, Ed.; J. Richer, Ed.; M. Sporny. HTTP Message Signatures. February 2024. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc9421
[SIGSRI]
Mike West. Signature-Based Integrity. cg-draft. URL: https://wicg.github.io/signature-based-sri/
[SRI]
Devdatta Akhawe; et al. Subresource Integrity. URL: https://w3c.github.io/webappsec-subresource-integrity/

Issues Index

These attributes have terrible placeholder names on purpose. See § 3.2 x-inlined-content-signature is a terrible attribute name for some spelling options.
Turn this into a step that produces signatures and keys from el. It’ll change a lot based on the spelling option we choose in § 3.2 x-inlined-content-signature is a terrible attribute name, ranging from replicating the SRI parsing model to walking around a bit to find keys that were placed elsewhere (headers, head, etc. Assume we end up with those two variables, each of which contains a list of decoded byte sequences.
Consider this a placeholder while we work through the discussion in WICG/signature-based-sri#10 and WICG/signature-based-sri#42. [Issue #WICG/signature-based-sri#42]