The Origin API

Unofficial Proposal Draft,

More details about this document
This version:
https://mikewest.github.io/origin-api
Issue Tracking:
GitHub
Inline In Spec
Editor:
(Google)

Abstract

An Origin object might be nice to have.

Status of this document

1. Introduction

The origin is a fundamental component of the web’s implementation, essential to both the security and privacy boundaries which user agents maintain. The concept is well-defined between HTML and URL, along with widely-used adjacent concepts like "site".

Origins, however, are not directly exposed to web developers. Though there are various origin getters on various objects, each of those returns the ASCII serialization of an origin, not the origin itself. This has a few negative implications. Practically, developers attempting to do same-origin or same-site comparisons when handling serialized origins often get things wrong in ways that lead to vulnerabilities (see PMForce: Systematically Analyzing postMessage Handlers at Scale (Steffens and Stock, 2020) as one illuminating study). Philosophically, it seems like a missing security primitive that developers struggle to polyfill accurately.

We can address this gap in the platform by introducing an Origin object that encapsulates the origin concept, and provides helpful methods for comparison, serialization, parsing, and etc.

2. The Origin Object

This section is probably best read as a patch to HTML, adding a new subsection to 7.1.1 Origins. If there’s support for the idea, we should turn it into a PR rather than a standalone document. [whatwg/html Issue #11534]

[Exposed=*]
interface Origin {
  constructor();

  static Origin? from(any value);

  readonly attribute boolean opaque;

  boolean isSameOrigin(Origin other);
  boolean isSameSite(Origin other);
};

An Origin object has an [[origin]] internal slot, which holds an origin.

The new Origin() constructor steps are:
  1. Set this’s [[origin]] to a unique opaque origin.

The static from(value) accepts method accepts an arbitrary object value, and either returns a newly-constructed Origin object if one can be extracted from value, or throws a TypeError otherwise:
  1. If value is a platform object:

    1. Let origin be the result of executing value’s extract an origin algorithm.

    2. If origin is not null, then return a new Origin object whose [[origin]] is set to origin.

  2. If value is a string:

    1. Let parsed url be the result of running the basic URL parser on value.

    2. If parsed url is not failure, then return a new Origin object whose [[origin]] is set to parsed url’s origin.

    Note: Unlike URL.parse, Origin.from(String) doesn’t accept a base, but requires the serialization of an absolute URL. That seems clearer, and guides developers towards constructing a URL object in some well-understood way that’s distinct from their work with Origin.

  3. Throw a TypeError.

    Note: We fall through to throwing a TypeError here if we fail to extract an origin from a given platform object or string.

The opaque attribute getter steps are to return true if this’s [[origin]] is an opaque origin, and false otherwise.

The isSameOrigin(other) method steps are to return true if this’s [[origin]] is same origin with other’s [[origin]], and false otherwise.

The isSameSite(other) method steps are to return true if this’s [[origin]] is same site with other’s [[origin]], and false otherwise.

Note: This is a same site, not schemelessly same site, comparison.

2.1. Origin Extraction

Platform objects have an extract an origin operation which returns null unless otherwise specified.

The following implementations should be colocated with their definitions in HTML and elsewhere. [whatwg/html Issue #11534]

2.1.1. URL

An object implementing the URL interface’s extract an origin operation runs the following steps:
  1. Return this’s origin.

2.1.2. Origin

An object implementing the Origin interface’s extract an origin operation runs the following steps:
  1. Return this’s [[origin]].

2.1.3. ExtendableMessageEvent

An object implementing the ExtendableMessageEvent interface’s extract an origin operation runs the following steps:
  1. Return this’s origin.

    This doesn’t actually work, as ExtendableMessageEvent holds a serialized origin, not an origin. We’ll need to adjust step 6.2.2 of ServiceWorker’s postMessage(message, options) accordingly.

2.1.4. HTMLHyperlinkElementUtils

An object implementing the HTMLHyperlinkElementUtils mixin’s extract an origin operation runs the following steps:
  1. If this’s url is null, return null.

  2. Return this’s url’s origin.

2.1.5. Location

An object implementing the Location interface’s extract an origin operation runs the following steps:
  1. If this’s relevant Document is non-null, and its origin is not same origin-domain with the entry settings object’s origin, then return null.

  2. Return this’s url’s origin.

Note: As Location is potentially accessible cross-origin, we need to maintain its getters' security checks here.

2.1.6. WindowOrWorkerGlobalScope

An object implementing the WindowOrWorkerGlobalScope mixin’s extract an origin operation runs the following steps:
  1. If this’s relevant settings object’s origin is not same origin-domain with the entry settings object’s origin, then return null.

  2. Return this’s relevant settings object’s origin.

Note: As WindowProxy objects are potentially accessible cross-origin, we need to perform a security check here before granting access to the global scope’s origin.

2.1.7. WorkerLocation

An object implementing the WorkerLocation interface’s extract an origin operation runs the following steps:
  1. Return this’s WorkerGlobalScope’s url’s origin.

2.1.8. MessageEvent

An object implementing the MessageEvent interface’s extract an origin operation runs the following steps:
  1. Return this’s relevant settings object’s origin.

3. Security Considerations

The isSameSite(other) method exposes each particular user agent’s understanding of an origin’s site. As this understanding generally depends on a specific snapshot of the Public Suffix List [PSL], the deliniation of a site can and does differ between user agents, and even between versions of one user agent. Developers are encouraged to exercise caution when making decisions based on sites, and are likewise encouraged to rely upon isSameOrigin(other) when making security decisions.

4. Privacy Considerations

The Public Suffix List [PSL] changes over time, and will likely be different from browser to browser and from one version of one browser to another. While the list is likely strongly correllated to a specific version of a user agent, exposing the list via isSameSite() could leak some information in cases where the user agent desires to limit its brand’s or version’s visibility.

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[DOM]
Anne van Kesteren. DOM Standard. Living Standard. URL: https://dom.spec.whatwg.org/
[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[SERVICE-WORKERS]
Yoshisato Yanagisawa; Monica CHINTALA. Service Workers. URL: https://w3c.github.io/ServiceWorker/
[URL]
Anne van Kesteren. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/
[WEBIDL]
Edgar Chen; Timothy Gu. Web IDL Standard. Living Standard. URL: https://webidl.spec.whatwg.org/

Informative References

[PSL]
Public Suffix List. URL: https://publicsuffix.org/

IDL Index

[Exposed=*]
interface Origin {
  constructor();

  static Origin? from(any value);

  readonly attribute boolean opaque;

  boolean isSameOrigin(Origin other);
  boolean isSameSite(Origin other);
};

Issues Index

This section is probably best read as a patch to HTML, adding a new subsection to 7.1.1 Origins. If there’s support for the idea, we should turn it into a PR rather than a standalone document. [whatwg/html Issue #11534]
The following implementations should be colocated with their definitions in HTML and elsewhere. [whatwg/html Issue #11534]
This doesn’t actually work, as ExtendableMessageEvent holds a serialized origin, not an origin. We’ll need to adjust step 6.2.2 of ServiceWorker’s postMessage(message, options) accordingly.