Source: doc.go in package github.com/google/safehtml/template

Source File
	doc.go

Belonging Package
	github.com/google/safehtml/template

Copyright (c) 2017 The Go Authors. All rights reserved. Use of this source code is governed by a BSD-style license that can be found in the LICENSE file or at https://developers.google.com/open-source/licenses/bsd

Package template (safehtml/template) implements data-driven templates forgenerating HTML output safe against code injection. It provides an interfacesimilar to that of package html/template, but produces HTML output that is moresecure. Therefore, it should be used instead of html/template to render HTML.

The documentation here focuses on the security features of the package. Forinformation about how to program the templates themselves, see thedocumentation for text/template.

Basic usage

This package provides an API almost identical to that of text/template andhtml/template to parse and execute HTML templates safely.

tmpl := template.Must(template.New("name").Parse(`<div>Hello {{.}}</div>`)) err := tmpl.Execute(out, data)

If successful, out will contain code-injection-safe HTML. Otherwise, err'sstring representation will describe the error that occurred.

Elements of data might be modified at run time before being included in out, orrejected completely if such a conversion is not possible. Pass values ofappropriate types from package safehtml to ensure that they are included in thetemplate's HTML output in their expected form. More details are provided belowin "Contextual autosanitization" and "Sanitization contexts".

Security improvements

safehtml/template produces HTML more resistant to code injection thanhtml/template because it: * Allows values of types only from package safehtml to bypass run-time sanitization. These types represent values that are known---by construction or by run-time sanitization---to be safe for use in various HTML contexts without being processed by certain sanitization functions. * Does not attempt to escape CSS or JavaScript. Instead of attempting to parse and escape these complex languages, safehtml/template allows values of only the appropriate types from package safehtml (e.g. safehtml.Style, safehtml.Script) to be used in these contexts, since they are already guaranteed to be safe. * Emits an error if user data is interpolated in unsafe contexts, such as within non-whitelisted elements or unquoted attribute values. * Only loads templates from trusted sources. This ensures that the contents of the template are always under programmer control. More details are provided below in "Trusted template sources". * Differentiates between URLs that load code and those that do not. URLs in the former category must be supplied to the template as values of type safehtml.TrustedResourceURL, whose type contract promises that the URL identifies a trustworthy resource. URLs in the latter category can be sanitized at run time.

Threat model

safehtml/template assumes that programmers are trustworthy. Therefore, datafully under programmer control, such as string literals, are considered safe.The types from package safehtml are designed around this same assumption, sotheir type contracts are trusted by this package.

safehtml/template considers all other data values untrustworthy andconservatively assumes that such values could result in a code-injectionvulnerability if included verbatim in HTML.

Trusted template sources

safehtml/template loads templates only from trusted sources. Therefore, templatetext, file paths, and file patterns passed to Parse* functions and methods mustbe entirely under programmer control.

This constraint is enforced by using unexported string types for the parametersof Parse* functions and methods, such as trustedFilePattern for ParseGlob.The only values that may be assigned to these types (and thus provided asarguments) are untyped string constants such as string literals, which arealways under programmer control.

Contextual autosanitization

Code injection vulnerabilities, such as cross-site scripting (XSS), occur whenuntrusted data values are embedded in a HTML document. For example,

import "text/template" ... var t = template.Must(template.New("foo").Parse(`<a href="{{ .X }}">{{ .Y }}</a>`)) func renderHTML(x, y string) string { var out bytes.Buffer err := t.Execute(&out, struct{ X, Y string }{x, y}) Error checking elided return out.String() }

If x and y originate from user-provided data, an attacker who controls thesestrings could arrange for them to contain the following values:

x = "javascript:evil()" y = "</a><script>alert('pwned')</script><a>"

which will cause renderHTML to return the following unsafe HTML:

<a href="javascript:evil()"></a><script>alert('pwned')</script><a></a>

To prevent such vulnerabilities, untrusted data must be sanitized before beingincluded in HTML. A sanitization function takes untrusted data and returns astring that will not create a code-injection vulnerability in the destinationcontext. The function might return the input unchanged if it deems it safe,escape special runes in the input's string representation to prevent them fromtriggering undesired state changes in the HTML parser, or entirely replace theinput by an innocuous string (also known as "filtering"). If none of theseconversions are possible, the sanitization function aborts template processing.

safehtml/template contextually autosanitizes untrusted data by addingappropriate sanitization functions to template actions to ensure that theaction output is safe to include in the HTML context in which the actionappears. For example, in

import "safehtml/template" ... var t = template.Must(template.New("foo").Parse(`<a href="{{ .X }}">{{ .Y }}</a>`)) func renderHTML(x, y string) string { var out bytes.Buffer err := t.Execute(&out, struct{ X, Y string }{x, y}) Error checking elided return out.String() }

the contextual autosanitizer rewrites the template to

<a href="{{ .X | _sanitizeTrustedResourceURLOrURL | _sanitizeHTML }}">{{ .Y | _sanitizeHTML }}</a>

so that the template produces the following safe, sanitized HTML output (splitacross multiple lines for clarity):

<a href="about:invalid#zGoSafez"> </a><script>alert('pwned')</script><a> </a>

Similar template systems such as html/template, Soy, and Angular, refer to thisfunctionality as "contextual autoescaping". safehtml/template uses the term"autosanitization" instead of "autoescaping" since "sanitization" broadlycaptures the operations of escaping and filtering.

Sanitization contexts

The types of sanitization functions inserted into an action depend on theaction's sanitization context, which is determined by its surrounding text.The following table describes these sanitization contexts.

+--------------------+----------------------------------+------------------------------+-----------------------+ | Context | Examples | Safe types | Run-time sanitizer | |--------------------+----------------------------------+------------------------------+-----------------------+ | HTMLContent | Hello {{.}} | safehtml.HTML | safehtml.HTMLEscaped | | | <title>{{.}}</title> | | | +--------------------------------------------------------------------------------------------------------------+ | HTMLValOnly | <iframe srcdoc="{{.}}"></iframe> | safehtml.HTML* | N/A | +--------------------------------------------------------------------------------------------------------------+ | URL | <q cite="{{.}}">Cite</q> | safehtml.URL | safehtml.URLSanitized | +--------------------------------------------------------------------------------------------------------------+ | URL or | <a href="{{.}}">Link</a> | safehtml.URL | safehtml.URLSanitized | | TrustedResourceURL | | safehtml.TrustedResourceURL | | +--------------------------------------------------------------------------------------------------------------+ | TrustedResourceURL | <script src="{{.}}"></script> | safehtml.TrustedResourceURL† | N/A | +--------------------------------------------------------------------------------------------------------------+ | Script | <script>{{.}}</script> | safehtml.Script* | N/A | +--------------------------------------------------------------------------------------------------------------+ | Style | <p style="{{.}}">Paragraph</p> | safehtml.Style* | N/A | +--------------------------------------------------------------------------------------------------------------+ | Stylesheet | <style>{{.}}</style> | safehtml.StyleSheet* | N/A | +--------------------------------------------------------------------------------------------------------------+ | Identifier | <h1 id="{{.}}">Hello</h1> | safehtml.Identifier* | N/A | +--------------------------------------------------------------------------------------------------------------+ | Enumerated value | <a target="{{.}}">Link</a> | Whitelisted string values | N/A | | | | ("_self" or "_blank" for | | | | | the given example) | | +--------------------------------------------------------------------------------------------------------------+ | None | <h1 class="{{.}}">Hello</h1> | N/A (any type allowed) | N/A (any type | | | | | allowed) | +--------------------+----------------------------------+------------------------------+-----------------------+ *: Values only of this type are allowed in this context. Other values will trigger a run-time error. †: If the action is a prefix of the attribute value, values only of this type are allowed. Otherwise, values of any type are allowed. See "Substitutions in URLs" for more details.

For each context, the function named in "Run-time sanitizer" is called tosanitize the output of the action. However, if the action outputs a value ofany of the types listed in "Safe types", the run-time sanitizer is not called.For example, in

if X is a string value, a HTML sanitizer that calls safehtml.HTMLEscaped will beadded to the action to sanitize X.

_sanitizeHTML calls safehtml.HTMLEscaped. <title>{{ .X | _sanitizeHTML }}</title>

However, if X is a safehtml.HTML value, _sanitizeHTML will not change itsvalue, since safehtml.HTML values are already safe to use in HTML contexts.Therefore, the string contents of X will bypass context-specificsanitization (in this case, HTML escaping) and appear unchanged in thetemplate's HTML output. Note that in attribute value contexts, HTML escapingwill always take place, whether or not context-specific sanitization isperformed. More details can be found at the end of this section.

In certain contexts, the autosanitizer allows values only of that context's"Safe types". Any other values will trigger an error and abort templateprocessing. For example, the template

triggers a run-time error if X is not a safehtml.StyleSheet. Otherwise, thestring form of X will appear unchanged in the output. The only exception tothis behavior is in TrustedResourceURL sanitization contexts, where actions mayoutput data of any type if the action occurs after a safe attribute value prefix.More details can be found below in "Substitutions in URLs".

Unconditional sanitization

In attribute value contexts, action outputs are always HTML-escaped aftercontext-specific sanitization to ensure that the attribute values cannot changechange the structure of the surrounding HTML tag. In URL or TrustedResourceURLsanitization contexts, action outputs are additionally URL-normalized to reducethe likelihood of downstream URL-parsing bugs. For example, the template

is rewritten by the autosanitizer into

_sanitizeHTML calls safehtml.HTMLEscaped. <a href="{{ .X | _sanitizeTrustedResourceURLOrURL | _normalizeURL | _sanitizeHTML }}">Link</a> <p id="{{ .Y | _sanitizeIdentifier | _sanitizeHTML }}">Text</p>

Even if X is a safehtml.URL or safehtml.TrustedResourceURL value, whichremains unchanged after _sanitizeTrustedResourceURLOrURL, X will still beURL-normalized and HTML-escaped. Likewise, Y will still be HTML-escaped even ifits string form is left unchanged by _sanitizeIdentifier.

Substitutions in URLs

Values of any type may be substituted into attribute values in URL andTrustedResourceURL sanitization contexts only if the action is preceded by asafe URL prefix. For example, in

Since "http:www.foo.com/" is a safe URL prefix, PathComponent can safely beinterpolated into this URL sanitization context after URL normalization.Similarly, in

Since "https:www.bar.com/" is a safe TrustedResourceURL prefix, PathComponentcan safely be interpolated into this TrustedResourceURL sanitization contextafter URL escaping. Substitutions after a safe TrustedResourceURL prefix areescaped instead of normalized to prevent the injection of any new URLcomponents, including additional path components. URL escaping also takes placein URL sanitization contexts where the substitutions occur in the query orfragment part of the URL, such as in:

A URL prefix is considered safe in a URL sanitization context if it doesnot end in an incomplete HTML character reference (e.g. https&#1) or incompletepercent-encoding character triplet (e.g. /fo%6), does not contain whitespace or controlcharacters, and one of the following is true: * The prefix has a safe scheme (i.e. http, https, mailto, or ftp). * The prefix has the data scheme with base64 encoding and a whitelisted audio, image, or video MIME type (e.g. data:img/jpeg;base64, data:video/mp4;base64). * The prefix has no scheme at all, and cannot be interpreted as a scheme prefix (e.g. /path).

A URL prefix is considered safe in a TrustedResourceURL sanitization context if it doesnot end in an incomplete HTML character reference (e.g. https&#1) or incompletepercent-encoding character triplet (e.g. /fo%6), does not contain white space or controlcharacters, and one of the following is true: * The prefix has the https scheme and contains a domain name (e.g. https:www.foo.com). * The prefix is scheme-relative and contains a domain name (e.g. www.foo.com/). * The prefix is path-absolute and contains a path (e.g. /path). * The prefix is "about:blank".


The pages are generated with Golds v0.3.2-preview. (GOOS=darwin GOARCH=amd64)
Golds is a Go 101 project developed by Tapir Liu.
PR and bug reports are welcome and can be submitted to the issue list.
Please follow @Go100and1 (reachable from the left QR code) to get the latest news of Golds.