Help for search

Texts are composed in HTML5 with US-ASCII encoding. Processing before searching removes comments, tags, and spaces including newlines, so you may not use search to find HTML tags or attribute values, such as the filenames of images.

Regular expressions

The phrase “regular expression” is computer jargon for a widely adopted syntax for defining search patterns. This syntax lets you declare alternatives, negatives, abstract character classes (such as digits, or space characters), unicode character properties, and sentence, line, or word boundaries (a.k.a. “anchors”). Regular expressions become an option for search terms when the normal searches are implemented using regular expressions, as in See Pattern syntax for regular expressions. If you check the “Regular expression” option, we treat your complete “Terms to search for” as one regular expression, but you must leave off the enclosing slash delimiters.

Special characters

Quotation marks and special characters in names require special consideration. René Descartes (with the special character for the e-acute) is not equal to René Descartes (with the HTML entity for the special character). A straight single or double quotation mark (' or ") is not the same as a curled opening or closing quotation mark (‘ or “ or ’ or ”).

If you know the HTML entities for special characters, you may use them, but you don’t need to use them because special characters in your search terms are converted to HTML entities before using them to search the HTML files.

You may search for either Halley’s comet, or enter the HTML entity ’ for the right single quotation mark: Halley’s comet.

Similarly, you may use the following HTML entities:

& & Ampersand
° ° degree symbol
— em dash
☹ frowning face
☺ smiley face
ə ə schwa
® ® registered mark