Is there any authoritative reference about the syntax and encoding of an URL for

Question

0

Asked: May 15, 20262026-05-15T18:48:39+00:00 2026-05-15T18:48:39+00:00

Is there any authoritative reference about the syntax and encoding of an URL for

0

Is there any authoritative reference about the syntax and encoding of an URL for the pseudo-protocol javascript:? (I know it’s not very well considered, but anyway it’s useful for bookmarklets).

First, we know that standard URLs follow the syntax:

scheme://username:password@domain:port/path?query_string#anchor

but this format doesn’t seem to apply here. Indeed, it seems, it would be more correct to speak of URI instead of URL : here is listed the “unofficial” format javascript:{body}.

Now, then, which are the valid characters for such a URI, (what are the escape/unescape rules) when embedding in a HTML?

Specifically, if I have the code of a javascript function and I want to embed it in a javascript: URI, which are the escape rules to apply?

Of course one could escape every non alfanumeric character, but that would be overkill and make the code unreadable. I want to escape only the necessary characters.

Further, it’s clear that it would be bad to use some urlencode/urldecode routine pair (those are for query string values), we don’t want to decode ‘+’ to spaces, for example.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-15T18:48:40+00:00

My findings, so far:

First, there are the rules for writing a valid HTML attribute value: but here the standard only requires (if the attribute value if enclosed in quotes) an arbitrary CDATA (actually a %URI, but HTML itself does not impose additional validation at its level: any CDATA will validate).

Some examples:

 <a href="javascript:alert('Hi!')">     (1)
 <a href="javascript:if(a > b && 1 < 0) alert(  b ? 'hi' : 'bye')">   (2)
 <a href="javascript:if(a&gt;b &amp;&&amp; 1 &lt; 0) alert( b ? 'hi' : 'bye')">  (3)

Example (1) is valid. But also example (2) is valid HTML 4.01 Strict. To make it valid XHTML we only need to escape the XML special characters < > & (example 3 is valid XHTML 1.0 Strict).

Now, is example (2) a valid javascript: URI ? I’m not sure, but I’d say it’s not.

From RFC 2396: an URI is subject to some addition restrictions and, in particular, the escape/unescape via %xx sequences. And some characters are always prohibited:
among them spaces and {}# .

The RFC also defines a subset of opaque URIs: those that do not have hierarchical components, and for which the separating charactes have no special meaning (for example, they dont have a ‘query string’, so the ? can be used as any non special character). I assume javascript: URIs should be considered among them.

This would imply that the valid characters inside the ‘body’ of a javascript: URI are

 a-zA-Z0-9 
 _|. !~*'();?:@&=+$,/-   
 %hh : (escape sequence, with two hexadecimal digits)

with the additional restriction that it can’t begin with /.
This stills leaves out some “important” ASCII characters, for example

{}#[]<>^\

Also % (because it’s used for escape sequences), double quotes " and (most important) all blanks.

In some respects, this seems quite permissive: it’s important to note that + is valid (and hence it should not be ‘unescaped’ when decoding, as a space).

But in other respects, it seems too restrictive. Braces and brackets, specially: I understand that they are normally used unescaped and browsers have no problems.

And what about spaces? As braces, they are disallowed by the RFC, but I see no problem in this kind of URI. However, I see that in most bookmarklets they are escaped as “%20”. Is there any (empirical or theorical) explanation for this?

I still don’t know if there are some standard functions to make this escape/unescape (in mainstream languages) or some sample code.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Is there any authoritative reference about the syntax and encoding of an URL for

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply