URL objects
The built-in URL class provides a convenient interface for creating and parsing URLs. There are no networking methods that require exactly a URL object, strings are good enough. So technically we don’t have to use URL. But sometimes it can be really helpful.
Creating a URL
The syntax to create a new URL object: - url – the full URL or only path (if base is set, see below), - base – an optional base URL: if set and url argument has only path, then the URL is generated relative to base. For example: These two URLs are same: We can easily create a new URL based on the path relative to an existing URL: The URL object immediately allows us to access its components, so it’s a nice way to parse the url, e.g.: Here’s the cheatsheet for URL components: - href is the full url, same as url.toString() - protocol ends with the colon character : - search - a string of parameters, starts with the question mark ? - hash starts with the hash character # - there may be also user and password properties if HTTP authentication is present: http://login:[email protected] (not painted above, rarely used).
SearchParams “?…”
Let’s say we want to create a url with given search params, for instance, https://google.com/search?query=JavaScript. We can provide them in the URL string: …But parameters need to be encoded if they contain spaces, non-latin letters, etc (more about that below). So there’s a URL property for that: url.searchParams, an object of type URLSearchParams. It provides convenient methods for search parameters: - append(name, value) – add the parameter by name, - delete(name) – remove the parameter by name, - get(name) – get the parameter by name, - getAll(name) – get all parameters with the same name (that’s possible, e.g. ?user=John&user=Pete), - has(name) – check for the existence of the parameter by name, - set(name, value) – set/replace the parameter, - sort() – sort parameters by name, rarely needed, - …and it’s also iterable, similar to Map. An example with parameters that contain spaces and punctuation marks:
Encoding
There’s a standard RFC3986 that defines which characters are allowed in URLs and which are not. Those that are not allowed, must be encoded, for instance non-latin letters and spaces - replaced with their UTF-8 codes, prefixed by %, such as %20 (a space can be encoded by +, for historical reasons, but that’s an exception). The good news is that URL objects handle all that automatically. We just supply all parameters unencoded, and then convert the URL to string: As you can see, both Тест in the url path and ъ in the parameter are encoded. The URL became longer, because each cyrillic letter is represented with two bytes in UTF-8, so there are two %.. entities.
Encoding strings
In old times, before URL objects appeared, people used strings for URLs. As of now, URL objects are often more convenient, but strings can still be used as well. In many cases using a string makes the code shorter. If we use a string though, we need to encode/decode special characters manually. There are built-in functions for that: - encodeURI - encodes URL as a whole. - decodeURI - decodes it back. - encodeURIComponent - encodes a URL component, such as a search parameter, or a hash, or a pathname. - decodeURIComponent - decodes it back. A natural question is: “What’s the difference between encodeURIComponent and encodeURI? When we should use either?” That’s easy to understand if we look at the URL, that’s split into components in the picture above: As we can see, characters such as :, ?, =, &, # are allowed in URL. …On the other hand, if we look at a single URL component, such as a search parameter, these characters must be encoded, not to break the formatting. - encodeURI encodes only characters that are totally forbidden in URL. - encodeURIComponent encodes same characters, and, in addition to them, characters #, $, &, +, ,, /, :, ;, =, ? and @. So, for a whole URL we can use encodeURI: …While for URL parameters we should use encodeURIComponent instead: Compare it with encodeURI: As we can see, encodeURI does not encode &, as this is a legit character in URL as a whole. But we should encode & inside a search parameter, otherwise, we get q=Rock&Roll - that is actually q=Rock plus some obscure parameter Roll. Not as intended. So we should use only encodeURIComponent for each search parameter, to correctly insert it in the URL string. The safest is to encode both name and value, unless we’re absolutely sure that it has only allowed characters. // valid url with IPv6 address let url = ‘http://[2607:f8b0:4005:802::1007]/’; alert(encodeURI(url)); // http://%5B2607:f8b0:4005:802::1007%5D/ alert(new URL(url)); // http://[2607:f8b0:4005:802::1007]/
let url = new URL('https://javascript.info/profile/admin');
Follow the lesson from Microsoft Web-Dev-For-Beginners course