|
Perhaps the most popular way to pass data between web-pages is via querystrings. This is used to both pass data to a new pop-up window, as well as to navigate between pages. (Side note: A querystring is the part of the URL that occurs after the ?. So in http://localhost/myWeb?id=3&name=Tim, id=3&name=Tim is the querystring. The querystring provides name-value pairs in the form of ?name1=value1&name2=value2...)
While this works great for simple alpha-numerics, it can be a problem to pass special characters in the URL, especially in different browsers.
While some may argue that querystring values should only contain simple IDs, there are legitimate benefits to being able to pass special characters. For example:
Fortunately there is a solution to handling special characters. .Net provides us the ability to Encode and Decode the URL using System.Web.HttpUtility.UrlEncode and HttpUtility.UrlDecode (note this is not HtmlEncode, which encodes html, and won't affect the &. We want Urls). This replaces problematic characters with URL-friendly equivalents.
The following table shows what UrlEncode translates:
ASCII Codes | Character | UrlEncode | |
| Dec | Hex | ||
| 32 | 20 | + | |
| 34 | 21 | " | %22 |
| 35 | 22 | # | %23 |
| 36 | 24 | $ | %24 |
| 37 | 25 | % | %25 |
| 38 | 26 | & | %26 |
| 43 | 2B | + | %2b |
| 44 | 2C | , | %2c |
| 47 | 2F | / | %2f |
| 58 | 3A | : | %3a |
| 59 | 3B | ; | %3b |
| 60 | 3C | < | %3c |
| 61 | 3D | = | %3d |
| 62 | 3E | > | %3e |
| 63 | 3F | ? | %3f |
| 64 | 40 | @ | %40 |
| 91 | 5B | [ | %5b |
| 92 | 5C | \ | %5c |
| 93 | 5D | ] | %5d |
| 94 | 5E | ^ | %5e |
| 96 | 60 | ` | %60 |
| 123 | 7B | { | %7b |
| 124 | 7C | | | %7c |
| 125 | 7D | } | %7d |
| 126 | 7E | ~ | %7e |
While alpha-numerics aren't affected, these special characters aren't encoded either:
ASCII Codes | Character | |
| Dec | Hex | |
| 95 | 5F | _ |
| 45 | 2D | - |
| 46 | 2E | . |
| 39 | 27 | ' |
| 40 | 28 | ( |
| 41 | 29 | ) |
| 42 | 2A | * |
| 33 | 21 | ! |
Side note 1: You can see the full ASCII tables online.
Side note: You can generate these tables with a simple loop like so:
|
Essentially UrlEncode replaces many problematic characters with "%" + their ASCII Hex equivalents.
Most of these remaining special characters don't pose a problem, except for the apostrophe that can cause cross-site scripting warnings. For that, one solution is to replace it with a unique token value, such as "%27". Note that we pick a reasonable token - "27" is the ASCII Hex for the apostrophe, and it follows the pattern of other Encodings. We could then write our own Encode and Decode methods that first apply the UrlEncode, and then replace the apostrophe with the token value. These methods could be abstracted to their own utility class:
|
HttpUtility.UrlDecode("http://MyHome.ca")
Offhand I'm not sure - although I thought URLs were case-insensitive. You
could always just try it out, just make a quick method to test different
calls to HttpUtility.
Do you have any idea why a space (' ') is represented by a plus ('+')
instead of '%20'?
When a space is included in QueryString parameter, it is "slightly more
correct" to encode it as a + instead of %20
Is there anything in C# to validate that a string has been UrlEncode'd? I
suppose you could UrlEncode the suspect string and then replace '%25' with
'%' and then compare the before and after. If they are equal the string is
UrlEncoded. If not, the suspect string was in fact not UrlEncoded.
I don't think so. You could do something like what you've suggested to give
a probable answer. But, I think the problem is that you could have a
literal string that coincidentally looked like it had been encoded, and
there'd be no way to tell. (You'd have a similar problem in any
escaped-like literal string). I could be missing something, but I don't
think this is possible.