The World’s Leading Microsoft .NET Magazine
   
 
timstall

Donate Today!

Search Box

 

Calendar

««Jul 2009»»
SMTWTFS
    12
3
4
567891011
12131415161718
19202122232425
262728293031

My RSS Feeds








Mailing List

Most Popular Tags

                                                           

Using HttpUtility.UrlEncode to Encode your QueryStrings

posted Wednesday, 23 February 2005

Perhaps the most popular way to pass data between web-pages is via querystrings. This is used to both pass data to a new pop-up window, as well as to navigate between pages. (Side note: A querystring is the part of the URL that occurs after the ?. So in http://localhost/myWeb?id=3&name=Tim, id=3&name=Tim is the querystring. The querystring provides name-value pairs in the form of ?name1=value1&name2=value2...)

While this works great for simple alpha-numerics, it can be a problem to pass special characters in the URL, especially in different browsers.

  • An ampersand would split the name-value pairs. (If you want to pass the value "A&B", but the & indicates a new name-value pair, then the value will be truncated to just "A". For example, in "id=A&B", getting the querystring "id" will return just "A", and B will be interpreted as its own key.
  • Apostrophes, greater than or less than signs may be interpreted as a cross-site scripting attack by some security plug-ins. As a result, these plug-ins may block the entire page.
  • Other special characters (like slash or space) may be lost or distorted when sending them into a url.

While some may argue that querystring values should only contain simple IDs, there are legitimate benefits to being able to pass special characters. For example:

  • Legacy Systems - The client's legacy system could include & or ' in the primary key.
  • Performance - You could be returning a value (such as a name like "O'reilly" or "Johnson & Sons") from a pop-up control. Just passing the id would require re-hitting the database. Therefore you could pass the name as well to help performance.

Fortunately there is a solution to handling special characters. .Net provides us the ability to Encode and Decode the URL using System.Web.HttpUtility.UrlEncode and HttpUtility.UrlDecode (note this is not HtmlEncode, which encodes html, and won't affect the &. We want Urls). This replaces problematic characters with URL-friendly equivalents.

The following table shows what UrlEncode translates:

ASCII Codes

CharacterUrlEncode
DecHex
3220 +
3421"%22
3522#%23
3624$%24
3725%%25
3826&%26
432B+%2b
442C,%2c
472F/%2f
583A:%3a
593B;%3b
603C<%3c
613D=%3d
623E>%3e
633F?%3f
6440@%40
915B[%5b
925C\%5c
935D]%5d
945E^%5e
9660`%60
1237B{%7b
1247C|%7c
1257D}%7d
1267E~%7e

While alpha-numerics aren't affected, these special characters aren't encoded either:

ASCII Codes

Character
DecHex
955F_
452D-
462E.
3927'
4028(
4129)
422A*
3321!

Side note 1: You can see the full ASCII tables online.

Side note: You can generate these tables with a simple loop like so:

for (int i = 0; i< 128; i++)
{
    string s = ((char)i).ToString();
    Console.WriteLine(i.ToString() + " Char: [" + s + "], UrlEncode: [" + System.Web.HttpUtility.UrlEncode(s) + "]");
}

Essentially UrlEncode replaces many problematic characters with "%" + their ASCII Hex equivalents.

Most of these remaining special characters don't pose a problem, except for the apostrophe that can cause cross-site scripting warnings. For that, one solution is to replace it with a unique token value, such as "%27". Note that we pick a reasonable token - "27" is the ASCII Hex for the apostrophe, and it follows the pattern of other Encodings. We could then write our own Encode and Decode methods that first apply the UrlEncode, and then replace the apostrophe with the token value. These methods could be abstracted to their own utility class:

public static string UrlFullEncode(string strUrl)
{
    if (strUrl == null)
        return "";
    strUrl = System.Web.HttpUtility.UrlEncode(strUrl);
    return strUrl.Replace("'",_strApostropheEncoding);
}
private const string _strApostropheEncoding = "%27";

public static string UrlFullDecode(string strUrl)
{
    if (strUrl == null)
        return "";
    strUrl = strUrl.Replace(_strApostropheEncoding,"'");
    return System.Web.HttpUtility.UrlDecode(strUrl);
}
 

 

links: digg this    technorati    




1. Isabelle left...
Tuesday, 25 April 2006 1:22 pm :: http://MyHome.ca

HttpUtility.UrlDecode("http://MyHome.ca")

When you use HttpUtility.UrlDecode does it take consideration of Upper case letters or it decode the url in lower case?


2. Tim Stall left...
Wednesday, 26 April 2006 4:54 pm

Offhand I'm not sure - although I thought URLs were case-insensitive. You could always just try it out, just make a quick method to test different calls to HttpUtility.


3. Doug left...
Thursday, 14 September 2006 12:30 pm

Do you have any idea why a space (' ') is represented by a plus ('+') instead of '%20'?


4. Adam Vandenberg left...
Thursday, 7 December 2006 11:31 am :: http://adamv.com/

When a space is included in QueryString parameter, it is "slightly more correct" to encode it as a + instead of %20


5. Jeff left...
Thursday, 18 October 2007 6:29 pm

Is there anything in C# to validate that a string has been UrlEncode'd? I suppose you could UrlEncode the suspect string and then replace '%25' with '%' and then compare the before and after. If they are equal the string is UrlEncoded. If not, the suspect string was in fact not UrlEncoded.

Any better way?


6. Tim Stall left...
Thursday, 18 October 2007 8:54 pm :: http://timstall.dotnetdevelopersjournal.

I don't think so. You could do something like what you've suggested to give a probable answer. But, I think the problem is that you could have a literal string that coincidentally looked like it had been encoded, and there'd be no way to tell. (You'd have a similar problem in any escaped-like literal string). I could be missing something, but I don't think this is possible.