WebClient.DownloadFile 404 errors with HTML characters in URI?

I am using the WebClient class to download files from a website and have a couple of questions.

When the URIs have HTML characters in the URI path (eg http://foo.com/path1 &

path2.pdf) I get 404 (not found) errors. How can I prevent this? I thought HTML characters were safe?
When the URIs represent a directory (e.g. http://foo.com/path ) I get 403 (forbidden) errors. I understand why this is happening, but how can I check my URI to see if it represents a directory without an index page.

0

eft May 21 '09 @ 4:15 am

a source to share

1 answer

HTML encoded characters are not URL safe. You need to encode the url. If your data is html encoded, you will want to use HttpUtility.HtmlDecode to get a properly formatted url (i.e. foo.com/page?foo=1&bar=2

if you have special characters that need to appear in urls like ampersands that are not part of the request Urls , you will need to encode urls.Use HttpUtility.UrlEncode
You can not.

+2

Randolpho May 21 '09 @ 4:23 am

a source to share

All Articles