WebClient.DownloadFile 404 errors with HTML characters in URI?

I am using the WebClient class to download files from a website and have a couple of questions.

  • When the URIs have HTML characters in the URI path (eg http://foo.com/path1 &

    path2.pdf) I get 404 (not found) errors. How can I prevent this? I thought HTML characters were safe?

  • When the URIs represent a directory (e.g. http://foo.com/path ) I get 403 (forbidden) errors. I understand why this is happening, but how can I check my URI to see if it represents a directory without an index page.

0


a source to share


1 answer


  • HTML encoded characters are not URL safe. You need to encode the url. If your data is html encoded, you will want to use HttpUtility.HtmlDecode to get a properly formatted url (i.e. foo.com/page?foo=1&bar=2

    if you have special characters that need to appear in urls like ampersands that are not part of the request Urls , you will need to encode urls.Use HttpUtility.UrlEncode
  • You can not.


+2


a source







All Articles