How to convert webpage apostrophe (') to ascii 39 in ruby 1.8.7
That's pretty much it. I am using Nokogiri to clean up a webpage that has & # 8217; characters in it and I can't figure out how to do the conversion. here is what i tried:
str.gsub(/’/,"'")
str.gsub("’","'")
str.gsub("ΓÇÖ","'") # that how it looks when I do a puts
(In the above, there is no space between & # 8217 and ";", but if I don't put the space in, SO converts it to an apostrophe - cruel, cruel irony!)
I'm sure this is covered somewhere, but can't find a solution here or on the internet.
TIA
+2
a source to share
1 answer
str.gsub("\342\200\231", "'")
must work
I got this from:
''.to_s
=> "\342\200\231"
Other HTML characters that can be replaced ( http://ask.metafilter.com/62656/Eliminating-odd-characters-from-web-site ):
"\342\200\176" - "'"
"\342\200\177" - "'"
"\342\200\230" - "'"
"\342\200\231" - "'"
"\342\200\232" - ','
"\342\200\233" - "'"
"\342\200\234" - '"'
"\342\200\235" - '"'
"\342\200\041" - '-'
"\342\200\174" - '-'
"\342\200\220" - '-'
"\342\200\223" - '-'
"\342\200\224" - '--'
"\342\200\225" - '--'
"\342\200\042" - '--'
"\342\200\246" - '...'
+2
a source to share