A clearer way to parse a token from a ruby string
I'm trying to clean up something, heck, and looking for ways to do it. The idea is that instead of using regex in my rules to parse a string, I would like to use something closer to the route syntax "something /: searchitem / somethingelse" and then give a string like "/ something / FOUNDIT / somethingelse "" you get the result "FOUNDIT".
Here's an example I'm refactoring: Given the input string, say " http://claimid.com/myusername ". I want to be able to run this string against multiple possible matches and then return "myusername" for one of them.
The data for launching it may look like this:
PROVIDERS = [
"http://openid.aol.com/:username",
"http://:username.myopenid.com",
"http://claimid.com/:username",
"http://:username.livejournal.com"]
something_here("http://claimid.com/myusername") # => "myusername"
Any good way to map a string like this http://claimid.com/myusername
to this list and understand the results? Or any methods to make something like this easier? I've been going over the rail routing code as it does something like this, but it's not the easiest code to run.
Right now I just do it with regex, but it looks like the above method will be much easier to read
PROVIDERS = [
/http:\/\/openid.aol.com\/(\w+)/,
/http:\/\/(\w+).myopenid.com/,
/http:\/\/(\w+).livejournal.com/,
/http:\/\/flickr.com\/photos\/(\w+)/,
/http:\/\/technorati.com\/people\/technorati\/(\w+)/,
/http:\/\/(\w+).wordpress.com/,
/http:\/\/(\w+).blogspot.com/,
/http:\/\/(\w+).pip.verisignlabs.com/,
/http:\/\/(\w+).myvidoop.com/,
/http:\/\/(\w+).pip.verisignlabs.com/,
/http:\/\/claimid.com\/(\w+)/]
url = "http://claimid.com/myusername"
username = PROVIDERS.collect { |provider|
url[provider, 1]
}.compact.first
a source to share
I think it is best to create regexes as Elazar suggested earlier. If you just match one field (: username), then something like this will work:
PROVIDERS = [
"http://openid.aol.com/:username/",
"http://:username.myopenid.com/",
"http://:username.livejournal.com/",
"http://flickr.com/photos/:username/",
"http://technorati.com/people/technorati/:username/",
"http://:username.wordpress.com/",
"http://:username.blogspot.com/",
"http://:username.pip.verisignlabs.com/",
"http://:username.myvidoop.com/",
"http://:username.pip.verisignlabs.com/",
"http://claimid.com/:username/"
]
MATCHERS = PROVIDERS.collect do |provider|
parts = provider.split(":username")
Regexp.new(Regexp.escape(parts[0]) + '(.*)' + Regexp.escape(parts[1] || ""))
end
def extract_username(url)
MATCHERS.collect {|rx| url[rx, 1]}.compact.first
end
It's very similar to your own code, only the vendor list is much cleaner, making it easier to maintain and add new vendors as needed.
a source to share
How about String include?
or index
?
url.include? "myuserid"
Or do you need a positional thing? If so, you can split
url.
Yes, third thought: using your input form with: username, create and compile a Regexp for each such line and use Regexp # match to return the MatchData . If you have kept the Regexp and field index: username pairs, you can do it directly.
a source to share
I still think regex might be the solution here. However, you need to write code that generates a regex from a trace-like string. Sample code:
class Router
def initialize(routing_word)
@routes = routing_word.scan /:\w+/
@regex = routing_word
@regex.gsub!('/','\\/')
@regex = Regexp.escape(@regex)
@regex.gsub!(/:\w+/,'(\w+)')
@regex = '^'+@regex+'$'
@regex = Regexp.new(@regex)
end
def match(url)
matches = url.match @regex
ar = matches.to_a[1..-1]
h = {}
@routes.zip(ar).each {|k,v| h[k] = v}
return h
end
end
r = Router.new('|:as|:sa')
puts r.match('|a|b').map {|k,v| "#{k} => #{v}\n"}
Use a router for each routing line. It should return good hash tables that match the url column urls to the actual url components.
To recognize a given URL, you need to go through all the routers and find out which one is accepting the given URL.
class OpenIDRoutes
def initialize()
routes = [
"http://openid.aol.com/:username/",
"http://:username.myopenid.com/",
"http://:username.livejournal.com/",
"http://flickr.com/photos/:username/",
"http://technorati.com/people/technorati/:username/",
"http://:username.wordpress.com/",
"http://:username.blogspot.com/",
"http://:username.pip.verisignlabs.com/",
"http://:username.myvidoop.com/",
"http://:username.pip.verisignlabs.com/",
"http://claimid.com/:username/"
].map {|x| Router.new x}
end
#given a URL find out which route does it fit
def route(url)
for r in routes
res = r.match url
if res then return res
end
end
r = OpenIDRoutes.new
puts r.route("http://claimid.com/myusername")
I think there is a good and simple implementation of most of the routing routes.
a source to share