How do I extract javascript links in an HTML document?
I am writing a small webspider for a site that uses a lot of javascript for links:
<htmlTag onclick="someFunction();">Click here</htmlTag>
where the function looks like:
function someFunction() {
var _url;
...
// _url constructed, maybe with reference to a value in the HTML doc
// and/or a value passed as argument(s) to this function
...
window.location.href = _url;
}
What is the best way to evaluate this function on the server side so that I can plot a value for _url?
0
Richard
a source
to share
4 answers
It must be a mess. But it depends on many parameters:
- Where is the link stored? inside an element, in javascript var, etc.
- Is the javascript function always your own?
Some hints that might do the trick are to simply parse your html and use a regex to catch http links where onclick = "someFunction ();" the attribute is present.
0
a source to share