Is there a way to compute XPath from a known value in a web page?

Question

Is there a way to compute XPath from a known value in a web page?

As I understand it, XPath is a way to navigate through elements in an XML document.
Direction is an XPath → element.
How do you go the other way? That is, calculate the XPath from the known value of the element?

For example, how would you find the xpath of the "faq" link in the stackoverflow header?
The language is not that important, I'm more interested in the algorithm and / or libraries / methods that will help me compute XPath.

0

javascript xpath

David May 23 '09 at 3:51

a source to share

3 answers

Since XPath can select the nth child element (ie / parentelement / child_element [2]), if you can figure out where in the tree the element is, then you should be able to generate the XPath back to it.

+1

Thanatos May 23 '09 at 3:59

a source to share

You have not specified a language that makes it difficult to answer this question. Python lxml module can do this

>>> a  = etree.Element("a")
>>> b  = etree.SubElement(a, "b")
>>> c  = etree.SubElement(a, "c")
>>> d1 = etree.SubElement(c, "d")
>>> d2 = etree.SubElement(c, "d")

>>> tree = etree.ElementTree(c)
>>> print(tree.getpath(d2))
/c/d[2]
>>> tree.xpath(tree.getpath(d2)) == [d2]
True

Even if you are not using python, you can find what you need in the module's source code

+1

SpliFF May 23 '09 at 4:01 am

a source to share

Matthew flaschen · Accepted Answer · 2009-05-23T07:40:24+0000

Here's a simple JS function. It only uses previousSibling, nodeType and parentNode, so it should be portable to other languages. However, the result is not readable (to humans) and it won't be particularly persistent across page changes.

In my experience XPath is more useful when written by hand. However, you can make an algorithm that generates prettier (if possible slower) results.

function getXPath(node)
{
    if(node == document)
    return "/";
    var xpath = "";
    while (node != null && node.nodeType != Node.DOCUMENT_NODE)
    {
    print(node.nodeType);
    var pos = 1, prev;
    while ((prev = node.previousSibling) != null)
    {
        node = prev
        pos++;
    }
    xpath = "/node()[" + pos + "]" + xpath;
    node = node.parentNode;
    }
    return xpath;
}

Is there a way to compute XPath from a known value in a web page?

More articles: