Need help accessing PHP DOM elements

Hi guys, I have the following HTML structure that I am trying to extract from:

// Product 1
<div class="productName">
 <span id="product-name-1">Product Name 1</span>
</div>

<div class="productDetail">            
 <span class="warehouse">Warehouse 1, ACT</span>                
 <span class="quantityInStock">25</span>
</div>

// Product 2
<div class="productName">
 <span id="product-name-2">Product Name 2</span>
</div>

<div class="productDetail">            
 <span class="warehouse">Warehouse 2, ACT</span>                
 <span class="quantityInStock">25</span>
</div>


// Product X
<div class="productName">
 <span id="product-name-X">Product Name X</span>
</div>

<div class="productDetail">            
 <span class="warehouse">Warehouse X, ACT</span>                
 <span class="quantityInStock">25</span>
</div>

      

I have no control over the original html and as you will see the productName and the companion productDetail is not contained in the generic element.

I am now using the following PHP code to try and parse the page.

$html = new DOMDocument();
$html->loadHtmlFile('product_test.html');

$xPath = new DOMXPath($html);

$domQuery = '//div[@class="productName"]|//div[@class="productDetail"]';

$entries = $xPath->query($domQuery);

foreach ($entries as $entry) { 
 echo "Detail: " . $entry->nodeValue) . "<br />\n";
}

      

Which prints the following:

Detail: Product Name 1
Detail: Warehouse 1, ACT
Detail: 25
Detail: Product Name 2
Detail: Warehouse 2, ACT
Detail: 25
Detail: Product Name X
Detail: Warehouse X, ACT
Detail: 25

      

Now this is close to what I want. But I need to do some processing on each Product Stock, Warehouse and Quantity and cannot figure out how to disassemble it into separate product groups. The end result I am getting is something like:

Product 1:
Name: Product Name 1
Warehouse: Warehouse 1, ACT
Stock: 25

Product 2:
Name: Product Name 2
Warehouse: Warehouse 2, ACT
Stock: 25 

      

I can't just figure it out, and I can't wrap my head around this DOM stuff, as the elements don't quite work the same as a standard array.

If anyone can help or point me in the right direction, I will always be grateful.

+2


a source to share


1 answer


This may not be the most efficient way, but

$html = new DOMDocument();
$html->loadHtmlFile('test2.php');

$xPath = new DOMXPath($html);

foreach( $xPath->query('//div[@class="productName"]') as $prodName ) { 
  $prodDetail = $xPath->query('following-sibling::div[@class="productDetail"][1]', $prodName);
  // <-- todo: test if there is one item here -->
  $prodDetail = $prodDetail->item(0);
  echo "Name: " . $prodName->nodeValue . "<br />\n";
  echo "Detail: " . $prodDetail->nodeValue . "<br />\n";
  echo "----\n";
}

      



prints

Name: 
 Product Name 1
<br />
Detail:             
 Warehouse 1, ACT                
 25
<br />
----
Name: 
 Product Name 2
<br />
Detail:             
 Warehouse 2, ACT                
 25
<br />
----
Name: 
 Product Name X
<br />
Detail:             
 Warehouse X, ACT                
 25
<br />
----

      

0


a source







All Articles