Skip to content

HTML that doesn't parse correctly (but doesn't fail either) #45

@GoogleCodeExporter

Description

@GoogleCodeExporter
I've been using Fizzler with great success, but today I came across some HTML 
that silently failed to parse correctly.

I was selecting all of the <a> elements and noticed that one was being ignored. 
Here are the repo steps:

1. Load the HTML from http://pastebin.com/T1Lsr6w6 (this is the "View Source" 
for http://www.diapers.com/product/productdetail.aspx?productid=16913)
2. Try to query the selector "#pdp"
3. Example code (assuming String html has the HTML above)

var doc = new HtmlDocument();
doc.LoadHtml(html);
var dom = doc.DocumentNode;
var pdpElement = dom.QuerySelector("#pdp");


What is the expected output? What do you see instead?
Expect pdpElement to be an HtmlNode of <a 
href="http://c1.diapers.com/images/products/p/pg/pg-256_1z.jpg" 
class="MagicZoomPlus" id="pdp" title="Pampers Sensitive Thick Baby Wipes Refill 
360ct." target="_blank">

Instead, it doesn't find a match.

What version of the product are you using? On what operating system?
Fizzler 0.9

Please provide any additional information below.

Original issue reported on code.google.com by portman....@gmail.com on 6 Apr 2011 at 7:36

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions