Skip to content

PageCrawlerExecutor's run() method logic error #2

@yzhang8015

Description

@yzhang8015

Hi, crawler.
I find may be there is one logic error in PageCrawlerExecutor's run() method.
" String link = normalizer.normalize(l); ", this will get link alawys relate baseUrl, but in pratice
we should get link every time relate last urlToCrawl.
I change it to " String link = Utils.normalize(urlToCrawl.link(), l); "
public static String normalize(final String beginUrl, String url) {
url = url.replaceAll("&", "&");
return UrlUtils.resolveUrl(beginUrl, url);
}

I do not whether it's error, but after change this code, i get the result i wanted.

More word, this project is good indeed. I like it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions