Make sure inline links are fully qualified URLs during scrape process

When scrapping documentation pages from the web, we should make sure that any links are converted to fully qualified version of themselves (e.g. going from something like:

```markdown
[migrate your entire database at once](/self-hosted/latest/migration/entire-database/]
```

to 

```markdown
[migrate your entire database at once](https://docs.tigerdata.com/self-hosted/latest/migration/entire-database/]
```

Right now the LLM likes to quote the returned markdown chunks where the former end up showing as weird broken text vs the latter. While we could maybe fix this via prompting as well, I think better to just eat the extra tokens in embedding and then make it easier for the LLMs to use.

It'll probably be easier/better though to try to do this manipulation against the HTML source, vs after we convert it to markdown.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make sure inline links are fully qualified URLs during scrape process #12

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Make sure inline links are fully qualified URLs during scrape process #12

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions