visit-website-reworked

Public

Forked from danielsig/visit-website

Reworded plugin with proxy functionality that provides LLMs with the capacity to "visit" websites by providing them with the links, image URLs and text content of any web page.

878 Downloads

12 stars

11 forks

README

Visit Website Reworked for LM Studio

An upgraded website-reading plugin for LM Studio with stronger reliability on modern websites, better diagnostics, and graceful fallback behavior.

Tools

  • Visit Website: fetches and returns key page data (title, headings, links, images, text content).
  • Download Images: downloads remote image URLs (or extracts from a page) and makes them viewable in the chat.

What was problematic before

The previous implementation worked on simple pages, but had major issues on real-world websites:

  • Hard failures on many websites (403, 401, 429, 503) with no recovery path.
  • Frequent anti-bot/WAF blocks (Cloudflare-like pages) ended in immediate tool failure.
  • Weak network resilience (no retry logic for transient failures).
  • Low-visibility errors (Forbidden) with almost no context for debugging.
  • Poor behavior on JS-heavy / WAF-protected pages where HTML extraction was insufficient.
  • Image extraction often missed lazy-loaded attributes (data-src, srcset, etc.).
  • Fallback content via r.jina.ai could contain useful markdown image links, but those were not previously extracted as images.
  • Tool naming/semantics caused orchestration confusion in some model flows.

What is improved in this reworked version

Reliability and fallback

  • Added retry logic for direct page fetches.
  • Added automatic fallback via r.jina.ai when direct fetch is blocked or fails.
  • Added block/challenge detection for common anti-bot scenarios.
  • Added broader image extraction support:
    • HTML attributes: src, data-src, data-original, srcset
    • Markdown image links in fallback bodies (important for source: "jina" cases)

Better observability

  • Visit Website now returns fetch metadata:
    • source (direct or jina)
    • finalUrl
    • statusCode
    • server
  • Errors now include more actionable status/server details.

Request quality tuning

  • Header strategy adjusted to reduce unnecessary bot-triggering patterns.
  • Keeps ranking/selection behavior, while making extraction resilient to modern page patterns.

Download Images behavior

  • Tool renamed from View Images to Download Images to better reflect function intent.
  • Intended usage is now explicit: remote HTTP(S) image URLs or websiteURL.
  • Supports normalization of image references (including markdown links and local paths) to reduce parser/tool-call failures in practical assistant workflows.

Installation

1-Click installation from LM Studio Hub:

Just click Run in LM Studio.

Configuration

You can tune:

  • Max Links
  • Max Images
  • Max Content

Use 0 to exclude a section, or -1 for auto defaults where supported. Don't forget to allow LLM to call the tool.

Usage

Ask the assistant to open a URL and extract page information.
For blocked pages, the plugin now attempts fallback automatically and reports fetch source.

Works great with Analyze Images and Duck-Duck-Go Reworked

LLM can use DuckDuckGo Reworked to find relevant URLs, then Visit Website to extract structured page content and images. Then view the content of the images via Analyze.

Examples