Guide: URL Resolution and Keyword Search

A key feature of the deepwiki_fetch tool is its ability to accept flexible, user-friendly inputs instead of requiring a precise Deepwiki URL. This guide explains the internal logic that resolves various input formats into a target URL.

This process is primarily handled in src/tools/deepwiki.ts.

The Resolution Flow

The server follows a sequence of steps to interpret the url parameter:

  1. Full URL Check: First, it checks if the input is already a valid HTTP/HTTPS URL. If it is, no further processing is needed.

    if (/^https?:\/\//.test(url)) {
      // Use URL as is
    }
  2. Owner/Repo Format: Next, it checks for the common owner/repo format (e.g., vercel/ai). If this pattern matches, it directly prepends the Deepwiki domain.

    if (/^[^/]+\/[^/]+$/.test(url)) {
      // Already in owner/repo format
      url = `https://deepwiki.com/${url}`;
    }
  3. Keyword Extraction (NLP): If the input is a free-form phrase (e.g., how can i use vercel ai) or a single word, the server uses Natural Language Processing to find the most likely technology or library name. This logic resides in src/utils/extractKeyword.ts.

    • It uses the wink-nlp library to perform Part-of-Speech (POS) tagging.
    • It identifies nouns (NOUN) and proper nouns (PROPN) as potential candidates.
    • It filters out common stop words like 'how', 'can', 'i', 'to', etc.
    • The first plausible candidate is returned.
    // src/utils/extractKeyword.ts
    export function extractKeyword(text: string): string | undefined {
      const doc = nlp.readDoc(text);
      const candidates: string[] = [];
      doc.tokens().each((t) => {
        const pos = t.out(its.pos);
        const value = t.out(its.normal);
        if ((pos === 'NOUN' || pos === 'PROPN') && !stopTerms.has(value)) {
          candidates.push(value);
        }
      });
      return candidates[0];
    }
  4. GitHub Repository Resolution: Once a keyword is identified, the server queries the GitHub Search API to find the most popular repository matching that keyword. This step, found in src/utils/resolveRepoFetch.ts, translates a name like ai into a canonical owner/repo string like vercel/ai.

    // src/utils/resolveRepoFetch.ts
    export async function resolveRepo(keyword: string): Promise<string> {
      const url =
        `https://api.github.com/search/repositories?q=${encodeURIComponent(
          `${keyword} in:name`,
        )}&per_page=1`;
      // ... fetch logic ...
      const { items } = (await res.json()) as { items: { full_name: string }[] };
      return items[0].full_name;
    }
  5. Final URL Construction: The resolved owner/repo string is then used to construct the final https://deepwiki.com/... URL, which the crawler uses as its starting point.

This multi-step process makes the tool significantly more robust and user-friendly, allowing users to interact with it more naturally.