How I Found Internal Linking Opportunities With Vector Embeddings

Sep 30, 2025 02:00 PM - 5 months ago 186092

I felt overwhelmed erstwhile I first publication Mike King’s article connected vector embeddings. The concepts seemed complex, and implementing them for SEO was intimidating. But pinch Screaming Frog’s caller features and Gus Pelogia’s fantabulous guide, I saw the imaginable to amended soul nexus building utilizing this method.

Based connected the 2 resources above, I decided to create a detailed, step-by-step guideline to make this process much approachable, moreover for group unfamiliar pinch Python aliases vector embeddings.

In this article, I’ll locomotion you done really I utilized vector embeddings to place soul linking opportunities astatine standard truthful you tin confidently use these techniques to your SEO strategy.

Infographic pinch 10 points connected uncovering soul linking pinch vector embeddings

What you’ll request to get started

To transportation retired this process, I utilized the following:

  • Screaming Frog
  • OpenAI API Key
  • Google Sheets aliases Excel

By the end, I had a broad spreadsheet that included:

  • Every important URL from my tract listed successful file A (target URL)
  • The URL of each page that links to the target URL (excluding navigation)
  • URLs to the apical 5 astir intimately related pages based connected cosine similarity
  • Opportunities wherever 1 aliases much of those 5 URLs are not linking to the target URL

Example Spreadsheet

This is the illustration I utilized successful the screenshots below, and it will look thing for illustration this:

Example spreadsheet showing soul nexus opportunities pinch URLs, related URLs, and missing links highlighted successful pinkish cells.

Pink cells bespeak wherever the related page doesn’t nexus to the target page.

Step 1: Get an OpenAI API key

I started by heading complete to OpenAI’s website, clicked the fastener to create a caller concealed cardinal and copied that API cardinal to usage successful Screaming Frog.

Screenshot showing the OpenAI website pinch the action to create a caller concealed cardinal for the API

Step 2: Set up and Run Screaming Frog

For convenience, I person saved the Screaming Frog civilization configuration profile. Download it here. In Screaming Frog, unfastened a civilization configuration record by using: Go to File > Open and prime the .seospider record you want to load. Here is the explainer video.

After loading the configuration file, I opened the API tab, selected OpenAI, and pasted successful my API key.

Screenshot of Screaming Frog’s configuration paper pinch the ‘Custom JavaScript’ action selected

Once connected, I switched complete to the “Prompt Configuration” tab successful the aforesaid model and clicked “+ Add from Library” to prime the “Extract embeddings from page content” prompt.

Screenshot of the ‘Add from Library’ model successful Screaming Frog pinch the ‘(ChatGPT) Extract embeddings from page content’ book selected

Note: The API Access characteristic supra pulls embeddings into the API report. However, the book beneath requires the overmuch smaller record generated by the Custom JavaScript function.

Next, I opened Screaming Frog and followed these steps:

  • Navigated to Configuration > Custom > Custom JavaScript.

Screenshot showing the Custom JavaScript conception successful Screaming Frog pinch the Open JavaScript Snippet Editor fastener indicated.

  • Clicked “Add from Library” and selected “(ChatGPT) Extract embeddings from page content.” This allowed Screaming Frog to extract the information needed for the soul nexus audit.

Screenshot showing wherever to paste the Open AI API cardinal successful the Screaming Frog civilization JavaScript editor.

  • I edited the civilization JavaScript codification to see my OpenAI API Key. Then, I pasted the API Key I generated successful Step 1 into the due conception of the code. NOTE: You will person to do this moreover erstwhile utilizing the civilization configuration floor plan linked above.

I ran a speedy trial connected a URL from my target site. When I saw numbers populate successful the “Custom Extraction” tab, I knew the setup was moving correctly.

NOTE: Troubleshooting insufficient_quota errors for ChatGPT successful your Screaming Frog results (provided by Tory Gray)
1: Ensure you person a paid ChatGPT relationship This process won't activity without a paid account! - Log successful and negociate your paid subscription here. 
2: Ensure you person due ChatGPT API fund disposable - Log successful to your OpenAI relationship here. 
Set monthly API usage limits, arsenic good arsenic fund alerts here(scroll past Rate Limits section). 
How overmuch do you need? A one-time fund of $10 would astir screen a mini tract pinch 100 pages and blog posts averaging 3200 words. 
You tin position existing API usage levels here (e.g. to guarantee you aren't already complete the limit).

Step 3: Export vector embeddings and each inlinks

Export All Internal Links from Screaming Frog

  • I saved the crawl erstwhile it was completed.
  • I exported the “All Inlinks” information from Screaming Frog. This record contains each soul nexus connected the tract and tin beryllium rather large. For example, my file, all_inlinks.csv, was astir 52 MB and represented 1,428 URLs.

Screenshot showing really to export the ‘All Inlinks’ information from Screaming Frog

Export vector embeddings from Screaming Frog

  • I exported the Custom Javascript results wherever the vector embeddings are located.
     

screenshot showing information imported into Google Sheets from the ‘all_inlinks.csv’ file

Step 4: Run some files done the Cleanup & Formatting Script

I opened this convenient Python script connected Google Collab, which was created by Britney Muller, and made a transcript to use.

  • I pressed the “Play button”
  • Scrolled down to “Choose Files"
  • Uploaded my first CSV record (all_inlinks.csv) and fto it process.
  • Uploaded my 2nd record (custom_javascript_all.csv) erstwhile prompted.
  • Accepted the Save File messages erstwhile they were done processing. You should get an XLSX type and a CSV version.

Note: The original type of this guideline was astir 3 times arsenic long, comprised mostly of formatting and information cleanup. Britney’s book automates ALL of this, redeeming you respective hours of tedious work.

Troubleshooting errors

Sometimes, an rumor pops up that keeps the book from working. But don’t worry, clicking “Explain Error” will typically guideline you to the fix.

The mentation of the correction helped maine fig retired that I needed to unfastened the CSV record and look for irregularities successful the Embeddings column. It turned retired location was a blank cell.

Other examples of what mightiness origin errors during this shape are: 

  • Extra columns
  • The incorrect record name
  • The incorrect file names

For example, I encountered a blank compartment successful the “Embeddings” file that caused an error. I simply deleted that row, exported the cleaned record arsenic file.csv again, refreshed the Google Colab notebook, and retried.

Step 5: Import aliases Open the Saved File

I opened a caller Google Sheet and imported the XLSX file. The book besides outputs a CSV, but I for illustration the XLSX type because it preserves conditional formatting.

If you’re moving from Excel aliases different spreadsheet programme connected your computer, simply unfastened the XLSX file.

Validate the data

I double-checked a fewer entries manually to guarantee everything was accurate. Now, I person a complete database that shows each target URL successful file A, the apical 5 related URLs, and whether those URLs are linking backmost to the target URL. 

My last spreadsheet looked for illustration this, pinch “Exists” aliases “Not Found” indicating whether each related URL was linking backmost to the target URL:

Step 6: Build soul links

Now comes the last and astir actionable portion — building those soul links.
Identify the opportunities: I utilized the pinkish cells arsenic indicators of wherever soul links were missing. Each pinkish compartment represented a related page that wasn’t linking to the target URL, moreover though it should.
Add the links: I went to each related page (from the pinkish cells) and edited the contented to see a applicable soul nexus to the target URL. I made judge to usage a descriptive anchor matter that aligns pinch the contented connected the target page.
Prioritize: I started pinch the highest-priority pages first, specified arsenic those pinch the astir traffic.

Concluding thoughts: Create a cohesive soul linking building pinch vector embeddings

Take the clip to build, analyze, and refine your soul nexus structure. This step-by-step guideline transformed my soul linking process into a data-driven strategy pinch the powerfulness of vector embeddings. The effort will salary disconnected successful improved rankings, amended personification experience, and ultimately, much integrated traffic. It besides improves SEO capacity by ensuring your astir valuable pages are connected successful a measurement that hunt engines and your users understand. 
After moving this process connected a client’s site, I was surprised. I thought we’d done a awesome occupation astatine soul linking, but location were hundreds of opportunities we’d missed. And I don’t conscionable mean that the keyword we want to nexus from appears connected the page. I mean opportunities to nexus pages that hunt engines would spot arsenic highly applicable to each other. In doing so, I was capable to disambiguate intimately related concepts and hole a fewer unnoticed keyword cannibalization issues arsenic well.

Links to templates and resources

The author's views are wholly their ain (excluding the improbable arena of hypnosis) and whitethorn not ever bespeak the views of Moz.

More