Screaming Frog and its Artificial Intelligence API (OpenAI, Gemini, and Ollama)

Table of contents

Screaming Frog is one of the most used tools for On-Page SEO optimization of a website. However, this tool is gradually expanding its functionalities and entering other SEO aspects. Thanks to its GA4 API, we can know how a user interacts with our website, with the Search Console API we can see data related to organic traffic, and with Ahrefs or Moz API we can also delve into the Off-Page part.

But that’s not all, as Screaming Frog recently launched in version 20.0 the possibility of connecting with artificial intelligence to perform granular analysis of our pages.

How this new system works

In case you don’t know this tool, Screaming Frog is a crawler, that is, a tracker that analyzes our website, just like Google bot would, with the aim of giving us information about the On-Page SEO of our site.

With the new version published in May 2024, we can execute JavaScripts to perform custom extractions based on the content of the website we analyze. And not only that, but within these JS customizations, we can connect to different AI models to perform analysis on our content.

screaming frog screen

It’s worth noting that to use this functionality, we have 2 options: we can use a pre-created library of prompts with predefined actions or we can directly create our own scripts.

In this article, we will see both possibilities, with several examples of each.

Nevertheless, the first thing is to learn how to configure this functionality, so we will quickly see this point.

Extraction Configuration

Configuring this tool is very simple. On one hand, we need to have an API from the AI model we are going to use, and on the other hand, we need to have a Screaming Frog license, as the free version is not sufficient.

After that, we just have to open this tool and in the Spider > Rendering section, enable the JavaScript option from the dropdown at the top.

Then, go to Settings > Custom and select Custom JavaScript.

At this moment, a screen will open, as seen below, to add our code:

Now, if you look at the top image, on the far right you’ll find the 2 options I mentioned at the beginning: we can execute pre-created prompts from the Add from library option or we can create our own custom code from Add.

Let’s start with the default options.

Executing JS Library Prompts

To start executing default prompts, we must click on Add from Library.

And after that, we will access all the options we have enabled:

Some of the default options include creating alternative texts for images (alt text), understanding search intent, extracting language or article sentiment, among others.

From all of these, let’s see, to start, the prompt that allows us to extract search intent with ChatGPT. To do this, simply select this option and select insert.

Now, our custom extraction will appear loaded and it will be almost ready to execute. I say almost ready because remember that we must include our API.

To indicate the API, we must click on the button with the pencil icon and we will find the code to execute:

If you don’t have a technical background, it might scare you a bit, but the reality is that we only have to edit 3 values:

  • The API.
  • The language model.
  • The temperature.

Let’s see it step by step.

We start with the OpenAI API, in case you don’t have one I’ll explain how to get it. You must create an OpenAI account (openai.com) and in the left sidebar, select the API Keys option.

In the screen that opens, click on Create new secret key, give your API a name and assign it a project:

At the moment of creating it, the API code will appear, copy it:

Once obtained, we just have to go to our Custom JavaScript to include it.

I take the opportunity to tell you that if you prefer to use Google’s AI, you must generate the Gemini API from aistudio.google.com/app/apikey, as I show you below:

Returning to our prompt, I’ll tell you that in addition to indicating the API, you can also change the AI model if you prefer to use a specific one.

Here keep in mind that, by the end of 2024, GPT-4o is the most advanced multimodal model, priced at $2.5 / 1M input tokens. In contrast, GPT-4o mini is a more basic but cost-effective alternative ($0.15 / 1M input tokens), even more so than GPT-3.5 Turbo, so it’s quite interesting depending on the action to be performed.

Finally, the last parameter we can adjust is the temperature, that is, the level of randomness or creativity we want in the generated responses. So you get an idea, a value close to 0 generates more predictable responses while a value close to 1 gives more creative or innovative responses.

Once we apply our 3 configurations (API, model, and temperature), our script is ready to run on the entire website.

The good thing is that Screaming Frog has thought of something very welcome: the ability to test the prompt before launching it on all URLs.

And it’s that, look below our screen, we have the option to perform a test with a URL and thus avoid wasting money in case it doesn’t work.

Once tested and confirmed, we can execute it on the entire website.

And the result, as you can see below, is not bad at all as it was able to differentiate informational intentions from commercial ones.

Certainly we can say that the result seems positive. However, I must tell you that some of the default prompts I consider to be somewhat poor.

Not far from it, the script that automatically generates alternative texts for images leaves something to be desired:

Fortunately, we can edit these codes to make the adjustments we want and even create our own, totally personalized ones.

Let’s do it.

Creating Custom AI JS Scripts

To create our own scripts, let’s recover the code I showed you before.

And if you look, I’ve now marked a specific section in green, the one where we must include our prompt.

Let’s forget about variables and programming topics, simply keep in mind that the prompt you want to include must go between the single quotes following the word question:

const question = ‘Prompt’

To better understand this and familiarize ourselves with it, let’s look at a couple of examples.

Practical Case 1: Meta Generation

As I mentioned before, one of Screaming Frog’s strengths is that we can use its APIs to go beyond On-Page. And of all of them, Search Console provides quite valuable information.

Thanks to it, we can see the traffic we receive, along with our positioning and CTR (among other options). Well, we can precisely look for URLs with the lowest CTR and generate new metas with AI.

To do this, we can create a prompt like the one I show you below:

Then, we just have to execute it along with the Search Console API and we’re done.

The key here is to sort by increasing CTR to find URLs with the worst click-through rate and we would have the result:

I take the opportunity to tell you that we can save our result in Excel to work more comfortably from this point.

I also advise you to filter by the position of your URLs so that, at a minimum, you only take into account URLs that are in position 12 or lower since if the position is higher, optimizing the metas will be of little use.

Practical Case 2: Email Extraction

I use Screaming Frog to extract emails from websites I’ve found through outreach with the aim of contacting them. And for that, I apply custom extractions with REGEX (regular expressions), like the ones I show you below:

However, if you are not familiar with these types of expressions, now we can create a script to extract these email accounts.

To do this, we must configure a prompt like the following:

And once configured, we have to run Screaming Frog in list mode and paste our URLs:

As a result, we obtain our following list of emails for each website:

Practical Case 3: Structured Data Generation

Structured data is becoming increasingly important as it helps Google better understand what a page is about. The problem is that there are more and more types, and I personally don’t know them all.

So, we can create a prompt that analyzes our content and proposes schema ideas.

To do this, we can use a prompt like the one I show you below:

And if we now test it on a page where I talk about a course, we see that it proposes several schemas, including the course type:

Exploiting AI APIs

Now that we know how to include custom extractions with AI, I’ll tell you that you can also include JS scripts from Screaming Frog settings > Connect to API > AI.

From this menu, we can choose between OpenAI, Gemini, and Ollama, include your API, and add the prompts to execute.

However, here the API is included directly in the Account Information section.

And once included, we must go to Alert Settings to include from the library those custom extractions we want to execute.

This way, we can configure, for example, the OpenAI and Gemini APIs to execute the same command and compare results, as I show you below:

Likewise, we can also create our own extractions, and much easier than we did before.

Look at the image below, to create a custom extraction now, since we don’t need to touch code, we just need to fill in the fields related to the extraction name, type and AI model, plus the prompt.

In this case, as an example, what we’re going to do is create a prompt that analyzes the content of a URL and proposes a paragraph with an internal link to a page we want to reinforce the Page Rank of. Of course, only if applicable, fits naturally, and it’s likely that the user will click.

After that, we accept and execute.

As we find above, in some cases, we see that the creation of said link does not apply, while in others it seems that it does.

For example, for a URL where we talk about Google Analytics 4, it offers the following result:

The provided content mainly focuses on a course about Google Analytics 4, which includes a section on linking Google Analytics and Google Search Console. Since the course specifically mentions linking with Google Search Console, including a link to a tutorial about Google Search Console could be relevant and useful for users interested in learning more about this tool. Here’s a sentence that incorporates the link naturally:

«To get more details on how to make the most of Google Search Console after linking it with GA4, you can consult this [detailed tutorial](https://sergiocanales.com/google-search-console/) about Google Search Console.»

Therefore, we can validate it as more than a correct proposal to work on our internal linking.

Creating Our Own Library

Imagine that you really liked the previous prompt and want to always have it available. We can save it from Add from library, by clicking on User and then on the plus (+) button.

In fact, we can go a step further and export or import these customizations in Json format to take them to another Screaming Frog session or even edit them.

This way, little by little, we will create our library of custom commands tailored to the needs of our projects.

Final Notes

As we have seen, the combination of Screaming Frog with AI results in a quite powerful tool with which to process our content and reduce analysis times that we would previously do manually.

In terms of price, the expense is quite low, and a good positive point is the ability to test our scripts before launching them to all URLs of the projects.

So, I invite you to explore this tool and create your own customizations since what really shines here is each person’s creativity.

Screaming Frog Prompts

Meta titles and description prompt generator

				
					// Este es un custom extraction para generar Meta Titles y Meta Descriptions

const OPENAI_API_KEY = 'API';
const question = 'Analyze the H1 and the first paragraph of the provided URL. Based on the content, generate: 1) A compelling meta title (maximum 60 characters) that grabs attention in spanish. 2) A descriptive meta description in spanish (maximum 150 characters) that encourages clicks and highlights the key value. Here are some emojis you can use  《 , 》, ▷,➡️ , ✅, |⮕';

const userContent = document.body.innerText;
    
function chatGptRequest() {
    return fetch('https://api.openai.com/v1/chat/completions', {
        method: 'POST',
        headers: {
            'Authorization': `Bearer ${OPENAI_API_KEY}`,
            "Content-Type": "application/json",
        },
        body: JSON.stringify({
            "model": "gpt-4o",
            "messages": [
                {
                    role: "user",
                    content: `${question} ${userContent}`
                }
            ],
            "temperature": 0.7
        })
    })
    .then(response => {
        if (!response.ok) {
            return response.text().then(text => {throw new Error(text)});
        }                
        return response.json();
    })
    .then(data => {
        return data.choices[0].message.content.trim();
    });
}

return chatGptRequest()
    .then(intent => seoSpider.data(intent))
    .catch(error => seoSpider.error(error));console.log( 'Code is Poetry' );
				
			

Extract email from URL

				
					// Intent of page content using ChatGPT
//
//
// This script demonstrates how JavaScript Snippets can communicate with 
// APIs, in this case ChatGPT.
// 
// This script also shows how the Spider will wait for JavaScript Promises to
// be fulfilled i.e. the fetch request to the ChatGPT API when fulfilled
// will return the data to the Spider.
// 
// IMPORTANT:
// You will need to supply your API key below on line 20 which will be stored
// as part of your SEO Spider configuration in plain text. Also be mindful if 
// sharing this script that you will be sharing your API key also unless you 
// delete it before sharing.
// 
// Also be aware of API limits when crawling large web sites with this snippet.
//

const OPENAI_API_KEY = 'API';
const question = 'Analyze the content of the provided website and extract any email address explicitly mentioned in the text or source code. Return only the email address if found. If no email is available, respond with No email address found';

const userContent = document.body.innerText;
    
function chatGptRequest() {
    return fetch('https://api.openai.com/v1/chat/completions', {
        method: 'POST',
        headers: {
            'Authorization': `Bearer ${OPENAI_API_KEY}`,
            "Content-Type": "application/json",
        },
        body: JSON.stringify({
            "model": "gpt-4o",
            "messages": [
                {
                    role: "user",
                    content: `${question} ${userContent}`
                }
            ],
            "temperature": 0.7
        })
    })
    .then(response => {
        if (!response.ok) {
            return response.text().then(text => {throw new Error(text)});
        }                
        return response.json();
    })
    .then(data => {
        return data.choices[0].message.content.trim();
    });
}


return chatGptRequest()
    .then(intent => seoSpider.data(intent))
    .catch(error => seoSpider.error(error));


				
			

Schema prompt suggestion from content

				
					// Este es un código personalizado para 

const OPENAI_API_KEY = 'API';
const question = 'Analyze the content of the provided URL and suggest the most appropriate structured data types (e.g., Schema.org, JSON-LD, Microdata, RDFa) to implement based on the page purpose and content, respond with No structured data found';

const userContent = document.body.innerText;
    
function chatGptRequest() {
    return fetch('https://api.openai.com/v1/chat/completions', {
        method: 'POST',
        headers: {
            'Authorization': `Bearer ${OPENAI_API_KEY}`,
            "Content-Type": "application/json",
        },
        body: JSON.stringify({
            "model": "gpt-4o",
            "messages": [
                {
                    role: "user",
                    content: `${question} ${userContent}`
                }
            ],
            "temperature": 0.7
        })
    })
    .then(response => {
        if (!response.ok) {
            return response.text().then(text => {throw new Error(text)});
        }                
        return response.json();
    })
    .then(data => {
        return data.choices[0].message.content.trim();
    });
}
return chatGptRequest()
    .then(intent => seoSpider.data(intent))
    .catch(error => seoSpider.error(error));
				
			
Sergio canales
Ingeniero industrial y experto en nichos digitales | + posts

Sergio Canales
Industrial Engineer and Expert in Digital Niches

Currently makes a living from his blogs and digital projects, with over a decade of experience as an engineer and developer. His passion for creating and monetizing websites led him to transform his career and achieve earnings of over €4,000 per month.

Thanks to the SEO Mentorship from Dean Romero, Jesús Roldán, and Dani Llamazares, he learned to identify profitable niches and optimize them for sustainable growth.

He collaborates on various projects alongside his mentors, constantly learning and refining strategies every day.

Would you like to improve your project?