Pillars for doing a Basic SEO Audit. On Page and Off Page.

tongfkymm44 · Post by **tongfkymm44** » Wed Jan 22, 2025 7:02 am

First of all, let's start with the basics. Generally, an SEO audit is divided into two parts, the On Page part, and the Off Page part.

Improve your SEO!
If you know anything about SEO, you surely know how to distinguish between these two pillars in a simple way, but in case you don't, we'll tell you.

On Page: Deals with everything related to the website itself. It deals with gmx email database list technical SEO aspects (indexability, crawling, WPO, sitemap, etc.) and Content (meta tags, headers, web architecture, keywords, etc.).
Off Page: In this case, it deals with topics external to our website (incoming links, authority, link profile, etc.)
Now that we know what On Page and Off Page are, we are going to analyze different aspects that must be taken into account in each of these branches to ensure that our project is complying with the recommended guidelines.

On-Page Part of an SEO Audit
Architecture
A good SEO-focused architecture is one of the main pillars for a successful SEO strategy. It is very important that all the pages we are most interested in positioning are as accessible as possible, within a logical and functional structure. A bad architecture can dynamite an entire SEO project since it can harm the traceability of the site and therefore its indexability and general positioning.

Explaining what an architecture should be like is impossible, since each project has its own needs. But basically it is about creating a clear URL structure , which allows for easy navigation through the website, and which allows you to reach all the important pages of the website with the minimum possible clicks. In any case, we have a post with tips for SEO architecture that we recommend you read.

A simple architecture might look like this:

Although obviously each project is unique and the architectures will always be customized according to the needs of the project.

Tracking
Improve your SEO!
Well, since we mentioned crawling in the previous point, let's talk about it. As you may already know, Google has a series of robots (crawlers) that are responsible for crawling websites, their URLs and their content. But they go relatively blindly, discovering links as they find them. That's why it's important that all pages on the website are easily accessible, because in this way they will also be easily crawlable.

We must try to make the task as easy as possible for crawlers. That is why one of our recommendations is to not have more URLs than necessary, to avoid generating pages that are not going to fulfill any function. Having an excess of URLs can cause crawling problems mainly due to what is known as Crawl Budget.

Although it is true that the concept of Crawl Budget has changed over the years, let's explain what it is. Crawlers have a limited set of resources to crawl the millions of websites that exist on the Internet, that is why they assign a certain budget to each website, and the higher it is, the more Google will crawl. But if Google does not consider the website relevant enough, the number of URLs it will crawl may be lower. That is why we generally try to optimize the number of URLs we have on a website and make sure it crawls the ones that really matter to us.

However, John Mueller has insisted that it really does not affect as much as previously believed, and that unless we have a giant website there should not be any problems with crawl budget.

Therefore, in conclusion, it is important to make sure that the pages that matter most to us are easy to crawl, that the crawler does not need to make many jumps to reach them, and that should be enough.

To find out the frequency and number of URLs that Google is crawling on our site, we can use the Google Search Console crawl statistics tool, where we can see a graph with data on the number of URLs that Googlebot has crawled in the last 90 days.

Another tip we give to analyze the quality of the crawl is to use Screaming Frog, this tool allows us to see the Crawl Depth. With this we know how many links the crawler had to follow to reach the destination. We can use this information and cross-reference it with the URLs that are most important to us in the project and make sure that these pages are easy to crawl.

Disallow
Within the crawling, we have a directive that is important to mention, and this is the disallow directive . We can find it within the robots.txt file of a website.

URLs and paths that follow a disallow directive will not be crawled by the robot. This is a very useful directive when we have very large sections of the website that we do not want Google to crawl because they do not serve any SEO purpose. But we must be very careful with how we apply this directive, because if we block access to important places we can dynamite the entire project. So we must be very careful with this.

https://youtu.be/GIVu8RLyDFU?si=D-tTiymrMUAnbgbY
Personally, knowing what we mentioned above, that in order to have tracking problems there must be a lot of URLs, I hardly use the disallow directive anymore because I consider it a wasted effort. But it is important to know its applications and uses because it may be needed at any time.

Indexability
Improve your SEO!
Indexability goes hand in hand with crawling, in fact, some people confuse the two terms.

If a page is indexable, it will be listed in search engines. If the page is not indexable, users will not be able to access it through the organic channel and therefore it is a disaster for SEO.

To ensure that our pages are indexable correctly, several factors must be taken into account:

The page must be crawlable, as the search engine must be able to find it and be able to read its content. And this is where people often confuse the terms crawlability and indexability.
The page must respond correctly, the page must be able to load and display without any problems. In slightly more technical terms, the page must have a status code of 200 (OK).
The page must not have a no-index meta tag.
The page must be canonical ( Explanation of the canonical tag here ). This means that the canonical tag of the website must not link to any other page other than the URL itself, also known as a self-referential canonical.
A URL can be crawled but not indexed. For example, when a page has a noindex meta tag, Google will crawl it but not show it in search results.

To find out the indexability status of your website, we recommend using the “Pages” feature in the indexing section of the Google Search Console tool.

Google Search Console Coverage Report
You can also place the URL inside the Google Search Console inspector, and this will give us information related to the URL, whether it is indexed, or if on the contrary it is not. Also, if it is not, we will have the option to ask Google to index the page. If we click this button, Google will test the URL, and if it is indexable , it will index it.

Another way to find out if a URL you are interested in is indexed quickly is to simply do a search in Google by pasting the URL and putting site: in front of it. Example: site:seocom.agency

Sitemaps and Robots.txt
Sitemap
Improve your SEO!
Related to both crawling and indexability is the sitemap. This is a file that contains or should contain all the URLs that we want Google to crawl and index in a structured way. There is no more mystery to it. This file should contain all the URLs that we want to rank in a project. It is a file that crawlers review and will be very useful to ensure that our URLs will be crawled.

The way to structure a sitemap is actually pretty much free. You can put all the URLs inside one without any particular order. And if we want to be more organized we can do it by language, section of the website... We usually do this when the website is very large, but crawlers really don't care because they will crawl the URLs indistinguishably.