Every once and awhile it is a great idea to do a internal link audit of your website(‘s). The main reason I personally do this is to make sure we don’t have any broken external links. Its one of the variable we don’t have any control over, the other webmaster does what he wants with their page/site.

I’ve also noticed after time, you can get lazy and sloppy with formatting the website for the web crawlers. It happens to the best of us.  Its extremely important to make sure you have the basic on page SEO setup correctly.

On page SEO consists of things like correctly formatted h1 & h2 title tags. No missing Meta descriptions, or titles. No broken outbound or internal links, things of that nature. This is also a good way to understand how the bots view your website, and how and why we structure a website this way. 


Step 1: Download & Install

Jump over to Screaming Frog‘s home page, and download the correct file for your OS. They offer a free and a paid version, so make sure you dont get confused and pay for it just now. If you end up really enjoying the tool and using it often. Then i would suggest upgrading to the paid version, for the extra little features. For most people doing a basic website audit, the free one will do what we need.

Installing & Update Java:

You might be required/asked to update Java, this is normal and to be expected. 

Installing the Spider Software:

Go ahead and install Screaming frog, and open it after its finished installing. 

Step 2: Scanning The Internal Pages

Once Screaming frog is running, go ahead and Enter the root domain of your website. i.e. http://domain.com/ This will go into the Enter URL to spider text box and Start the process.

Step 3: Auditing and Fixing Errors

Internal Link Report:

We are looking for broken internal, and external links to fix. You can usually find some other issues that are going on as well, while running this report. If you sort by Status Code, look for any 4XX client errors. This wiki post has more specific info on each error code, and what they mean. If you have a 404 error code on a page that you moved or changed the title on. Then create a custom redirect for that page. If someone ends up going to that url, you want the traffic to get sent to the correct page. 

4xx’s = Dead, fix it.
3xx’s = A redirection
2xx’s= Live, all is well.

External Link Report:

We can take a look at all the external links pointing out from out site under the External tab. If you find that one of the pages you are linking out too is dead, then go ahead and find a new replacement. When looking at a 4xx code on a outbound link, you can see what page it is linking to in the Inlinks tab.

This is how some people like to go about getting backlinks from others. They email them letting them know they have a dead link on a post, and they have a “perfect” article for the replacement. It can be a good tactic depending what niche you are in, just don’t be pushy when you email them. 

Header Tags:

Having multiple h1’s is almost always a bad idea, and google can penalized you because of it. Best practice would be to remove any extra h1 titles if you find any, so its only 1 per page. In our example, I happened to set two h1 tags on the home page #oops.

I never noticed it until now, and maybe why the exact match search of PDXWeb Design is on page 3 at the moment. All I had to do is change that tag to a h2, and all is well now. If any large movements in the search results happen within the next few days, its likely from a automatic penitently getting lifted.

I’m sure there are some legitimate reasons for having two h1 tags on one page, but they are not a common thing to come up. 

Image Size and Count Report:

Image tab: All looked well, we just have some super large image files because of all the screenshots. The one thing i did notice was the The Social share Icons. Digg and Reddit have a decently large file size for their image. 1000 KB+ and it show up on every single post page. I’ve noticed are not all that common for people to use/share on blog posts, and could be removed.  

This will increase the site speed of pages slightly by removing unnecessary calls to the server when a page is loaded.  If you see other images that show up often that are large, or being re-scaled. Then go ahead and take a few minutes to address those problems.

Meta & Link Elements:

The Directives tab will show the meta, canonical and rel=“next” and rel=“prev” link elements.

So in this example we can take a look at all of the no-indexed pages we have on the site. Make sure you did not accidentally no indexed a page you want to show up in the search engine results. With category pages on WordPress, its generally best practice to no-index those to prevent duplicate content issues.

Site Overview:

The overview tab is also super helpful as well to look for other SEO related problems. I dont see a ton of problems in the overview except one our page titles is viewed as too long. I’ll shorten down the title of this to make it easier to read, and keep the spiders happy. You can find more information about the tabs and screaming frog software here.