Today’s Mini Case study will be to Spam 60,000 blog pages twice. So we can test to see if paid private proxies are worth the money. The total list is Roughly 60k of unique URLs, just split up into five shorter lists. Each list got ran twice in ScrapeBox’s fast poster. This way, our only changing variable is the type of proxy used.
Freshly Scrapped google passed vs. Semi-private from Blazing SEO. Who will be able to post more?
Blazing SEO Private HTTP/HTTPS/SOCKS Supported Proxies
We are going to be using BlazingSEO proxies for this test, as they allow unlimited bandwidth/threads. I believe I came across them on GSA SERs forum, they are newer but have worked well for me. Their main features are no limits on bandwidth/threads, Instant Proxy Replacement once per month, and User:Pass authorization if you are on a dynamic ip. If you go with BlazingSEO, I recommend that you go with US for location. This way you can order a few unique cities for posting, so all of your IP traffic isn’t bad from one location. The US located ip’s are the cheapest per proxy as well.
Semi Private means 2 or 3 other users ( Like yourself ) will also be using these IP’s for SEO campaigns. There is a small chance the other users on that ip could get it banned on a domain. If you want a 100% private proxy, they only cost a few dollars more. Rotating proxies are really nice, because they switch out their ip address every 10+ minutes. This way you have a really fresh supply of proxies with whatever you are doing. Though, for the average scrapebox user backconnect/rotating ones are overkill. They do offer a free 2 day trial if you want to see if they perform better.
The Google Passed Proxies
Bonus: Scrapebox Guide For Better Proxy Harvesting Results
Keywords to Search for sources in Google:
ssl list filetype:txt
free daily proxy sites
daily proxies list
fresh new proxies
proxy list 2017 txt
new proxy sites 2017
new proxy every day
fresh unblocked proxy
If you are struggling to find pages that are easy to scrape the ips out of, then append filetype:txt to the search. Such as in the first example keyword in this list.
Adding Proxy Sources In Scrapebox
The above sources that are checked are the ones that were used for this test. You are welcome to use these, but would recommend you create a list on you own. The Sources that come default on SB can pull some decent results but, Ive found creating a custom one saves an enormous amount of time. Just remember to look to see how many Google passed are coming from each new source you add. Removing any source that isn’t worth using resources on.
Note: For more information on scraping free proxies for web scraping.
Read our post on building a list of proxy servers.
Filtering Out Slow Internet Protocol Addresses
For the Google passed proxy test, they were all scrapped from a couple of proxy blogs that I found with a few google searches (Listed above.) All of these scrapped free proxies had the slow speed passed ones removed. This way we are less likely to hit a timeout on a result we would have normally gotten with a quicker proxy. You can find this feature under filter/remove by speed. For each new list tested, we scraped fresh proxies to remove any banned ones.
The Final Results
Scrapped Free Poxies: posted 4608
Blazing SEO Proxies: posted 5830 <<< Winner!
The List Breakdown
With list 5 – That was an expanded list of a couple of domains that were auto-approving. I’ve been running that list for awhile using my semi-private proxies while testing GSA captcha breaker. It’s possible that those IPs got banned on one of those domains, from blasting them with 100’s of comments.
Overall, though, we had a 21% Gain in successful postings from just spending a few dollars on proxies. So there are some downsides but looking at the larger picture. If you are looking to get into automated link building, proxies will be well worth the money spent.
I would just use a combination of private and scrapped proxies. Using the paid ones for posting and other sensitive automation tasks. With how fast Google can ban an IP, it’s nice to have a large amount of them to burn when doing an extensive web scraping campaign.