SEO

Netpeak Checker 3.0: SERP Scraping. Brief Overview

7
0

Folks, Netpeak Software team is glad to share great news about Netpeak Checker 3.0 release that we worked on for more than one year. In this post, I'll tell you about new features of Netpeak Checker and describe how this tool has changed during last year.

Spoiler: now you can solve much more tasks with its help ;)

1. SE (Search Engines) Scraper

With a new tool under an 'SE Scraper' nickname functionality of Netpeak Checker 3.0 got much broader than before. It can help you get Google, Bing, Yahoo, and Yandex search results in a structured table with a lot of useful data.

1.2. What Data You Can Get:

  • URL → page address;
  • Position → indicates URL position in corresponding SERP.
  • Snippet type → it can be a regular result, video, image, news, or sitelink.
  • Title → page title in SERP snippet. Can be shown differently depending on the search query.
  • Description → description of the page under its Title in SERP snippet. Just like a Title, description can be formed depending on the search query.
  • Highlighted text → indicates text marked as bold in SERP snippet: this way search engines highlight exact match for the query or similar terms (synonyms).
  • Sitelinks → texts (anchors) used for sitelink extensions in SERP snippet for certain search result.
  • Rating (review snippet) → indicates rating of the page, if it's shown in SERP snippet.
  • Featured snippet → indicates if the URL is mentioned in a special featured snippet block at the top of the search results page.
  • Host → indicates host of the page from SERP. For instance, if an initial URL type is [https://subdomain.domain.com/page.html], then the cell will contain [subdomain.domain.com] value.
  • Query and Search Engine → these parameters help you navigate because the table can contain data from different search queries and search engines.

2. 'Search Engines' Settings

Now Netpeak Checker can be called a program optimized for local SEO and regional SERP research. The program settings have acquired a new 'Search engines' item which affects both standard parameters of search results (Google / Bing / Yahoo / Yandex SERP) and a new 'SE Scraper' tool.

Netpeak Checker 3.0: Tab with 'Search engines' settings

Let's take a closer look at implemented settings:

  • Google:
    • geolocation (city / region)
    • country
    • language
    • date and time (past hour, past 24 hours, past week, past month or past year)
  • Bing:
    • 'country' or 'country / language' parameter
    • date and time (past 24 hours, past week or past month)
  • Yahoo:
    • date and time settings similar to the Bing
  • Yandex:
    • geolocation
    • language
    • date and time (past 24 hours, past week, past month or past year)

3. New 'Proxy Anti-Ban' Algorithm

This new feature was implemented to enhance working with search engines. Proxies allow you to receive data from Alexa, Google SERP, Bing SERP, Yahoo SERP, Yandex SERP, Wayback Machine, Facebook, Pinterest and On-Page parameters and reduce Captcha appearing because requests are made from different IP addresses.

We figured out how to minimize the chance of getting your proxy banned →

  • We've determined the time delays / the number of proxies that can be used simultaneously for each service and provide reliable data scraping;
  • We've improved the proxy switching function in case of a ban from a particular service and introduced the conditions for further unbanning;
  • If all proxies are excluded, at the end of the analysis you will see a notification about which services were stopped and why.

4. Custom Filter and Parameter Templates

Now you can save your own or choose default templates for parameters and filters in Netpeak Checker as well as in Netpeak Spider.

Netpeak Checker 3.0: Parameter templates

4.1. Parameter Templates

Choose the necessary parameters and save a template with the name you choose or use one of the default templates:

  • All free → connection with paid services is not required.
  • Link building → for evaluating link sources we chose basic On-Page parameters, data from Alexa, Whois, Google, Facebook as well as backlink parameters from the free version of Mozscape that can optionally be replaced by any paid analog.
  • Dropped domains: basic → the most basic parameters that will allow you to find dropped domains more effectively.
  • Dropped domains: extended → after applying a basic set of parameters, you can go on to a more detailed analysis.
  • Contacts search → the server response code (to make sure that the website is accessible), Title, Description, all social network settings, all available email address parameters (both on the page and in Whois-records).

4.2. Filter Templates

Netpeak Checker is made for different researches, benchmarks, etc. You more than likely use the same filters for results and now you can create your own templates for it and use them in one click.

5. Filtration and Search

5.1. Updated 'URL Explorer'

Now 'URL Explorer' tab shows filtration conditions for data and the ratio of the filtered URLs to the total number of results (in percentage, %).

Netpeak Checker 3.0: Tab with applied filter

You can open the window with settings in three ways:

  • press the 'Filter' button on the dashboard;
  • via the 'Analysis' item → 'Set filter...' in the main menu;
  • with Ctrl+F keyboard shortcut.

5.2. 'Filter by Value' Function

This new function will allow you to quickly filter results based on selected data in cells. For example, you've analyzed 1 million URLs, and some of them have 301/302 redirects. To find all pages that have a redirect, simply select the cell with the necessary response code and select the 'Filter by value' item in context menu:

Netpeak Checker 3.0: Filter by Value

5.3. Quick Search in Table

Quick search is implemented in all tables of the program (the 'All Results' and 'URL Explorer' tabs in the main table, as well as the 'SE Scraper' tool) – just press Ctrl+E and enter search query:

Netpeak Checker 3.0: Quick search in table

6. Optimization of RAM Consumption (×2)

In the new version of Netpeak Checker, we applied the latest developments (our own) to optimize RAM consumption. As a result, we've reduced RAM consumption by more than 2 times.

7. Interface and Usability

7.1. Parameters:

  • The 'Parameters' tab has been moved to the right sidebar;
  • Choose parameter / service and you will see its extended description in the info panel below;
  • By clicking on the parameter, you will scroll to it in the current table. And if you click on the service, the current table will be scrolled to the first parameter of this service.

7.2. Adding List of URLs for Analysis:

  • manually
  • from a file (it can be a file with .txt, .xlsx, .csv, .xml, .nspj and .ncpj extensions)
  • from XML Sitemap (XML Sitemap, XML Sitemap index, TXT Sitemap, and their gzip-archives)
  • from clipboard
  • Drag and Drop

7.3. Saving the Current List of URLs to the TXT-file.

7.4. Data Rechecking in 'All Results' and 'URL Explorer' tabs:

  • recheck values → starts rechecking all values you've highlighted
  • recheck rows → rechecking all parameters for the selected URLs
  • recheck columns → rechecking selected parameter for all URLs in the current table

Netpeak Checker 3.0: Context menu

7.5. Improved Reports

  • Sorting, grouping, applying a quick search, changing order, attaching columns, and filtering are considered in the export file.
  • Now if the results don't fit in one file, Netpeak Checker will automatically split the data into several files and number them.

8. Verifying Proxy Operation Status

We've developed special functionality to verify one proxy or the entire proxy list right in the program settings. You can download your proxy list and check their availability to access to the Internet, Google, Bing, Yahoo, Yandex.

In addition to checking the availability, you can also find out the response code, the response time and the country the corresponding proxy server belongs to.

Netpeak Checker 3.0: Settings of the list of proxies

9. Limits and Balance Display

Now you can see the remaining limits for accessing the following paid services in the settings of the paid services (Serpstat, Ahrefs, Majestic, SEMrush, anti-captcha.com).

10. Changes in Analyzed Parameters

New parameters were implemented:

  • SQI parameter from Yandex.Webmaster → Site Quality Index that shows how useful site is according to Yandex.
  • 'Language' On-Page parameter → defines the language of the target page in the ISO 639-1 format.
  • Host parameters Yandex SERP → 'Indexed URL' (the number of pages of the site indexed by Yandex), 'Indexing' (whether the target host is indexed by Yandex) and 'Address' (the physical address of the organization that appears in the snippet on the Yandex search results page).

Note that now when exporting a table to XLSX format, you can see hints with a description of each parameter in the column name.

The following services were also completely removed:

  • LinkedIn → the API for receiving data on-demand was closed;
  • Twitter → similar situation;
  • StumbleUpon → service moved to Mix.com (we are still thinking over the need to integrate with this service).

11. Special Status Codes

We've implemented new unique notations for the most common situations with status codes (Status Code, server response code). Now we display them with the & symbol:

  • Disallowed → URL is blocked in robots.txt file.
  • Canonicalized → URL contains Canonical tag pointing to another URL (note that if the page has Canonical pointing to itself, you won't see this notation).
  • Refresh Redirected → URL contains Refresh tag (in HTTP response headers or Meta Refresh in thesection of the document) pointing to another URL.
  • Noindex / Nofollow → URL contains instructions not to index and/or follow links accordingly (instructions can be placed in HTTP response headers or in thesection of the document).

So if the page returns 200 server response code and is not allowed to be indexed in the robots.txt file, now you will see '200 OK & Disallowed' in the 'Status Code' column:

Netpeak Checker 3.0: Status Codes

12. Other Improvements

  • 'Restart' function (F6 key) was added. When you click on the corresponding button, all data will be cleared and analysis of parameters for all added URLs will restart.
  • Now your new project won't replace the old one while saving. Instead you will be asked to save a separate project with a new name (Ctrl+S).
  • If you don't have time to fill in the project name, then you can rely on the new 'Quick save' function (Ctrl+Shift+S), which will fill the name for you and save the project without any further questions :)
  • Monitoring memory limit → checking the amount of free RAM and disk space: there should be at least 128 MB of both available for program to run. If the limit is reached, the crawling stops and the data remains intact.
  • Now in case of redirection Netpeak Checker follows it and enters the values of the final page (not the initial one as it was before) for all On-Page parameters taken from the source code. The original page retains only parameters received from HTTP headers.
  • 'Open URL in service' item is added to the context menu allowing you to open selected URL in Google services, Bing, Yahoo, Yandex, Serpstat, Majestic, Open Site Explorer (Mozscape), Ahrefs, Google PageSpeed, Mobile Friendly Test, Google Cache, Wayback Machine (Web Archive), W3C Validator or all at once.
  • Also the 'Open robots.txt' item was added to the context menu, it (by sheer coincidence) opens a robots.txt file in the root folder of the selected host.
  • We've implemented highlighting links in the table. Now you will not mix up URLs and plain text in cells.
  • When you click on any cell in the table, the same table will be shown in the 'Info' panel with the analyzed parameters, only vertically and specifically for this URL. Previously, this panel showed the full contents of the cell and it wasn't always useful.
  • We've completely removed a separate bar with grouping by services in the main table. Instead of it, column name indicates the service and the 'target' (for example, URL or Host).
  • Multi-window → it is now possible to open several Netpeak Checker windows directly from the 'Project' menu → 'New Window' (Ctrl+Shift+N) and run separate analysis.
  • We've improved saving the positions of all windows and panels → and if something goes wrong, you can always reset their positions using the 'Interface' menu → 'Reset all positions', 'Reset window items', or 'Reset panel positions'.
  • Since we had lots of requests from our users, you no longer need admin rights to install, update and run Netpeak Software programs.
  • An option to configure User Agent and use basic authentication for analyzing On-Page parameters is added.
  • We significantly improved the algorithm for manual adding the list of URLs → now the program recognizes a URL with a path, but without a protocol (for example, [example.com/category]). It also defines several URLs in one line (in this case all URLs except the first one need a protocol). And we've implemented escaping #! for Ajax.
  • Now you have an option for manual and automatic (with the help of anti-captcha.com) Google reCAPTCHA solving.

In a Nutshell

In a 3.0 version Netpeak Checker not only got a new powerful built-in tool for SERP results scraping, but also has become a faster, more reliable and useful instrument allowing you to work with the most interesting and complicated tasks in digital marketing. It can be a real lifesaver for link builders, SEO and PPC specialists, webmasters, marketing teams, bloggers, and sales managers.

You can find more detailed information about all new features and tasks that Netpeak Checker solves on a Netpeak Software blog.

Found a mistake? Select it and press Ctrl + Enter

Comments (0 )

To leave a comment, you have to log in.

Subscribe

to the most useful newsletter on internet marketing

Most

discussed popular viewed

This site uses cookies and other tracking technologies to help navigate and your ability to provide feedback, analyze the use of our products and services, helps in our advertising and marketing efforts for better user convenience.