Is MetaSeeker a screen scrapper or a web scrapper

There is not an authority to define what are screen scrapper and web scrapper respectively. The following description is based on my understanding on their difference. I think a screen scrapper is integrated with a Web browser engine and a web scrapper is not. So MetaSeeker is in type of the former.

Because integrated with a Web browser engine, a screen scrapper can support all standard page authoring technology. The scrapper extracts Web data in style of WYSIWYG，i.e. What You See Is What You Get. Let's give an example. In an HTML document, the character string <br> means a line break. The screen scrapper doesn't care what the source codes of the HTML document are. What it sees is a line break code, e.g. 0x0a0x0d, which is translated by the browser engine from the string <br>. As a result, the code 0x0a0x0d will appear in the results.

There are many approaches to implement a web scrapper. One implementation may make use of regular expressions, which filters the HTML document against a series of regular expressions to find particular data snippets. So what it sees is HTML document's source codes. As a result, it is very straight-forward for it to extract the string <br> instead of the code 0x0a0x0d. Another implementation may make use of programming libraries manipulating DOM. This implementation is very similar to screen scrapper in handling with HTML DOM. But it is a complicated work to integrate other engines, e.g. JavaScript engines, into it, which weakens its capability sharply.

In order to run MetaSeeker properly and manipulate the results exactly, this character of MetaSeeker must be paid attention to. Please click the taxonomy Result Files to read all articles on how to manipulate results.

GooSeeker

Documentation

Is MetaSeeker a screen scrapper or a web scrapper

Languages