key

If a property is decorated by attribute key, it must exist on target page. Otherwise the page is viewed as unrecognizable or the extracted result is discarded.

There are two types of keys:

  1. validation key: acts as a data schema recognition rule. If the property with this attribute cannot be found on the target page, this page is marked as unrecognizable and the status of the SpiderClue is changed to unknownschema;
  2. data key: is used to validate the extracted results. If the property with this attribute cannot be found on the target page, the extracted results are discarded while the extraction task is successfully finished and the status of the SpiderClue is changed to extracted.

The former covers the later. If a property's key is not set and there is not the property on target page, the property takes a value of the reserved word geometa_NAV in the result file.