Compared to V2.0, MetaStudio V3.1.0 has implemented the prorietary technology FreeFormat. As a result, one more type of Bucket, i.e. FreeFormat Bucket, has been supported, which improve the compatibility of the Web data extraction rules and the precision of data extracting. While both types of the buckets, i.e. FreeFormat and ListBucket, are to be explained in the following chapter, FreeFormat bucket is recommended because it covers all capability of ListBucket. When new data schemas are to be defined, FreeFormat bucket should be used. In contrast, ListBucket is used to edit and maintain the legacy data schemas generated by the previous versions of MetaStudio.
At the end of each sections, there are exercises for readers to try MetaStudio in practice, the target of which is to extract information on commodities from Alibaba.
Tips: This chapter explains how to define a data schema from scratch. In fact there is a shortcut to generate data and clue extraction instruction files. The MetaCamp server is a repository of data schemas where the data schemas are shared among MetaSeeker users and protected with access control methods. Before define a new data schema, you can search for a shared data schema for the same target. If one found, it can be loaded by initiating the Load operation on the work board, Schema List, of MetaStudio. If this approach is taken, the next section, Load a sample page, can be skipped. How to load a data schema please refer Facilities#Schema list.