About Smart Data Crawler

Data Process

flow

  1. When a user access the product page, the javascript will send request to Virtusize Server for the product info.
  2. If the product is not in Virtusize Server, the request will be sent to AutoFeed Server.
  3. AutoFeed Server will start web crawler.
  4. Get the product page information.
  5. AutoFeed will analyze the information and transfer it to structured measurement data.
  6. If product type cannot be detected, send request to image AI.
  7. If image AI cannot detecte the product type, Virtusize QA team will work for this part.
  8. Return the product type mapping.
  9. Autofeed Server will return the measurement data to Virtusize Server.
  10. Since Virtusize Server has the data, when next user come to the Product Page, the Virtusize button will show.

Before the Integration

Confirm the Product ID

Product ID will be set in the Product Page, and used by the crawler to access the Product Page.

  1. The Page URL can be determined by the Product ID

For example, A001, if URL is http://example.com/item/A001.html in this case the product URL is determined, the crawler can work properly.

  1. The Page URL can not be determined by the Product ID

For example, A001, if URL is http://example.com/item/jacket.html then we cannot get the product URL by the product ID. In this case, a Redirector is needed.

The store side should implement the redirector like this:

  • Crawler access http://example.com/redirect?product-id=A001.
  • Redirector returns the real product page URL, for example http://example.com/item/jacket.html.
  • Redirector should return HTTP 301 head, that the crawler can be redirect to http://example.com/item/jacket.html.

Master Data

If the store has some kind of master data (Category tree, Measurement Guide, etc.) please share it to Virtusize Account Manager.