The Images addon downloads images from extracted image URLs and stores them into an Amazon S3 storage. The addon is enabled by updating the IMAGES_STORE setting and defining two item fields:

  • image_urls  with type image, which is used for annotating image URLs in the template. This will be the source field from which the addon will get URLs of the images to be downloaded.
  • images , where the addon will save important information about the stored image, including Amazon S3 path relative to the IMAGES_STORE  setting and the original image URL.


Those field names are the default ones, but can be overridden with the settings IMAGES_URLS_FIELD and IMAGES_RESULT_FIELD. The source and target fields defined by these two settings do not need to be different – you can make both be the same. It will ease you from defining an additional field in the item. The addon will just overwrite the data previously extracted with the data it generates (which is a dict already including the origin URL).

Settings:

  • IMAGES_STORE  - provide a complete Amazon S3 base path (in format s3://<bucket name>/<base path>/) where the images should be stored
  • IMAGES_MIN_WIDTH  - images with a smaller width (in pixels) are ignored (default value is 0)
  • IMAGES_MIN_HEIGHT - images with a smaller height (in pixels) are ignored (default value is 0)
  • IMAGES_EXPIRES  - when an image is already in store, update it only when its age is older than the given value in days (default value is 90)
  • IMAGES_URLS_FIELD  - specify the item field from which the addon will read the image URLs to download/store (default value is image_urls)
  • IMAGES_RESULT_FIELD  - specify the item field where the addon will save the stored image information (default value is images)


You will also need to provide the standard AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY settings as shown in below screenshot, so the addon will be able to upload the images in your Amazon S3 storage.



For more details refer to Scrapy Images Pipeline, the Images addon is based on it.