Things are going smoothly for my Summer of Code. As the add-on system changed its orientation from a focus on user-friendliness more into the direction of being precise and fitting into the existing framework during the summer, I have scratched the stretch goal of a command line helper tool . Instead, I will try to integrate spider callbacks into the add-on system. I.e., spiders will be able to implement the add-on interface as well, and be called back to update settings or to check the final configuration.

The base Addon class

As I said, things are just humming along right now, which means I don’t really have too much to blog about here. So I will use this blog post to introduce another feature of my PR: A base Addon class that developers can (but don’t have to) use to ease some common tasks of add-ons, such as inserting a single component into the settings or exporting some configuration. Again, I’m hoping that I can reuse some of this for Scrapy’s docs.

The Addon base class provides three convenience methods:

  • basic settings can be exported via export_basics(),
  • a single component (e.g. an item pipeline or a downloader middleware) can be inserted into Scrapy’s settings via export_component()
  • the add-on configuration can be exposed into Scrapy’s settings via export_config()

By default, the base add-on class will expose the add-on configuration into Scrapy’s settings namespace, in caps and with the add-on name prepended. It is easy to write your own functionality while still being able to use the convenience functions by overwriting update_settings().

Each of the three methods can be configured via some class attributes:

Exporting basic settings via export_basics()

The class attribute basic_settings is a dictionary of settings that will be exported with addon priority.

Inserting a single component via export_component()

  • The component to be exported is read from the component class attribute. It can be either a path to a component or a Python object.
  • The type of the component is read from the component_type class attribute. It should match the name of the setting associated with that component type, e.g. ITEM_PIPELINES or DOWNLOADER_MIDDLEWARES.
  • The order of the component is read from component_order. This setting only applies to ordered components, e.g. item pipelines or middlewares.
  • The key of the component is read from component_key. This only applies to unordered components such as download handlers.

Exposing the add-on configuration into Scrapy’s settings

  • The prefix to be used for the global settings is read from settings_prefix. If that attribute is None, the add-on name will be used.
  • The default configuration will be read from default_config.
  • Specific setting name mappings for single configuration entries can be set in the config_mapping dictionary.