The base add-on class

Things are going smoothly for my Summer of Code. As the add-on system changed its orientation from a focus on user-friendliness more into the direction of being precise and fitting into the existing framework during the summer, I have scratched the stretch goal of a command line helper tool . Instead, I will try to integrate spider callbacks into the add-on system. I.e., spiders will be able to implement the add-on interface as well, and be called back to update settings or to check the final configuration.

Continue reading »

The add-on system in action

In my earlier posts, I have talked mostly about the motivation and the internal implementation of Scrapy’s add-on system. Here, I want to talk about how the add-on framework looks in action, i.e. how it actually effects the users’ and developers’ experience. We will see how users are able to configure built-in and third-party components without worrying about Scrapy’s internal structure, and how developer’s can check and enforce requirements for their extensions. This blog entry will therefore probably feel a little like a documentation page, and indeed I hope that I can reuse some of it for the official Scrapy docs.

Continue reading »

Meet the Add-on Manager

Previously, I introduced the concept of Scrapy add-ons and how it will improve the experience of both users and developers. Users will have a single entry-point to enabling and configuring add-ons without being required to learn about Scrapy’s internal settings structure. Developers will gain better control over enforcing and checking proper configuration of their Scrapy extensions. Additional to their extension, they can provide a Scrapy add-on. An add-on is any Python object that provides the add-on interface. The interface, in turn, consists of few descriptive variables (name, version, …) and two callbacks: One for enforcing configuration, called before the initialisation of Scrapy’s crawler, and one for post-init checks, called immediately before crawling begins. This post describes the current state of and issues with the implementation of add-on management in Scrapy.

Continue reading »

Towards an Add-on Framework

Last time, we learned that most Scrapy extension hooks are controlled via dictionary-like settings variables. We allowed updating these settings from different places without having to worry about order by extending Scrapy’s priority-based settings system to dictionaries. The corresponding pull request is ready for final review by now and includes complete tests and documentation. Now that this is (almost) out of the way, how can we “[improve] both user and developer experience by implementing a simplified interface to managing Scrapy extensions”, as I promised in my initial blog post?

Continue reading »
Top