This week I:
- Implemented Crawling logic for mapping vulnerability identifiers to equivalent vulnerabilities that can be parsed by the patch-finder. This mainly involved implementing a spider to parse various forms of content-types, the logic of which I have implemented for three: text/html, application/json and text/plain.
- Wrote utilities that would be used by the above mentioned spider.
- Manually tested the spider on DSAs, RHSAs and GLSAs.
- Wrote unit tests for most of the above functionalities.
- Looked into Scrapy's Crawler API, its interaction with the networking framework Twisted and Scrapyd. This was to analyse how multiprocessing implementation would look like and function with Scrapy.
- Learnt about database terminology and concepts such as Object Relational Mapping, Data Access Objects and Database Abstraction Layers.
- Looked into various Python DAL and ORM projects such as PonyORM, SQLAlchemy and PyDAL.