Hi,
This week I:
- Implemented Crawling logic for mapping vulnerability identifiers to equivalent vulnerabilities that can be parsed by the patch-finder. This mainly involved implementing a spider to parse various forms of content-types, the logic of which I have implemented for three: text/html, application/json and text/plain.
- Wrote utilities that would be used by the above mentioned spider.
- Manually tested the spider on DSAs[1], RHSAs[2] and GLSAs[3].
- Wrote unit tests for most of the above functionalities.
- Looked into Scrapy's Crawler API[4], its interaction with the networking framework Twisted[5] and Scrapyd[6]. This was to analyse how multiprocessing implementation would look like and function with Scrapy.
- Learnt about database terminology and concepts such as Object Relational Mapping, Data Access Objects and Database Abstraction Layers.
- Looked into various Python DAL and ORM projects such as PonyORM[7], SQLAlchemy[8] and PyDAL[9].
Cheers,
Jaskaran