GSoC weekly report of Jaskaran Singh for week 7

Hi,

This week I:

- Implemented Crawling logic for mapping vulnerability identifiers to equivalent vulnerabilities that can be parsed by the patch-finder. This mainly involved implementing a spider to parse various forms of content-types, the logic of which I have implemented for three: text/html, application/json and text/plain.

- Wrote utilities that would be used by the above mentioned spider.

- Manually tested the spider on DSAs[1], RHSAs[2] and GLSAs[3].

- Wrote unit tests for most of the above functionalities.

- Looked into Scrapy's Crawler API[4], its interaction with the networking framework Twisted[5] and Scrapyd[6]. This was to analyse how multiprocessing implementation would look like and function with Scrapy.

- Learnt about database terminology and concepts such as Object Relational Mapping, Data Access Objects and Database Abstraction Layers.

- Looked into various Python DAL and ORM projects such as PonyORM[7], SQLAlchemy[8] and PyDAL[9].

My working repository: https://github.com/jajajasalu2/patch-finder

Cheers,

Jaskaran

[1]https://www.debian.org/security/

[2]https://access.redhat.com/security/security-updates/#/security-advisories

[3]https://security.gentoo.org/glsa

[4]https://docs.scrapy.org/en/latest/topics/api.html

[5]https://github.com/twisted/twisted

[6]https://github.com/scrapy/scrapyd

[7]https://github.com/ponyorm/pony/

[8]https://github.com/sqlalchemy/sqlalchemy

[9]https://github.com/web2py/pydal