Hi Mo, Quoting Mo Zhou (2019-12-26 16:31:34) > On Thu, Dec 26, 2019 at 02:59:25PM +0100, Bernd Zeimetz wrote: > > So in my opinion the option we should implement is a (mostly) > > automated license check. There are various tools listed on the wiki > > page, but there are also commercial tools out there which do that > > task. Although I know it will sounds completely wrong in the ears of > > some of readers here, I think asking one of the companies if they'd > > sponsor their tools to examine the new queue sounds like a very good > > idea to me, if it helps the ftp team to be faster. At the same time > > we'll get hints to license violations from our upstreams... > > Very long time ago I had a bold idea to formularize the license > checking, a somewhat repetitive process into a machine learning > problem. However, experience indicates that there are an amount of > special cases hard to be modeled in a math system. Plus, the NEW > reviewing process is not falt-tolerant, which definitely further > increased the difficulty to automate ftp-master's work with > "artificial intelligence". > > As a trainee, I ended up browsing every single file manually. If you have any notes on that thought process - even vague scribblings - then I would appreciate them as inspiration on my work on licensecheck. (I don't plan to implement machine learning to licensecheck - that is way above my skills - but imagine that your possible notes might help aid in improving how to recognize and categorize copyright and licensing statements in free-form texts). - Jonas -- * Jonas Smedegaard - idealist & Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private
Attachment:
signature.asc
Description: signature