[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: A policy on use of AI-generated content in Debian



On 5/3/24 12:10, Stefano Zacchiroli wrote:
On that front, useful "related work" are the policies that scientific
journals and conferences (which are exposed *a lot* to this, given their
main activity is vetting textual documents) have put in place about
this.
Indeed. Here are some examples:
Nature: https://www.nature.com/nature-portfolio/editorial-policies/ai
ICML: https://icml.cc/Conferences/2023/llm-policy
CVPR: https://cvpr.thecvf.com/Conferences/2024/ReviewerGuidelines
          https://cvpr.thecvf.com/Conferences/2024/AuthorGuidelines

Some additional points to the two from Stefano:
1. Nature does not allow LLM to be an author.
2. CVPR holds the author who used LLM responsible for all LLM's fault.
3. CVPR agrees that the paper reviewers skipping their work with LLM
    is harming the community.
The general policy usually contains two main points (paraphrased below):

(1) You are free to use AI tools to *improve* your content, but not to
     create it from scratch for you.
Polishing language is the case where I find LLMs most useful. But in fact,
as an author, when I really care about the quality of whatever I wrote,
I will find the state-of-the-art LLMs (such as ChatGPT4) poor in logic,
poor in understanding my deep insight. They eventually turn into a
smart language tutor to me.
(2) You need to disclose the fact you have used AI tools, and how you
     have used them.
Yes, It is commonly encouraged to acknowledge the use of AI tools.
Exactly as in your case, Tiago, people managing scientific journals and
conferences have absolutely no way of checking if these rules are
respected or not. (They have access to large-scale plagiarism detection
tools, which is a related but different concern.) They just ask people
to *state* they followed this policy upon submission, but that's it.
If the cheater who use LLM is lazy enough, not editing the LLM outputs
at all --- you will find it super easy to identify whether a chunk of text
is produced by LLM on your own. For example, I use ChatGPT basically everyday in
March, and its answers always feel like being organized in the same
format. No human answers questions in the same boring format all the time.
If your main concern is people using LLMs or the like in some of the
processes you mention, a checkbox requiring such a statement upon
submission might go a longer way than a project-wide statement (which
will sit in d-d-a unknown to n-m applicants a few years from now).
For the long run, there is no way to enforce a ban on the use of AI over
this project. What is doable, from my point of view, is to confirm that
a person acknowledges the issues, potential risk and implications of
the use of AI tools, and hold people who use AI to be responsible for
AI's fault.

Afterall, it's easy to identify one's intention of using AI -- it is either
for good or bad. If the NM applicants can easily get the answer of an
NM question, maybe it is time to refresh the question? Afterall nobody
can stop one from learning from AI outputs when they need suggestion
or reference answers -- and they are responsible for the wrong answer
if AI is wrong.

Apart from deliberately conducting bad acts using AIs, one thing that seems
benign but harmful to the community is slacking off and skipping important
work with AIs. But still, this can be covered by a single rule as well --
"Let the person who use AI to be responsible for AI's fault."

Simple, and doable.


Reply to: