Feedback requested

I would appreciate feedback on the following mockups, which I present without commentary since the point is to see if makes sense to you.

New tab on Image Copyright and License screen:

New tab on all other media types (text, audio, video):

  1. Does this make sense to you?
  2. Do you think this will make sense to the people you train?

It seems sensible to me to have an AI tab. It’s also easy for training as well.

It seems sensible to me. It is something else we’ll have to train for and it would be good to have more details about the pros and cons in black and white available somewhere for trainers.

Both pages were clear to me.
My only question: The second page has the words, “The copyright holder of this image”
Should it say “The copyright holder of this text, audio or video”?

I think the pages will make sense to those I train, but without more information available, they could have some unfounded fears regarding AI programs learning from their language and media.

I appreciate the thought put into this and I agree with Greg’s comment, suggesting perhaps the wording ‘The copyright holder of this work’?

In my experience people I train do find it difficult to understand copyright and licensing and I wonder if I need to allow more time during Bloom workshops to explain the theory. Adding AI permissions will require further explanation but I do feel it is important.

While I am in favour of this new tab, I wonder how many people (even some I try to train!) will just ignore it. What happens if the AI tab is not completed? Will it be flagged as incomplete or will it default to the third option as the answer?

To be more precise, I wonder if you need to say something like “lawmakers and courts in many jurisdictions”.
“Anyway” is a standard adverb that means “in any case” or “regardless” and is acceptable in formal writing and speech. For example, “I felt tired, but decided to go to the party anyway”. “Anyways” is an informal, colloquial, or archaic version of “anyway” that is generally only accepted in informal writing or colloquial speech.
Instead of “listening to recordings” it might be better to say, “analyzing recordings”. It’s more accurate and also covers video recordings.

Everyone: thanks for the feedback!

James:

While I am in favour of this new tab, I wonder how many people (even some I try to train!) will just ignore it. What happens if the AI tab is not completed? Will it be flagged as incomplete or will it default to the third option as the answer?

Yeah actually I would hope that most people do ignore it – for text at least. In many languages, Bloom books represent the best set of non-Bible parallel texts available and, in my opinion, the benefits to having your language participate in the AI revolution outweigh the costs. We don’t need yet one more thing that society offers to large languages that it doesn’t offer to small ones. But most communities will find this unfamiliar territory. To facilitate the questions in this screen going unanswered, the default will be this “We don’t know their opinion”.

Should you add this to what you train people? Not necessarily. As you know, Bloom is now too large to train people on each capability “just in case” they need it. As a trainer you have to figure out what capabilities are needed and train to that. For example, I would find out if the language communities represented by the trainees are wanting to take on this issue at this point.

It is not vital that they do, because the fact of the matter is that almost every web crawler will ignore whatever you say, I.e., OpenAi, Google, Meta, Anthropic, etc. If the New York Times can’t prevent being crawled, neither can BloomLibrary.org. I can guarantee that unless/until the law changes, they are not going to honor what your book says about AI. SIL’s AI people will, and hopefully other non-profits, but these are actually the people mostly likely to be doing good with your data. So in saying “No AI Training”, you’re blocking the “good guys”, and not blocking the big corporations.

So why even do this? First, I am loving SIL’s new AI Ethics statement. It may not speak to this particular issue, but we want to be serving in a way consistent with the spirit of the statement. Second, one of the publishers who share their books with us (as CC-BY) do not want AIs training on their material. So we have at least one large group of books that we will need to mark in some way.

Tom: in Bloom-land, we’re writing for people who may be reading the UI in what is their 3rd or 4th strongest language. So we don’t use words like “jurisdictions” and “analyzing”.

Although I think your intention here is ethical and admirable, the Bloom Library relies heavily on Creative Commons licenses. The Creative Commons organization’s position is that AI training should be fair use, and if the courts determine it’s not, it may still be allowed under some CC licenses. Even if a contributor says they do not want their content used for AI training, others do not have to respect that under a CC license. SIL would respect it, but there is no way for the rights/copyright holder to legally enforce it at this time.

I think it is important to remind people that this is a language community decision as per SIL ethics statement. So I prefer the mock-ups that mention the language community.