AI Use Policy for Talent Developing Content for O’Reilly
Last updated: October 24, 2023
O’Reilly embraces the potential of generative AI technologies and tools (“GenAI Models”), which can revolutionize the way we and our content creators produce and curate educational content for our users. By generating captivating text, illustrative diagrams, compelling videos, enhanced narration and audio effects, interactive quizzes, and more with remarkable efficiency, GenAI Models are a powerful asset when used properly. We do however want to strike a delicate balance between reaping the benefits and mitigating the risks associated with the use of GenAI Models, including the generation of factually untrue outputs (or “hallucinations”) or biased outputs—leading to inaccurate content in our products, vulnerabilities in data security, and other privacy concerns; risking the integrity of our systems and data; leaking our content and other intellectual property (IP); undermining our value proposition; and increasing the risk that we infringe third-party IP.
With these principles in mind, this policy sets forth our guidelines on the use of GenAI Models to create educational content for our users and to support our business generally. It is important that you review this policy periodically, as we may update our guidelines from time to time to reflect advancements in the technology underlying GenAI Models and changes in legal and regulatory frameworks (including as a result of the outcomes of pending litigation relating to the use of GenAI Models). Note that all references to legal developments or the terms of any GenAI Models are accurate as of the date of this policy.
2. Inputs
“Input” means prompts, text, images, code, or any other materials submitted as input to a GenAI Model.
Review the GenAI Model’s terms on Input: Carefully review, read, and understand the GenAI Model’s legal terms and conditions governing Input as they may include the following potential pitfalls:
- Broad license to train provider models: Some GenAI Models grant ongoing and broad rights to the provider to use your Input for improving and training their underlying model, while other GenAI Models offer an option (sometimes through use of an enterprise license or an on-prem (local) installation of the tool) under which Input provided to the GenAI Model will not be used to train the underlying model and will only be used to train an instance of the GenAI Model accessible only to users of that enterprise account (in our case, O’Reilly’s account).
- Example: Using OpenAI’s ChatGPT via an API under OpenAI’s API Data Usage Policies does not permit OpenAI to use the Inputs to train or improve OpenAI’s underlying large language models (“LLMs”). However, use of the web UI for ChatGPT does permit OpenAI to use the Inputs to train or improve OpenAI’s LLMs, unless the user opts out by completing OpenAI’s opt-out form.
- Input made available to other users in some GenAI Models: Inputs or a derivative or summary of your Inputs may also be available to other users of the GenAI Model. Please see Exhibit A Example 1 for some examples that illustrate this risk.
Guidelines: Follow these guidelines and best practices when submitting Input to a GenAI Model.
- Trade secrets
- Do not submit any Input containing trade secrets or other sensitive or strategic information.
- Example: Avoid submitting information such as a unique pedagogical approach or entire portions of books or course materials you create for O’Reilly because if such trade secrets or information are disclosed to a GenAI Model, their uniqueness and competitive advantage may be compromised.
- Do not submit any Input containing trade secrets or other sensitive or strategic information.
- Materials protected by third-party IP rights
- Do not submit materials, including text, images, audio, or video protected by third-party IP rights as Input unless such Input is owned by you or submitting such Input is permitted by the terms of the applicable license to O’Reilly.
- Example: Submitting an entire chapter of a third party’s copyright protected book to a GenAI Model may violate the copyright holder’s exclusive right to reproduce or display the copyright protected work, subjecting you to potential legal liability for copyright infringement.
- Example: Submitting to a GenAI Model data or other content which is licensed from a third party and is subject to a restrictive license permitting only limited and specific use cases by you could breach those license terms, subjecting you to legal liability.
- Do not submit materials, including text, images, audio, or video protected by third-party IP rights as Input unless such Input is owned by you or submitting such Input is permitted by the terms of the applicable license to O’Reilly.
- Confidential information from a third party
- Do not submit a third party’s confidential information as Input.
- Personally identifiable information
- Do not submit any personal information, including any personally identifiable information (PII) such as contact info, SSNs, TINs, or financial information, as Input without first ensuring that the terms of the GenAI Model prohibit the GenAI Model from accessing, storing, or using such PII.
- Objectionable content
- Do not submit any materials that may reasonably contain obscene, hateful, immoral, offensive, controversial, or politically charged terms or similar images as Input.
- Use your judgment
- Do exercise appropriate judgment when submitting Inputs into GenAI Models. If you are unsure whether you should submit certain Inputs to a GenAI Model, we recommend not doing so.
3. Outputs
“Output” means text, images, code, or other materials generated by GenAI Models.
Make sure you own the Output. It is important that you review, read, and understand the GenAI Model’s legal terms and conditions governing Output. As a general rule, you should ensure that the user owns the IP rights in the Output or at the very least has the right to use the Output for commercial purposes. If the terms of the GenAI Model indicate that the user does not own the IP rights in the Output, and you wish to incorporate any such Output in whole or in part into the content you create for O’Reilly, you must obtain approval in advance from your O’Reilly editor before doing so.
Even if you own the Output, you may still infringe. The US Copyright Office has issued formal guidance in March 2023 noting that most Outputs solely based on a prompt are not eligible for copyright protection. This effectively makes that Output part of the public domain and free for anyone to use without restriction, unless the Output is a reproduction or derivative of a copyrightable work. However, if the Output is a reproduction or derivative of a copyrightable work, while the GenAI Model may designate you as the owner of the Output, use of the Output may inadvertently lead to plagiarism, copying, or infringement of others’ work. The graphics in Exhibit A Example 2 illustrate how certain defenses to allegations of infringement may apply to Outputs.
Guidelines: Follow these guidelines and best practices when incorporating Output into content that is delivered to our users, including with regard to how you select, craft, and/or modify all elements of your content that are derived from Output.
- Text outputs
- Do enhance and customize Output with your own voice and style.
- Example: When utilizing GenAI Models to produce assessment questions, we expect you to enhance the Output by incorporating your distinctive voice and style. This will create content that is not only engaging and inviting but also specifically tailored to resonate with your intended audience.
- Do use GenAI Models for short excerpts of text.
- Example: It is generally acceptable to use GenAI Models to suggest headlines for longer pieces of text or to generate short social media posts. Ensure that you follow the guidance provided above regarding the appropriate scope of Input to achieve your desired outcomes.
- Do use GenAI Models to enhance course content.
- Example: We encourage you to use GenAI Models to develop analogies that clarify complex concepts, incorporate additional examples to reinforce arguments, or devise interactive learning activities that engage students, but Do not copy the Output verbatim as outlined above. Use the GenAI Models as a platform to think through general ideas to invigorate course content creation and elevate the overall learning experience.
- Do exercise reasonable judgment.
- Example: Be cautious using AI to generate quiz questions based on text or create key takeaways for a chapter. Despite the convenience of GenAI Models, there are potential risks such as introducing factual errors or altering the intended meaning. We expect you to use your judgment to edit and emphasize the most relevant, original, and engaging aspects.
- Do not copy and paste Output verbatim.
- Example: When utilizing a GenAI Model to develop original course content, copying and pasting the Output without modification can inadvertently lead to plagiarism or copyright infringement. As a general rule, aim to use no more than 10% of the unmodified Output in your finished product.
- Do enhance and customize Output with your own voice and style.
- Image or video outputs
- Do use GenAI Models to generate generic figures to illustrate a concept. Please see Exhibit A Example 3 for examples of generic figures.
- Do modify the image or video Outputs.
- Example: When utilizing GenAI Models to generate illustrations (e.g., for book covers, book interiors, and web pages), focus on making meaningful alterations to the Output or evoking a different aesthetic appeal before incorporating such Output into your finished work. Making minor or imperceptible changes is not sufficient. Exhibit A Example 4 shows an instance where the US Copyright Office noted that the digital enhancements to a comic book character’s lips were too minor and imperceptible to qualify for copyright protection.
- Do use GenAI Models to enhance the quality of video or audio files, as long as you comply with the Input guidelines set out above.
- Example: Feel free to use Adobe Podcast’s AI recording and editing tools to enhance the audio quality in videos produced by O’Reilly.
- Do not copy distinctive features of the Outputs.
- Example: When utilizing GenAI Models to generate illustrations, avoid using Output that incorporates the exact composition, subject matter, or visual style of works that may be subject to copyright protection. Consider drawing inspiration from various sources and infusing your own creativity to ensure that the Output maintains originality.
- Example: Avoid using a character or object’s unique visual appearance and underlying traits without permission, such as Batman’s Batmobile. Character traits are protectable if they are consistently defined by distinctive and widely identifiable traits. For example, James Bond’s cold-bloodedness, use of guns, physical strength, sophistication, and love of martinis are distinctive character traits that are protectable under copyright law.
- Do modify the image or video Outputs.
- Do use GenAI Models to generate generic figures to illustrate a concept. Please see Exhibit A Example 3 for examples of generic figures.
- General
- Tracking: Do carefully track and document which parts of a work were created by the human author and which were created by a GenAI Model, especially if you aim to own the copyright to your entire work. Save your Output with generic terms (e.g., “Intro to Macroeconomics”) and document which GenAI Model was used to generate that Output.
- Attribution: Some GenAI Models require users to attribute credit to the GenAI Model if the user incorporates any Output into content that is made available to the public. Do inform your O’Reilly editor that proper attribution may be required by the GenAI Model.1
- Use your judgment: Overall, Do exercise appropriate judgment when incorporating Outputs of GenAI Models into content that you make available to O’Reilly.
- Remuneration: We may develop, adapt, or modify your works or create or generate derivative works based on your works, including, for example, quiz questions, translations, or summaries (collectively, “Adaptations”) as further described in and subject to your agreement with us. If we generate any Adaptation via a GenAI Model and use it in our services (e.g., the O’Reilly learning platform), we will compensate you in accordance with and subject to your agreement with us.
- Guidelines for performance and bias issues for Output
- If the Output presents a statement of fact, do verify its accuracy and completeness. Do not solely rely on the Output; undertake rigorous fact-checking and verify the information through reliable sources before dissemination to avoid hallucinations that may be introduced by AI.
- Example: If the Output includes historical facts related to actual individuals, places, or years, historical trends such as percentages, or a reference to a primary or secondary source, it is imperative to verify the accuracy independently.
- When the Output depicts an image or describes a group of people, do check for detectable bias. Bias can manifest in various forms, including gender bias, racial bias, or cultural bias. Do mitigate bias to ensure fair and inclusive outcomes.
- Example: When using a GenAI Model to generate course descriptions, be attentive to any biased language or implicit biases that may discourage or exclude certain groups of learners (e.g., a financial-based course illustrated with a picture of white males). Take the necessary steps to adjust and refine the generated content, ensuring it promotes inclusivity and fairness.
- Be aware of other types of bias that may affect the Output (such as the prompts you use to generate the Output and your modifications to the Output), including:
- Anchoring bias: the tendency to rely too heavily on the first piece of information received on a topic.
- Sampling bias: the bias that occurs when a sample is not representative of the whole population, leading to skewed results.2
- Attribution/recall bias: the bias in identifying a rationale or cause of an event, influenced by one’s own perceptions and recall of past experiences.
- Omitted-variable bias: the bias that arises when a variable that should be included in the analysis is left out, leading to the wrong attribution of the effect of other variables.
- Confirmation bias: the tendency to search for, interpret and process, and favor information in a way that confirms one’s preexisting beliefs or values.
- Halo effect: the tendency to make specific judgments based on an overall impression of Output.
- If you notice that a GenAI Model consistently generates biased content or exhibits poor performance in a specific domain, do report it to O’Reilly at ai-questions@oreilly.com.
- If the Output presents a statement of fact, do verify its accuracy and completeness. Do not solely rely on the Output; undertake rigorous fact-checking and verify the information through reliable sources before dissemination to avoid hallucinations that may be introduced by AI.
4. Special Considerations in Using Open Source GenAI Models
General: Some open source GenAI Models (“OSS GenAI Models”) are governed by licenses that prohibit you from using the OSS GenAI Model for commercial purposes.3 This may mean that you cannot use OSS GenAI Models in your work for O’Reilly. Further, the Output of some OSS GenAI Models may be subject to the terms of a copyleft or network viral open source license, tainting O’Reilly’s proprietary software and imposing obligations on O’Reilly to make certain source code publicly available.
Guidelines: Follow these guidelines and best practices when submitting Input to and using Output of OSS GenAI Models.
- Do consult with your O’Reilly editor in advance regarding the terms of the OSS GenAI Models.
- Do ensure that your usage of Inputs to and Outputs from OSS GenAI Models conforms with the guidelines established in this policy, as O’Reilly will likely not be able to obtain enterprise versions of OSS GenAI Models.
- Do, if possible, download an OSS GenAI Model to your device or premises, and deploy it in a manner where it does not transfer (or require you to transfer) Inputs back to the developer or provider of the OSS GenAI Models.
- Do not use any OSS GenAI Models that produce software Output that is subject to the terms of a copyleft or network viral open source license. Do confirm that the governing license of an OSS GenAI Model does not contain any terms that would impose any such obligations on O’Reilly and ask your O’Reilly editor to obtain advance clearance of the applicable license from ai-questions@oreilly.com.
5. Public Statements and Other Considerations Regarding GenAI Models
General:
- Do not expressly mislead others into believing that O’Reilly is not using GenAI Models or that particular content published or made available by O’Reilly is entirely human-generated.
- Do not share feedback or suggested improvements with a GenAI Model as it relates to use of the GenAI Model for development of O’Reilly content, as GenAI Models may reserve broad rights to use any such feedback, including any applicable O’Reilly IP contained in such feedback. If you wish to provide feedback or suggest improvements to the GenAI model in the context of the development of O’Reilly content, please contact your editor for guidance.
Any exceptions to the guidelines in this Section 5 must be preapproved in writing by O’Reilly and will be considered on a case-by-case basis. Your editor is your point of contact for requesting all necessary approvals.
6. Violations of This Policy
We will investigate any complaint or alleged violation of the law or this policy and, if necessary, take appropriate corrective action and/or disciplinary action up to and including termination of your engagement.
If you have any questions about this policy, please contact your O’Reilly editor. Please be aware that O’Reilly has prepared this policy as a resource for you but cannot offer our content creators specific legal advice.
Exhibit A: Examples
Example 1
Proceed with caution when crafting or preparing Inputs for a GenAI Model. As shown below, your exact Inputs or a derivative or summary of your Inputs may also be available to other users of the GenAI Model.
Data regurgitation |
Derivative or summaries of your works |
---|---|
Data regurgitation is a phenomenon in which the GenAI Model produces a near or exact replica of an Input as an Output to other users. We note that many GenAI Models are implementing and employing technical measures to prevent regurgitation on a go-forward basis. |
Some GenAI Models have the capability to take an Input and generate derivative works as Output that is provided to other users, potentially incorporating core expressive elements from the original Input:4 Additionally, if an Input consists of a textbook, some GenAI Models can currently summarize the textbook on a chapter-by-chapter basis for other users of the GenAI Model. |
Example 2
Dr. Seuss: A work that combined Dr. Seuss’s Oh, the Places You’ll Go! with elements from Star Trek was the subject of a complaint that was ultimately resolved in favor of Dr. Seuss. Output that is a derivative work or reproduction of copyrighted material may infringe the rights of the copyright owner, even if the Output is “owned” by the user of the GenAI Model per the GenAI Model’s terms.
Dr. Seuss Work |
Infringing Work |
---|---|
Disney’s Snow White: Transforming a protected work might allow O’Reilly to make “fair use” of the work, which means avoiding potential copyright liability. For example, the following depiction of Snow White as a hunter may be sufficiently transformative to inform a finding of fair use.
If you’re using AI to “transform” a third-party work for inclusion in content you’re creating for O’Reilly, please explicitly identify these materials to your O’Reilly editor during the development process.
Example 3
Using GenAI models to generate generic figures like these to illustrate a concept is likely to be an acceptable use, assuming that you’re complying with the rest of this policy.
Example 4
When utilizing GenAI Models to generate illustrations (e.g., for book covers, book interiors, and web pages), focus on making meaningful alterations to the Output or evoking a different aesthetic appeal before incorporating such Output into your finished work. Making minor or imperceptible changes (as shown below) is not sufficient.
Detail before Photoshop |
Detail after Photoshop |
---|---|
1 Consider using the following language as an example of how to provide proper attribution: “Adapted from content generated by [GenAI Model].”
2 Although you won’t likely know whether a GenAI Model’s training data is robust or refreshed, contains all applicable variables, and/or is sufficiently representative, use your reasonable judgment to select and modify Output in a manner that does not espouse any gender, racial, cultural, or preexisting biases.
3 As of the last effective date of this policy, fine-tuned checkpoints for Stability’s LLM (StableLM-Tuned-Alpha) are licensed under the Noncommercial Creative Commons license (CC BY-NC-SA-4.0), in alignment with Stanford’s Alpaca license guidelines.
4 Note that this is just an artist’s claim of an infringing derivative work. Determining whether the Output is a derivative work is a fact-specific analysis determined by a court. Output that copies and reproduces certain facts (which are unprotectable under copyright) is less likely to be infringing; however, Output that gets at the core expression of the author or that could serve as a substitute for the work may be infringing.