Generative AI tools can enhance productivity and creativity, but they also come with risks related to data privacy, information security, and inherent biases. This page explores how these tools handle user data, the potential for inaccurate or misleading content (hallucinations), and the ways biases can affect AI outputs. Knowing these aspects helps you navigate AI tools with a better understanding of their limitations and ethical considerations.
Since GenAI is trained on real-world data, text, and media from the internet, the content it provides to users may be misleading, factually inaccurate, or outright misinformation (like deep fakes, for example). Because it’s unknown exactly where the data used to train AI originates and AI cannot specify its sources, its output may not be credible or reliable for academic use. The information provided may be implicitly or explicitly biased, outdated, or a “hallucination.”
AI hallucination: Gen AI fabricating sources of information even though it is meant to be trained on real-world data. IBM (n.d.) examines the various causes of AI hallucinations, indicating that common factors include “overfitting, training data bias/inaccuracy and high model complexity.”
To avoid using or spreading misinformation, verify the accuracy of AI-generated content using reliable sources before including it in your work.
Like other digital tools, generative AI tools collect and store data about users. Signing up to use generative AI tools allows companies to collect data about you. This data can be used to make changes to tools to keep you engaged.
User data may also be sold or given to third parties for marketing or surveillance purposes. When interacting with AI tools, you should be cautious about supplying sensitive information, including personal, confidential, or proprietary information or data.
Check whether you can use your college credentials to sign into apps like Copilot to ensure your data and privacy remain secure.
Registration: Registering for a ChatGPT account requires users to provide their name, email address, and birthday. OpenAI's Privacy Policy- opens in a new tab states that your personal information may be shared with third parties without further notice to you. Data is stored outside Canada.
Prompts/Forms: Users interact with ChatGPT by asking questions, writing prompts, or uploading files. Do not upload, input, or disclose any personal information. Do not upload, input, or disclose any information that should not be made public.
Training Data: OpenAI uses data to train ChatGPT on how to respond. This includes anything you write or upload to ChatGPT, so you should not input any information you would not want to be made public. Learn how to opt out of having your data used for training- opens in a new tab.
Note: Always be sure to review the terms and conditions of any application they use and be aware that these agreements often include a clause allowing the company to modify the terms at any time.
Tell-Tale Signs of AI Generated Text | ||
---|---|---|
Generic Style Repetitive writing style that lacks a unique voice or perspective, and lacks specific details |
Missing Context Lacks detailed information or a nuanced understanding of your specific topic |
Missing or Fake Sources Citations are missing or the AI Tool may fabricate (e.g., “hallucinate”) citations. |
Overuse of Jargon Overly reliant on certain works that are not commonly used in everyday language |
Inconsistencies Statements may contradict one another or may be completely unrelated to the topic |
Outdated Information Information is not always up-to-date and might contain inaccuracies |
One may think that technology is objective and neutral. Generative AI, however, is trained on real-world data and information, such as images and text scraped from the internet. This information is rife with human biases. Human biases may be embedded in the AI model during its creation, and biases in the datasets used for training can influence how it generates content. Additionally, AI can develop its own biases based on how it interprets the data, and user input may inadvertently guide it toward biased responses.
AI Bias: "also referred to as machine learning bias or algorithm bias, refers to AI systems that produce biased results that reflect and perpetuate human biases within a society" (IBM Data and AI Team, 2023). Some common biases include gender stereotypes and racial discrimination.
Recognizing these factors is essential for critically evaluating AI-generated content. By understanding potential biases in the data, model design, and user input, you can better assess the credibility and accuracy of the AI's output.
Get hands-on experience with algorithm bias in this quick activity by The Artefact Group- opens in a new tab. Who will win the awards at Millennium Middle School? Will your predictions on the award winners align with the algorithm used by the Most Likely Machine to pick the winners?
Information on this page was adapted, with permission from "Misinformation, Online Security and More- opens in a new tab" by Conestoga Library & Learning Services. Along with information from "Writing Support- opens in a new tab" by Seneca Polytechnic Libraries and "Ethical Considerations- opens in a new tab" by Fleming College.