A collaborative piece by The Office of Curriculum and Instructional Support (CIS) and Academic and Research Computing (OIT).
So, you’re about to dive into the magical world of generative AI tools, and you’re probably thinking, “This is going to make my life so much easier!” And there is a good chance that it will make some things easier. But before you hand over your data (and possibly your soul), take a moment to ask yourself these eight important questions.
1. What data is being collected, and how is it being used?
Before you start typing away, consider this: every keystroke is potentially being scooped up by the AI tool. Yep, even that typo you made on the second try. AI tools often collect user input, and sometimes more than you’d like them to. Knowing what’s being collected—and why—is your first step toward protecting your own privacy (and that of your students).
2. Who owns the data I input into the AI system?
In academic circles, intellectual property isn’t just a phrase—it’s a currency. Before you input your hard-earned data into an AI tool, be sure to check if it’s still yours. Once you provide data to the tool, you need to know whether you retain ownership or if the platform gains control over it and its use in other applications or training. CMU policy prohibits sharing HIPAA, research, or restricted data.
3. Does the tool comply with privacy regulations (i.e., FERPA)?
If you’re in higher education, staying FERPA-compliant isn’t just a good idea—it’s the law. You need to make sure you’re not sharing any student-identifying information with generative AI tools, even by accident. Spoiler alert: currently, CMU doesn’t have a FERPA-compliant AI system. So, proceed with caution.
4. Can I control what the AI does with the data I input?
We all love a little control, right, especially when it comes to sensitive data. Before you get too cozy with that AI tool, look for options that allow you to delete or limit access to your data after it’s used. This level of control is crucial for maintaining data privacy and security. Keep in mind that nothing is ever completely erased from the internet, and there are plenty of examples of companies using data collection methods that fall on the wrong side of questionable.
5. How is the data protected from unauthorized access?
Think of your data as a treasure (even if it’s just your grocery list). You don’t want it to fall into the wrong hands. So, ask yourself what measures are in place to prevent hacking, data breaches, or random third-party snoopers, especially when dealing with academic or research data.
6. Is my data being used to train the AI?
Wait, you mean the AI might actually learn from my input? Yes, that’s a thing. Some AI tools gobble up your data to become smarter and faster. While this can improve the AI’s performance, it may also raise concerns about data privacy and the potential for bias in future outputs.
7. What third-party services or partners have access to my data?
Your data isn’t always staying with just one party. Sometimes, third-party services or partners get in on the action too. Find out who has access and what exactly they’re looking at. No one likes surprise guests at their data party.
8. Are the terms of use, data handling, and privacy policies clearly explained and easily accessible?
We’ve all scrolled through the fine print, clicked ‘Accept,’ and hoped for the best. But when it comes to AI, transparency is key. Many of the questions above should be answered within a tool’s terms of use. Look for clear, plain-English explanations of how your data is handled. If it reads like a legal thriller, it’s time to ask some questions. Or hire a translator. Our friends in OIT are happy to help navigate the complex world of generative AI and cybersecurity.
References:
AI Multiple Research. (2024, January 3). Generative AI data : Importance & 7 methods. AIMultiple. https://research.aimultiple.com/generative-ai-data/
Liu, H., & FTC, S. at the. (2024, August 20). FTC and DOJ charge Amazon with violating children’s privacy law by keeping kids’ alexa voice recordings forever and undermining parents’ deletion requests. Federal Trade Commission. https://www.ftc.gov/news-events/news/press-releases/2023/05/ftc-doj-charge-amazon-violating-childrens-privacy-law-keeping-kids-alexa-voice-recordings-forever
McIntosh, Amy. (2024, September 9). How to ensure Ferpa Compliance in Colleges and Universities. Technology Solutions That Drive Education. https://edtechmagazine.com/higher/article/2022/05/how-ensure-ferpa-compliance-colleges-and-universities-perfcon
Tschammer, T. von. (2024, February 2). AI models – data collection and generation. Neural Concept. https://www.neuralconcept.com/post/data-collection-and-generation-for-ai-models
University of Illinois Urbana-Champaign. (n.d.). Privacy considerations for Generative AI – Privacy & Cybersecurity. https://cybersecurity.illinois.edu/policies-governance/privacy-considerations-for-generative-ai/