Many WashU community members create audio and video recordings in research, during meetings, while attending lectures, and in other circumstances. These recordings can be indispensable to a project because they document what was said with perfect fidelity for future reference and analysis. A transcript of the recording is even more helpful, making it easy to search for specific terms, identify emergent codes, and conduct content analysis.
Of course, anyone who has ever transcribed an interview knows it is a painstaking task—manual transcription of an hour-long interview takes several hours for the average person. Multiply that by a few dozen, hundred, or thousand recordings, and manual transcription becomes impractical.
Increasingly, automatic transcription services offer to eliminate this constraint by harnessing the power of artificial intelligence (AI). Susan McGregor of Columbia University’s Data Science Institute stated, “The fact that these AI-powered services exist and can turn a couple of hours of audio into a reasonable written transcript in a matter of minutes is a complete game changer” (Kine 2022).
However, readers should proceed with caution. McGregor also warned, “These run on machine learning, which means that they expose your data to the algorithm that is both transcribing your text and almost certainly using your text and audio to improve the quality of future transcription.” This is especially concerning if the recordings might contain identifiable or protected information (e.g., protected health information). Additionally, the AI transcription model is only as complete as the data used to train it. If a recording contains words that are unfamiliar to the model, results may not be perfectly accurate.
What are the Concerns?
- Many automatic transcription services work in the “the cloud,” and the data involved may be shared with AI services.
- Transcription apps store troves of valuable data in their cloud servers. These repositories of nonpublic information are enticing targets for attackers.
- Transcription applications may have technical vulnerabilities (e.g., the absence of two-factor authentication), which makes it easier for hackers to access the account and the contents of interviews and transcripts.
- New devices can record any phone call or meeting and automatically upload it to the internet for transcription, summary, and analysis, with or without the knowledge or consent of the speaker(s).
- Hiring a human typist for transcription can introduce privacy and security risks without a contract specifying a company’s nondisclosure and its responsibility to secure the data.
Mitigating the Risks
WashU’s Office of Information Security (OIS) is committed to protecting the WashU Community from security events that directly or indirectly (i.e., via a vendor) impact our institution, data, and ability to fulfill our shared missions of teaching, research, and patient care. To that end, the OIS conducts risk assessments of technologies, vendors, and services to identify possible problems, provide risk mitigation guidance, and recommend more secure alternatives when necessary.
WashU community members can also proactively select transcription services that prioritize security. Keep the following practices in mind when considering the best transcription service for your needs:
- What kind of data might appear in your recordings? If there is a possibility that protected health information (PHI) might be included in the recording, opt for a transcription service with which WashU holds a Business Associate Agreement (BAA), such as Landmark Associates, Trint, and Qualtranscribe (as of October 2023).
- Does the transcription service offer two-factor authentication (2FA) or integrate with the WUSTLKey authentication process? If not, someone could log into your account using only a password and download your transcripts. If 2FA is available, be sure to set it up. Also, remember to use long, unique passwords for your accounts, which will prevent multiple accounts from being compromised if one password falls into the wrong hands.
- Don’t leave your data lingering out there in the world. After downloading your transcripts and saving them in an approved storage service, delete your data from the transcription service website. This will avoid unauthorized access to your data if the transcription service experiences a breach.
- Double-check the transcription company’s website to ensure they encrypt your data in transit and at rest. Most transcription services do. If you find one that doesn’t, consider selecting an alternative company.
- Think carefully before adopting technologies that automatically upload recordings to the internet for transcription, summary, and analysis. Those recordings may contain protected, sensitive, and personal information that could easily fall into the wrong hands if handled carelessly. Protect yourself and everyone with whom you interact by keeping security and privacy in mind as you consider new technologies.
As always, contact the OIS at firstname.lastname@example.org if you have questions, concerns, or are searching for a suitable transcription service. The team is ready and happy to help!
Kine, P. (2022) My Journey Down the Rabbit Hole of Every Journalist’s Favorite App. Politico.
Shelton, M. And Y. Grauer (2022) How Secure are Journalist’s Favorite Transcription Tools? Freedom of the Press Foundation.
Data Classification, WashU Office of Information Security
Generative Artificial Intelligence, WashU Information Technology
HIPAA Business Associates Agreements, WashU Office of Resource Management
Secure Storage and Communication Services, WashU Office of Information Security