Categorieën (praktisch) Practical Telling oral history All about Oral history

Example minimum standard:

Metadata is essential for access to oral history collections and for the reuse of a collection’s source material.
An important question is: Which metadata needs to be recorded and how?
Information about all kinds of underlying data of oral history interviews is essential for compiling, managing and finding a collection or interview.
There are roughly four different parts of metadata:

descriptive, administrative, and structural

DESCRIPTIVE
TECHNICAL
ADMINISTRATIVE
STRUCTURAL

Descriptive metadata

Bij oral history verwijzen beschrijvende metadata naar informatie over het interview of het besproken onderwerp.

Technical metadata

Technical metadata refers to the technical information that makes up the (digital) data file containing the interview, such as file type, codec, file size and resolution.

Administrative metadata

Administrative metadata ensures that items are categorised and ordered. These metadata refer to information related to issues such as rights management.

Structural metadata

Structural metadata refers to how individual components relate to the whole. Structural metadata is used to influence search results on the Internet or in an internal system, for example by the use of labels (tags) to improve findability.

Workflow-interview

For a good oral history interview, the following aspects are important:

Approach to persons to be interviewed
Good communication about the research questions and the project, the expectations and the further process with the interviewees
Choosing the setting and place of the interview
Creating trust
Good questionnaire or topic list
Good listening
Dealing with emotions during the interview
Interventions during the interview
Good agreements on privacy
A pleasant ending

Considerations when using speech recognition

When considering using automatic speech recognition (ASR), it is important to realise that some AV sources are more suitable than others. Here are some things to keep in mind. First, the audio quality must be high. This means that voices should be clear, not echoing, and preferably recorded close to the mouth with adequate microphones.

Secondly, ASR works best with monologues. If an audio file is full of people interrupting each other and talking at cross purposes, the results can be confusing to read, as not all software is able to recognise different people by their voices. Ideally, the file should have a separate channel for each speaker.

Finally, ASR is generally not very good at dealing with accents and dialects. When dealing with migrants or rural dwellers with an accent that might be easy for you to understand, ASR can have great difficulty with it. Let alone accents that are difficult for outsiders to understand.

ASR software

When using software, bear in mind that you are uploading privacy-sensitive files.

Always read the terms and conditions of an ASR service before deciding whether it meets your privacy requirements.

ASR with Subtitle Edit

As of January 2023 (version 3.6.12), a new automatic speech recognition option has been built into Subtitle Edit.

This version of Subtitle Edit includes two speech recognition features under the Video tab:

Vosk/Kaldi (a somewhat older ASR method)
Whisper (AI-based modern ASR feature)

Brief installation instructions for Subtitle Edit 3.6.12*, to make the program work best for Whisper speech recognition:

* Version 4.0.5 is now available, with more options for Whisper again. The OpenAI, CTranslate2 and WhisperX engines require separate installation of Python. The Purfview’s Faster Whisper (with language model large-v3), CPP and ConstMe engines can be used without Python.

The Advanced option at the Whisper “Audio to text” screen allows additional parameters for the Whisper command line to be specified. And allow configure of Whisper post-processing via Settings.

Installing Python is a chapter by itself. A simplified way to install Python and Whisper on your computer is given below under the heading: Installing Whisper and Python (Windows) – for advanced users.

MacWhisper

Quickly and easily transcribe audio files into text with OpenAI’s state-of-the-art transcription technology Whisper.

Transcription is done on your device, your (sensitive) data does not leave your computer.
Export subtitles in .srt & .vtt. Text export in .csv
Search the entire transcript and highlight words
Play audio and sync with transcripts
Supports 100 different languages
Automatically remove ums, uhhs and other similar padding words
Supported formats: mp3, wav, m4a and mp4 videos.
Supports Tiny and Base models

The Pro version requires a small fee of €29 (Personal use)

The Pro version uses Medium and Large models, where transcription results are often much better.

AI Transcriptions by Riverside

Users who do not want to download software on their computer and still want to use AI transcribing can use Riverside’s transcriber. Transcribe audio and video in 100+ languages with just a few clicks. Riverside’s transcriber offers Ai transcriptions absolutely free.

There are some drawbacks to using it online:

(Sensitive) data you upload to an internet space
Transcription times can vary depending on file size, content length and how busy Riverside’s servers are.

Advantages:

Unlimited file upload (MP3, Wav, MP4 and MOV)

Output in Caption – subtitle file (srt) or Text file (txt)

Disadvantage:

Other file formats, such as m4a, must first be converted to a format readable for the website. For example, with Convertio.co

Whisper SteveDigital online

Users who do not want to download software on their computer and still want to use Whisper, they can use SteveDigital’s free service on the Internet.

Online convert audio files or YouTube files into text with OpenAI’s advanced transcription technology Whisper.

There are some drawbacks to using it online, though:

(Sensitive) data you upload to an Internet space.
At busy times there is a queue, sometimes can take a long time with large files
Output is a text file (without time coding)

Advantages:

Transcription takes 5-10 seconds per minute of audio
Uses large-model

Installing Whisper en Python (Windows) – for advanced users

Whisper AI uses the Python programming language.

Installing everything on your computer, from Python to the various Whisper models, does require some computer knowledge. The GitHub site has all the necessary files grouped together:

github.com/openai/whisper

Getting everything working on your personal computer is quite complicated. Which programme components to install depends a lot on your computer’s specifications.

However, there is an installation programme developed by TroubleChute that goes through the entire installation process automatically, taking into account your computer’s configuration.

Below is a link to a video that explains step-by-step how to easily install Python and Whisper on your computer (English spoken):

TroubleChute

One-click Whisper install windows install script

ASR with Word 365

Automatic speech recognition

Automatic speech recognition with Word in Office 365.

With a Microsoft registration, the service can be used online for free.

The disadvantage is that the result is a document without time codes.

Via an option in YouTube Studio, subtitles with time codes can be created.

DOWNLOAD the separate instruction document.

Instruction document for automatic speech recognition in Word can be downloaded here:

Automatic transcription

Automatic transcription with Word in Office 365.

The service can only be used with an Office 365 premium subscription.

(300 minutes of speech recognition per month)

The result is a document with start times per paragraph. An option in YouTube Studio can be used to turn it into a readable subtitle file with time codes.

Download the separate instruction document.

Instructie-document voor automatische transcriptie in Word is hier te downloaden:

ASR with Google Docs

The automatic speech recognition can be used with a Google Account.

The disadvantage is that the result is a document without time codes.

Using an option in YouTube Studio, subtitles with time codes can be created from this.

DOWNLOAD the separate instruction document.

Instruction document for automatic speech recognition in Google Docs can be downloaded here:

ASR with YouTube

The automatic subtitles can be created with a Google / YouTube account.

Only suitable for video files.

If you want to have an audio file (mp3, wav, ogg, etc.) automatically transcribed, it must first be converted to a video file in order to be uploaded to YouTube. There are all kinds of free programmes for this. The trick is to load a sound track and put a random picture along the entire length of the sound file. Then save the whole thing as an mp4 file. And the sound file is ready for uploading to YouTube.

Instruction document can be downloaded here

ASR for academics

Transcription Portal

Easy to use web-based ASR
Multilingual
Editing possible
Free of charge (academic use)

https://www.phonetik.uni-muenchen.de/apps/oh-portal

The Transcription Portal is an online ASR tool developed and hosted by LMU Munich for academic transcription purposes. The tool is not an ASR service itself, but allows you to process your audio files through many different ASR services. You can then correct and edit the results within the OH-Portal or export them in a file type of your choice.

If you are interested in transcription tools, please check here:

TRANSCRIPTIE-TOOLS

example

Expanded and in line with the General Data Protection Regulation (GDPR)

Stories in motion

Consent form interviews

In order to allow the interviews to be used, each interviewee must fill in a consent form, giving permission for the (entire) interview to be made available.
If there are excerpts from the interview that the interviewee does not wish to make available, or in which he/she does not want to be seen recognisably, the interviewee can indicate this on the consent form.
The interviewee may also indicate that some documents will not be accessible for another ten or twenty years.
Finally, the interviewee may indicate that the interview is only to be made available to registered researchers, for example.

Interview ownership and use agreement

Interviewers who work freelance for a project are regarded as creators for copyright purposes. Therefore, interviewers must declare that the interviews he or she conducts on behalf of an organisation or a specific project become the property of that organisation.

Workflow interviews

A schematic representation of the steps you need to take for an oral history interview.

From choosing a person to interview to depositing all the files in an archive.

Automatic Speech Recognition:

SPEECH=REcognistion-TRANSCRIBE

Transcription tools:
TRANSCRIPTIE-TOOLS

There are several specialised tools for manual digital transcription. Although it is not necessary to use specialised tools (you can also use Microsoft Word or Google Docs), it is useful to know different options. This makes transcribing less difficult and increases the possibilities of research using your transcriptions.

An advantage of using transcription software, for example, is that the “playing” of the sound and image is combined with “typing” the spoken text. This results in a transcription that is recorded with time codes, i.e. the start and end time of each text fragment (a word, a few words, a sentence or a paragraph) are known. This time alignment makes it possible to search for spoken words and to generate subtitles. In transcriptions made with an ordinary word processor (Notepad, Word, etc.) this time alignment is missing and the result is only text.

LOCATION

Make sure that your interviewee feels at ease. An oral history interview should therefore preferably take place in the interviewee’s home. Interviewees should preferably be seated where they usually sit in the house.
Do not sit too far apart.
Avoid raising your voice and losing eye contact.
Do not sit too close either.
You will invade the interviewees’ personal space and make them feel uncomfortable.
Make sure that light falls on the interviewee’s face.

AUDIO

Be aware of things that can spoil your recording. Turn off the telephone, TV and radio. Look for ticking clocks, purring cats and anything else that could disturb a recording.
Reduce noise from outside the room. Close doors to other parts of the house and, if there is traffic noise, close windows wherever possible.
Do not, for example, allow the interviewee to sit with rustling paper in hand.
A microphone does not distinguish between sounds and therefore picks up disturbing noises just as loudly as speech.
When recording an interview it is much better to use a separate microphone than the built-in microphone of the video camera or audio equipment.
Make sure that you cannot trip over cords. Do not lay cords near doorways or in places where you could trip over them.
The closer the microphone is to the sound source, the better the quality of the recording.
If you are using a table microphone, point it at the interviewee and place it as close as possible to the interviewee.
If using a tie-pin microphone, position it about 15 cm below the mouth of the interviewee.
Make sure that the clothes cannot rub against the microphone. Microphones with alligator clips are best for this.
Record and play back a few seconds of audio to make sure everything works before you start the interview.

VIDEO

If you are using video you should also think about how you are going to portray the interviewee.
First of all you don’t want direct light in the picture. Avoid backlighting from windows as this makes it difficult to find the right camera setting for exposure.
If you use artificial light you are more flexible.
You want to see as much of the interviewee’s face as possible, in order to register emotions as well.
Place the camera next to where you are sitting as the interviewer, or just behind your shoulder.
Avoid using auto focus.
Zoom all the way in to adjust the focus manually and then set the zoomed out position for the shot.
A close-up may look interesting, but it can put your interviewees out of frame if they move.
A more zoomed-out shot is safer if you are the interviewer doing the recording yourself.
A more zoomed out shot is also interesting because you can see any gestures the interviewee might make.
If there is a second person to operate the camera, they can keep an eye on the image and the sound, so that the interviewer can concentrate on asking the questions.
The person operating the camera wears headphones to check the sound quality

What are metadata?