The trials and tribulations of trying to record tutorial videos

Photo by Jakob Owens on Unsplash

When we were gearing up to launch the alpha of back in October 2021 we had envisioned adding tutorials on how to use the app to the website and how we would create a set of videos that would not only make those tutorials easier to follow but also open up another channel to spread the word about our product on.

Back in the days of my first tech job as a “New Media Officer” (fancy job title but I just maintained a website) at a hospital in 2010 I used to have conversations with clinicians about how videos could help their patients understand topics easier as a study showed that people learn subjects better from a mix of video and text compared to text alone, so I was keen to use this to help sell the idea behind

These videos would be an experiment to see if there’s an easier to way to get potential users to understand the value that our product will bring to them so they are more likely to want to sign up and ultimately give us money.

The first attempt

In order to create videos of a reasonable enough quality I invested a couple of hundred pounds in a Blue Yeti microphone, a Logitech Stream Cam, a set of cheap LED panels I could use to light my face with and a cheap green screen.

For capturing footage my plan was to run two instances of OBS to capture my screen and a webcam feed of me talking through the scripts we’d written. I’d then take both sets of clips, sync them up (I made sure to clap before each take so I had a point to anchor from) and use Adobe Premiere to edit them together and remove the background from the webcam clips.

I’d chosen to record separate clips of the screen and the webcam instead of compositing them within OBS as I felt this would be give me more flexibility while I figured out how I wanted the video to look.

Unfortunately the visual aspect of the video was the biggest issue I had to deal with as I had overestimated my presentation skills by a lot so the first set of takes took a long time and looked terrible.

It turns out that while I’m naturally a very chatty person I cannot read from a script in the same manner. I think this is because I tend to speak in a very off-the-cuff manner and when presented with a set of words I have to say this results in me tripping over myself constantly and I don’t emote as well as I would normally so it all looks very robotic and disingenuous.

The first attempt at a single six-minute video (there are four I had to do) took me four hours to record and two to edit. This meant that the videos would become a blocker for the website launch so we shelved the videos in order to move that forward.

The second attempt

After launching the website I took some time to think about how I could optimise the video making process and decided that recording the actions on screen and myself talking to camera could be handled separately.

That way I could mess up the script as much as I want and then record the actions needed for the final edited webcam clip. To record the actions on screen I would play back the webcam clip so I could ensure the timings were right when editing.

This approach did shorten the time it took to produce a video as there was less need to reset everything if I messed up and I took pauses between sentences to increase my chances of not having to redo entire segments again but I wasn’t happy with the way the webcam footage came out.

It was clear that I was reading a script just below the camera and I again looked very robotic and disingenuous, but this time as I had edited together a bunch of different takes I ended up with some very jumpy footage as I would move about a lot between those takes.

The improved time to produce a video was about four hours for both recording and editing but the quality of the video still wasn’t up to the standard I wanted.

The third and final attempt

I was ‘fortunate’ enough to have spent 2021 working a lot so in December I had three weeks of holiday to take and I was able to sink time into tackling the problem of how to efficiently create the tutorial videos.

The first step I took was to revisit the value that having me appearing in the videos actually gave the user. I had originally wanted to have a human element present as I felt this gave the videos more personality but as I couldn’t capture any personable footage I decided that a disembodied voice would work well enough in order to get the videos published.

To record the audio I used the same approach to reading the script sentence by sentence with pauses as before but because there was no video footage to jump about this time it actually worked well and I was able to get the scripts recorded in about 20 minutes per script instead of the hour(s) before.

I then recorded the actions on screen using the same method as before by playing the recorded script back and doing the actions in time to what was being said. There were no time savings here compared to the previous approach.

The time to edit was improved as well as I left gaps between the different sections of the script in the audio file so I could easily align the video to the audio and create clips of the script audio and the screen capture together and use those when building the final video.

This allowed me to reduce the time taken to create a video down to about 90 minutes after I streamlined the process to recording the audio for all the videos first, then capturing the screens one after another (as the scripts built up an example sequentially) and finally editing them.

The result

The videos didn’t end up as we’d originally envisioned but they do the job and it’s better to have them published so we can start seeing the impact they have on users being able to learn how to than to seek perfection.

I am happy with the final result though as I think the process I landed on worked well enough that I wouldn’t feel too worried if I had to create more videos in the future.

After publishing the videos the next challenge was figuring out what to put in the description box of YouTube when I uploaded the videos but for the most part I was able to re-use the script to do this.

I’ve created a playlist on YouTube for the tutorials that you can watch if you’d like to learn more about how to build an interactive User Journey Map in order to build Shared Understanding easier with

The first video in the series of videos I created to help users learn more about

Lessons learned

The biggest lesson learned was probably something I already knew but forgot due to the panic of understanding a medium I didn’t have much experience in — perfection is the enemy of progress.

I think I’ve spent too much time consuming highly polished videos from content creators with teams behind them that my idea of what was valuable was skewed a little so I’m glad I took the time to re-evaluate and re-focus.

I have a new found respect for those in front of the camera with the ability to read a script seamlessly in order to create a stream of content. I’m sure I’ll improve as I do it more but the level that those who publish content daily are on is impressive.

Recording the videos gave me an interesting insight into the way that I communicate. I have a habit of starting sentences strong and clear before trailing off at the end, this is very similar to my handwriting so I have to work hard to keep things clear throughout.

This insight lead to me to figuring out how to talk in a consistent manner and how to bring some emotion into the recordings without sacrificing that clarity. This is something I’m hoping I can keep practicing in my day job also as I spend a lot of time building understanding amongst parties so clarity is important.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Colin Wren

Colin Wren

Currently building Interested in building shared understanding, Automated Testing, Dev practises, Metal, Chiptune. All views my own.