# YouTube’s Automatic Closed-Captioning of Mathematical Speech (Part 2)

Last semester, as I spend untold hours editing the closed captioning automatically generated by YouTube on the math videos on my YouTube channel, I got a crash course on the capabilities and limitations of this system. This crash course was perhaps not legally necessary but extra work that I took on because a student with a hearing impairment was enrolled in my class, and I wanted to ensure that the review videos that I provide to my students were accessible to him also.

I think the resources offered by my university are fairly typical to ensure that instructors are able to reach all students and not just those who don’t have audio/visual impairments. After discussions with the cognizant people at my university, I’ve made a few conclusions:

• Mostly by accident, my videos are ADA compliant since I made the decision to both write out the solutions and also talking through the solutions.
• While the automatic closed-captioning provided by YouTube may be minimally compliant with ADA, I’m not sure that a student with a hearing impairment could always follow the transcriptions due to a number of errors.
• Aside from punctuation, capitalization, and the occasional homonym (e.g., right vs. write), YouTube does a pretty good job at transcribing ordinary speech.
• Naturally, YouTube’s automated closed-captioning is not to blame when I don’t enunciate clearly, have a rabbit trail of thought but then have to backtrack, use poor grammar, make a outright mistake, etc.
• However, YouTube seems to have a lot of difficulty providing automatic closed-captioning of mathematical speech.

Fixing these transcription errors took an awful lot of time. I don’t want to know how many hours I devoted to fixing the 120 or so videos (each video is about 3-10 minutes long) recorded so that my hearing-impaired student could have full access to my class. About halfway into this project of fixing the closed-captioning errors, I started writing down some of the closed-captioning errors. I wish I had thought to do this near the start, but oh well.

Phonetically, I can understand why most of these errors were made. But these mistakes really shouldn’t have happened. Here are my favorite howlers that I recorded, showing both what I said and what YouTube thought I said.

• “931,147,496” became “930 1,000,000 147,000 496”
• $A \cap C$,” pronounced “$A$ intersect $C$,” became “A inner sexy”
• “arithmetic” became “rhythm sick”
• “capital $X$” became “Catholics”
• “cardinality” became “carnality”
• “divisible by 5” became “visited his wife live” (I have no idea how that happened)
• $e^x$” became “eat ooh the x”
• “for succinctness” became “force the sickness”
• $n \choose n$,” pronounced “$n$ choose $n$,” became “and shoes and”
• “set containing” became “second taining”
• $\sqrt{2}$” became “squirt tuna”
• “two ways in” became “too wasted”
• “what $f(3)$,” pronounced “what $f$ of 3,” became “whateva 3”
• $x \in B$, pronounced “$x$ is in $B$,” became “sexism be”
• $x \in B \cap C$, pronounced “$x$ is in $B$ and $C$,” became “x is Indiana see”
• $x \in C$, pronounced “$x$ is in $C$,” became “excellency”

Here’s the complete list of howlers that I recorded for posterity. If I’ve learned nothing else, it’s that I need to be more proactive about ensuring the mathematical accuracy of closed-captioning for my YouTube videos.

 4 for 857 a 50 7 1232 1230 two 4761 4760 1 19,999 19,000 999 46,376 40 6376 123,552 120 3,552 5,565,120 five million 565,000 120 931,147,496 930 1,000,000 147,000 496 $(2,\emptyset)$ 2d sent $(20,8)$ 28 $[1,2]$ one too $12 \choose 4$ 12 juice 4 $16 \choose 8$ 16 choosing $3 + 1 = 4$ surplus one mix for $4 \choose 0$ 4 2 0 $4 \choose k$ four twos k $49 \choose 5$ 49 she’s 5 $50 \choose 6$ 52 six $8 \choose 2$ a choose to $A \cap C$ A inner sexy $A \cap D$ a intersecting $A \cup B$ a you be $A \cup C$ a UNC $A \cup C$ a you will see a proof approved $A^c$ a compliment $a_i$ asa by all multiples of almost visit an element of $A$ known the debate an element of $A$ normal today and divisible and as above and positive 50 + + 50 and tens intense and would let this be 3 andrew lippa p3 arithmetic earth to arithmetic rhythm sick $A$s ace $B$ but not $C$ be but not si $B \cap C$ b in a sexy $B$ if beef bijection bi CH action bijection bite jection bijection by dejection bijection by ejection bijection by jection bijection by Junction both sets both says capital X Catholics cardinality carnality Cartesian car to shull codomain code Amin coordinate cordon coordinate court coordinates corners coordinates have cort in sap cosine cosign disjoint destroyed divisible by 5 visited his wife live $e^x$ eat ooh the x element of A illness of A element of A mellow today element $x$ that Windex elements of us empty MQ $\emptyset$ descent $\emptyset$ intercept equal able exponent x1 factored acted factorial fact welders fill in film flipping four coins philippine for coins for succinctness force the sickness hence in Hanson $i$ eye $i$ aye If I divide by 15 If I / 15 in $A$ nae in there a bear infinite if an infinite imp an infinite infant into five in 2 5 $i$s ice $j \choose r$ j choose arms $k$th cave $k$th kate likewise lakh wise $n \choose n$ and shoes and $n$th row nth throw one-to-one 121 onto on 2 $r \choose r$ our shoes are $r$ to art at $r$ to already $\mathbb{R}^2$ are too $\mathbb{R}^2$ our too $r$‘s hours same row samro second coordinate sec cornered set containing second inning set containing second taining set containing seconds hanging set containing secretary set containing 1 second anyone since $A$ has say has sixth one six-month square swear $\sqrt{2}$ score 2 $\sqrt{2}$ squirt of tuna team A teammate term in it terminate than zero gloves are off that’s chosen that’s Showzen then $x$ the next therefore there for this entry in the century plus to the $k$ decay two are to are two ways in too wasted union you need up here pier what $f(3)$ whateva 3 will be 4 will before with $n=4$ finials 4 would subtract was attract writing riding $x$ is extras $x$ is in exiting $x$ is in $A$ x as a native $x$ is in $A$ x is nay $x$ is in $B$ sexism be $x$ is in $B$ and $C$ x is Indiana see $x$ is in $C$ excellency $x$ is in $C$ X’s and see $x_2$ next to $x_2$ text too $x-$coordinate export $y$ why $y$ wine $y$ is greater than or wider $y$s wise