Computational Sound study group project
Goal is to create a poor man’s VU meter for some audio waveform.
​
Here I’m using King George VI’s speech to his people, 1939 (as chronicled in the movie The King's Speech).
This speech is available at the Internet Archive in two parts: 1, 2. It is embedded here in this notebook (because I’m not sure there’s a direct download URL):
In[]:=
source1=
00:00
03:28
Data in Notebook
;source2=
00:00
02:07
Data in Notebook
;
Download this notebook to evaluate!
In[]:=
kingEdwardVI1939Speech=AudioJoin[source1,source2];speechDuration=UnitConvert[Duration[kingEdwardVI1939Speech],"Milliseconds"];​​speechLengthMs=QuantityMagnitude[speechDuration];kingEdwardsPausesS=AudioIntervals[kingEdwardVI1939Speech,"VoiceInactivity"];(*<=terrificfunctionality,here*)​​(*pausesareinunitsofseconds*)​​(*justincaseyouwanttohearitwithoutthesolemnpauses:*)​​kingEdwardVII1939SpeechWithoutPauses=AudioDelete[kingEdwardVI1939Speech,kingEdwardsPausesS];​​(*predicatereturnstrueduringsilencebetweenwords:*)​​kingIsSilentQ[ms_]:=AnyTrue[kingEdwardsPausesS,#[[1]]<=(ms/1000.0)<=#[[2]]&];​​kingIsSpeakingPassFilter[ms_]:=If[kingIsSilentQ[ms],0.0,1.0];​​​​Print["King Edward IV's speech lasted ",Duration[kingEdwardVI1939Speech]," (",speechDuration,")"]
King Edward IV's speech lasted
5
min
34.8898
s
(
334890.
ms
)
In[]:=
frameRate=30;​​{rawAudioLevels,audioLevelIntervalMs}=​​With[​​{frameInterval=Quantity[1000.0/30,"Milliseconds"]},​​{rawAudioLevels=AudioLocalMeasurements[kingEdwardVI1939Speech,"RMSAmplitude",PartitionGranularity->frameInterval]["Values"]},​​(*Idon'tgethowtheaudiolevelintervalsarecomputedbyMathematica,itishalfthe`PartitionGranularity`specified...*)​​{audioLevelInterval=speechDuration/Length[rawAudioLevels]},​​{audioLevelIntervalMs=QuantityMagnitude[audioLevelInterval]},​​Print["frameInterval: ",QuantityMagnitude[frameInterval],", audioLevelInterval: ",QuantityMagnitude[audioLevelInterval]];​​{rawAudioLevels,audioLevelIntervalMs}];
frameInterval: 33.3333, audioLevelInterval: 16.6662
In[]:=
scaledAudioLevels=With[​​{noiseThreshold=0.02,​​(*bandpassonlyforactualspeech,otherwiseclampto0.0forbackgroundnoisepresentinthisrecording(otherwise​​there'salittleblacklineflickeringatthebottomofthevideoduringthepauses)*)​​audioSampleTimes=((Range[Length[rawAudioLevels]]-1.0)/Length[rawAudioLevels])×speechLengthMs+0.5×audioLevelIntervalMs},​​(kingIsSpeakingPassFilter/@audioSampleTimes)×rawAudioLevels/Max[rawAudioLevels]​​];
In[]:=
video=With[​​{slides=Graphics[{If[#>0.0,Black,White],Rectangle[{1,0},{1.5,#}]},PlotRange{{0,2.5},{0,1.5}}]&/@scaledAudioLevels},​​{videoNoSound=SlideShowVideo[slides,DefaultDuration->speechDuration,FrameRate->frameRate]},​​VideoCombine[{videoNoSound,kingEdwardVI1939Speech}]​​]
Out[]=