从音频文件中提取音乐特征在Matlab工具箱中文翻译(3)

2019-06-17 11:24

? The audio file can be first loaded using the miraudio function, which can perform diverse operations such as resampling,automated trimming of the silence at the beginningand/or at the end of the sequence, extraction of a given subsequence, centering, normalization with respect to RMS energy, etc.

a=mirtempo(’myfile’,’Sampling’,11025,

’Trim’,’Extract’,2,3,

’Center’,’Normal’) (33) mirspectrum(a) (34)

? Batch analyses of audio files can be carried out by simplyreplacing the name of the audio file by the keyword’Folder’.mirspectrum(’Folder’) (35)

? Any vector v computed in Matlab can be converted into awaveform using, once again, the miraudio function, byspecifying a specific sampling rate.

a=miraudio{v,44100) (36) mirspectrum(a) (37) ? Any feature extraction can be based on the result of a previouscomputation. For instance, the autocorrelation of aspectrum curve can be computed as follows:

s=mirspectrum(a) (38) as=mirautocor(s) (39) ?Product of curves [10] can be performed easily:

mirautocor(a)*mirautocor(s) (40) In this particular example, the waveform autocorrelationmirautocor(a) is automatically converted to frequencydomain in order to be combined with the spectrum autocorrelationmirautocor(s).

4. MIRTOOLBOX COMPARISON TO MARSYAS

Marsyas is a framework written in C++ and Java for prototy pingand experimentation with computer audition applications [1].It provides a general architecture for connecting audio, soundfiles,signal processing blocks and machine learning. The architectureis based on dataflow programming, where computation is expressed as a network of processing nodes/components connected by a number of communication channels/arcs. Users can build

第 11 页共 28 页

their own data flow network using a scripting language at run-time. Marsyas provides a framework for building applications rather than a set of applications [1] 7 Marsyas executables operate either onindividual soundfiles or collections which are simple text files that contain lists of soundfiles. In general collection files should contain soundfiles with the same sampling rate as Marsyas doesn’t perform automatic sampling conversion (except between 44100Hz and 22050Hz). The results of feature extraction processes are stored in Marsyas as text files that can be used later in the Weka machine learning environment. In parallel, Marsyas integrates some basic machine learning components.

Also MIRtoolbox offers the possibility of articulating processone after the other in order to construct complex computation, using a simple and adaptive syntax. Contrary to Marsyas though,MIRtoolbox does not offer real-time capabilities. On the otherhand, its object-based architecture (paragraph 4.2) enables a significant simplify cation of the syntax. MIRtoolbox can also analyse folders of audio files, and can deal with folder of varying sampling rates without having to perform any conversion. The data computed by the MIRtoolbox can be further processed directly in the Matlab environment with the help of other toolboxes, or can be exported into text files.

5. AVAILABILITY OF THE MIRTOOLBOX

Following our first Matlab toolbox, called MIDItoolbox [18], dedicated to the analysis of symbolic representations of music, the MIRtoolbox is offered for free to the research community. It can be downloaded from the following URL:

第 12 页共 28 页

http://www.cc.jyu.fi/~lartillo/mirtoolbox 6. ACKNOWLEDGMENTS

This work has been supported by the European Commission (NEST project ―Tuning the Brain for Music\code 028570). The development of the toolbox has benefitted from productive collaborations with the other partners of the project, in particular TuomasEerola, Jose Fornari, Marco Fabiani, and students of our department.

第 13 页共 28 页

7. REFERENCES

[1] G. Tzanetakis and P. Cook, ―Marsyas: A framework for audio analysis,‖ Organized Sound, vol. 4, no. 3, 2000.

[2] M. Slaney, ―Auditory toolbox version 2,‖ Tech. Rep., Interval Research Corporation, 1998-010, 1998.

[3] I. Nabney, Springer Advances In Pattern Recognition Series,chapter NETLAB: Algorithms for pattern recognition, 2002.

[4] J. Vesanto, ―Proceedings of the matlab dsp conference,‖ in Self-Organizing Map in Matlab: the SOM Toolbox, 1999, pp. 35–40.

[5] G. Tzanetakis and P. Cook, ―Multifeature audio segmentation for browsing and annotation,‖ in Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 1999.

[6] A. Rauber E. Pampalk and D. Merkl, ―Content-based organization and visualization of music archives,‖ in Proceedings of the 10th ACM International Conference on Multimedia, 2002, pp. 570–579.

[7] E. Terhardt, ―On the perception of periodic sound fluctuations (roughness),‖ Acustica, vol. 30, no. 4, pp. 201–213,1974.

[8] P. Boersma, ―Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound,‖ IFA Proceedings, vol. 17, pp. 97–110, 1993. [9] T. Tolonen and M. Karjalainen, ―A computationally efficient multipitch analysis model,‖ IEEE Transactions on Speech and Audio Processing, vol. 8, no. 6, pp. 708–716, 2000. [10] G. Peeters, ―Music pitch representation by periodicity measures based on combined temporal and spectral representations,‖in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2006.

[11] L. Rabiner and B. H. Juangl, Fundamentals of Speech Recognition, Prentice-Hall, 1993. [12] E. Gomez, ―Tonal description of polyphonic audio for music content processing,‖ INFORMS Journal on Computing, vol. 18, no. 3, pp. 294–304, 2006.

[13] C. Krumhansl, Cognitive Foundations of Musical Pitch, Oxford University Press, 1990.

第 14 页共 28 页

[14] C. Krumhansl and E. J. Kessler, ―Tracing the dynamic changes in perceived tonal organization in a spatial representation of musical keys,‖ Psychological Review, vol. 89,pp. 334–368, 1982.

[15] P. Toiviainen and C. Krumhansl, ―Measuring and modeling real-time responses to music: The dynamics of tonality induction,‖Perception, vol. 32, no. 6, pp. 741–766, 2003.

[16] P. Toiviainen and J.S. Snyder, ―Tapping to bach: Resonancebased modeling of pulse,‖ Music Perception, vol. 21, no. 1,pp. 43–80, 2003.

[17] J. Foote and M. Cooper, ―Media segmentation using selfsimilarity decomposition,‖ in Proceedings of SPIE Storage and Retrieval for Multimedia Databases, 2003, number 5021, pp. 167–175.

[18] T. Eerola and P. Toiviainen, ―MIR in Matlab: The Midi Toolbox,‖in Proceedings of 5th International Conference on Music Information Retrieval, 2004, pp. 22–27.

[19] P. N. Juslin, ―Emotional communication in music performance:A functionalist perspective and some data,‖ Music Perception, vol. 14, pp. 383–418, 1997.

[20] K. R. Scherer and J. S. Oshinsky, ―Cue utilization in emotion attribution from auditory stimuli,‖ Motivation and Emotion,vol. 1

第 15 页共 28 页

共6页:

从音频文件中提取音乐特征在Matlab工具箱中文翻译(3).doc 将本文的Word文档下载到电脑下载失败或者文档不完整，请联系客服人员解决！

下载这篇word文档