banner

A multimodal deep learning algorithm for polyphonic music applied to music sentiment analysis and generation

Qidi Sun, Hyuntae Kim

Abstract


The acronym for “polyphonic music” (PM) is employed when referring to music which includes different melody lines that are performed sequentially. Integrating PM with Sentiment Analysis (SA) and music composition involves evaluating and creating tunes with several tunes played simultaneously. Advanced techniques, usually centred on Deep Learning (DL) methods, have been employed to achieve the aim. The intention of this study is to provide an innovative framework for monitoring and developing SA in good management. Initially, a particular system of analysis is created, employing sophisticated DL methods to enhance the accuracy and sensitivity of PM detection of sentiments. The research addresses the intricate functioning of sound features like Mel Frequency Cepstral Coefficients (MFCC) and Chroma. This research project investigates whether dimension reduction approaches like Stacked Autoencoders (SAE) enhance PM-SA models. To address computationally demanding issues. The recommended SA system MDL is thoroughly evaluated compared to traditional techniques. Accuracy, precision, recall, and F1-score examine the MDL framework’s potential to detect and classify PM sentiment states.


Keywords


sentiment analysis; polyphonic music; accuracy; deep learning; classification and performance; Mel frequency cepstral coefficients

Full Text:

PDF

References


1. Çano E. Text-based sentiment analysis and music emotion recognition. arXiv. 2018; arXiv:1810.03031.

2. Abboud R, Tekli J. Integration of nonparametric fuzzy classification with an evolutionary-developmental framework to perform music sentiment-based analysis and composition. Soft Computing. 2019; 24(13): 9875-9925. doi: 10.1007/s00500-019-04503-4

3. Xi C. Music Emotion Analysis Based on PSO-BP Neural Network and Big Data Analysis. Ahmed SH, ed. Computational Intelligence and Neuroscience. 2021; 2021: 1-9. doi: 10.1155/2021/6592938

4. Rajesh S, Nalini NJ. Polyphonic Instrument Emotion Recognition using Stacked Auto Encoders: A Dimensionality Reduction Approach. Procedia Computer Science. 2023; 218: 1905-1914. doi: 10.1016/j.procs.2023.01.167

5. Nalini NJ, Palanivel S. Emotion recognition in music signal using AANN and SVM. International Journal of Computer Applications. 2013; 77(2).

6. Pouyanfar S, Sameti H. Music emotion recognition using two-level classification. In: Proceedings of the 2014 Iranian Conference on Intelligent Systems (ICIS 2014).

7. Chiang WC, Wang JS, Hsu YL. A music emotion recognition algorithm with hierarchical SVM-based classifiers. In: Proceedings of the 2014 International Symposium on Computer, Consumer and Control.

8. Lin C, Liu M, Hsiung W, Jhang J. Music emotion recognition is based on two-level support vector classification. In: Proceedings of the 2016 International Conference on Machine Learning and Cybernetics (ICMLC 2016).

9. Fukayama S, Goto M. Music emotion recognition with adaptive aggregation of Gaussian process regressors. In: Proceedings of the 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP 2016).

10. Nalini NJ, Palanivel S. Music emotion recognition: The combined evidence of MFCC and residual phase. Egyptian Informatics Journal. 2016; 17(1): 1-10. doi: 10.1016/j.eij.2015.05.004

11. Liu H, Fang Y, Huang Q. Music emotion recognition using a variant of recurrent neural network. In: Proceedings of the 2018 International Conference on Mathematics, Modeling, Simulation and Statistics Application (MMSSA 2018).

12. Er MB, Aydilek IB. Music emotion recognition is done using a chroma spectrogram and deep visual features. International Journal of Computational Intelligence Systems. 2019; 12(2): 1622-1634.

13. Rajesh S, Nalini NJ. Musical instrument emotion recognition using deep recurrent neural network. Procedia Computer Science. 2020; 167: 16-25.

14. Yang J. A Novel Music Emotion Recognition Model Using Neural Network Technology. Frontiers in Psychology. 2021; 12. doi: 10.3389/fpsyg.2021.760060

15. Hizlisoy S, Yildirim S, Tufekci Z. Music emotion recognition using convolutional long short term memory deep neural networks. Engineering Science and Technology, an International Journal. 2021; 24(3): 760-767. doi: 10.1016/j.jestch.2020.10.009

16. Niu N. Music Emotion Recognition Model Using Gated Recurrent Unit Networks and Multi-Feature Extraction. Mobile Information Systems; 2022.

17. Rahmany I, Dhahri N, Moulahi T, et al. Optimized Stacked Auto-Encoder for Unnecessary Data Reduction in Cloud of Things. In: 2022 International Wireless Communications and Mobile Computing (IWCMC). Published online 30 May 2022. doi: 10.1109/iwcmc55113.2022.9825372

18. Available online: https://paperswithcode.com/dataset/muse (accessed on 2 January 2024).

19. Shaik AA, Mareedu VDP, Polurie VVK. Learning multiview deep features from skeletal sign language videos for recognition. Turkish Journal of Electrical Engineering & Computer Sciences. 2021; 29(2): 1061-1076. doi: 10.3906/elk-2005-57

20. Ghuge A, Prakash VC, Ruikar SD. Systematic analysis and review of video object retrieval techniques. Control and Cybernetics. 2020; 49(4): 471–498.

21. Ghuge CA, Chandra Prakash V, Ruikar SD. Weighed query-specific distance and hybrid NARX neural network for video object retrieval. The Computer Journal. 2019; 63(11): 1738-1755. doi: 10.1093/comjnl/bxz113

22. Victoria DrAH, Manikanthan SV, H R DrV, et al. Radar Based Activity Recognition using CNN-LSTM Network Architecture. International Journal of Communication Networks and Information Security (IJCNIS). 2023; 14(3): 303-312. doi: 10.17762/ijcnis.v14i3.5630

23. Thota MK, Shajin FH, Rajesh P. Survey on software defect prediction techniques. International Journal of Applied Science and Engineering. 2020; 17(4): 331–344.

24. Appathurai A, Sundarasekar R, Raja C, et al. An Efficient Optimal Neural Network-Based Moving Vehicle Detection in Traffic Video Surveillance System. Circuits, Systems, and Signal Processing. 2019; 39(2): 734-756. doi: 10.1007/s00034-019-01224-9

25. Balamurugan D, Aravinth SS, Reddy PCS, et al. Multiview Objects Recognition Using Deep Learning-Based Wrap-CNN with Voting Scheme. Neural Processing Letters. 2022; 54(3): 1495-1521. doi: 10.1007/s11063-021-10679-4

26. Bhavana D, Kishore Kumar K, Bipin Chandra M, et al. Hand Sign Recognition using CNN. International Journal of Performability Engineering. 2021; 17(3): 314. doi: 10.23940/ijpe.21.03.p7.314321

27. Kumar EK, Kishore PVV, Kiran Kumar MT, et al. 3D sign language recognition with joint distance and angular coded color topographical descriptor on a 2 – stream CNN. Neurocomputing. 2020; 372: 40-54. doi: 10.1016/j.neucom.2019.09.059

28. Ghuge C, Prakash V, Ruikar S. An Integrated Approach Using Optimized Naive Bayes Classifier and Optical Flow Orientation for Video Object Retrieval. International Journal of Intelligent Engineering and Systems. 2021; 14(3): 210-221. doi: 10.22266/ijies2021.0630.19

29. Gullapelly A, Dr. BGB. Exploring the techniques for object detection, classification, and tracking in video surveillance for crowd analysis. Indian Journal of Computer Science and Engineering. 2020; 11(4): 321-326. doi: 10.21817/indjcse/2020/v11i4/201104064

30. Saha J, Chowdhury C, Ghosh D, et al. A detailed human activity transition recognition framework for grossly labeled data from smartphone accelerometer. Multimedia Tools and Applications. 2020; 80(7): 9895-9916. doi: 10.1007/s11042-020-10046-w

31. Yadav J, Misra M, Rana NP, et al. Exploring the synergy between nano-influencers and sports community: behavior mapping through machine learning. Information Technology & People. 2021; 35(7): 1829-1854. doi: 10.1108/itp-03-2021-0219

32. Yadav J, Misra M, Rana NP, et al. Netizens’ behavior towards a blockchain-based esports framework: a TPB and machine learning integrated approach. International Journal of Sports Marketing and Sponsorship. 2021; 23(4): 665-683. doi: 10.1108/ijsms-06-2021-0130

33. Mohan KK, Prasad CR, Kishore PVV. Yolo V2 with bifold skip: A deep learning model for video-based real-time train bogie part identification and defect detection. Journal of Engineering Science and Technology. 2021; 16(3): 2166–2190.

34. Krishnamohan K, Prasad ChR, Kishore PVV. Train rolling stock video segmentation and classification for bogie part inspection automation: a deep learning approach. Journal of Engineering and Applied Science. 2022; 69(1). doi: 10.1186/s44147-022-00128-x

35. Raju K, Sampath Dakshina Murthy A, Chinna Rao B, et al. A robust and accurate video watermarking system based on SVD hybridation for performance assessment. International Journal of Engineering Trends and Technology. 2020; 68(7): 19–24.

36. Suneetha M, Prasad MVD, Kishore PVV. Sharable and unshareable within class multi view deep metric latent feature learning for video-based sign language recognition. Multimedia Tools and Applications. 2022; 81(19): 27247-27273. doi: 10.1007/s11042-022-12646-0

37. Suneetha M, Prasad MVD, Kishore PVV. Multi-view motion modelled deep attention networks (M2DA-Net) for video-based sign language recognition. Journal of Visual Communication and Image Representation. 2021; 78: 103161. doi: 10.1016/j.jvcir.2021.103161

38. Wagdarikar AMU, Senapati RK. A secure communication approach in OFDM using optimized interesting region-based video watermarking. International Journal of Pervasive Computing and Communications. 2020; 18(2): 171-194. doi: 10.1108/ijpcc-05-2019-0044

39. Janarthanan P, Murugesh V, Sivakumar N, et al. An Efficient Face Detection and Recognition System Using RVJA and SCNN. Mathematical Problems in Engineering. 2022; 2022: 1-9. doi: 10.1155/2022/7117090

40. Priyadharshini B, Gomathi T. Naive Bayes classifier for wireless capsule endoscopy video to detect bleeding frames. International Journal of Scientific and Technology Research. 2020; 9(1): 3286–3291.

41. Ali SA, Prasad MVD, Kishore PVV. Ranked Multi-View Skeletal Video-Based Sign Language Recognition with Triplet Loss Embeddings. Journal of Engineering Science and Technology. 2022; 17(6): 4367–4397.

42. Pande SD, Chetty MSR. Linear Bezier Curve Geometrical Feature Descriptor for Image Recognition. Recent Advances in Computer Science and Communications. 2020; 13(5): 930-941. doi: 10.2174/2213275912666190617155154

43. Depuru S, Nandam A, Ramesh PA, et al. Human Emotion Recognition System Using Deep Learning Technique. Journal of Pharmaceutical Negative Results. 2022; 13(4): 1031–1035.

44. Ali SKA, Prasad MVD, Kumar PP, et al. Deep Multi View Spatio Temporal Spectral Feature Embedding on Skeletal Sign Language Videos for Recognition. International Journal of Advanced Computer Science and Applications. 2022; 13(4). doi: 10.14569/ijacsa.2022.0130494

45. Rani S, Ghai D, Kumar S. Reconstruction of Simple and Complex Three Dimensional Images Using Pattern Recognition Algorithm. Journal of Information Technology Management. 2022; 14: 235–247.

46. Rani S, Ghai D, Kumar S, et al. Efficient 3D AlexNet Architecture for Object Recognition Using Syntactic Patterns from Medical Images. Computational Intelligence and Neuroscience. 2022; 2022: 1-19. doi: 10.1155/2022/7882924

47. Kotkar VA, Sucharita V. Scalable Anomaly Detection Framework in Video Surveillance Using Keyframe Extraction and Machine Learning Algorithms. Journal of Advanced Research in Dynamical and Control Systems. 2020; 12(7): 395-408. doi: 10.5373/jardcs/v12i7/20202020

48. Li X, Manivannan P, Anand M. Task Modelling of Sports Event for Personalized Video Streaming Data in Augmentative and Alternative Communication. Journal of Interconnection Networks. 2022; 22(Supp01). doi: 10.1142/s0219265921410279




DOI: https://doi.org/10.32629/jai.v7i5.1615

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Qidi Sun, Hyuntae Kim

License URL: https://creativecommons.org/licenses/by-nc/4.0/