Nguyễn Hoàng Bảo Đại, 27, is the initiator of an AI model that can generate ten melodies in one second. Photo courtesy of Nguyễn Hoàng Bảo Đại
Nguyễn Hoàng Bảo Đại, 27, is the third Vietnamese to be honoured as a Google Developer Expert in the field of Machine Learning. He is also the initiator of an AI model that can generate ten melodies in one second. Đại talks with Lương Hương about the project.
Inner Sanctum: Please introduce yourself?
My name is Nguyễn Hoàng Bảo Đại. I’m working as a natural language processing research scientist and attending a graduate programme at the National University of Singapore.
I was introduced to music at a very young age but only took it seriously at the age of 12 when I started taking piano lessons. At first, I didn’t want to take up music because I always thought that it was not a subject for boys but I kept learning the piano to satisfy my father’s wishes. However, after a long time being exposed to music and experiencing many things, I started to unconsciously fall for it. Music has now become an integral part of my life, making me happy and inspiring me to do anything.
After finishing high school, I wanted to apply for the composition department of HCM City Conservatory of Music but my family did not encourage me to follow this path. My teachers also advised me to pursue maths or any other natural science subjects, of which I was pretty good at school.
In the end, I decided to apply for Information Technology like my best friend. I was really disoriented at that time and just wanted to go to university to have a friend whom I could talk to and sing to. Unfortunately, we couldn’t make it because I passed the entrance exam but my friend didn’t so I had to study by myself, however, it was also the beginning of my career with Information Technology as well as in Artificial Intelligence.
Inner Sanctum: How did the idea of music-writing AI, still a new concept in Việt Nam, come to you?
The idea originated from my habit of writing music. I have always spent a lot of time writing melodies because I think that only good melodies make beautiful songs. Sometimes, I finish writing a melody but then leave it because I’m not totally satisfied. I thought that if I could shorten the time for writing melodies, the whole process would become faster and the songwriters could release their works sooner.
On the other hand, AI has also been applied in various fields so I started thinking about applying AI in the fields of artistic creation, and assisting Vietnamese songwriters specifically.
After two years, my current AI model can write a piece of music that lasts for 10 seconds, 5 minutes, or even longer. It will take in a prime melody, which is a short piece of music as input to capture the stylistic intent, then create the melodies exactly as I want.
For example, if I provide it with a gentle, mellow melody, I will have 10 soft, mellow songs with lyrical ballads. If I provide a piece of music that has a slightly faster tempo, I will get 10 songs with a dynamic vibe. Moreover, when I lack ideas, I can ask the AI model to generate songs on its own without any prime melody.
Inner Sanctum: What were your difficulties as a pioneer in the field in Việt Nam and how have you overcome them?
The first difficulty in building this AI model was a shortage of data, which I believe that all other AI engineers will face. As I want to build a model that can generate music in the Vietnamese pop ballad style, I chose a training method based on supervised learning that requires a lot of data. However, I cannot use the available music of Vietnamese artists because it is not my desired input. I can find and collect the proper data online but it is insufficient and difficult to standardise as I wish as each source has its own format.
The second difficulty I encountered was the problem of human resources. The whole project has been implemented by myself as I consider it as a personal project and have managed it with my own limited budget. I often hesitate to invite others to help on complicated issues as we have no guarantee of results or finance to support.
Therefore, I decided to solve the problems myself. To get the file in the desired format, I have to sit on a midi keyboard every day and save each one. Many a little makes a mickle, and over a year I think I have accumulated enough data to train the AI model.
Inner Sanctum: Are you now satisfied with the melodies generated by your AI model?
It has achieved about 80 per cent of my expectation. The composing speed is also impressive as it only needs one second to generate 10 different melodies. Such capacity has saved me much time in composing and finalising the music.
I have written a lot of songs with the help of AI, but it takes many stages thereafter to introduce one to the market, such as writing the lyrics, writing the mix, finding the singer, doing the mixing, mastering and even PR/marketing, so they are mainly serving my personal demands.
Inner Sanctum: What do you think of the possibility that music-writing AI will become increasingly popular and replace songwriters in the future?
I highly appreciate the potential application of AI in the music industry. Music marketing is currently a hot field. All brands want to have catchy music for their promotions and advertisements.
The production of promotional music includes a time-consuming phase, similarly to my composing stage: the client listens to the demo and chooses the best version. I believe this process is time-consuming for the composer because they have to create music to meet the customer's request. I think that using AI to write music and make proposals is the fastest and most effective approach because it will save labour and time for both sides.
In addition, music writing AI will also be applied to shorten the production time of a music project. Currently, my AI model supports the process of composing pop tunes so that the producers can spend more time on the rest of the project and create better final products.
However, I don't think it's possible for AI to replace musicians, or it will take a long time to happen. Each songwriter has his/her own musical style. As long as the audience still has the demand for that style, AI will not yet replace their roles. Moreover, the training data is still human-generated so it still needs to depend on people. VNS