As seen in this updated mindmap below, there is a lot going on internally when making music, but there also is emotional influence of the listener, either intended or not, by the maker. This makes for a more wholesome structure of the research field and includes all the different parts that make music in itself an interesting thing to study. When looking into the technical details of making an artificial music generator there is a part which analyses data, implements learned details (which melodic and song structure are made up of) and the actual generator part which uses the aforementioned learned details and rules to generate music. In a set up with GANs there is the possibility to generate more data with the encoder, while the decoder is fed this information to discriminate between. The encoder is therefore atuned to generate different types of subsets and learns better what the difference with the original data set is. The sequential aspect of music ma...
I have looked into several directions, the CNN & GAN combination is still most interesting to me. I want to see whether, referring to the MidiNet paper, I'm able to find certain restrictions on the generation model to enhance performance of what humans think of as music they would want to hear. The so called Rencon experience, is an event where researchers gather to evaluate their results regarding music generation algorithms. This paper describes the different Rencon events, where mainly classical music is generated and evaluated. Their findings to not provide a golden standard to evaluate generative models as I hoped, so I'll have to look in another direction to find a solution for that. After looking on the internet I found another paper depicting a turing test for generative music algorithms, also called Rencon. It might prove useful, but for now I'll let it rest. Another approach was the human hearing model. I haven't really found any mathematics that describ...
Good gracious, I just found some more data, the Nottingham Database , which is a collection of ABC formatted music files. This format can be put into MIDI format and vice versa. Of course I'm facing a problem, first of all, the specific data I want, MIDI with a lot of different genres, is not widely available. Therefore I have a couple of options: Try to train on actual MP3/OGG/WAV/FLAC music files, which is going to take forever. Although the FMA data set offers 30s samples of the whole collection of songs. NSynth is a collection of single instruments, which is mostly suitable for synthesizing intstruments and not especially for generating songs/music. The Nottingham Database, an ABC formatted data base. The most suitable solution comes in the ABC formatted database, there is more to find and I'm currently tracking down more data. However, there are some caveits along the way. I have found papers using all of these data sets, therefore it is very likely t...
Comments
Post a Comment
Thank you for your message!