Abstract
State-of-the-art automatic drum transcription (ADT) ap-proaches utilise deep learning methods reliant on time-consuming manual annotations and require congruence be-tween training and testing data. When these conditionsare not held, they often fail to generalise. We proposea game approach to ADT, termed player vs transcriber(PvT), in which a player model aims to reduce transcrip-tion accuracy of a transcriber model by manipulating train-ing data in two ways. First, existing data may be aug-mented, allowing the transcriber to be trained using record-ings with modified timbres. Second, additional individualrecordings from sample libraries are included to generaterare combinations. We present three versions of the PvTmodel:AugExist, which augments pre-existing record-ings;AugAddExist, which adds additional samples ofdrum hits to theAugExistsystem; andGenerate, whichgenerates training examples exclusively from individualdrum hits from sample libraries. The three versions areevaluated alongside a state-of-the-art deep learning ADTsystem using two evaluation strategies. The results demon-strate that including the player network improves the ADTperformance and suggests that this is due to improved gen-eralisability. The results also indicate that although theGeneratemodel achieves relatively low results, it is a vi-able choice when annotations are not accessible.
Original language | English |
---|---|
Publication status | Published (VoR) - 27 Sept 2018 |