Great article, thanks.
I suppose it should also be said, even if it rather goes without saying, that questions about plagiarism in art are very difficult.
All works of art owe their existence to works that came before. No art is purely original. Standards of plagiarism are always somewhat arbitrary, depending on human judgment, and very difficult to quantify.
Your concerns about exploiting data are critically important, but another problem that arises for me is removing human judgment from questions that are inherently more qualitative than quantitative.
The question becomes something like, do we want machines deciding what music is plagiarism and what music is not plagiarism — especially if human beings aren’t able to decide?
I think the answer for me is no, because that could end up restricting and hobbling art and artists.