Point taken Dominik. You wrote:
One has two files. The first is the diplomatic transcription (karmma, vindu, adhiṣṭāna). The second is whatever one wants it to be, but it's interpretative or normalized.
I think another reason, in addition to all the reasons you gave for what you suggest. I.e. "first is the diplomatic transcription" and only then to create a "normalized" file, is that deciding whats normal is sometimes a judgement call . There may be more than one norm. For example:
Monier-Williams dictionary has pattra and chattra but Apte's dictionary has patra and chatra .
Harry Spier