Point taken Dominik. You wrote:
 One has two files.  The first is the diplomatic transcription (karmma, vindu, adhiṣṭāna).  The second is whatever one wants it to be, but it's interpretative or normalized.

I think another reason, in addition to all the reasons you gave for what you suggest. I.e.  "first is the diplomatic transcription"  and only then to create a "normalized" file, is that deciding whats normal is sometimes a judgement call . There may be more than one norm. For example:
Monier-Williams dictionary has pattra and chattra but Apte's dictionary has patra and chatra .

Harry Spier