A new theory of data representation involving "formats", which are based on three recurrent and prominent data-structuring concepts, is introduced. In a mathematically rigorous way, a notion of "equivalent" information capacity is defined and shown to be natural in a wide range of contexts. A normal form is introduced, and each equivalence class of formats is shown to have a unique representative in normal form. Finally, a natural way of comparing the information capacity of (non-equivalent) formats is formalized and studied.
展开▼