We propose an approach for discovery of implicit semantic integrity constraints (SIC) from XML instances called DIInCX. DIInCX is a process composed by three phases: Preprocessing, Discovering and Conversion. Our motivation with this work is to improve the activity of XML semantic data integration or XML information extraction systems, complementing their resulting XML schemata with SIC rules that cannot be explicitly perceived by a human user. Our approach is validated through experiments that show that the discovered SIC rules are valid, human readable and not complex to be implemented because they are based on simple restrict conditions.
展开▼