One of the difficulties in analyzing large blocks of data to derive usable information is the potential for accidentally including personal or discriminatory data, making the information unusable because of various laws. The authors propose a method for extracting usable data from a large dataset through the use of various theorems and algorithms. They demonstrate this methodology on two large datasets and show the results from using different methods for extracting the "clean" information, including analysis of the viability of each method. The results of their methodology appear to have accomplished the goals of obtaining information that is discrimination-free and privacy-retaining without having significant loss of the desired information in a reasonable amount of time.
展开▼