首页> 外文会议>Annual Midwest Instruction and Computing Symposium >Item Retention Improvements to Dropsort, a Lossy Sorting Algorithm
【24h】

Item Retention Improvements to Dropsort, a Lossy Sorting Algorithm

机译:物品保留改进液滴,损失分选算法

获取原文

摘要

This paper presents the Dropsort algorithm and suggests improvements. Dropsort is a simple comparison sorting algorithm that performs in θ(n) time by simply removing those elements that are not already in the correct order. Because of this, it is not a true sorting algorithm as the output does not contain all the elements of the input. In other words it is a lossy "sorting algorithm" that performs far more quickly than true sorting algorithms. Much of the original data is lost when using this algorithm unless the numbers to sort are in a mostly ordered format already. If more than a few elements are far from their correct position nearly all of the elements are removed. Two major improvements are proposed that greatly broaden the applicability of the Dropsort algorithm. In the first improvement, a concept of recent memory is added to the algorithm. For example, if several elements are consecutively dropped, the preceding element is assumed to be far from ordered and is removed instead. This is shown to increase the usefulness of Dropsort by increasing the number of retained elements. With a specific type of introduced error, 10,000% more of the original elements are retained over standard Dropsort. In the second improvement, each element is compared to not only determine if it is in sorted order, but if the next element is in order as well. Only if both elements are in order is the item kept. It is important to note that neither of these suggested improvements greatly increase the complexity or execution time of the Dropsort algorithm. Finally, the applications and limitations of these improvements and Dropsort in general are explored. This improved algorithm has many applications to very quickly sort large data sets when specific elements are not as important as the assurance that the list is sorted. This could be the case when a sorted list is critical but the source providing the pre-sorted input is not trustworthy. Also, applications in fields with large data sets and where speed is of utmost importance are discussed.
机译:本文介绍了Dropsort算法,并提出了改进。 Dropsort是一种简单的比较分类算法,它通过简单地删除尚未以正确顺序的那些元素执行θ(n)时间。因此,由于输出不包含输入的所有元素,因此不是真正的排序算法。换句话说,它是一个有损的“排序算法”,它比真正的排序算法更快地执行。使用此算法时,除非要排序的数字以大多数有序格式,否则大部分原始数据都会丢失。如果几乎删除了几乎少数元素,则几乎所有元素都被删除。提出了两项​​重大改进,大大拓宽了Dropsort算法的适用性。在第一个改进中,将近期内存的概念添加到算法中。例如,如果连续丢弃若干元素,则假设前面的元素远离有序并被删除。这显示出通过增加保留元件的数量来增加液滴的有用性。具有特定类型的引入误差,10,000%的原始元素保留在标准Dropsort上。在第二个改进中,将每个元素进行比较,而不仅可以确定它是否处于排序顺序,而且如果下一个元素也是顺序的。仅当两个元素都是顺序的,只能保留项目。值得注意的是,这些建议的改进都没有大大提高Dropsort算法的复杂性或执行时间。最后,探讨了这些改进和液滴的应用和限制。当特定元素与列表排序的保证不那么重要时,这种改进的算法有许多应用程序非常快速地排序大数据集。当排序列表至关重要时,这可能是这种情况,但提供预先排序输入的源不值得信赖。此外,讨论了具有大数据集的字段中的应用程序以及速度最重要的位置。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号