Efficiency of data structures for detecting overlaps in digital documents

机译：用于检测数字文档中重叠的数据结构的效率

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper analyses the efficiency of different data structures for detecting overlap in digital documents. Most existing approaches use some hash function to reduce the space requirements for their indices of chunks. Since a hash function can produce the same value for different chunks, false matches are possible. In this paper we propose an algorithm that can be used for eliminating those false matches. This algorithm uses a suffix tree structure, which is space consuming. We define a modified suffix tree that only considers chunks starting at the beginning of words and we show how the algorithm can work on this structure. We can alternatively reduce space requirements of a suffix tree by converting it to a directed acyclic graph. We show that suffix link information can be preserved in this new structure and the matching statistics algorithm still works with those modifications that we propose.

机译：本文分析了不同数据结构用于检测数字文档重叠的效率。大多数现有方法使用某种哈希函数来减少其块索引的空间要求。由于哈希函数可以为不同的块产生相同的值，因此可能会出现错误匹配。在本文中，我们提出了一种可用于消除那些错误匹配的算法。该算法使用后缀树结构，这会占用空间。我们定义了一个修改后缀树，该树仅考虑单词开头的块，并展示了算法如何在此结构上工作。我们可以通过将后缀树转换为有向无环图来减少其空间需求。我们显示后缀链接信息可以保留在此新结构中，并且匹配统计算法仍可与我们建议的那些修改一起使用。

著录项

来源
《Proceedings of the 24th Australasian conference on Computer science》|2001年|P.140-147|共8页
会议地点 Gold Coast(AU);Gold Coast(AU)
作者
Krisztian Monostori; Arkady Zaslavsky; Heinz Schmidt;
展开▼
作者单位

Monash University, Melbourne, Australia;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Digital Text Watermarking: Secure Content Delivery and Data Hiding in Digital Documents [J] . Qadir M.A., Ahmad I. IEEE Aerospace and Electronic Systems Magazine . 2006,第11期

机译：数字文本水印：数字文档中的安全内容传递和数据隐藏
2. Digital Text Watermarking: Secure Content Delivery and Data Hiding in Digital Documents [J] . Qadir M.A., Ahmad I. IEEE Aerospace and Electronic Systems Magazine . 2006,第期

机译：数字文本水印：数字文档中的安全内容传递和数据隐藏
3. Taking Efficiency to the Next Level: Hereford leaders and breeders pursue data to document and build feed efficiency [J] . Teresa Oe Hereford World . 2006,第5期

机译：将效率提高到一个新水平：赫里福德的领导者和育种者追求数据来记录和建立饲料效率
4. Efficiency of data structures for detecting overlaps in digital documents [C] . Monostori, K., Zaslavsky, . 2001

机译：用于检测数字文档中重叠的数据结构的效率
5. Digital documents in the workplace: An empirical investigation of document reuse and information technology infrastructure. [D] . Murphy, Lisa Diane. 2000

机译：工作场所中的数字文档：对文档重用和信息技术基础结构的实证研究。
6. The Digital Fish Library: Using MRI to Digitize, Database, and Document the Morphological Diversity of Fish [O] . Rachel M. Berquist, Kristen M. Gledhill, Matthew W. Peterson, 2009

机译：数字鱼库：使用MRI数字化，数据库化和记录鱼的形态多样性
7. Volume 2, Issue 3, Special issue on Recent Advances in Engineering Systems (Published Papers) Articles Transmit / Received Beamforming for Frequency Diverse Array with Symmetrical frequency offsets Shaddrack Yaw Nusenu Adv. Sci. Technol. Eng. Syst. J. 2(3), 1-6 (2017); View Description Detailed Analysis of Amplitude and Slope Diffraction Coefficients for knife-edge structure in S-UTD-CH Model Eray Arik, Mehmet Baris Tabakcioglu Adv. Sci. Technol. Eng. Syst. J. 2(3), 7-11 (2017); View Description Applications of Case Based Organizational Memory Supported by the PAbMM Architecture Martín, María de los Ángeles, Diván, Mario José Adv. Sci. Technol. Eng. Syst. J. 2(3), 12-23 (2017); View Description Low Probability of Interception Beampattern Using Frequency Diverse Array Antenna Shaddrack Yaw Nusenu Adv. Sci. Technol. Eng. Syst. J. 2(3), 24-29 (2017); View Description Zero Trust Cloud Networks using Transport Access Control and High Availability Optical Bypass Switching Casimer DeCusatis, Piradon Liengtiraphan, Anthony Sager Adv. Sci. Technol. Eng. Syst. J. 2(3), 30-35 (2017); View Description A Derived Metrics as a Measurement to Support Efficient Requirements Analysis and Release Management Indranil Nath Adv. Sci. Technol. Eng. Syst. J. 2(3), 36-40 (2017); View Description Feedback device of temperature sensation for a myoelectric prosthetic hand Yuki Ueda, Chiharu Ishii Adv. Sci. Technol. Eng. Syst. J. 2(3), 41-40 (2017); View Description Deep venous thrombus characterization: ultrasonography, elastography and scattering operator Thibaud Berthomier, Ali Mansour, Luc Bressollette, Frédéric Le Roy, Dominique Mottier Adv. Sci. Technol. Eng. Syst. J. 2(3), 48-59 (2017); View Description Improving customs’ border control by creating a reference database of cargo inspection X-ray images Selina Kolokytha, Alexander Flisch, Thomas Lüthi, Mathieu Plamondon, Adrian Schwaninger, Wicher Vasser, Diana Hardmeier, Marius Costin, Caroline Vienne, Frank Sukowski, Ulf Hassler, Irène Dorion, Najib Gadi, Serge Maitrejean, Abraham Marciano, Andrea Canonica, Eric Rochat, Ger Koomen, Micha Slegt Adv. Sci. Technol. Eng. Syst. J. 2(3), 60-66 (2017); View Description Aviation Navigation with Use of Polarimetric Technologies Arsen Klochan, Ali Al-Ammouri, Viktor Romanenko, Vladimir Tronko Adv. Sci. Technol. Eng. Syst. J. 2(3), 67-72 (2017); View Description Optimization of Multi-standard Transmitter Architecture Using Single-Double Conversion Technique Used for Rescue Operations Riadh Essaadali, Said Aliouane, Chokri Jebali and Ammar Kouki Adv. Sci. Technol. Eng. Syst. J. 2(3), 73-81 (2017); View Description Singular Integral Equations in Electromagnetic Waves Reflection Modeling A. S. Ilinskiy, T. N. Galishnikova Adv. Sci. Technol. Eng. Syst. J. 2(3), 82-87 (2017); View Description Methodology for Management of Information Security in Industrial Control Systems: A Proof of Concept aligned with Enterprise Objectives. Fabian Bustamante, Walter Fuertes, Paul Diaz, Theofilos Toulqueridis Adv. Sci. Technol. Eng. Syst. J. 2(3), 88-99 (2017); View Description Dependence-Based Segmentation Approach for Detecting Morpheme Boundaries Ahmed Khorsi, Abeer Alsheddi Adv. Sci. Technol. Eng. Syst. J. 2(3), 100-110 (2017); View Description Paper Improving Rule Based Stemmers to Solve Some Special Cases of Arabic Language Soufiane Farrah, Hanane El Manssouri, Ziyati Elhoussaine, Mohamed Ouzzif Adv. Sci. Technol. Eng. Syst. J. 2(3), 111-115 (2017); View Description Medical imbalanced data classification Sara Belarouci, Mohammed Amine Chikh Adv. Sci. Technol. Eng. Syst. J. 2(3), 116-124 (2017); View Description ADOxx Modelling Method Conceptualization Environment Nesat Efendioglu, Robert Woitsch, Wilfrid Utz, Damiano Falcioni Adv. Sci. Technol. Eng. Syst. J. 2(3), 125-136 (2017); View Description GPSR+Predict: An Enhancement for GPSR to Make Smart Routing Decision by Anticipating Movement of Vehicles in VANETs Zineb Squalli Houssaini, Imane Zaimi, Mohammed Oumsis, Saïd El Alaoui Ouatik Adv. Sci. Technol. Eng. Syst. J. 2(3), 137-146 (2017); View Description Optimal Synthesis of Universal Space Vector Digital Algorithm for Matrix Converters Adrian Popovici, Mircea Băbăiţă, Petru Papazian Adv. Sci. Technol. Eng. Syst. J. 2(3), 147-152 (2017); View Description Control design for axial flux permanent magnet synchronous motor which operates above the nominal speed Xuan Minh Tran, Nhu Hien Nguyen, Quoc Tuan Duong Adv. Sci. Technol. Eng. Syst. J. 2(3), 153-159 (2017); View Description A synchronizing second order sliding mode control applied to decentralized time delayed multi−agent robotic systems: Stability Proof Marwa Fathallah, Fatma Abdelhedi, Nabil Derbel Adv. Sci. Technol. Eng. Syst. J. 2(3), 160-170 (2017); View Description Fault Diagnosis and Tolerant Control Using Observer Banks Applied to Continuous Stirred Tank Reactor Martin F. Pico, Eduardo J. Adam Adv. Sci. Technol. Eng. Syst. J. 2(3), 171-181 (2017); View Description Development and Validation of a Heat Pump System Model Using Artificial Neural Network Nabil Nassif, Jordan Gooden Adv. Sci. Technol. Eng. Syst. J. 2(3), 182-185 (2017); View Description Assessment of the usefulness and appeal of stigma-stop by psychology students: a serious game designed to reduce the stigma of mental illness Adolfo J. Cangas, Noelia Navarro, Juan J. Ojeda, Diego Cangas, Jose A. Piedra, José Gallego Adv. Sci. Technol. Eng. Syst. J. 2(3), 186-190 (2017); View Description Kinect-Based Moving Human Tracking System with Obstacle Avoidance Abdel Mehsen Ahmad, Zouhair Bazzal, Hiba Al Youssef Adv. Sci. Technol. Eng. Syst. J. 2(3), 191-197 (2017); View Description A security approach based on honeypots: Protecting Online Social network from malicious profiles Fatna Elmendili, Nisrine Maqran, Younes El Bouzekri El Idrissi, Habiba Chaoui Adv. Sci. Technol. Eng. Syst. J. 2(3), 198-204 (2017); View Description Pulse Generator for Ultrasonic Piezoelectric Transducer Arrays Based on a Programmable System-on-Chip (PSoC) Pedro Acevedo, Martín Fuentes, Joel Durán, Mónica Vázquez, Carlos Díaz Adv. Sci. Technol. Eng. Syst. J. 2(3), 205-209 (2017); View Description Enabling Toy Vehicles Interaction With Visible Light Communication (VLC) M. A. Ilyas, M. B. Othman, S. M. Shah, Mas Fawzi Adv. Sci. Technol. Eng. Syst. J. 2(3), 210-216 (2017); View Description Analysis of Fractional-Order 2xn RLC Networks by Transmission Matrices Mahmut Ün, Manolya Ün Adv. Sci. Technol. Eng. Syst. J. 2(3), 217-220 (2017); View Description Fire extinguishing system in large underground garages Ivan Antonov, Rositsa Velichkova, Svetlin Antonov, Kamen Grozdanov, Milka Uzunova, Ikram El Abbassi Adv. Sci. Technol. Eng. Syst. J. 2(3), 221-226 (2017); View Description Directional Antenna Modulation Technique using A Two-Element Frequency Diverse Array Shaddrack Yaw Nusenu Adv. Sci. Technol. Eng. Syst. J. 2(3), 227-232 (2017); View Description Classifying region of interests from mammograms with breast cancer into BIRADS using Artificial Neural Networks Estefanía D. Avalos-Rivera, Alberto de J. Pastrana-Palma Adv. Sci. Technol. Eng. Syst. J. 2(3), 233-240 (2017); View Description Magnetically Levitated and Guided Systems Florian Puci, Miroslav Husak Adv. Sci. Technol. Eng. Syst. J. 2(3), 241-244 (2017); View Description Energy-Efficient Mobile Sensing in Distributed Multi-Agent Sensor Networks Minh T. Nguyen Adv. Sci. Technol. Eng. Syst. J. 2(3), 245-253 (2017); View Description Validity and efficiency of conformal anomaly detection on big distributed data Ilia Nouretdinov Adv. Sci. Technol. Eng. Syst. J. 2(3), 254-267 (2017); View Description S-Parameters Optimization in both Segmented and Unsegmented Insulated TSV upto 40GHz Frequency Juma Mary Atieno, Xuliang Zhang, HE Song Bai Adv. Sci. Technol. Eng. Syst. J. 2(3), 268-276 (2017); View Description Synthesis of Important Design Criteria for Future Vehicle Electric System Lisa Braun, Eric Sax Adv. Sci. Technol. Eng. Syst. J. 2(3), 277-283 (2017); View Description Gestural Interaction for Virtual Reality Environments through Data Gloves G. Rodriguez, N. Jofre, Y. Alvarado, J. Fernández, R. Guerrero Adv. Sci. Technol. Eng. Syst. J. 2(3), 284-290 (2017); View Description Solving the Capacitated Network Design Problem in Two Steps [O] . Meriem Khelifi, Mohand Yazid Saidi, Saadi Boudjit 2017

机译：第2卷，第3卷，工程系统最近进步的特殊问题（已发布论文）文章传输/接收频率各种阵列的波束成形，具有对称频率偏移Shaddrack偏航Nusenu Adv。 SCI。技术。 eng。系统。 J. 2（3），1-6（2017）;查看描述S-UTD-CH模型Eray Arik刀刃结构幅度和坡度衍射系数的详细分析，Mehmet Baris Tabakcioglu Adv。 SCI。技术。 eng。系统。 J. 2（3），7-11（2017）;查看描述案例基于组织内存的案例组织内存由PABMM ArchitectralMartín，MaríadeLosÁngeles，Diván，MarioJoséAven。 SCI。技术。 eng。系统。 J. 2（3），12-23（2017）;查看说明使用频率各种阵列天线Shaddrack偏航Nusenu Adv的低拦截横梁仪表概率。 SCI。技术。 eng。系统。 J. 2（3），24-29（2017）;查看说明零信任云网络使用传输访问控制和高可用性光学旁路交换套管切换西米列德·莱格托希金，安东尼Sager adv。 SCI。技术。 eng。系统。 J. 2（3），30-35（2017）;视图描述派生指标作为支持有效的需求分析和发布管理Indranil Nath ADV的测量。 SCI。技术。 eng。系统。 J. 2（3），36-40（2017）;视图描述肌电假肢yuki ueda的温度感觉反馈装置，恰米·伊莎。 SCI。技术。 eng。系统。 J. 2（3），41-40（2017）;查看描述深静脉血栓表征：超声检查，弹性造影和散射操作员Thibaud Berthomier，Ali Mansour，Luc Bressollette，FrédéricLeRoy，Dominique Mottier Adv。 SCI。技术。 eng。系统。 J. 2（3），48-59（2017）;查看说明通过创建货物检测的参考数据库来改进海关边界控制X射线图像Selina Kolokytha，Alexander Flisch，ThomasLüthi，Mathieu Plamondon，Adrian Schwaninger，Wiana Schwaninger，Wiana Hardmeier，Marius Costin，Caroline Vienne，Frank Sukowski，ULF哈桑德勒，伊瑞恩多森，纳吉·甘迪，塞尔格·马西亚诺，亚伯拉·马西亚诺，安德雷阿索尼卡，埃里克·罗·克，Ger Komen，Micha Slegt Adv。 SCI。技术。 eng。系统。 J. 2（3），60-66（2017）;查看说明航空导航使用偏光技术Arsen Klochan，Ali Al-Ammouri，Viktor Romanenko，Vladimir Tronko Adv。 SCI。技术。 eng。系统。 J. 2（3），67-72（2017）;查看描述使用用于救援运营的单双转换技术优化多标准变送器架构Riadue Essaadali，Chokri Jebali和Ammar Kouki Adv。 SCI。技术。 eng。系统。 J. 2（3），73-81（2017）;视图描述电磁波反射模型中的奇异积分方程A. S.Ilinskiy，T.Galishnikova Adv。 SCI。技术。 eng。系统。 J. 2（3），82-87（2017）;查看工业控制系统信息安全管理的描述方法：概念证明与企业目标对齐。 Fabian Bustamante，Walter Fuertes，Paul Diaz，Theofilos Toulqueridis adv。 SCI。技术。 eng。系统。 J. 2（3），88-99（2017年）;查看描述依赖基于依赖的分割方法，用于检测语素边界Ahmed Khorsi，Abeer Alsheddi Adv。 SCI。技术。 eng。系统。 J. 2（3），100-110（2017）;查看描述纸张改进了基于统治的犹太人，解决了阿拉伯语Soufiane Farrah，Hanane El Manssouri，Ziyati Elhoussaine，Mohamed Ouzzif Adv。 SCI。技术。 eng。系统。 J. 2（3），111-115（2017）;查看描述医疗不平衡数据分类Sara Belarouci，穆罕默德胺Chikh Adv。 SCI。技术。 eng。系统。 J. 2（3），116-124（2017）;查看描述adoxx建模方法概念化环境Nesat Efendioglu，Robert Woitsch，Wilfrid Utz，Damiano Falcioni Adv。 SCI。技术。 eng。系统。 J. 2（3），125-136（2017）;查看描述GPSR +预测：通过预期Vanets Zineb Squalli Houssaini，Imane Zaimi，Mohammed Oumsis，SaïdelAlaouiOuatik Advik Advik Advik Advik Advik Acik Adve，GPSR +预测SCI。技术。 eng。系统。 J.2（3），137-146（2017）;查看说明矩阵转换器通用空间矢量数字算法的最佳合成Adrian Popovici，MirceaBăBăIţă，Petru Papazian adv。 SCI。技术。 eng。系统。 J. 2（3），147-152（2017）;视图描述轴向磁通永磁同步电动机的控制设计，其在标称旋转Xuan Minh Tran，Nhu Hien Nguyen，CACoc Tuan Duong Adv。 SCI。技术。 eng。系统。 J. 2（3），153-159（2017）;视图说明A同步应用于分散时间延迟多功能机器人系统：稳定性证明Marwa Fathallah，Fatma Abdelhedi，Nabil Derbel Adv。 SCI。技术。 eng。系统。 J. 2（3），160-170（2017年）;查看描述故障诊断和耐受控制使用观察者银行应用于连续搅拌坦克反应器Martin F. Pico，Eduardo J. Adam Adv。 SCI。技术。 eng。系统。 J. 2（3），171-181（2017年）;查看说明用人工神经网络利用人工神经网络的热泵系统模型的开发和验证Nabil Nassif，Jordan Goodend Adv。 SCI。技术。 eng。系统。 J. 2（3），182-185（2017）;查看描述对心理学学生的耻辱 - 终止的有用性和吸引力的描述：一场严肃的比赛，旨在减少精神疾病的耻辱，诺埃尔·纳瓦罗，Juan J. Ojeda，迭戈库戈，何塞A. Piedra，joséGallego adv。 SCI。技术。 eng。系统。 J. 2（3），186-190（2017）;视图说明基于Kinect的移动人类跟踪系统，避免避让人Abdel Mehsen Ahmad，Zouhair Bazzal，Hiba Al Youssef Adv。 SCI。技术。 eng。系统。 J. 2（3），191-197（2017年）;视图描述基于蜜罐的安全方法：保护在线社交网络免受恶意配置文件FATNA Elmendili，Nisrine Maqran，Younes el Bouzekri El Idrissi，Habiba Chaoui Adv。 SCI。技术。 eng。系统。 J. 2（3），198-204（2017）;视图描述超声波压电传感器阵列的基于可编程系统的片上（PSoC）Pedro Acevedo，MartínFentes，JoelDurán，MónicaVázquez，CarlosDíazadv。 SCI。技术。 eng。系统。 J. 2（3），205-209（2017）;查看描述使玩具车辆与可见光通信（VLC）的交互（VLC）M.A.Ilyas，M. B. Othman，S. S. Shah，Mas Fawzi Adv。 SCI。技术。 eng。系统。 J. 2（3），210-216（2017）;查看说明分析分数2xN RLC网络传输矩阵MahmutÜn，ManolyaÜndadv。 SCI。技术。 eng。系统。 J. 2（3），217-220（2017年）;查看描述灭火系统在大型地下车库Ivan Antonov，Rositsa Velichkova，Svetlin Antonov，Kamen Grozdanov，Milka Uzunova，Ikram El Abbassi Adv。 SCI。技术。 eng。系统。 J. 2（3），221-226（2017）;查看说明使用双元频率各种阵列的定向天线调制技术Shaddrack偏航Nusenu Adv。 SCI。技术。 eng。系统。 J. 2（3），227-232（2017）;查看描述使用人工神经网络与乳腺癌与乳腺癌的乳腺X乳头乳腺癌的兴趣区域进行分类，使用人工神经网络EstefaníaD.Avalos-Rivera，Alberto de J. Pastana-Palma Adv。 SCI。技术。 eng。系统。 J.2（3），233-240（2017）;查看描述磁悬浮和引导系统Florian Puci，Miroslav Husak Adv。 SCI。技术。 eng。系统。 J. 2（3），241-244（2017年）;视图说明分布式多功能传感器网络中的节能移动感应minh t. nguyen adv。 SCI。技术。 eng。系统。 J. 2（3），245-253（2017年）;视图描述大分布式数据Ilia Nouretdinov Adv的保形异常检测的有效性和效率。 SCI。技术。 eng。系统。 J. 2（3），254-267（2017年）;查看描述S参数优化在分段和未分段绝缘TSV中高达40GHz频率Juma Mary Atieno，Xuliang Zhang，He Song Bai Adv。 SCI。技术。 eng。系统。 J. 2（3），268-276（2017年）;查看描述综合未来车辆电气系统的重要设计标准Lisa Braun，Eric Sax Adv。 SCI。技术。 eng。系统。 J. 2（3），277-283（2017年）;查看描述虚拟现实环境的故障交互通过数据手套G. Rodriguez，N.Jofre，Y.Alvarado，J.Fernández，R.Guerrero Adv。 SCI。技术。 eng。系统。 J. 2（3），284-290（2017年）;查看描述在两个步骤中解决电容网络设计问题

Efficiency of data structures for detecting overlaps in digital documents

摘要

著录项

相似文献

相关主题

期刊订阅