首页>
外国专利>
Table Header Detection Using Global Machine Learning Features from Orthogonal Rows and Columns
Table Header Detection Using Global Machine Learning Features from Orthogonal Rows and Columns
展开▼
机译:使用全局机器学习功能从正交行和列中进行表头检测
展开▼
页面导航
摘要
著录项
相似文献
摘要
A method, system and computer-usable medium for detecting headers in various documents, such as PDF and HTML files. The files are converted to a two dimensional array or table, having orthogonal rows and columns. Either rows or columns are determined to include headers. For determining if rows include headers. For each row in the array or table, pair wise comparison is performed for each cell of each column that is orthogonal to that row. The pair wise comparison scores or values are summed up for each orthogonal column to that row and the sum across for all the orthogonal columns to row provide a score or value for that row. Row scores are evaluated relative to one another to determine likelihood of headers in the row. For determining if columns have headers, similar calculation is performed between columns and their orthogonal rows.
展开▼