Mathematics and Scientific Markup

机译：数学和科学标志

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The development of e-Science (cyberScience, Grid, etc.) is starting to become a reality with formalised data resources, services on demand, domain-specific search engines, digital repositories, etc. Increasingly STM information will be contained in compound XML documents, representing scientific communication (articles, theses, repository entries, etc.). In physical sciences such as chemistry, materials science, engineering, physics, earth sciences, these "datuments" [1] normally contain hypertext, graphics, tables, graphs and numerical data, mathematical objects and relationships. In addition they may also contain domain-specific content such as chemical formula and reactions, thermodynamic and mechanical properties, electric, magnetic and optical properties. Among the domain-specific languages, CML (Chemical Markup Language) is the oldest and broadest, and is now being actively used for publishing by the Royal Society of Chemistry (Project Prospect [2]) which gives an idea of what chemistry in datuments can look like. CML has had to develop the domain-specific objects (molecules, atoms, bonds, spectra, crystallography, etc.) and the relationships between them. However, due to the text-based nature of early XML, it has also had to design an implement domain-independent infrastructure which can support much of physical science. Originally called STMML [3] it supports data types (float, integer, complex, etc.), data structures (arrays, lists, matrices, etc.), geometrical concepts (points, planes, lines, etc.) and scientific units of measurement. In addition CML bases much of its flexibility one user-created dictionaries (ontologies) which are hyperlinked from objects in the datuments. It is now clear that the domain-independent parts of CML (and by extension some other markup languages in physical science) are loosely isomorphic with approaches in MathML and OMDOC. If a synthesis can be found, then CML can happily forget about the "non-chemistry" knowing that the mathematical and physical science community has a general way forward. In easiest-first order, the following are suggested: (1) Mathematical variables and equations in chemical documents. An obvious challenge is that the variables represent types, often physical quantities (but also chemical objects such as atomTypes). This would be one of the first areas to explore with publishers. (2) Graphs and tables. A high proportion of graphs are functions of one of more dependent variables against one or more independent variables, currently supported by >. (3) Dictionaries. The CML dictionaries and OMDOC content dictionaries seem fairly similar in approach. (4) Mathematical relationships. A large area of physical science is based on theoretically and experimentally validated relationships which have been proved over many years (e.g. Maxwell's equations in thermodynamics). Often a quantity can be most easily determined by measuring different ones and transforming them. However most transformations are currently hidden in procedural non-portable code and it would be an exciting challenge to create a self-consistent declarative model of parts of physical science. It would be very exciting to have a discovery engine which could, on demand, decide which quantities were deducible from which (with similarity to theorem proving). A major challenge for distributed mathematics and science is discovery through search engines. These currently work on "free text" and are optimised to recognise strings. In a few cases domain-specific canonicalisations can be used (e.g. our Google Inchi [4] transforms a molecular graph into a string which is recognised by search engines). However most cases require mathematical operations (arithmetic, transformations, subgraph-matching, etc.). How - and where - can these be performed? A new generation of domain-independent and domain-specific indexing and searching tools needs to be developed. Recently CML has had to evolve a grammar to support fuzzy c

机译：E-Science（Cybercience，Grid等）的开发开始成为具有正式的数据资源的现实，按需服务，域特定的搜索引擎，数字存储库等越来越多的STM信息将包含在复合XML文档中，代表科学通信（文章，论文，存储库条目等）。在化学，材料科学，工程，物理学，地球科学等物理科学中，这些“DATUSTUMES”[1]通常包含超文本，图形，表格，图形和数值数据，数学对象和关系。此外，它们还可含有特异性域含量，例如化学式和反应，热力学和机械性能，电磁，磁性和光学性质。在特定于域的语言中，CML（化学标记语言）是最古老，最广泛的，现在正在积极地用于由皇家化学学会（项目前景[2]）发表，这使得可以了解数据库中的化学物质看起来像。 CML必须开发特定于域的对象（分子，原子，键，光谱，晶体学等）和它们之间的关系。然而，由于早期XML的基于文本的性质，它还必须设计一个独立于域的独立基础设施，可以支持大部分物理科学。最初称为STMML [3]它支持数据类型（浮点，整数，复杂等），数据结构（阵列，列表，矩阵等），几何概念（点，平面，线条等）和科学单位测量。另外，CML基于它的大部分灵活性，一个用户创建的词典（本体），它是从数据库中的对象中的超链接到的超链接。现在清楚的是，CML的域独立部分（以及通过扩展物理科学中的其他一些标记语言）是与MathML和OMDoc中的方法松散的同构。如果可以找到合成，那么CML可以愉快地忘记“非化学”，知道数学和物理科学界具有一般的前途。在最简单的顺序中，建议以下提出以下内容：（1）化学文档中的数学变量和方程。显而易见的挑战是变量代表类型，通常是物理量（但也是atomType等化学物体）。这将是第一个探索出版商的领域之一。（2）图形和表格。高比例图是对一个或多个独立变量的一个或多个独立变量中的一个的函数，目前由>支持。（3）词典。 CML词典和OMDOC内容词典似乎相同的方法。（4）数学关系。大面积的物理科学是基于经过理论上和实验验证的关系，这些关系已经过多年（例如，热力学中的Maxwell方程）。通常可以通过测量不同的数量来最容易地确定并改变它们。然而，大多数转换目前都隐藏在程序不便携式代码中，创建物理科学部分的自我一致声明模型是一个激动人心的挑战。具有一个发现引擎，可以根据需要确定哪些数量从哪个数量（具有与定理的定理相似）来说是非常令人兴奋的。分布式数学和科学的一项重大挑战是通过搜索引擎发现的。这些目前正在处理“自由文本”，并优化以识别字符串。在少数情况下，可以使用域特定的Canonication（例如，我们的Google inchi [4]将分子图转换为由搜索引擎识别的字符串）。然而，大多数情况需要数学操作（算术，转换，子图等）。这些如何 - 和哪里 - 可以执行这些吗？需要开发新一代独立的域和域特定的索引和搜索工具。最近CML必须发展一个语法来支持模糊C.

著录项

来源
《Symposium on Computer Algebra Systems and Automated Deduction Systems》|2007年||共2页
会议地点
作者
Peter Murray Rust;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Earth Science Markup Language (ESML): a solution for scientific data-application interoperability problem [J] . Rahul Ramachandran, Sara Graves, Helen Conover, Computers & geosciences . 2004,第1期

机译：地球科学标记语言（ESML）：科学数据应用互操作性问题的解决方案
2. STMML. A markup language for scientific, technical and medical publishing [J] . Henry S. Rzepa, Peter Murray-Rust Data science journal . 2002,第2002期

机译：STMML。科学，技术和医学出版的标记语言
3. Using MathML Parallel Markup Corpora for Semantic Enrichment of Mathematical Expressions [J] . Minh-Quoc NGHIEM, Giovanni YOKO KRISTIANTO, Akiko AIZAWA IEICE transactions on information and systems . 2013,第8期

机译：使用MathML并行标记语料库来丰富数学表达式的语义
4. Using Mathematical and Scientific Markup as an Approach to Model Specification [C] . Joseph B. Collins Grand Challenges in Modeling Simulation Conference . 2008

机译：使用数学和科学标记作为模型规范的方法
5. The learning and use of markup languages: An experimental investigation of the impacts of user interface, help system, and user background on learning a markup language. [D] . Hsu, Jeffrey. 2000

机译：标记语言的学习和使用：对用户界面，帮助系统和用户背景对学习标记语言的影响进行的实验研究。
6. How to Eliminate Uncertainty in Clinical Medicine – Clues from Creation of Mathematical Models Followed by Scientific Data Mining [O] . Yoshihiro Asano 2018

机译：如何消除临床医学的不确定性-从创建数学模型后进行科学数据挖掘的线索
7. OMDoc an open markup format for mathematical documents (version 1.2 [O] . Michael Kohlhase 2006

机译：OmDoc是数学文档的开放标记格式（版本1.2

Mathematics and Scientific Markup

摘要

著录项

相似文献

相关主题

期刊订阅