This paper proposes a clause-based extractive summarization algorithm by ranking and extracting semantic clauses from the original document. Discourse structure relation is useful for identifying semantically important parts of the source document. We segment the document into clauses and evaluate the importance of clauses based on semantic relations, and then, rank and extract them coarsely, and utilize graph rank to refine the extracted clauses. This way can create a more concise summary with more information and less redundancy. Research reach the following results: 1) compared with the other summarization algorithms on different granularity, the clause-based summarization achieves higher recall score; and, 2) different discourse relations have different importance.
展开▼