Automatic Summarization of Open-Domain Podcast Episodes

Song, Kaiqiang; Li, Chen; Wang, Xiaoyang; Yu, Dong; Liu, Fei

Computer Science > Computation and Language

arXiv:2011.04132 (cs)

[Submitted on 9 Nov 2020 (v1), last revised 12 Nov 2020 (this version, v2)]

Title:Automatic Summarization of Open-Domain Podcast Episodes

Authors:Kaiqiang Song, Chen Li, Xiaoyang Wang, Dong Yu, Fei Liu

View PDF

Abstract:We present implementation details of our abstractive summarizers that achieve competitive results on the Podcast Summarization task of TREC 2020. A concise textual summary that captures important information is crucial for users to decide whether to listen to the podcast. Prior work focuses primarily on learning contextualized representations. Instead, we investigate several less-studied aspects of neural abstractive summarization, including (i) the importance of selecting important segments from transcripts to serve as input to the summarizer; (ii) striking a balance between the amount and quality of training instances; (iii) the appropriate summary length and start/end points. We highlight the design considerations behind our system and offer key insights into the strengths and weaknesses of neural abstractive systems. Our results suggest that identifying important segments from transcripts to use as input to an abstractive summarizer is advantageous for summarizing long documents. Our best system achieves a quality rating of 1.559 judged by NIST evaluators---an absolute increase of 0.268 (+21%) over the creator descriptions.

Comments:	TREC 2020
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2011.04132 [cs.CL]
	(or arXiv:2011.04132v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2011.04132

Submission history

From: Fei Liu [view email]
[v1] Mon, 9 Nov 2020 01:31:05 UTC (4,859 KB)
[v2] Thu, 12 Nov 2020 17:34:35 UTC (4,861 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-11

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Kaiqiang Song
Chen Li
Xiaoyang Wang
Dong Yu
Fei Liu

…

export BibTeX citation

Computer Science > Computation and Language

Title:Automatic Summarization of Open-Domain Podcast Episodes

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Automatic Summarization of Open-Domain Podcast Episodes

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators