Computer Science > Computation and Language
[Submitted on 17 Jul 2024]
Title:A Novel Dependency Framework for Enhancing Discourse Data Analysis
View PDF HTML (experimental)Abstract:The development of different theories of discourse structure has led to the establishment of discourse corpora based on these theories. However, the existence of discourse corpora established on different theoretical bases creates challenges when it comes to exploring them in a consistent and cohesive way. This study has as its primary focus the conversion of PDTB annotations into dependency structures. It employs refined BERT-based discourse parsers to test the validity of the dependency data derived from the PDTB-style corpora in English, Chinese, and several other languages. By converting both PDTB and RST annotations for the same texts into dependencies, this study also applies ``dependency distance'' metrics to examine the correlation between RST dependencies and PDTB dependencies in English. The results show that the PDTB dependency data is valid and that there is a strong correlation between the two types of dependency distance. This study presents a comprehensive approach for analyzing and evaluating discourse corpora by employing discourse dependencies to achieve unified analysis. By applying dependency representations, we can extract data from PDTB, RST, and SDRT corpora in a coherent and unified manner. Moreover, the cross-linguistic validation establishes the framework's generalizability beyond English. The establishment of this comprehensive dependency framework overcomes limitations of existing discourse corpora, supporting a diverse range of algorithms and facilitating further studies in computational discourse analysis and language sciences.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.