A web-based platform for annotating and querying knowledge graphs
Saṅgrāhaka is a web-based platform for annotating and querying knowledge graphs. It supports structured interaction with graph data — including entity and relation annotation — and provides a natural-language querying interface backed by Neo4j.
Knowledge bases (KB) are an important resource in a number of natural language processing (NLP) and information retrieval (IR) tasks, such as semantic search, automated question-answering etc. They are also useful for researchers trying to gain information from a text. Unfortunately, however, the state-of-the-art in Sanskrit NLP does not yet allow automated construction of knowledge bases due to unavailability or lack of sufficient accuracy of tools and methods. Thus, in this work, we describe our efforts on manual annotation of Sanskrit text for the purpose of knowledge graph (KG) creation. We choose the chapter Dhānyavarga from Bhāvaprakāśanighaṇṭu of the Ayurvedic text Bhāvaprakāśa for annotation. The constructed knowledge graph contains 410 entities and 764 relationships. Since Bhāvaprakāśanighaṇṭu is a technical glossary text that describes various properties of different substances, we develop an elaborate ontology to capture the semantics of the entity and relationship types present in the text. To query the knowledge graph, we design 31 query templates that cover most of the common question patterns. For both manual annotation and querying, we customize the Sangrahaka framework previously developed by us. The entire system including the dataset is available from https://sanskrit.iitk.ac.in/ayurveda. We hope that the knowledge graph that we have created through manual annotation and subsequent curation will help in development and testing of NLP tools in future as well as studying of the Bhāvaprakāśanighaṇṭu text.
@inproceedings{terdalkar2023semantic,title={Semantic Annotation and Querying Framework based on Semi-structured Ayurvedic Text},author={Terdalkar, Hrishikesh and Bhattacharya, Arnab and Dubey, Madhulika and Ramamurthy, S and Singh, Bhavna Naneria},booktitle={Proceedings of the Computational {S}anskrit {\&} Digital Humanities: Selected papers presented at the 18th World {S}anskrit Conference},month=jan,year={2023},address={Canberra, Australia (Online mode)},publisher={Association for Computational Linguistics},url={https://aclanthology.org/2023.wsc-csdh.11},pages={155--173},}
NYCIKS ’23
Āyurjñānam: Exploring Āyurveda using Knowledge Graphs
The Bṛhat-Trayī, consisting of Carakasaṃhitā, Suśrutasaṃhitā, and Aṣṭāṅgahṛdaya, is an encyclopaedic reference set in Āyurveda. However, the need for simpler texts led to the emergence of the Laghu-Trayī that includes Mādhavanidāna, Śārṅgadharasaṃhitā, and Bhāvaprakāśa. Authored by Ācārya Bhāvamiśra in the 16th century CE, Bhāvaprakāśa is a comprehensive work focused on medicine. The classification system of varga in its nighaṇṭu section, Bhāvaprakāśanighaṇṭu, categorizes substances based on type, origin, and medicinal properties. This valuable resource assists practitioners and researchers in Āyurveda. We present this information in an accessible manner to promote wider utilization of this knowledge. We create a robust ontology to capture the semantic information of medicinal substances, designing user-friendly interfaces for efficient annotation and curation, perform meticulous manual annotation on Bhāvaprakāśanighaṇṭu, and construct an accurate knowledge graph from three chapters of Bhāvaprakāśanighaṇṭu. The system is accessible at https://sanskrit.iitk.ac.in/ayurveda/.
@misc{terdalkar2023ayurjnanam,title={{Āyurjñānam}: Exploring {Āyurveda} using Knowledge Graphs},author={Terdalkar, Hrishikesh and Deulgaonkar, Vishakha and Bhattacharya, Arnab},note={Presented at the National Youth Conference on Indian Knowledge Systems 2023},year={2023},url={https://sanskrit.iitk.ac.in/ayurveda/},}
2021
ESEC/FSE ’21
Sangrahaka: A Tool for Annotating and Querying Knowledge Graphs
In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece, 2021
We present a web-based tool \\emphSangrahaka for annotating entities and relationships from text corpora towards construction of a knowledge graph and subsequent querying using templatized natural language questions. The application is language and corpus agnostic, but can be tuned for specific needs of a language or a corpus. The application is freely available for download and installation. Besides having a user-friendly interface, it is fast, supports customization, and is fault tolerant on both client and server side. It outperforms other annotation tools in an objective evaluation metric. The framework has been successfully used in two annotation tasks.
@inproceedings{terdalkar2021sangrahaka,title={Sangrahaka: A Tool for Annotating and Querying Knowledge Graphs},author={Terdalkar, Hrishikesh and Bhattacharya, Arnab},booktitle={Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering},year={2021},publisher={Association for Computing Machinery},url={https://doi.org/10.1145/3468264.3473113},doi={10.1145/3468264.3473113},pages={1520--1524},location={Athens, Greece},}