Saṅgrāhaka

A web-based platform for annotating and querying knowledge graphs

Saṅgrāhaka is a web-based platform for annotating and querying knowledge graphs. It supports structured interaction with graph data — including entity and relation annotation — and provides a natural-language querying interface backed by Neo4j.

The platform has been used for building and exploring Sanskrit and Ayurvedic knowledge graphs, and served as the basis for the Natural Language Interface for Knowledge Graph Querying project.

References

2023

WSC ’23
Semantic Annotation and Querying Framework based on Semi-structured Ayurvedic Text

Hrishikesh Terdalkar, Arnab Bhattacharya, Madhulika Dubey, and 2 more authors

In Proceedings of the Computational Sanskrit & Digital Humanities: Selected papers presented at the 18th World Sanskrit Conference, Jan 2023

Abs arXiv Bib PDF Slides

Knowledge bases (KB) are an important resource in a number of natural language processing (NLP) and information retrieval (IR) tasks, such as semantic search, automated question-answering etc. They are also useful for researchers trying to gain information from a text. Unfortunately, however, the state-of-the-art in Sanskrit NLP does not yet allow automated construction of knowledge bases due to unavailability or lack of sufficient accuracy of tools and methods. Thus, in this work, we describe our efforts on manual annotation of Sanskrit text for the purpose of knowledge graph (KG) creation. We choose the chapter Dhānyavarga from Bhāvaprakāśanighaṇṭu of the Ayurvedic text Bhāvaprakāśa for annotation. The constructed knowledge graph contains 410 entities and 764 relationships. Since Bhāvaprakāśanighaṇṭu is a technical glossary text that describes various properties of different substances, we develop an elaborate ontology to capture the semantics of the entity and relationship types present in the text. To query the knowledge graph, we design 31 query templates that cover most of the common question patterns. For both manual annotation and querying, we customize the Sangrahaka framework previously developed by us. The entire system including the dataset is available from https://sanskrit.iitk.ac.in/ayurveda. We hope that the knowledge graph that we have created through manual annotation and subsequent curation will help in development and testing of NLP tools in future as well as studying of the Bhāvaprakāśanighaṇṭu text.
@inproceedings{terdalkar2023semantic, title = {Semantic Annotation and Querying Framework based on Semi-structured Ayurvedic Text}, author = {Terdalkar, Hrishikesh and Bhattacharya, Arnab and Dubey, Madhulika and Ramamurthy, S and Singh, Bhavna Naneria}, booktitle = {Proceedings of the Computational {S}anskrit {\&} Digital Humanities: Selected papers presented at the 18th World {S}anskrit Conference}, month = jan, year = {2023}, address = {Canberra, Australia (Online mode)}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2023.wsc-csdh.11}, pages = {155--173}, }
NYCIKS ’23
Āyurjñānam: Exploring Āyurveda using Knowledge Graphs

Hrishikesh Terdalkar, Vishakha Deulgaonkar, and Arnab Bhattacharya

2023

Presented at the National Youth Conference on Indian Knowledge Systems 2023

Best Poster Award Abs Bib

Best Poster Award

The Bṛhat-Trayī, consisting of Carakasaṃhitā, Suśrutasaṃhitā, and Aṣṭāṅgahṛdaya, is an encyclopaedic reference set in Āyurveda. However, the need for simpler texts led to the emergence of the Laghu-Trayī that includes Mādhavanidāna, Śārṅgadharasaṃhitā, and Bhāvaprakāśa. Authored by Ācārya Bhāvamiśra in the 16th century CE, Bhāvaprakāśa is a comprehensive work focused on medicine. The classification system of varga in its nighaṇṭu section, Bhāvaprakāśanighaṇṭu, categorizes substances based on type, origin, and medicinal properties. This valuable resource assists practitioners and researchers in Āyurveda. We present this information in an accessible manner to promote wider utilization of this knowledge. We create a robust ontology to capture the semantic information of medicinal substances, designing user-friendly interfaces for efficient annotation and curation, perform meticulous manual annotation on Bhāvaprakāśanighaṇṭu, and construct an accurate knowledge graph from three chapters of Bhāvaprakāśanighaṇṭu. The system is accessible at https://sanskrit.iitk.ac.in/ayurveda/.
@misc{terdalkar2023ayurjnanam, title = {{Āyurjñānam}: Exploring {Āyurveda} using Knowledge Graphs}, author = {Terdalkar, Hrishikesh and Deulgaonkar, Vishakha and Bhattacharya, Arnab}, note = {Presented at the National Youth Conference on Indian Knowledge Systems 2023}, year = {2023}, url = {https://sanskrit.iitk.ac.in/ayurveda/}, }

2021

ESEC/FSE ’21
Sangrahaka: A Tool for Annotating and Querying Knowledge Graphs

Hrishikesh Terdalkar and Arnab Bhattacharya

In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece, 2021

Best Software Award Abs DOI arXiv Bib PDF Code Slides

Best Software Award at 57th Convocation IITK

We present a web-based tool \\emphSangrahaka for annotating entities and relationships from text corpora towards construction of a knowledge graph and subsequent querying using templatized natural language questions. The application is language and corpus agnostic, but can be tuned for specific needs of a language or a corpus. The application is freely available for download and installation. Besides having a user-friendly interface, it is fast, supports customization, and is fault tolerant on both client and server side. It outperforms other annotation tools in an objective evaluation metric. The framework has been successfully used in two annotation tasks.
@inproceedings{terdalkar2021sangrahaka, title = {Sangrahaka: A Tool for Annotating and Querying Knowledge Graphs}, author = {Terdalkar, Hrishikesh and Bhattacharya, Arnab}, booktitle = {Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering}, year = {2021}, publisher = {Association for Computing Machinery}, url = {https://doi.org/10.1145/3468264.3473113}, doi = {10.1145/3468264.3473113}, pages = {1520--1524}, location = {Athens, Greece}, }