Luaran Penelitian Non-Tradisional dalam Linguistik di Era Digital: Studi Kasus VerbInd Pangkalan Data Verba Bahasa Indonesia Berbasis Korpus

Authors

DOI:

https://doi.org/10.24843/JH.2026.v30.i01.p05%20%20

Keywords:

digital humanities, computational linguistics, corpus linguistics, quantitative linguistics, linguistic database, lexical database

Abstract

The 21st-century research ecosystem highlights the importance of non-traditional outputs such as datasets and software, alongside articles and books. In the Humanities, digital technologies enable data to be presented more openly through online databases. This paper introduces VerbInd, an open-access online database of morphologically complex verbs in Indonesian, built from corpus texts and processed through computational morphological parsing with further manual verification and editing. The paper demonstrates the use of VerbInd in quantitative research on Indonesian verbal morphology and morphosyntax (particularly voice alternation), resulting in traditional outputs, namely book chapters. It also briefly compares VerbInd with the sixth edition of the online Kamus Besar Bahasa Indonesia (KBBI), emphasizing their potential complementarity as linguistic research resources.

Author Biography

Gede Primahadi Wijaya Rajeg, Universitas Udayana

Gede Primahadi Wijaya Rajeg lectures at the Bachelor of English Literature program and the Linguistics Postgraduate (Doctoral and Master's) program in the Faculty of Humanities, Udayana University. He holds a PhD in Linguistics (2019) from Monash University, Australia, and conducted postdoctoral research (2023-2025) at the University of Oxford, UK, funded by the Arts and Humanities Research Council (AH/W007290/1). His research interests are Cognitive Linguistics, usage-based Construction Grammar, Corpus Linguistics, Lexicology and Lexicography, Austronesian languages, Documentary Linguistics, Data Science, and Digital Humanities. For further details, visit his ORCID webpage at https://orcid.org/0000-0002-2047-8621

References

Arka, I. W. (2003). Voice systems in the Austronesian Languages of Nusantara: Typology, Symmetricality, and Undergoer Orientation. Linguistik Indonesia, 21(1), 113–139.

Arka, I. W., & Manning, C. D. (2008). Voice and grammatical relations in Indonesian: A new perspective. In P. K. Austin & S. Musgrave (Eds.), Voice and grammatical relations in Austronesian languages (pp. 45–69). Center for the Study of Language and Information.

Baayen, R. H. (1994). Derivational productivity and text typology. Journal of Quantitative Linguistics, 1(1), 16–34. https://doi.org/10.1080/09296179408589996

Baayen, R. H. (2009). Corpus linguistics in morphology: Morphological productivity. In Anke Lüdeling & Merja Kytö (Eds.), Corpus linguistics: An international handbook (Vol. 2, pp. 899–919). Mouton de Gruyter.

Baayen, R. H., & Neijt, A. (1997). Productivity in context: A case study of a Dutch suffix. Linguistics, 35(3), 565–588. https://doi.org/10.1515/ling.1997.35.3.565

Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452–454. https://doi.org/10.1038/533452a

Berez-Kroeker, A. L., Gawne, L., Kung, S. S., Kelly, B. F., Heston, T., Holton, G., Pulsifer, P., Beaver, D. I., Chelliah, S., Dubinsky, S., Meier, R. P., Thieberger, N., Rice, K., & Woodbury, A. C. (2018). Reproducible research in linguistics: A position statement on data citation and attribution in our field. Linguistics, 56(1), 1–18. https://doi.org/10.1515/ling-2017-0032

Bond, F., Lim, L. T., Tang, E. K., & Riza, H. (2014). The combined wordnet Bahasa. NUSA, 57, 83–100. https://doi.org/10.15026/79286

Bowerman, M. (1988). The ’no negative evidence’ problem: How do children avoid constructing an overly general grammar? In J. A. Hawkins (Ed.), Explaining language universals (pp. 73–101). B. Blackwell. https://pure.mpg.de/rest/items/item_468143/component/file_532427/content

Croft, W. (2001). Radical Construction Grammar: Syntactic theory in typological perspective. Oxford University Press.

Curry, S., de Rijcke, S., Hatch, A., Pillay, D. (Gansen)., van der Weijden, I., & Wilsdon, J. (2022). The changing role of funders in responsible research assessment: Progress, obstacles and the way ahead (RoRI Working Paper No.3) [Report]. Research on Research Institute. https://doi.org/10.6084/m9.figshare.13227914.v2

Denistia, K. (2023). Databases on the Indonesian Prefixes PE- and PEN-. Journal of Language and Literature, 23(1), 13–24. https://doi.org/10.24071/joll.v23i1.4967

Denistia, K., & Rajeg, G. P. W. (2023). Afiksasi nomina dalam bahasa Indonesia. In Tata bahasa Indonesia kontemporer: Morfologi. Badan Pengembangan dan Pembinaan Bahasa. https://ora.ox.ac.uk/objects/uuid:7346b8e7-f66b-46ce-8efd-89baac017f6c

Drucker, J. (2021). The Digital Humanities Coursebook: An Introduction to Digital Methods for Research and Scholarship. Routledge. https://doi.org/10.4324/9781003106531

EAMENA database. (2023). Sistan: Part 1. Heritage Places [Dataset]. Zenodo. https://doi.org/10.5281/zenodo.10375902

Engels, T. C. E., & Kulczycki, E. (2022). Introduction: Research assessment in the social sciences. In T. C. E. Engels & E. Kulczycki (Eds.), Handbook on Research Assessment in the Social Sciences (pp. 1–6). Edward Elgar Publishing. https://doi.org/10.4337/9781800372559.00006

Foley, W. A. (1998). Symmetrical Voice Systems and Precategoriality in Philippine Languages [Paper]. Workshop on Voice and Grammatical Functions in Austronesian Languages, LFG98, Brisbane, Australia. https://doi.org/10.5281/zenodo.5336773

Goldberg, A. E. (2013). Constructionist approaches. In T. Hoffmann & G. Trousdale (Eds.), The Oxford Handbook of Construction Grammar (pp. 15–31). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780195396683.013.0002

Gries, S. Th. (2023). Collostructional Methods. In C. A. Chapelle (Ed.), The Encyclopedia of Applied Linguistics (1st ed., pp. 1–6). Wiley. https://doi.org/10.1002/9781405198431.wbeal20003

Gries, S. Th., & Stefanowitsch, A. (2004). Extending collostructional analysis: A corpus-based perspective on ’alternations’. International Journal of Corpus Linguistics, 9(1), 97–129.

Haspelmath, M. (Ed.). (2025). CrossGram. Max Planck Institute for Evolutionary Anthropology. https://crossgram.clld.org/

Hicks, D., Wouters, P., Waltman, L., de Rijcke, S., & Rafols, I. (2015). Bibliometrics: The Leiden Manifesto for research metrics. Nature News, 520(7548), 429. https://doi.org/10.1038/520429a

Hilpert, M. (2014). Collostructional analysis: Measuring associations between constructions and lexical elements. In D. Glynn & J. A. Robinson (Eds.), Corpus methods for semantics: Quantitative studies in polysemy and synonymy (pp. 391–404). John Benjamins Publishing Company.

Himmelmann, N. (2005). The Austronesian languages of Asia and Madagascar: Typological characteristics. In K. A. Adelaar & N. Himmelmann (Eds.), The Austronesian languages of Asia and Madagascar (pp. 110–181). Routledge.

Hurrell, C. (2023). Research Assessment Reform, Non-Traditional Research Outputs, and Digital Repositories: An Analysis of the Declaration on Research Assessment (DORA) Signatories in the United Kingdom. Evidence Based Library and Information Practice, 18(4), 2–20. https://doi.org/10.18438/eblip30407

Kaiping, G. A., & Klamer, M. (2018). LexiRumah: An online lexical database of the Lesser Sunda Islands. PLOS ONE, 13(10). https://doi.org/10.1371/journal.pone.0205250

Kirby, K. R., Gray, R. D., Greenhill, S. J., Jordan, F. M., Gomes-Ng, S., Bibiko, H.-J., Blasi, D. E., Botero, C. A., Bowern, C., Ember, C. R., Leehr, D., Low, B. S., McCarter, J., Divale, W., & Gavin, M. C. (2016). D-PLACE: A Global Database of Cultural, Linguistic and Environmental Diversity. PLOS ONE, 11(7), e0158391. https://doi.org/10.1371/journal.pone.0158391

Krauße, D., Rajeg, G. P. W., Pramartha, C. R. A., Zobel, E., Nothofer, B., Hemmings, C., Ogilvie, S., Arka, I. W., & Dalrymple, M. (2024). EnoLEX: A diachronic lexical database for the Enggano language (Version 1.0.0) [Online database]. University of Oxford. https://doi.org/10.25446/oxford.28282169.v1

Larasati, S. D., Kuboň, V., & Zeman, D. (2011). Indonesian morphology tool (MorphInd): Towards an indonesian corpus. Systems and Frameworks for Computational Morphology, 119–129. https://doi.org/10.1007/978-3-642-23138-4_8

List, J.-M., Forkel, R., Greenhill, S. J., Rzymski, C., Englisch, J., & Gray, R. D. (2022). Lexibank, a public repository of standardized wordlists with computed phonological and lexical features. Scientific Data, 9(1), 316. https://doi.org/10.1038/s41597-022-01432-0

Loprieno, A., Werlen, R., Hasgall, A., & Bregy, J. (2016). The “Mesurer les Performances de la Recherche” Project of the Rectors’ Conference of the Swiss Universities (CRUS) and Its Further Development. In M. Ochsner, S. E. Hug, & H.-D. Daniel (Eds.), Research Assessment in the Humanities (pp. 13–21). Springer International Publishing. https://doi.org/10.1007/978-3-319-29016-4_2

Nomoto, H., Choi, H., Moeljadi, D., & Bond, F. (2018). MALINDO morph: Morphological dictionary and analyser for Malay/Indonesian. Proceedings of the LREC 2018 Workshop "the 13th Workshop on Asian Language Resources", 36–43. http://lrec-conf.org/workshops/lrec2018/W29/pdf/8_W29.pdf

Palmer, C. C. (2015). Measuring productivity diachronically: Nominal suffixes in English letters, 1400–1600. English Language and Linguistics, 19(1), 107–129. https://doi.org/10.1017/S1360674314000264

Plag, I., Dalton-Puffer, C., & Baayen, H. (1999). Morphological productivity across speech and writing. English Language and Linguistics, 3(2), 209–228. https://doi.org/10.1017/S1360674399000222

Plutniak, S. (2025). Open-archeOcsean: An interactive catalogue of open source datasets for the archaeology of the Pacific and Southeast Asia regions (Version 1.0.0) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.16812839

R Core Team. (2025). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/

Radford, A. (1988). Transformational Grammar: A First Course (1st ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511840425

Raff, J. W. (2013). The San Francisco Declaration on Research Assessment. Biology Open, 2(6), 533–534. https://doi.org/10.1242/bio.20135330

Rajeg, G. P. W. (2020). Linguistik korpus kuantitatif dan kajian semantik leksikal sinonim emosi bahasa Indonesia. Linguistik Indonesia, 38(2), 123–150. https://doi.org/10.26499/li.v38i2.155

Rajeg, G. P. W. (2025a, February). Languaging in the age of digital humanities: Examples from the Enggano language research. Fakultas Ilmu Budaya (FIB) Digital Talk (DigiTalk), Fakultas Ilmu Budaya, Universitas Udayana, Bali. https://doi.org/10.25446/oxford.28454975.v1

Rajeg, G. P. W. (2025b, September 12). Luaran Penelitian Non-Tradisional dalam Linguistik di Era Digital [Plenary talk]. Seminar Nasional Bahasa, Sastra, dan Budaya (SNBSB). https://doi.org/10.5281/zenodo.17112853

Rajeg, G. P. W. (2025). Material pendukung dan kode pemrograman R untuk “Luaran Penelitian Non-Tradisional dalam Linguistik di Era Digital” (Version 1.0.0) [Computer software]. https://doi.org/10.17605/OSF.IO/GN29Q (Original work published 2025)

Rajeg, G. P. W., & Artawa, K. (2024). Kajian korpus kuantitatif terhadap aspek-aspek diatesis dalam bahasa Indonesia. In Tata bahasa Indonesia kontemporer: Sintaksis. Badan Pengembangan dan Pembinaan Bahasa. https://doi.org/10.5281/zenodo.10615406

Rajeg, G. P. W., & Denistia, K. (2023a). Afiksasi verba dalam bahasa Indonesia. In Tata bahasa Indonesia kontemporer: Morfologi. Badan Pengembangan dan Pembinaan Bahasa. https://ora.ox.ac.uk/objects/uuid:ad496412-7b2d-4b88-b9c1-8b20db52dbac

Rajeg, G. P. W., & Denistia, K. (2023b). VerbInd: Pangkalan data verba bahasa Indonesia berbasis korpus (Version 1.0.0) [Dataset]. Zenodo. https://doi.org/10.5281/zenodo.7947605

Rajeg, G. P. W., & Denistia, K. (2023c). Material pendukung untuk Afiksasi Verba Dalam Bahasa Indonesia (Version 0.0.3) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.7812619

Rajeg, G. P. W., Krauße, D., & Pramartha, C. (2024). EnoLEX: A diachronic lexical database for the Enggano language. In A. Inoue, N. Kawamoto, & M. Sumiyoshi (Eds.), AsiaLex 2024 proceedings: Asian lexicography - merging cutting-edge and established approaches (pp. 123–132). https://doi.org/10.25446/oxford.27013864

Rajeg, G. P. W., & Rajeg, I. M. (2019). Analisis Koleksem Khas dan potensinya untuk kajian kemiripan makna konstruksional dalam Bahasa Indonesia. In I. N. Sudipa (Ed.), ETIKA BAHASA Buku persembahan menapaki usia pensiun: I Ketut Tika (Vol. 1, pp. 65–83). Swasta Nulus. https://doi.org/10.31227/osf.io/uwzts

Riesberg, S. (2014). Symmetrical Voice and Linking in Western Austronesian Languages. De Gruyter Mouton. https://doi.org/10.1515/9781614518716

Stefanowitsch, A. (2006). Negative evidence and the raw frequency fallacy. Corpus Linguistics and Linguistic Theory, 2(1), 61–77.

Stefanowitsch, A. (2013). Collostructional analysis. In T. Hoffmann & G. Trousdale (Eds.), The Oxford Handbook of Construction Grammar (pp. 290–306). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780195396683.013.0016

Stefanowitsch, A., & Gries, S. Th. (2009). Corpora and grammar. In A. Lüdeling & M. Kytö (Eds.), Corpus linguistics: An international handbook (Vol. 2, pp. 933–951). Mouton de Gruyter.

The Royal Society. (2025). Open science | Royal Society. The Royal Society. https://royalsociety.org/journals/open-access/open-science/

Warwick, C., Terras, M., & Nyhan, J. (2012). Introduction. In C. Warwick, J. Nyhan, & M. Terras (Eds.), Digital Humanities in Practice (pp. xiii–xx). Facet; published online by Cambridge University Press. https://doi.org/10.29085/9781856049054.001

Watts, J., Sheehan, O., Greenhill, S. J., Gomes-Ng, S., Atkinson, Q. D., Bulbulia, J., & Gray, R. D. (2015). Pulotu: Database of Austronesian Supernatural Beliefs and Practices. PLoS ONE, 10(9), 1–17. https://doi.org/10.1371/journal.pone.0136783

Downloads

Published

2026-02-28

How to Cite

Rajeg, G. P. W. (2026). Luaran Penelitian Non-Tradisional dalam Linguistik di Era Digital: Studi Kasus VerbInd Pangkalan Data Verba Bahasa Indonesia Berbasis Korpus. Humanis, 30(1), 62–80. https://doi.org/10.24843/JH.2026.v30.i01.p05