Luaran Penelitian Non-Tradisional dalam Linguistik di Era Digital: Studi Kasus VerbInd Pangkalan Data Verba Bahasa Indonesia Berbasis Korpus
DOI:
https://doi.org/10.24843/JH.2026.v30.i01.p05%20%20Keywords:
digital humanities, computational linguistics, corpus linguistics, quantitative linguistics, linguistic database, lexical databaseAbstract
The 21st-century research ecosystem highlights the importance of non-traditional outputs such as datasets and software, alongside articles and books. In the Humanities, digital technologies enable data to be presented more openly through online databases. This paper introduces VerbInd, an open-access online database of morphologically complex verbs in Indonesian, built from corpus texts and processed through computational morphological parsing with further manual verification and editing. The paper demonstrates the use of VerbInd in quantitative research on Indonesian verbal morphology and morphosyntax (particularly voice alternation), resulting in traditional outputs, namely book chapters. It also briefly compares VerbInd with the sixth edition of the online Kamus Besar Bahasa Indonesia (KBBI), emphasizing their potential complementarity as linguistic research resources.
References
Arka, I. W. (2003). Voice systems in the Austronesian Languages of Nusantara: Typology, Symmetricality, and Undergoer Orientation. Linguistik Indonesia, 21(1), 113–139.
Arka, I. W., & Manning, C. D. (2008). Voice and grammatical relations in Indonesian: A new perspective. In P. K. Austin & S. Musgrave (Eds.), Voice and grammatical relations in Austronesian languages (pp. 45–69). Center for the Study of Language and Information.
Baayen, R. H. (1994). Derivational productivity and text typology. Journal of Quantitative Linguistics, 1(1), 16–34. https://doi.org/10.1080/09296179408589996
Baayen, R. H. (2009). Corpus linguistics in morphology: Morphological productivity. In Anke Lüdeling & Merja Kytö (Eds.), Corpus linguistics: An international handbook (Vol. 2, pp. 899–919). Mouton de Gruyter.
Baayen, R. H., & Neijt, A. (1997). Productivity in context: A case study of a Dutch suffix. Linguistics, 35(3), 565–588. https://doi.org/10.1515/ling.1997.35.3.565
Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452–454. https://doi.org/10.1038/533452a
Berez-Kroeker, A. L., Gawne, L., Kung, S. S., Kelly, B. F., Heston, T., Holton, G., Pulsifer, P., Beaver, D. I., Chelliah, S., Dubinsky, S., Meier, R. P., Thieberger, N., Rice, K., & Woodbury, A. C. (2018). Reproducible research in linguistics: A position statement on data citation and attribution in our field. Linguistics, 56(1), 1–18. https://doi.org/10.1515/ling-2017-0032
Bond, F., Lim, L. T., Tang, E. K., & Riza, H. (2014). The combined wordnet Bahasa. NUSA, 57, 83–100. https://doi.org/10.15026/79286
Bowerman, M. (1988). The ’no negative evidence’ problem: How do children avoid constructing an overly general grammar? In J. A. Hawkins (Ed.), Explaining language universals (pp. 73–101). B. Blackwell. https://pure.mpg.de/rest/items/item_468143/component/file_532427/content
Croft, W. (2001). Radical Construction Grammar: Syntactic theory in typological perspective. Oxford University Press.
Curry, S., de Rijcke, S., Hatch, A., Pillay, D. (Gansen)., van der Weijden, I., & Wilsdon, J. (2022). The changing role of funders in responsible research assessment: Progress, obstacles and the way ahead (RoRI Working Paper No.3) [Report]. Research on Research Institute. https://doi.org/10.6084/m9.figshare.13227914.v2
Denistia, K. (2023). Databases on the Indonesian Prefixes PE- and PEN-. Journal of Language and Literature, 23(1), 13–24. https://doi.org/10.24071/joll.v23i1.4967
Denistia, K., & Rajeg, G. P. W. (2023). Afiksasi nomina dalam bahasa Indonesia. In Tata bahasa Indonesia kontemporer: Morfologi. Badan Pengembangan dan Pembinaan Bahasa. https://ora.ox.ac.uk/objects/uuid:7346b8e7-f66b-46ce-8efd-89baac017f6c
Drucker, J. (2021). The Digital Humanities Coursebook: An Introduction to Digital Methods for Research and Scholarship. Routledge. https://doi.org/10.4324/9781003106531
EAMENA database. (2023). Sistan: Part 1. Heritage Places [Dataset]. Zenodo. https://doi.org/10.5281/zenodo.10375902
Engels, T. C. E., & Kulczycki, E. (2022). Introduction: Research assessment in the social sciences. In T. C. E. Engels & E. Kulczycki (Eds.), Handbook on Research Assessment in the Social Sciences (pp. 1–6). Edward Elgar Publishing. https://doi.org/10.4337/9781800372559.00006
Foley, W. A. (1998). Symmetrical Voice Systems and Precategoriality in Philippine Languages [Paper]. Workshop on Voice and Grammatical Functions in Austronesian Languages, LFG98, Brisbane, Australia. https://doi.org/10.5281/zenodo.5336773
Goldberg, A. E. (2013). Constructionist approaches. In T. Hoffmann & G. Trousdale (Eds.), The Oxford Handbook of Construction Grammar (pp. 15–31). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780195396683.013.0002
Gries, S. Th. (2023). Collostructional Methods. In C. A. Chapelle (Ed.), The Encyclopedia of Applied Linguistics (1st ed., pp. 1–6). Wiley. https://doi.org/10.1002/9781405198431.wbeal20003
Gries, S. Th., & Stefanowitsch, A. (2004). Extending collostructional analysis: A corpus-based perspective on ’alternations’. International Journal of Corpus Linguistics, 9(1), 97–129.
Haspelmath, M. (Ed.). (2025). CrossGram. Max Planck Institute for Evolutionary Anthropology. https://crossgram.clld.org/
Hicks, D., Wouters, P., Waltman, L., de Rijcke, S., & Rafols, I. (2015). Bibliometrics: The Leiden Manifesto for research metrics. Nature News, 520(7548), 429. https://doi.org/10.1038/520429a
Hilpert, M. (2014). Collostructional analysis: Measuring associations between constructions and lexical elements. In D. Glynn & J. A. Robinson (Eds.), Corpus methods for semantics: Quantitative studies in polysemy and synonymy (pp. 391–404). John Benjamins Publishing Company.
Himmelmann, N. (2005). The Austronesian languages of Asia and Madagascar: Typological characteristics. In K. A. Adelaar & N. Himmelmann (Eds.), The Austronesian languages of Asia and Madagascar (pp. 110–181). Routledge.
Hurrell, C. (2023). Research Assessment Reform, Non-Traditional Research Outputs, and Digital Repositories: An Analysis of the Declaration on Research Assessment (DORA) Signatories in the United Kingdom. Evidence Based Library and Information Practice, 18(4), 2–20. https://doi.org/10.18438/eblip30407
Kaiping, G. A., & Klamer, M. (2018). LexiRumah: An online lexical database of the Lesser Sunda Islands. PLOS ONE, 13(10). https://doi.org/10.1371/journal.pone.0205250
Kirby, K. R., Gray, R. D., Greenhill, S. J., Jordan, F. M., Gomes-Ng, S., Bibiko, H.-J., Blasi, D. E., Botero, C. A., Bowern, C., Ember, C. R., Leehr, D., Low, B. S., McCarter, J., Divale, W., & Gavin, M. C. (2016). D-PLACE: A Global Database of Cultural, Linguistic and Environmental Diversity. PLOS ONE, 11(7), e0158391. https://doi.org/10.1371/journal.pone.0158391
Krauße, D., Rajeg, G. P. W., Pramartha, C. R. A., Zobel, E., Nothofer, B., Hemmings, C., Ogilvie, S., Arka, I. W., & Dalrymple, M. (2024). EnoLEX: A diachronic lexical database for the Enggano language (Version 1.0.0) [Online database]. University of Oxford. https://doi.org/10.25446/oxford.28282169.v1
Larasati, S. D., Kuboň, V., & Zeman, D. (2011). Indonesian morphology tool (MorphInd): Towards an indonesian corpus. Systems and Frameworks for Computational Morphology, 119–129. https://doi.org/10.1007/978-3-642-23138-4_8
List, J.-M., Forkel, R., Greenhill, S. J., Rzymski, C., Englisch, J., & Gray, R. D. (2022). Lexibank, a public repository of standardized wordlists with computed phonological and lexical features. Scientific Data, 9(1), 316. https://doi.org/10.1038/s41597-022-01432-0
Loprieno, A., Werlen, R., Hasgall, A., & Bregy, J. (2016). The “Mesurer les Performances de la Recherche” Project of the Rectors’ Conference of the Swiss Universities (CRUS) and Its Further Development. In M. Ochsner, S. E. Hug, & H.-D. Daniel (Eds.), Research Assessment in the Humanities (pp. 13–21). Springer International Publishing. https://doi.org/10.1007/978-3-319-29016-4_2
Nomoto, H., Choi, H., Moeljadi, D., & Bond, F. (2018). MALINDO morph: Morphological dictionary and analyser for Malay/Indonesian. Proceedings of the LREC 2018 Workshop "the 13th Workshop on Asian Language Resources", 36–43. http://lrec-conf.org/workshops/lrec2018/W29/pdf/8_W29.pdf
Palmer, C. C. (2015). Measuring productivity diachronically: Nominal suffixes in English letters, 1400–1600. English Language and Linguistics, 19(1), 107–129. https://doi.org/10.1017/S1360674314000264
Plag, I., Dalton-Puffer, C., & Baayen, H. (1999). Morphological productivity across speech and writing. English Language and Linguistics, 3(2), 209–228. https://doi.org/10.1017/S1360674399000222
Plutniak, S. (2025). Open-archeOcsean: An interactive catalogue of open source datasets for the archaeology of the Pacific and Southeast Asia regions (Version 1.0.0) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.16812839
R Core Team. (2025). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
Radford, A. (1988). Transformational Grammar: A First Course (1st ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511840425
Raff, J. W. (2013). The San Francisco Declaration on Research Assessment. Biology Open, 2(6), 533–534. https://doi.org/10.1242/bio.20135330
Rajeg, G. P. W. (2020). Linguistik korpus kuantitatif dan kajian semantik leksikal sinonim emosi bahasa Indonesia. Linguistik Indonesia, 38(2), 123–150. https://doi.org/10.26499/li.v38i2.155
Rajeg, G. P. W. (2025a, February). Languaging in the age of digital humanities: Examples from the Enggano language research. Fakultas Ilmu Budaya (FIB) Digital Talk (DigiTalk), Fakultas Ilmu Budaya, Universitas Udayana, Bali. https://doi.org/10.25446/oxford.28454975.v1
Rajeg, G. P. W. (2025b, September 12). Luaran Penelitian Non-Tradisional dalam Linguistik di Era Digital [Plenary talk]. Seminar Nasional Bahasa, Sastra, dan Budaya (SNBSB). https://doi.org/10.5281/zenodo.17112853
Rajeg, G. P. W. (2025). Material pendukung dan kode pemrograman R untuk “Luaran Penelitian Non-Tradisional dalam Linguistik di Era Digital” (Version 1.0.0) [Computer software]. https://doi.org/10.17605/OSF.IO/GN29Q (Original work published 2025)
Rajeg, G. P. W., & Artawa, K. (2024). Kajian korpus kuantitatif terhadap aspek-aspek diatesis dalam bahasa Indonesia. In Tata bahasa Indonesia kontemporer: Sintaksis. Badan Pengembangan dan Pembinaan Bahasa. https://doi.org/10.5281/zenodo.10615406
Rajeg, G. P. W., & Denistia, K. (2023a). Afiksasi verba dalam bahasa Indonesia. In Tata bahasa Indonesia kontemporer: Morfologi. Badan Pengembangan dan Pembinaan Bahasa. https://ora.ox.ac.uk/objects/uuid:ad496412-7b2d-4b88-b9c1-8b20db52dbac
Rajeg, G. P. W., & Denistia, K. (2023b). VerbInd: Pangkalan data verba bahasa Indonesia berbasis korpus (Version 1.0.0) [Dataset]. Zenodo. https://doi.org/10.5281/zenodo.7947605
Rajeg, G. P. W., & Denistia, K. (2023c). Material pendukung untuk Afiksasi Verba Dalam Bahasa Indonesia (Version 0.0.3) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.7812619
Rajeg, G. P. W., Krauße, D., & Pramartha, C. (2024). EnoLEX: A diachronic lexical database for the Enggano language. In A. Inoue, N. Kawamoto, & M. Sumiyoshi (Eds.), AsiaLex 2024 proceedings: Asian lexicography - merging cutting-edge and established approaches (pp. 123–132). https://doi.org/10.25446/oxford.27013864
Rajeg, G. P. W., & Rajeg, I. M. (2019). Analisis Koleksem Khas dan potensinya untuk kajian kemiripan makna konstruksional dalam Bahasa Indonesia. In I. N. Sudipa (Ed.), ETIKA BAHASA Buku persembahan menapaki usia pensiun: I Ketut Tika (Vol. 1, pp. 65–83). Swasta Nulus. https://doi.org/10.31227/osf.io/uwzts
Riesberg, S. (2014). Symmetrical Voice and Linking in Western Austronesian Languages. De Gruyter Mouton. https://doi.org/10.1515/9781614518716
Stefanowitsch, A. (2006). Negative evidence and the raw frequency fallacy. Corpus Linguistics and Linguistic Theory, 2(1), 61–77.
Stefanowitsch, A. (2013). Collostructional analysis. In T. Hoffmann & G. Trousdale (Eds.), The Oxford Handbook of Construction Grammar (pp. 290–306). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780195396683.013.0016
Stefanowitsch, A., & Gries, S. Th. (2009). Corpora and grammar. In A. Lüdeling & M. Kytö (Eds.), Corpus linguistics: An international handbook (Vol. 2, pp. 933–951). Mouton de Gruyter.
The Royal Society. (2025). Open science | Royal Society. The Royal Society. https://royalsociety.org/journals/open-access/open-science/
Warwick, C., Terras, M., & Nyhan, J. (2012). Introduction. In C. Warwick, J. Nyhan, & M. Terras (Eds.), Digital Humanities in Practice (pp. xiii–xx). Facet; published online by Cambridge University Press. https://doi.org/10.29085/9781856049054.001
Watts, J., Sheehan, O., Greenhill, S. J., Gomes-Ng, S., Atkinson, Q. D., Bulbulia, J., & Gray, R. D. (2015). Pulotu: Database of Austronesian Supernatural Beliefs and Practices. PLoS ONE, 10(9), 1–17. https://doi.org/10.1371/journal.pone.0136783












