Please use the following text to cite this item or export to a predefined format:
Ainara Estarrona; Izaskun Etxeberria; Ricardo Etxepare; Ander Soraluze and Manuel Padilla-Moyano, 2026,
BIM/SAHCOBA corpus: Syntactically Annotated Historical Corpus in Basque, Dspace HiTZ Zentroa,
https://hdl.handle.net/20.500.14614/43.
| dc.contributor.author | Ainara Estarrona |
| dc.contributor.author | Izaskun Etxeberria |
| dc.contributor.author | Ricardo Etxepare |
| dc.contributor.author | Ander Soraluze |
| dc.contributor.author | Manuel Padilla-Moyano |
| dc.date.accessioned | 2026-06-22T08:32:35Z |
| dc.date.available | 2026-06-22T08:32:35Z |
| dc.date.issued | 2026-06-17 |
| dc.description | Basque in the Making (BIM): A Historical Look at a European Language Isolate and Syntactically Annotated Historical Corpus in Basque (SAHCOBA) are two projects for the construction of a morphosyntactically annotated historical corpus of Basque. This corpus will comprise both part-of-speech and syntactic annotation, and a rich set of metadata structure. Our database will allow us to search the annotated corpus by words, lemmas, grammatical categories, by sequences of grammatical categories, and by specific structural configurations. The BIM project aims to collect the most significant works from the 15th century to the mid 18th century (Archaic and Old Basque), while the SAHCOBA project aims to extend this corpus from the mid 18th century to the mid 20th century (Early and Late Modern Basque) when standard Basque appeared. BIM and SAHCOBA are interdisciplinary projects, where experts on Linguistics and Natural Language Processing take part. |
| dc.identifier.uri | https://hdl.handle.net/20.500.14614/43 |
| dc.language.iso | Basque |
| dc.publisher | HiTZ (University of the Basque Country) |
| dc.relation.isreferencedby | https://doi.org/10.1093/llc/fqab066 |
| dc.rights | Creative Commons - Attribution 4.0 International (CC BY 4.0) |
| dc.rights.label | PUB |
| dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ |
| dc.source.uri | https://bim.ixa.eus/ |
| dc.subject | Digital Humanities |
| dc.subject | historical corpus |
| dc.subject | basque |
| dc.subject | diachronic syntax |
| dc.title | BIM/SAHCOBA corpus: Syntactically Annotated Historical Corpus in Basque |
| dc.type | corpus |
| local.contact.person | Ainara Estarrona ainara.estarrona@ehu.eus HiTZ center (University of the Basque Country) |
| local.demo.uri | https://bim.ixa.eus/ |
| local.files.count | 1 |
| local.files.size | 2926904 |
| local.has.files | yes |
| local.size.info | 600000 tokens |
| local.sponsor | nationalFunds RTI2018-098082-J-I00 Ministerio de Ciencia Innovación y Universidades (MICINN) SAHCOBA: Syntactically Annotated Historical Corpus in Basque (MICINN) |
| local.sponsor | nationalFunds ANR-17-CE27-0011 Agence Nationale de la Recherche (ANR) Basque in the Making: A Historical Look at a Language Isolate – BIM |
| metashare.ResourceInfo#ContentInfo.mediaType | text |
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- BZENTROA-BIM.zip
- Size
- 2.79 MB
- Format
- application/zip
- Description
- MD5
- f7f1fb550c7c5a0d5548e114fecb2dad

item.preview.no-preview xabier.goenaga@ehu.eus