• circuitfarmer@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    2
    ·
    19 days ago

    There are a great many languages which are undocumented entirely or are severely lacking in documentation. One part of my job is collecting data for such languages. Another part is more traditional computational linguistics, which in my case is primarily corpus analysis (still a relatively common step in the development of model training data).