Taalmaster

    CEFR Rating

    When importing a new resource, we look up the word frequency and rank of each of them. For example, the word the in English has a frequency of 0.0537. This means, that about 5% of all words in an average text are the!. This also makes the the number one English word (rank 1). From this rank, we estimate a CEFR level for each word. We start at A1 with really common words and then we put words in higher categories (A2, B1, etc.) the lower their frequency gets. So for example, words like comprise or assess land in the C2 level. From this, we can compute a frequency distribution. This tells us, how the text is made up in terms of CEFR words.

    We then sprinkle some AI into the mix that gives us back the overall CEFR rating of the text, which also considers the grammatical structures used.