EduLingua 8/1 (2022)  1 
 
 
Cohesion in Finnish EFL essays: Digital analyses and 
observations on the use of online sources  
Marja-Leena Niitemaa1 
School of Languages and Translation Studies, Department of English, University of Turku, 
Finland 
DOI: 10.14232/edulingua.2022.1.1 
 
The study investigated cohesion in Finnish upper-secondary school EFL learners’ essays (N=46). 
Cohesive devices were digitally identified using TAACO 2.0.4, and robust correlations were run to 
examine how the devices related to human-rated holistic essay quality. The analyses found that the two 
most important predictors of writing quality were the use of modifying adverbs and adverbials as 
referential devices across paragraphs, and a wide array of connectives to organise the text. Further, the 
writing sessions were video-recorded to examine the role of consulting digital sources in cohesion-
building. The recorded data suggested that consulting online dictionaries and informational pages assisted 
cohesion-building if the writer possessed adequate vocabulary knowledge and computer skills and knew 
how to exploit the sources efficiently. Pedagogically, the findings indicated that learners need more 
instruction and practice not only on writing cohesive texts but also on how to search for information and 
lexis effectively.  
 
Keywords: cohesion, EFL essays, online dictionaries, video-recorded data, writing process 
1. Introduction 
In written language, cohesive features are particularly important for readers with insufficient 
vocabulary or background knowledge (cf., McNamara, Kintsch, Songer, & Kintsch, 1996), 
while in speech, cohesive markers can decrease misunderstandings between first (L1) and 
second language (L2) speakers (Crossley, Salsbury, & McNamara, 2010). Although crucial in 
all communication (e.g., Lintunen, Mutta, & Peltonen, 2020; Council of Europe, 2018b), 
learning to employ cohesive devices is one of the major challenges for learners of English as a 
foreign language (EFL). 
In Finland, the L2 curricula are based on the guidelines of the Common European 
Framework of Reference for Languages (CEFR, Council of Europe, 2018a). Overall, CEFR 
emphasizes the ability to produce clear, detailed text on a variety of topics. Accordingly, 
Finnish upper-secondary school students should reach the CEFR level B.2 in EFL written 
production by the national school-leaving examination. Regarding cohesion, B2 descriptors 
highlight the ability to employ cohesive devices to produce clear, coherent text following the 
established conventions of the genre. This includes, e.g., structuring texts in paragraphs, using 
                                                     
1
 Author’s e-mail: maleni@utu.fi; https://orcid.org/0000-0002-2822-2735 
 
EduLingua 8(1), pp. 1-16 (2022) 
ISSN 2415-945X 
2 Niitemaa: Cohesion in Finnish EFL essays 
 
 
contextually appropriate lexis, avoiding errors disrupting readability, and linking clauses and 
paragraphs with a wide range of connectives (Council of Europe, 2018a). All these skills take a 
long time to develop for EFL learners. Moreover, language teachers often consider cohesion one 
of the most strenuous issues to teach, e.g., Finnish learners often forget to write EFL essays in 
paragraphs, although they do so in L1. 
Better understanding of EFL writing processes may offer new insights into approaching 
cohesion in the classroom. The present study sets out to examine, which cohesive features 
characterize Finnish upper-secondary school EFL learners’ essays and how the CEFR 
descriptions for cohesion are met at level B2, i.e., the level that Finnish students are expected to 
reach before taking the national school-leaving examination in English. Another aim is to 
examine the role of consulting online dictionaries and other digital sources in cohesion-building. 
For these purposes, cohesive devices are digitally identified and compared against the human-
rated holistic writing scores and the CEFR requirements at level B2, and the authentic writing 
sessions are video-recorded to examine consulting the Internet during writing. Further, the rate 
of successful consultations of digital sources is compared against the writer’s vocabulary 
knowledge. 
2. Literature review 
Cohesion commonly denotes linguistic signposts that help readers notice relationships between 
the ideas and information presented in texts. In the framework of Halliday and Hasan (1976), 
cohesive features are categorized to referential devices (e.g., demonstrative reference by 
this/these, similarly/otherwise); substitution (replacing words instead of repeating them); ellipsis 
(omitting words); connective links within and between clauses and paragraphs; and lexical 
cohesion, i.e., using different but semantically related words and collocations. Recent studies 
have employed digital tools to identify cohesive features in tertiary-level EFL students’ texts 
and to examine how such features relate to holistic writing quality and text organization, and 
moreover, investigated the writing process in real time to develop teaching and feedback 
practices, which could help writers produce cohesive texts. 
2.1 Cohesion in relation to essay quality and text organization 
This subsection introduces four studies using digital analyses to examine cohesion in EFL texts. 
The first two focus on the longitudinal development in using cohesive devices in relation to 
human rating, while the last two examine the role of elaboration in teaching cohesion. To 
facilitate comparison between these and the present results, the review is limited to studies using 
different versions of TAACO, and the titles for the cohesive devices are italicized. Examples of 
cohesive features in the present data are provided below in section 4.2. 
Crossley, Kyle and McNamara (2016b) examined three sets of 30-minute descriptive 
essays written by university students attending English for Academic Purposes courses within a 
semester. The analyses conducted by TAACO 1.0 (Crossley, Kyle, & McNamara, 2016a) 
showed that 44% of the variance of holistic essay ratings was collectively explained by 
Function words across paragraphs and sentences, and Pronouns across paragraphs and the 
whole text. Regarding text organization, the best predictors were Function words at the 
paragraph and text levels and Pronouns and Coordinating conjuncts between sentences 
EduLingua 8/1 (2022)  3 
 
 
explaining 36% of the variance of the scores. In these analyses Function words across 
paragraphs appeared to be the best single predictor. To explain this, the researchers suggested 
that human raters may show bias for organizational devices in L2 writing, since content may not 
be as rich as in L1 essays. As for longitudinal development, the analyses reported significant 
growth over a semester for about a half of the cohesive features examined such as Nouns and 
Synonyms across paragraphs. However, the increased occurrences did not necessarily correlate 
with human ratings. 
Kim and Crossley (2018) analyzed the joint effect of lexical, syntactic, and cohesive 
variables on 30-minute argumentative essays and 20-minute source-based texts using structural 
equation modelling (SEM). The essays were written by tertiary-level students with diverse L1 
backgrounds. Four indices (TAACO 1.0) were employed to examine cohesion: Overlap of lexis 
across sentences and paragraphs, and the incidence of Positive and Negative logical 
connectives. These devices are thought to increase readability via referential links. Overlap, i.e., 
repetition of the same lexical items, helps readers notice connections between the ideas 
presented, while connectives provide explicit links to them. 
The results also showed that sentence-level, Overlap of lexis correlated positively with 
the scores of source-based texts but not with the argumentative essays, while Overlap of lexis 
across paragraphs correlated positively with the scores of both types of writing and thus, was 
selected to measure cohesion in the SEM analysis. This index together with lexical 
sophistication and syntactic complexity explained 82% of the variance of the scores of both 
argumentative and source-based essays. However, lexical sophistication and syntactic 
complexity explained a greater share of the variance compared to referential cohesion. 
Two studies, Crossley and McNamara (2016), and Crossley, Kyle and Dascalu (2019), 
examined whether elaboration can be used to draw EFL writers’ attention to cohesion during 
writing. Both the studies used the same essays written by American university students. The 
texts were written on computers but using notes or the Internet was not allowed. When ready, 
the participants were to spend 15 minutes to elaborate the main ideas by adding two more 
paragraphs. The same procedure was repeated with a new prompt so that each writer produced 
two original essays and two elaborated versions. Next, an expert was asked to manipulate the 
texts and increase referential cohesion via lexical overlap across the text segments. Finally, the 
original essays, elaborated texts and manipulated versions were digitally analysed for cohesion. 
In the earlier study from 2016, all the versions were analyzed for Lexical overlap across 
sentences and paragraphs using TAACO 1.0. The analyses indicated that the elaborated essays 
did not score significantly higher than the original ones, whereas the essays with expert-added 
cohesion scored higher than the original and elaborated versions. The index measuring Lexical 
overlap between paragraphs was a strong predictor of essay scores. Analysing the same texts, 
the study from 2019 employed measures of Semantic similarity provided by TAACO 2.0. 
Contrary to the earlier results, the elaborated versions scored higher points than the original 
ones, but the expert-added essays still scored the highest. Semantic similarity (word2vec) across 
paragraphs appeared to be an important predictor of the essay ratings. 
In sum, the TAACO analyses indicate that higher-rated writing is associated with two 
cohesive features, referential cohesion, i.e., the use of pronouns and lexical repetition, and 
organizational tools, such as connectives and function words. These devices are thought to 
enhance comprehensibility of the text. The researchers suggest, however, that different textual 
genres may require different cohesive means. 
4 Niitemaa: Cohesion in Finnish EFL essays 
 
 
2.2 Cohesion-building in the writing process 
Employing different methods, the following three studies examine EFL learners’ text production 
processes in order to improve writing assessment, teaching practices and feedback. The essays 
analysed in these studies were written on computers without access to the Internet. 
Bowen and Thomas (2020) investigated in real time how L1 and L2 students developed 
clauses, added information, and used lexis to refer backward and forward across the paragraphs. 
The participants were three L1 and three L2 students (Chinese) at a British university, the latter 
scoring intermediate points on the International English Language Testing System. Using 
keystroke logging (Inputlog), the researchers analysed writing essays which were to include 
description of the data presented in charts, responding to the prompt, and revising the text by 
adding information. The data was then used to examine how the text evolved from one section 
to another. The findings suggested that the writers could produce informationally dense and 
interconnected texts when revising but that L1 and L2 students employed different cohesive 
means: L1 writers preferred substitution and modification of complex noun groups, while L2 
students relied on coordinating conjunctions and demonstrative pronouns. 
Abdel Latif (2021) analysed the basic components of text production and composing 
strategies among a group of 30 university students with Arabic as L1 and intermediate 
proficiency in English. The participants were asked to write an argumentative essay, as this text 
type requires more textual organization compared to other genres. The researcher employed 
think-aloud data to investigate how the writing process evolved. The participants were first 
trained on how to verbalize and record their concurrent actions during writing. The transcribed 
think-aloud data showed that cohesion emerged via adding, changing, or deleting words and 
phrases, in other words, using referential devices, substitution, ellipsis, and connectors. The 
findings also suggested that such revisions were associated with the writers’ linguistic 
resources, as proficient writers were able to monitor and revise their texts, whereas less skilled 
writers had difficulties in finding alternative choices for words and expressions and needed to 
verify L2 meanings using L1. 
Lyashevskaya, Panteleeva and Vinogradova (2021) examined features of lexical, 
morphological, syntactic, and discursive complexity in EFL texts written by Russian university 
students. The aim was to develop a digital feedback application to provide recommendations for 
revising the text and alerting for L1 interference. For this purpose, the researchers analysed over 
3000 essays including descriptions of graphical material and expressing opinions on social and 
cultural problems. The texts were then divided in the best and non-best essays assessed by 
human raters. The findings indicated that discursive complexity emerged from three cohesive 
features: the use of discourse-organizing nouns, i.e., semantically unspecific abstract nouns like 
fact, issue, and argument referring to information given in different text sections, multi-word 
connectors, e.g., on the other hand, or in my opinion, and single-word connectives to 
communicate addition, cause and effect, clarification, or contrast such as moreover or 
consequently. Overall, the best essays contained more discourse-organizing nouns and diverse 
linking tools compared to the non-best essays.  
To sum up, the findings suggest that writing quality is strongly associated with cohesion. 
Research on various aspects of the writing process and findings on text complexity indicate that 
higher-scoring texts progress logically from one topic to another and employ multiple types of 
EduLingua 8/1 (2022)  5 
 
 
cohesive devices, while lower-scoring texts demonstrate difficulties in connecting ideas across 
paragraphs, although the learners manage writing at the sentence level. 
3. The present study 
The present study aims to contribute to cohesion research in two respects. Firstly, we examine 
essays written by upper-secondary school learners, while previous research mostly focuses on 
writing at tertiary-level. Secondly, the participants are allowed to use the Internet during 
writing, as there is an overall scarcity of examinations simulating authentic writing sessions 
with free access to online dictionaries and webpages. 
Subsection 4.1 reports on how digitally identified cohesive features are connected to 
holistic writing quality in Finnish secondary-school EFL essays, and how the CEFR 
expectations for cohesion at level B2 are met (RQ 1). Subsection 4.2. focuses on the role of 
consulting the Internet in cohesion-building. As employing cohesive devices may be connected 
to EFL learners’ lexical development (e.g., Crossley et al., 2016b) and linguistic resources (e.g., 
Abdel Latif, 2021), successful use of digital sources is compared against the writer’s receptive 
vocabulary knowledge (RQ 2). The research questions are formulated as follows: 
RQ 1. Which cohesive features characterize Finnish upper-secondary school EFL 
essays? How do the essays fulfill the CEFR expectations for cohesion at level B2? 
RQ 2. Under what conditions can using digital sources enhance cohesion? 
3.1 Participants 
The participants were 46 students, aged 16−17 with Finnish as L1, at a typical Finnish 
municipally maintained upper-secondary school. They volunteered to be tested for lexical 
knowledge and writing tasks during the first two academic years. Previously they had studied 
L2 English circa 600 lessons (45 min.) in the general basic education and circa 120‒150 lessons 
(75 min.) during the first and second year at the upper-secondary level. To persuade the students 
to take all the tests, they were offered a credit of one course in English. It was also agreed that 
the test performance would not affect their English grades and that the tests would be conducted 
during school hours. The present examination is based on the data from the second year. 
3.2 Essays and scoring 
The participants were asked to write an essay with 150‒250 words on “My second school year”. 
They were encouraged to regard this task as an additional opportunity to train their writing skills 
before the national high-stakes examination in the following year. The prompts included, e.g., 
discussing their academic success and expectations in front of the upcoming school-leaving 
examinations as well as their future plans for the tertiary level. To hinder priming effects, the 
prompts were given in L1 using 3−5 bullet points. The writing time was 60 minutes, as the 
participants were expected to use online sources to search for lexis and information and to 
revise their texts using editing functions. They were also informed that the essays would be 
checked for plagiarism. 
6 Niitemaa: Cohesion in Finnish EFL essays 
 
 
The essays were rated on a four-point scale from 0 to 99 points (e.g., 80−82−85−88 at the 
upper-intermediate level) accounting for the content and structure of the text, lexical richness 
and accuracy, and the candidate’s ability to communicate the message clearly. The same criteria 
are used in the national Matriculation Examination. The raters were twenty-eight teacher 
trainees finishing their studies at the Faculty of Education at a Finnish university as a part of 
curricular training. The trainees first assessed the essays on their own and then discussed the 
assessments in small groups. They were, however, encouraged to give the scores independently. 
The interrater reliability (Cronbach’s alfa) was strong ranging from 96% to 99%. 
Before running the automated analyses, the essays were cleaned for spelling errors that 
changed the word meaning, e.g., taught instead of thought. The raters also used the cleaned 
version, as the purpose was to draw attention to the structure and cohesiveness of the texts 
instead of accuracy. 
3.3 Recording of the writing session 
A freely downloadable video-recording software, CamStudio, was installed in the computers to 
record the individual writing processes. The participants were first shown how to switch on the 
software and then asked to start writing as usual. The recording is unobtrusive for the writer. We 
examined the recorded data to monitor which online sources were used, how many times each 
source was consulted, what was searched, how the writers used the findings, and evaluated 
whether the search results suit the context. However, due to space problems in the computer 
rooms, we were able to record only thirty-one students out of forty-six. 
3.4 Vocabulary knowledge 
To examine the use of online sources in relation to vocabulary knowledge, the participants were 
assessed for receptive vocabulary knowledge using the revised version of the Vocabulary Levels 
Test (the VLT; Schmitt, N., Schmitt, D., & Clapham, C., 2001; Nation, 1983) with 30 items in 
the 2
nd
, 3
rd
, 5
th
, and 10
th
 thousand frequency bands (maximum 120 points). Each frequency band 
consists of ten item groups with six words and three definitions. The task is to match the words 
to the definitions. The VLT is commonly considered a reliable and valid measure of receptive 
vocabulary (e.g., Meara, 2009; Read, 2000). The test sections are scalable, i.e., the knowledge 
of rare words implies knowledge of more familiar words. The test administration coincided with 
the writing session. The rationale for using the VLT in a cohesion study is that receptive 
vocabulary size is closely connected to EFL learner’s ability to use English in multiple ways 
(Alderson, 2005; Schmitt et al., 2001) and thus, may also interact with cohesion (Crossley et al., 
2016b). 
3.5 Automated analysis 
To analyze the essays for cohesive features, the present study employed TAACO 2.0.4 
(Crossley, Kyle, & Dascalu, 2019). It is a freely accessible tool designed to detect cohesion 
across sentences (local cohesion), paragraphs (global cohesion) and the entire text (text 
cohesion). TAACO also provides diagnostic output files to show how the words in the 
EduLingua 8/1 (2022)  7 
 
 
individual essays are tagged. For detailed information on calculation of the indices, please see 
https://www.linguisticanalysistools.org/taaco.html. 
The present study analysed 123 indices: 
• 60 indices at the sentence level: 48 lexical overlap indices, 2 semantic overlap indices, 
the incidence of 10 types of connectives 
• 50 indices at the paragraph level: 48 lexical overlap indices and 2 semantic overlap 
indices 
• 13 indices at the text level: measuring type-token ratios, determiners, demonstratives, 
pronoun to noun ratio and pronoun density 
Lexical overlap calculates repetition of words across sentences and paragraphs, while Semantic 
overlap measures repetition of semantically related words such as synonyms for nouns and 
verbs. At the text level, cohesion is measured by the incidence of determiners, demonstratives 
and type-token ratios (TTR) of various parts of speech. The latter, lexical diversity, signifies 
referential cohesion. 
3.6 Statistical analyses 
Robust bootstrapped correlations (Larson-Hall, 2016, 213‒214) were computed to examine the 
strength of connections between TAACO indices and essay ratings. Robust tests provide 
confidence intervals (CI), which indicate that the actual correlation coefficient is within the 
bounds of the CI with a probability of 95%. Although wider in smaller samples, the CI indicates 
that the correlation is statistically significant if it does not pass through zero. The values for the 
effect (R2) are interpreted according to Plonsky and Oswald’s (2014) guidelines: R2 = 0.06 is 
small, R2 = 0.16 is medium and R2 = 0.36 is large. Robust multiple linear regression analyses 
were run to identify which cohesive indices were predictive of human-rated essay scores. 
Robust tests do not assume normal distribution, but the assumptions of regression, including 
linearity, normal distribution of errors, homogeneity of variances and multicollinearity were 
checked (Larson-Hall, 2016, 251). The analyses were conducted using SPSS, version 27. 
4. Results 
The digital analyses were based on 46 essays comprising 10.835 words. On the Finnish national 
assessment scale (c.f., 3.2.), the mean score approached the upper-intermediate level (mean 
77.7; SD 9.2). Regarding the CEFR descriptors (Council of Europe, 2018a), the essay scores 
ranged between C1 and B1 so that 15% reached the level C1, 46% were at the level B2, while 
39% remained at B1. Regarding the VLT results, the mean score (77.48) was 65% of the 
maximum points, which is on par with multiple results among the same age group (e.g., Peters, 
2018). However, the standard deviation (SD 30.18) was exceptionally wide, as 30% of the 
participants scored 50% or less, while 33% of them scored 80% or more. 
8 Niitemaa: Cohesion in Finnish EFL essays 
 
 
4.1 Which cohesive features characterize Finnish upper-secondary school EFL essays? How do 
the essays fulfill the CEFR expectations for cohesion at level B2? (RQ 1) 
The essays were analysed for the 123 indices introduced above (c.f., section 3.5), and robust 
bootstrapped correlations (c.f., section 3.6) were conducted to examine whether the essay scores 
(dependent variable) were related to the cohesive features (independent variables). The analyses 
showed that 28 (roughly 23 %) variables were significantly correlated with the essay scores 
including 25 paragraph-level measures and three sentence-level measures, while text-level 
features, such as the incidence of determiners and demonstratives, did not correlate significantly 
with writing quality. 
Fifteen indices (c.f., Appendix) with at least medium effect sizes (R2 ≥ .160) were chosen 
for further analyses. After checking for multicollinearity (r ≤ .700), a series of multiple linear 
regression analyses with two independents were conducted using robust tests. The calculations 
were checked for outliers, homogeneity of variances, and normality and independence of 
residuals. The results indicated that two measures, Adverb 2 paragraph normed and the 
incidence of Conjunctions and/but, collectively reported the best significant model (F (3.42) = 
14 008, p < .001, R2 = .37). The coefficients are provided in Table 1. The confidence intervals 
inform that the association is statistically significant. The other combinations with two 
independents yielded effect sizes from 30% to 34%, whereas the analyses with three 
independent variables did not report significant equations. 
Table 1. Coefficients in the robust test. N = 46 
    BCa 95 % Confidence Interval 
 B sig. 2-tailed Lower Upper 
(Constant) 84.583 <.001 77.32 92.04 
Adverb 2 paragraphs normed  3.359 <.001 1.86 4.83 
Conjunctions and/but ‒256.958 .001 ‒380.06 ‒133.05 
 
To answer RQ 1, the digital analyses indicated that the best predictors of writing quality were 
using a rich array of adverbs and adverbials as referential devices across paragraphs and a wide 
range of connective devices. In terms of the CEFR descriptors for cohesion, EFL writers at level 
C1 knew how to employ referential and lexical cohesion, substitution, and a wide range of 
connectives, B2 writers were able to avoid errors disrupting readability but used fewer types of 
connectives, while B1 writers overused the conjunctions and/but and often forgot to structure 
the text in paragraphs, which diminished the clarity of the text. 
4.2 Under what conditions can using online affordances enhance cohesion? (RQ 2) 
Based on the video-recorded data of authentic writing sessions of thirty-one upper-secondary 
school EFL students (cf., 3.3), this subsection reports how the participants employed online 
dictionaries and informational sites, what they searched, what they found, and what problems 
they encountered during consultations. We also discuss how the online findings were related to 
EduLingua 8/1 (2022)  9 
 
 
the cohesive indices identified by TAACO and what role the writer’s lexical knowledge played 
in searching for words, expressions, and information online. 
Number of consultations, sources, and the search language 
The writers consulted online sources 341 times (Table 2). Eight writers conducted from one to 
five queries, thirteen consulted from six to fifteen times and nine searched from sixteen to 26 
times, which was the maximum number of individual queries. Those who queried frequently 
returned to the same items several times. One participant did not consult any online sources. 
The students employed from one to four different sources. These included freely available 
multilingual dictionaries, translation tools and informational sites like Wikis and home pages of 
educational institutions. The most frequently consulted sources were Sanakirja.org
2
 (174 
queries) and Google Translate (140 queries). The majority used only one source so that fourteen 
chose Sanakirja.org, six writers used Google translate, and four writers employed them both. 
One writer consulted EUdict.com (four queries) while five writers employed different 
combinations of dictionaries, translation tools and informational sites, e.g., Wikipedia or home 
pages of educational institutions (86 queries). No expert-constructed learners’ dictionaries were 
consulted. Roughly 91% of the consultations were conducted from L1 to L2. Sixteen students 
searched only from L1 to L2, while fourteen also used L2 when crosschecking meanings or 
consulting informational sites. 
Table 2. Sources, consultations, and the search language 
Number of 
writers** 
Sources 
consulted* 
Number of all 
consultations 
Number of successful 
consultations 
L1 ->L2 L2 -> L1 
14 D 138 107 126 12 
6 GT 70 54 69 1 
4 D + GT 43 [33 + 10] 24 39 4 
1 OD 4 4 3 1 
1 GT + K 11 [6 + 5] 10 10 1 
1 OD + GT 26 [4 +22] 20 26 0 
1 GT + I 19 [13 + 6] 17 19 0 
1 D + GT + I 22 [1 +19 + 2] 20 12 10 
1 D + OD + I + I 8 [2 + 1 + 5] 8 5 3 
Total   30** 1 to 4 sources 341 264 (77%) 309 (91%) 32 (9%) 
*D = Sanakirja.org; GT = Google translate; K = Kaannos.com; OD = Other dictionaries; I = 
Informational sites; **One of the 31 recorded participants did not consult any online sources. 
 
After searching, the writers chose one of the five following actions: found the item, and used it 
correctly, found the item but used it incorrectly, hesitated and decided not to use a word 
                                                     
2
 Information provided by Sanakirja.org and IlmainenSanakirja, and the translation tool Kaannos.com are 
based on Wiktionary articles. 
10 Niitemaa: Cohesion in Finnish EFL essays 
 
 
unknown to them, did not find the item and replaced it with something else or did not find the 
item and discarded the topic. 
What was searched frequently? 
The most frequently searched item was the official term for the Finnish national school-leaving 
examination (matriculation examination) searched by twenty students. Fifteen students queried 
the noun (school) subject. Eleven queries were made for tertiary-level schools such as 
polytechnic and university of applied sciences. 
How did the consultations succeed? 
Although consulting the references was not without problems, 77% of the queries were 
conducted successfully. Thirteen participants found matriculation examination either in 
Sanakirja.org, or later by chance in Wikipedia when searching for something else. The word 
seemed unknown to the majority, which manifested as hesitation in the recorded data. For 
example, three students replaced it by final test or the verb graduate. Four queries were 
unsuccessful: one writer did not find the item, another chose baccalaureate but used it as a verb, 
while two writers accepted the literal translation from L1 (student writings*) by Google 
translate. The noun (school) subject was successfully found by eleven students from 
Sanakirja.org. Four students chose the literal translation from L1 (substance, material) but 
changed the word after consulting other sources. However, three of them also used the 
inappropriate words in their texts. Instead of searching for subject, five students replaced it with 
specific terms, such as history or physics. Twenty-two students searched for adjectives to 
characterize school subjects or examinations. The most frequently queried adjectives were 
compulsory (3 queries), mandatory (7 queries), obligatory (1 query), and optional (6 queries). 
Apart from one case, these searches were successful. 
What was not searched? 
The verbs collocating with matriculation examination or subject were not searched. The verb 
choice was strongly affected by L1 interference, as in Finnish, you “write an examination” and 
“read a subject”. When consulting Wikipedia for matriculation examination, one student found 
the collocate take, but also used the incorrect combination in another sentence. Regarding the 
collocating verb for subject, Google translate suggested read, e.g., read subjects* or read 
mathematics*. Connective devices were searched rarely. One successful query was made for 
each of the following connectives: although, compared to, even though, firstly, however, 
nevertheless, and unfortunately. One student searched for the combination on one hand without 
finding it.  
Problems in searching 
The recordings revealed three major problems: ineffective use of online sources, lack of basic 
computer skills and inadequate vocabulary knowledge. 
EduLingua 8/1 (2022)  11 
 
 
Firstly, most participants tended to rely on one source without crosschecking and 
evaluating the information in the entries. Neither did they consult definitions and examples, 
which would have provided collocates and information on register. For example, the Finnish 
counterpart for (school) subject is a polysemic word covering such meanings as matter, material 
and substance. The suggestions in Sanakirja.org are provided in decontextualized word lists, 
which resulted in choosing *material or *substance. The latter was also the primary suggestion 
by Google translate. Moreover, the users of Google translate often relied on the suggested 
literal translations. For example, aiming to express “I am going to take the matriculation 
examination in the fall,” one student copied the suggested phrase “I am going to write a fall in 
the matriculation examination.” Another student intended to communicate his future plans, 
which Google translate formulated as “I read myself a building engineer.” There were only few 
cases in which changing the search term helped the writers to find the appropriate target. 
Secondly, some participants lacked basic computer skills. Instead of using the copy-and-
paste function, some students navigated back and forth between the source and the task copying 
one word at a time, while more skilled students had the primary source opened in its own 
window next to the writing task allowing a quick navigation between the task and the sources. 
Regarding misspelt words, most writers corrected the errors when flagged. A few students 
corrected only some of the spelling errors, while two students ignored them all. However, the 
writers rarely noticed errors that changed the word meaning as the proof-reading function did 
not flag them, e.g., chance instead of change, or curses instead of courses. At times, the 
dictionary did not provide any suggestions due to spelling errors. Some spelling errors occurred 
even in L1 queries. 
Thirdly, the students with low vocabulary knowledge could not evaluate the words and 
phrases suggested in the sources as demonstrated in the examples above. Regarding the VLT 
results (c.f., the beginning of section 4), the examples of incorrect choices originated from 
writers scoring under 64 % or less in the VLT. In contrast, the participants scoring 80 per cent or 
more seemed to know most of the topic-related key words without having to search for them, 
but when they searched, they knew how to crosscheck the findings. 
Lastly, some students stopped querying if the item was not found immediately. This may 
indicate low motivation due to the problems experienced when consulting the sources. 
Moreover, the participants knew that their language teacher would not grade the essays. 
To answer RQ 2, the video-recordings suggested that consulting online dictionaries and 
informational pages may help to enhance cohesion if the writer possesses adequate vocabulary 
knowledge and computer skills and knows how to exploit the sources efficiently. Regarding the 
framework of Halliday and Hasan (1976), this indicates that consulting online sources facilitates 
using substitution and lexical cohesion. Substitution allows the writer to exploit a greater range 
of topic-related lexis, for example, to replace the noun subject with more specific terms such as 
social studies or psychology. Regarding lexical cohesion, higher-scoring writers searched or 
checked semantically related words when presenting information (curriculum, examination, 
mathematics, matriculation examination, pronunciation, skills) as well as adjectives to specify 
meanings (advanced, basic, complicated, compulsory, optional, oral). 
Regarding the role of connectives and demonstrative reference in cohesion-building, the 
overuse of conjunctions and/but correlated negatively with writing quality and made the texts 
resemble spoken language, e.g., “And I have good memories.” or “And the second year’s 
courses are a little bit harder”. In contrast, demonstrative reference, such as using adverbs as 
12 Niitemaa: Cohesion in Finnish EFL essays 
 
 
modifiers across paragraphs (especially, definitely, hopefully, luckily, nearly, personally, 
probably) was a positive predictor of essay scores. However, higher-scoring writers, who 
employed a wide range of connectors and adverbs, did not need to search for them. 
5. Discussion 
The present study on cohesion in EFL writing was grounded on the seminal framework of 
Halliday and Hasan (1976) presenting five types of cohesive means: employing referential 
devices, replacing words instead of repeating them, omitting words, linking text segments with 
connectives, and using different but semantically related words and collocations. Multiple 
research results have shown that such features facilitate reading comprehension and assist 
noticing relationships between the ideas and information presented in the text, and that writing 
quality is strongly related to the writer’s ability to produce cohesive text.  
The study aimed, firstly, to ascertain which cohesive devices Finnish upper-secondary 
school EFL students employed in their compositions and analyse how these devices associated 
with human-rated writing quality. The present findings aligned with the results attained among 
tertiary-level students (e.g., Crossley et al., 2016b; Kim, & Crossley, 2018) in that features of 
paragraph-level cohesion were closely related to writing quality. In terms of the framework of 
Halliday and Hasan (1976), higher-scoring Finnish upper-secondary school writers employed a 
rich array of adverbs and adverbials as referential devices across paragraphs and linked 
sentences and paragraphs using a wide range of connectives, i.e., both younger and older EFL 
students tended to employ cohesive devices related to textual organization. In contrast to 
Crossley et al. (2016b), no connection was found between writing scores and the type-token 
ratios or the use of pronouns and determiners. The demonstratives are particularly problematic 
for Finnish EFL learners, as the Finnish language does not take articles. However, the 
correlation between the incidence of articles and essay quality in the Finnish essays was 
approaching the significance level, suggesting gradual development towards more right types of 
articles in the right places. Without digital analyses, such subtle change would have been 
difficult to detect. 
In comparison with the CEFR descriptions for cohesion (Council of Europe, 2018a), 
approximately 60% of the essays had reached at least the level B2, which is the stage that 
Finnish EFL students are expected to reach in English by the national school-leaving 
examination. This means that over half of the writers structured their text in paragraphs, avoided 
errors disrupting readability, and at least most of the time employed contextually appropriate 
lexis. The writers at level C1 knew how to employ lexical cohesion, substitution, and also 
employed more referential devices and a wider range of connectives compared to the students at 
level B2. The students remaining at level B1 overused the conjunctions and/but and often forgot 
to structure the text in paragraphs, which diminished the clarity of the text, and moreover, wrote 
in a rather informal style.  
With regard to second research question, recent examinations suggest that various real-
time observations are fundamental for understanding EFL students’ writing processes, as they 
may assist teachers in introducing and instructing cohesion in the classroom (e.g., Bowen, & 
Thomas, 2020). Following this line of research, the present study employed video-recordings to 
investigate how cohesion emerged during essay writing.  
EduLingua 8/1 (2022)  13 
 
 
The present observations concurred with previous results in that cohesion-building is 
closely connected to the writers’ linguistic resources. The recorded data demonstrated that using 
online reference sources facilitated cohesion-building on the condition that the writer possessed 
adequate lexical knowledge and digital skills to search for lexis in online sources. For example, 
the present analyses indicated that writers scoring less than 64% in the VLT did not benefit from 
consulting online dictionaries and other reference sources in cohesion-building. In contrast, the 
participants scoring 80% or more in the VLT seemed to master most of the topic-related words 
without having to search for them, but when they searched, they were able to choose the 
appropriate option and crosscheck the findings. Higher-scoring writers also managed to add, 
change, and delete words to avoid repetition, whereas lower-scoring writers had difficulty in 
finding alternative lexis even if they had a chance to search for it. Efficient consultations were 
particularly beneficial to finding semantically related lexis. In this respect, the results supported 
previous observations conducted among adult students and based on different methodologies 
(Bowen, & Thomas, 2020; Abdel Latif, 2021; Lyashevskaya et al., 2021) and aligned with 
earlier findings on translation tasks (Mutta et al., 2014) as well as on indirect writing tasks 
(Niitemaa, & Pietilä, 2018). 
6. Conclusion 
As regards limitations of the present study, the small sample size restricts generalizing the 
findings on the use of cohesive features in EFL essays.  However, the present findings seem to 
concur with recent results on cohesion-building. In the case of the video-recordings, the sample 
was even smaller due to scarcity of time and space in the computer room. Moreover, it may 
have been worthwhile to survey the students’ individual experiences on the benefits and 
problems of accessing online affordances when writing. In relation to pedagogical implications, 
the findings demonstrate, firstly, that EFL learners need to be taught which properties different 
online dictionaries and translation tools provide and how they function. Although EFL learners 
encounter English in multiple digital environments in their free time, they mostly participate in 
activities which do not require consulting online dictionaries (Niitemaa, 2020). Thus, EFL 
students also need opportunities to practise using online reference sources in the classroom. 
Furthermore, emphasizing the individual nature of the writing processes, the present 
observations point towards searching for the optimal ratio between automated and person-to-
person feedback.  
In the context of observational research, future studies could benefit from recent 
developments in key-stroke logging such as GenoGraphiX-LOG 2.0 (www.ggxlog.net). This 
tool stores the whole writing process and total navigation as a log file, calculates the writing 
bursts, additions, insertions, and deletions, and moreover, provides graphs of the writer’s 
actions. For now, the tool has been used at the tertiary level to investigate, e.g., the types of 
collocations produced in different languages, and for feedback in consultations between the 
student and the teacher. Another area calling for future research is to examine more closely how 
lexical competence develops from recognition skills towards a large associative lexical network 
allowing productive use of lexis, e.g., for cohesion-building. In this regard, researchers could 
approach productive use of lexis from the perspective of lexical networks (Meara, 2009; 
Sigman, & Cecchi, 2002) comparing differences between cohesion in texts written by learners 
14 Niitemaa: Cohesion in Finnish EFL essays 
 
 
with large but loosely organised lexicons and by learners with similar-sized but more densely 
organised lexicons.  
References 
Abdel Latif, M. M.. (2021). Remodelling writers’ composing processes: Implications for 
writing assessment. Assessing Writing, 50, 1‒16. 
https://doi.org/10.1016/j.asw.2021.100547 
Bowen, N. E. J. A., & Thomas, N. (2020). Manipulating texture and cohesion in academic 
writing. A keystroke logging study. Journal of Second Language Writing, 50, 1‒15. 
https://doi.org/10.1016/j.jslw.2020.100773 
CamStudio (n.d.). https://camstudio.en.softonic.com/?ex=DSK-347.0  
Council of Europe (2018a). Common European Framework of Reference for languages: 
Learning, Teaching, Assessment. Companion volume with new descriptors. Language 
Policy Programme. Education Policy Division Education Department. 
https://rm.coe.int/cefr-companion-volume-with-new-descriptors-2018/1680787989 
Council of Europe (2018b). Descriptors of competences for democratic culture. Reference 
Framework of Competences for Democratic Culture. Vol. 2.  
https://www.coe.int/en/web/campaign-free-to-speak-safe-to-learn/reference-framework-
of-competences-for-democratic-culture 
Crossley, Kyle, K., & McNamara, D. S. (2016b). The development and use of cohesive devices 
in L2 writing and their relations to judgments of essay quality. Journal of Second 
Language Writing, 32, 1–16. https://doi.org/10.1016/j.jslw.2016.01.003 
Crossley, S., & McNamara, D. (2016). Say more and be more coherent: How text elaboration 
and cohesion can increase writing quality. Journal of Writing Research, 7, 351‒370. 
doi:10.17239/jowr-2016.07.03.02 
Crossley, S.A., Kyle, K., & Dascalu, M. (2019). The tool for the automatic analysis of cohesion 
2.0: integrating semantic similarity and text overlap. Behaviour Research Methods, 51, 
14‒27. https://doi.org/ ezproxy.utu.fi/10.3758/s13428-018-1142-4 
Crossley, S.A., Kyle, K., & McNamara, D. (2016a). The tool for the automatic analysis of text 
cohesion (TAACO): automatic assessment of local, global, and text cohesion. Behaviour 
Research Methods, 4, 1227‒1237. https://doi-org.ezproxy.utu.fi/10.3758/s13428-015-
0651-7 
Crossley, S.A., Salsbury, T., & McNamara, D.S. (2010). The development of semantic relations 
in second language speakers: A case for Latent Semantic Analysis. Vigo International 
Journal of Applied Linguistics, 7, 55‒74. 
Halliday, M., & Hasan, R. (1976). Cohesion in English. Longman. 
Kim, M., & Crossley, S.A. (2018). Modeling second language writing quality: A structural 
equation investigation of lexical, syntactic, and cohesive features in source-based and 
EduLingua 8/1 (2022)  15 
 
 
independent writing. Assessing Writing, 37, 39‒56. 
https://doi.org/10.1016/j.asw.2018.03.002 
Larson-Hall, J. (2016). A Guide to Doing Statistics in Second Language Research Using SPSS 
and R: Vol. Second Edition. Routledge. 
Lintunen, P., Mutta, M., & Peltonen, P. (2020). Fluency in L2 learning and use. Multilingual 
Matters. https://doi.org/10.21832/9781788926317 
Lyashevskaya, O., Panteleeva, I., & Vinogradova, O. (2021). Automated assessment of learner 
text complexity. Assessing Writing, 49, 1‒16. https://doi.org/10.1016/j.asw.2021.100529 
McNamara, D., Kintsch, E., Songer, N., & Kintsch, W. (1996). Are good texts always better? 
Interactions of text coherence, background knowledge, and levels of understanding in 
learning from text. Cognition and Instruction, 14, 1‒43. 
http://www.jstor.org/stable/3233687 
Meara, P. (2009). Connected Words: word Associations and second language vocabulary 
acquisition. John Benjamins Pub.Co. 
Mutta, M., Pelttari, S., Salmi, L., Chevalier, A., & Johansson, M. (2014). Digital literacy in 
academic language learning contexts: Developing information-seeking competence. In J. 
Pettes Guikema & L. Williams (Ed.) Digital literacies in foreign and second language 
education. CALICO Monograph Series, Vol. 12. Texas State University: Computer 
Assisted Language Instruction Consortium (CALICO), 227‒244. 
https://www.researchgate.net/publication/279840004_Digital_Literacy_in_Academic_La
nguage_Learning_Contexts_Developing_Information-Seeking_Competence 
Nation, P. (1983). Testing and teaching vocabulary. Guidelines, 5, 12─25. 
Niitemaa, M. L., & Pietilä, P. (2018). Vocabulary skills and online dictionaries: A study on EFL 
learners' receptive vocabulary knowledge and success in searching electronic sources for 
information. Journal of Language Teaching and Research, 9(3), 453‒462. 
Niitemaa, M. L. (2020). Informal acquisition of L2 English vocabulary. Exploring the 
relationship between online out-of-school exposure and words at different frequency 
levels. Nordic Journal of Digital Literacy, 2, 86‒105. https://doi.org/10.18261/issn.1891-
943x-2020-02-02 
Peters, E. (2018). The effect of out-of-class exposure to English language media on learners’ 
vocabulary knowledge. ITL International Journal of Applied Linguistics 169, 142−167. 
https://DOI.org/10.1075/itl.00010.pet  
Plonsky, L., & Oswald, F. (2014). How big is “big”? Interpreting effect sizes in L2 
research. Language Learning, 64, 878–912. https://doi-
org.ezproxy.utu.fi/10.1111/lang.12079 
Read, J. (2000). Assessing vocabulary. Cambridge University Press. 
Ryshina-Pankova, M. (2015). A meaning-based approach to the study of complexity in L2 
writing: The case of grammatical metaphor. Journal of Second Language Writing, 29, 
51−63. http://dx.doi.org/10.1016/j.jslw.2015.06.005 
16 Niitemaa: Cohesion in Finnish EFL essays 
 
 
Schmitt, N., Schmitt, D., & Clapham, C. (2001). Developing and exploring the behaviour of two 
new versions of the vocabulary levels test. Language Testing, 18, 55‒88. 
http://dx.doi.org.ezproxy.utu.fi/10.1191/026553201668475857 
Sigman, M., & Cecchi, G.A. (2002). Global organisation of lexicon. PNAS, 99, 1742‒1747. 
https://langev.com/author/msigman. 
TAACO. https://www.linguisticanalysistools.org/taaco.html 
Usoof, H., Leblay, C., & Caporossi, G. (2020). GenoGraphiX-Log version 2.0 user guide. Les 
Cahiers Du GERAD, 2020 68, 1-63. https://www.gerad.ca/en/papers/G-2020-68