AI-driven computational study of Atomic Layer Deposition growth of gallium nitride Department of Mechanical and Materials engineering Master's thesis Author: Elli Virtanen 26.8.2025 Turku The originality of this thesis has been checked in accordance with the University of Turku quality assurance system using the Turnitin Originality Check service. Master's thesis Subject: Materials engineering Author: Elli Virtanen Title: AI-driven computational study of Atomic Layer Deposition growth of gallium nitride Supervisor: Assoc. Prof. Milica Todorović Number of pages: 76 Date: 26.8.2025 Semiconductors are a class of materials used in numerous fields in modern technology. Some of the applications of semiconductors include power and optoelectronics, renewable energy systems and electric vehicles. Power devices are used for conversion and control of electrical power, which is more and more relevant with the rise of electrification, and increased utilization of renewable energy sources. As a more efficient alternative to traditional Si-based semiconductor devices, wide band gap semiconductor materials are offering great promise due to their higher power levels, frequencies, and operating temperatures. Gallium nitride (GaN) is a such wide band gap semiconductor material, used for applications in optoelectronics and power electronics. To ensure sufficient quality and performance, these high-power and high-temperature devices require manufacturing processes to be controllable in the atomic scale, which can be achieved by depositing thin films one atom layer at a time via a process called atomic layer deposition (ALD). In ALD, small molecules called precursors are used to facilitate surface growth on a substrate via sequential adsorption on the surface sites. Optimizing the ALD process requires fundamental knowledge of the chemical interactions and spatial arrangements taking place between the precursor and surface during adsorption. There can be numerous possible precursors and surface reactions for any given material, which is why utilizing computational methods, such as density functional theory (DFT), and Bayesian optimization machine learning, is a viable tool to accelerate the research and optimize the deposition process. The focus of this work was to study the adsorption of a single ALD precursor molecule gallium trichloride (GaCl3) onto a GaN(0001) surface, using DFT to determine the adsorption energies of different molecule-surface configurations, combined with a Bayesian optimization structure search (BOSS), a machine learning method. The function of BOSS was to sample the configurational structure space, and identify the molecule configurations that minimize the adsorption energy of the system. These minimum energy structures were then further analysed with computational methods to determine their electronic properties and adsorption mechanism. The BOSS study found five unique local minima configurations for the molecule-surface system in the desired ALD temperature range. In one of the five cases the adsorbate remained as GaCl3, while in the rest of the four cases the precursor dissociated into fragments. The global minimum configuration consisted of a GaCl fragment adsorbed on the nitrogen top site of the GaN(0001) surface, with two Cl atoms dissociated to separate gallium top and bridge sites. The global minimum structure had an adsorption energy of -7.24 eV, indicating the adsorption was very energetically favourable. The adsorption energy values for the other four local minima ranged from -6.84 to -6.18 eV. The bonding between the adsorbate and surface was further analysed with atom-projected density of states (pDOS), and Mulliken partial charges. pDOS indicated that GaCl3 will chemisorb to the GaN(0001) surface, which is also suggested by the large negative magnitude of the adsorption energy values. Mulliken analysis indicated that there is significant charge transfer between the adsorbed molecule and the surface, but further research is needed to locate the exact surface atoms participating in the bonding. Key words: gallium nitride, gallium trichloride, atomic layer deposition, density functional theory, Bayesian optimization Diplomityö Tutkinto-ohjelma, oppiaine: Materiaalitekniikka Tekijä: Elli Virtanen Otsikko: AI-driven computational study of Atomic Layer Deposition growth of gallium nitride Ohjaaja: Apulaisprof. Milica Todorović Sivumäärä: 76 Päivämäärä: 26.8.2025 Puolijohteet ovat luokka materiaaleja, joita käytetään lukuisissa nykyteknologian sovelluksissa, kuten opto- ja tehoelektroniikassa, uusiutuvan energian teknologioissa ja sähköajoneuvoissa. Tehoelektroniikalla on tärkeä rooli sähkötehon muuntamisessa ja kontrolloimisessa, mikä on erityisen tärkeää sähköistämisen ja uusiutuvien energialähteiden käytön yleistyessä yhteiskunnassa. Leveän energiavyön puolijohteet ovat tehokkaampi vaihtoehto perinteisille piipohjaisille tehoelektroniikkalaitteille niiden teho-, taajuus- ja lämpöominaisuuksien vuoksi. Galliumnitridi (GaN) on leveän energiavyön puolijohteisiin kuuluva materiaali, jota käytetään muun muassa opto- ja tehoelektroniikan sovelluksissa. Yhä haastavampien käyttöolosuhteiden vuoksi tehoelektroniikassa käytettävien puolijohdekomponenttien on oltava laadultaan ja ominaisuuksiltaan optimoituja, minkä vuoksi niiden valmistusmenetelmien on oltava hallittavissa atomitasolta lähtien. Yksi tällainen menetelmä puolijohdekalvojen valmistukseen on atomikerroskasvatus (atomic layer deposition, ALD), jossa materiaalia kerrostetaan yksi atomikerros kerrallaan halutun paksuuden saavuttamiseksi. ALD- prosessissa hyödynnetään lähtöaineina pienmolekyylejä, jotka adsorptoituvat halutulle pinnalle. Yksityiskohtainen käsitys lähtöainemolekyylin ja pinnan välisistä kemiallisista vuorovaikutuksista ja niiden avaruudellisesta asettumisesta on välttämätöntä ALD-prosessin optimoinnissa. Mahdollisten eri lähtöaineiden ja niiden pintareaktioiden suuresta määrästä johtuen laskennallisten menetelmien, kuten tiheysfunktionaaliteorian (density functional theory, DFT) ja koneoppimismetodeihin kuuluvan Bayesialaisen optimoinnin, käyttöönotto on hyödyllinen ratkaisu kokeellisten menetelmien lisäksi tutkimuksen nopeuttamiseksi ja kerrostumisprosessin optimoimiseksi. Tämän opinnäytetyön tarkoituksena oli tutkia yksittäisen ALD-lähtöainemolekyylin, galliumtrikloridin (GaCl3), adsorptiota GaN(0001)-pinnalle käyttämällä tiheysfunktionaaliteoriaa eri molekyyli-pinta- konfiguraatioiden adsorptioenergioiden määrittämiseen, sekä Bayesialaista optimointia hyödyntävää BOSS-koneoppimismenetelmää adsorptionenergian minimoivien konfiguraatioiden löytämiseen. Löydettyjä minimienergiarakenteita analysoitiin lopuksi laskennallisilla menetelmillä niiden elektronisten ominaisuuksien ja adsorptiokäyttäytymisen määrittämiseksi. Tärkeimpinä tämän työn tuloksina BOSS-menetelmän avulla löydettiin viisi ainutlaatuista lokaalia minimikonfiguraatiota molekyyli-pinta-systeemille tarkastellulla ALD-lämpötila-alueella. GaCl3 dissosioitui adsorption seurauksena neljässä tapauksessa viidestä minimikonfiguraatiosta. Globaali minimikonfiguraatio koostui typpiatomin kohdalle adsorptoituneesta GaCl-fragmentista, sekä kahdesta dissosioituneesta klooriatomista galliumatomin ja galliumsillan kohdalla. Tämän minimikonfiguraation adsorptioenergia oli -7.24 eV. Neljän muun havaitun minimin adsorptioenergiat olivat -6.84, -6.82, -6.45 ja -6.18 eV. Sidosmuodostusta ja varauksen jakautumista adsorptoituneiden fragmenttien ja pinnan välillä tutkittiin atomiprojektoidun tilatiheyden, sekä Mullikenin osittaisvarausanalyysin avulla. Tilatiheyksien perusteella GaCl3 sitoutui GaN(0001) pinnalle kemisorption kautta, ja osittaisvaraukset osoittivat, että pinnan atomit luovuttavat varausta adsorptoituneelle molekyylille, mutta varauksensiirron tarkemmat yksityiskohdat vaativat vielä lisää tutkimusta. Avainsanat: galliumnitridi, galliumtrikloridi, atomikerroskasvatus, tiheysfunktionaaliteoria, Bayesialainen optimointi Table of contents 1 Introduction 9 1.1 Research objectives 13 1.2 Thesis structure 14 2 Theoretical background 15 2.1 Gallium nitride 15 2.1.1 Properties and applications 15 2.1.2 Structure of GaN 16 2.2 Atomic Layer Deposition 18 2.3 Principles of density functional theory 21 2.4 Computational modelling of surface adsorption 26 2.4.1 Model preparation 26 2.4.2 Adsorption energy 28 2.5 Bayesian Optimization Structure Search (BOSS) 30 2.5.1 Bayesian optimization (BO) 30 2.5.2 Gaussian process (GP) 30 3 Methodology 34 3.1 Construction of DFT simulation models 34 3.1.1 Adsorbate molecule (GaCl3) 34 3.1.2 Semiconductor bulk (GaN) and surface (GaN(0001)) 35 3.2 BOSS model 36 3.2.1 Working principle of BOSS in practice 36 3.2.2 BOSS parameters 38 3.2.3 BOSS models 38 3.3 DFT calculations for electronic structure analysis 39 4 Results 41 4.1 Simulation model validation and convergence tests 41 4.1.1 Precursor adsorbate molecule (GaCl3) 41 4.1.2 Bulk semiconductor (GaN) 43 4.1.3 Semiconductor surface (GaN(0001)) 45 4.2 BOSS preliminary studies 46 4.3 6D BOSS simulation 49 4.4 Found local minima structures 55 4.4.1 Local minima before relaxation 55 4.4.2 Local minima after relaxation 57 4.5 Electronic structure properties of local minima 62 4.5.1 Projected density of states 62 4.5.2 Mulliken charges 66 5 Conclusions 69 References 72 Appendix 76 Acknowledgements I want to sincerely thank my supervisor Assoc. Prof. Milica Todorović for patiently providing guidance and feedback through my extensive thesis journey. She has shared with me so many valuable insights and knowledge about the realm of materials science, and I am grateful to have learned so much during my thesis project. I also want to thank Dr. Ransell D’Souza for always happily helping me navigate the technical side of BOSS, as well as my friends and family for their endless support and interest in my work. Abbreviations AES adsorption energy surface ALD atomic layer deposition BB building block BL bilayer BO Bayesian optimization BOSS Bayesian Optimization Structure Search CB conduction band DB dangling bond DFT density functional theory DOF degree of freedom DOS density of states eLCB exploratory lower confidence bound ECR electron counting rule GaN gallium nitride GGA generalized gradient approximation GP Gaussian process LDA local density approximation LM local minima ML machine learning PBC periodic boundary condition PBE Perdew Burke Ernzerhof PES potential energy surface PH pseudohydrogen SCF self-consistent field VB valence band vdW van der Waals WBG wide band gap XC exchange correlation 9 1 Introduction Semiconductors are a class of materials that have significant importance in modern technology. Some of the numerous fields for applications of semiconductors include power and optoelectronics, renewable energy systems and electric vehicles. Notably, semiconductors are used in the field of power electronics to convert electrical energy from one form to another, and to control the flow of electric power. These so-called semiconductor power devices include diodes, transistors and thyristors, which utilize the semiconducting materials ability to work as a switch, by controlling the flow of electricity within the power device. The need for efficient conversion and control of electrical power is essential in modern society. The global electricity consumption is continuously increasing, with demand rising by 4.3 % in 2024, and the trend predicted to continue similarly for the next few years. This growth is facilitated by the growing demand of industrial production, especially in emerging economies, electrification, for which one example is the increased use of electric vehicles, and the global rise of data centers [1]. Additionally, increased use of renewable energy technologies also poses challenges for efficient grid integration and electricity conversion, as nature-derived sources of power, such as solar and wind tend to generate fluctuating electric power flows [2]. Both the rise in electrification and integration of renewable energy sources into the grid produces the crucial need to efficiently manage the large electric power flows from the varying generation sources to users, while optimizing efficiency [3]. Semiconductor power devices have integral role in controlling these flows of electric power, as their purpose is to provide the optimum amount and form of electric power for given application [4], [5]. The band gap (Eg) is an important concept in semiconductor theory, as it affects the conductive properties of a material. Band gap describes the energy difference between the valence band (VB) maximum and conduction band (CB) minimum of a material, defined by its electronic structure. VB corresponds to the occupied electron states involved in covalent bonding within the material, while CB represents free electron states. Eg corresponds to the energy required to promote an electron from the VB to the CB, a process which allows for the material to conduct electrical current through the free flow of electrons in the CB. [6] According to the band gap, materials can be classified into three types; conductors, semiconductors and insulators. Conductors don’t have an energy gap between the CB and VB, and thus electrons can move freely to the conduction band. Insulators on the other hand have a large band gap, typically above 3 eV [7], which usually prevents the electrons from 10 crossing the gap. Semiconductors are intermediate materials, with conductive properties between insulators and conductors. They have a narrow band gap, typically around 0-4 eV [7]. The key advantage of semiconductors for many applications is that their properties, including electrical conductivity, can be tuned by optimizing the band gap for specific application. This tuning enables the use of semiconductors in varying operating conditions and advanced device compositions. Band gap can be modified either by heating, doping or by applying light or magnetic field [8]. Doping is the process of introducing trace amounts of other elements with different number of valence electrons to the semiconductor lattice. Doping alters the electronic properties of the material, as it either increases the electrons (n- doping) or holes (p-doping) [6]. Semiconductors can be classified as elemental or compound semiconductors. The former consists of a single element, while the latter is formed by two or more elements. Elemental semiconductors, composed of group IV elements silicon (Si) or germanium (Ge), were so called first-generation semiconductors. The second generation of semiconductors includes compound semiconductors, formed typically from elements of groups III and V, or II and VI, such as gallium arsenide (GaAs). [9] The third and most recent generation of semiconductors are the wide band gap (WBG) semiconductors, which include silicon carbide (SiC) and gallium nitride (GaN) [2]. WBG semiconductors are defined to have Eg larger than 3 eV [10]. This thesis focuses on GaN, which properties are described in more detail the Chapter 2.1. Silicon has a Eg of 1.1 eV [2], and it has been the most widely used semiconductor in numerous applications due to its availability and refined manufacturing processes [5]. However, despite the development of practically defect-free manufacturing processes, silicon- based semiconductor devices have in some areas met their physical limits [4], [11]. Due to the modern technological requirements of higher power levels, frequencies and operating temperatures, improved efficiency [11], as well as decreased size for the devices, intrinsic properties of silicon are not simply able to match these requirements. WBG materials demonstrate improved potential as an alternative to Si-devices because their wider band gap enables operation in higher temperatures, voltages and switching frequencies. The energy efficiency and power density of WBG devices is also higher than their Si counterparts, which in theory improves the overall sustainability of these devices, due to the reduced power losses and material consumption [2], [4]. However, currently the wide-scale production of WBG semiconductor devices is still very energy-consuming, which stems from the extraction and 11 purification of the elemental raw materials, as well as manufacturing processes that utilize high temperature epitaxial growth technologies [2]. The growing need for a better performance in novel semiconductors requires sophisticated production methods, with the possibility to tune the material properties based on the desired application requirements. A constant trend in semiconductor technology is to increase the power density of devices, and decrease their physical volume [4], [12]. This means the manufacturing methods should operate in a very small scale to provide extreme conformality control for the semiconductor material. Semiconductor thin films can be manufactured from bottom-up, defined to be 1 μm or less in thickness and operating in the nanometer scale. In addition to the structural controllability, the ideal manufacturing process for semiconductors should aim to minimize costs, time and energy consumption, as well as the environmental impact. Atomic layer deposition (ALD) is a one promising method for production of next-generation thin films, which is also able to meet the quality standards of the desired end products [13]. In the ALD process, the target material can be produced from precursors, typically small molecules, which are sequentially injected into a reaction chamber, where they react with available surface groups of a substrate to form one atom-layer thick films. The growth of the material is induced through the self-limiting adsorption of the precursor molecules to the current top layer of the thin film, after which the unwanted byproducts are removed from the chamber. This leaves behind a single- or multi-element thin film. [14] More in-depth description of the ALD process is presented in Chapter 2.2. One notable challenge for ALD stems from the fact that there can be numerous possible precursor combinations and process conditions to evaluate for suitability in regards to the production for given objective material. Therefore, optimizing the experimental ALD process can be time consuming, expensive and unsustainable, and in some cases extremely complex. Computational approaches and modelling are a viable additional method to use in accelerating the research of different ALD conditions, as they can provide a more systematic way to quantify and identify the underlying chemical phenomena and corresponding surface interactions happening during the ALD process. Modelling can also be used to optimize the process conditions, as the experimental process conditions can sometimes be outside the actual global optimum. Additional benefits of a computational approach include being less time-consuming and expensive than experiments. [15] 12 A concept relating to the quantitative study of materials, is the structure-property relationship, which represents the relation between the physical properties of a material with its structure. These properties can be explained by the atomic structure, including the electron configuration describing the arrangements of the electrons for a specific material. Compared to experiments, systematic study of the structure-property relationships can be performed with computational methods. Due to developments in recent decades, quantum mechanical methods like density functional theory (DFT) have become an accurate and efficient tool in the research of complex and increasingly larger systems [13]. DFT simulations operate in the quantum mechanical scale, which means the ability to describe the behavior and properties of small particles like atoms and electrons. Therefore, DFT can be used determine the atomic and electronic structure of a material and the properties derived from them. The principles of DFT are presented in Chapter 2.3. In recent years, increasing focus has also been shifted to combining computational simulations with artificial intelligence (AI) and machine learning (ML) methods in materials science research. The accuracy and relevance of atomistic simulation results rely on chemical intuition and previous experimental findings, as the quality of the outputs correlates to the accuracy of the inputs, and is therefore somewhat dependent on the current state of knowledge. To fill these gaps in the existing knowledge, ML methods can be implemented along atomistic simulations to discover new patterns and insights form existing data. This in turn can help steer the research in previously unknown directions with more information gain. Additionally, ML methods can accelerate the study of complex and multidimensional systems, which would be time-consuming and computationally expensive to systematically analyze with atomistic simulations alone. This approach provides a way to combine the accuracy of atomistic computer simulations with the high-throughput ML methods, which can shift focus on relevant areas of research, and find patterns on smaller amount of data. This means the number of new costly atomistic computations needed for thorough understanding of material properties can be decreased. ML methods can be used for example to find out the optimal parameter set for experiments, or experimental settings with the maximum information gain [16], as well as for predicting materials properties according to the structure- property relationship [17], and discovering and designing of new materials [18]. 13 1.1 Research objectives The objective of my thesis was to study the adsorption of a single ALD (atomic layer deposition) precursor molecule GaCl3 onto a GaN(0001) semiconductor surface with computational methods, to gain insight of the possible adsorption configurations, chemical and physical behavior, and energetics of the system during the ALD process. The research objectives were determined as follows: 1. Building and validating simulation models for the GaCl3 molecule in vacuum, and the GaN(0001) surface slab, accurately representing their geometry, as well as their physical and chemical behavior. 2. Utilizing AI-driven computational structure search algorithm (BOSS) together with density functional theory (DFT) simulations to study the adsorption energy of different GaCl3 molecule configurations on the GaN(0001) surface, and creating the adsorption energy landscape based on the adsorption energies obtained with DFT. 3. Analyzing the local minima structures, obtained from the adsorption energy landscape, to identify the optimal adsorption configurations for the GaCl3-GaN system as well as gain understanding of the general adsorption characteristics of the pairing. 14 1.2 Thesis structure In this thesis, I describe the necessary steps to be taken into account when performing a computational simulation project for surface adsorption. For motivation and background, I introduced the importance of semiconductors in modern society, and the novel use of computational simulations coupled with artificial intelligence in materials science, and especially ALD research in Chapter 1. The second chapter includes general background information about gallium nitride and ALD, as well as the theoretical concepts and computational methods relevant in my work, including density functional theory, Bayesian optimization and Gaussian process. Additionally, Chapter 2 includes introduction to the topic of computational surface adsorption studies. I describe the computational details and methodology used for the DFT simulations in this work in Chapter 3. Additionally, this chapter includes descriptions of the methodology used for the computational structure search part of this thesis, Bayesian optimization structure search, comprising of low-dimensional preliminary structure search tests, and the full structure search, for the search of local minima adsorption configurations. Chapter 3 lastly includes details about the calculations used to post-process and analyze the found minima. Chapter 4 presents the results of my work, including the minima structures identified with the full structure search, and description of their mechanical and electronic properties. This chapter also includes details about the typical model validation and convergence tests for individual molecules, bulk structures, and surfaces, which are overall important in computational work regarding surface adsorption. Lastly, in Chapter 5 I provide some discussion about the results and conclusions drawn from the adsorption behavior of these found local minima, as well as some possible future steps for research in this area. 15 2 Theoretical background 2.1 Gallium nitride 2.1.1 Properties and applications Gallium nitride (GaN) is a wide band gap semiconductor, with an experimental band gap of 3.4 eV [19], suitable for various optoelectronic and electronic applications. As discussed in Chapter 1, WBG semiconductors are demonstrating great promise as alternatives for traditional Si-semiconductors. GaN is famously used in light emitting diodes (LEDs), due to the large and direct band gap. Doped GaN can be produced by introducing either Si or Mg into the structure to create n- and p-doped GaN, respectively. A GaN LED is typically constructed by placing p-type GaN and n-type GaN between two electrodes. This leads to the formation of p-n homojunction, where holes and electrons flowing from their respective electrodes recombine, to create light. [20] GaN semiconductors are also attractive candidates for applications requiring high operating temperatures, including high-temperature and high-power electronics, due to the beneficial thermal properties of GaN, as the thermal conductivity of GaN facilitates heat dissipation in devices. [19], [21], [22]. The high operating voltages and frequencies of GaN have also allowed it to be used in solar inverters and wind turbine power electronics, to enable efficient power conversion and grid integration, consequently improving their output and easing the use of renewable energy [2]. WBG semiconductor materials, such as GaN, attract a lot of interest, due to their inherent properties outperforming pure silicon’s physical limits. For example, GaN-based switching power semiconductor devices can be used as a more efficient alternative to traditional silicone power converters. However currently, GaN power devices can reach their full potential only in theory, as fabrication methods for GaN films and standardized structure architecture for GaN devices are still in need for development and further research [4], [11]. 16 2.1.2 Structure of GaN Exact knowledge and understanding of the GaN structure is important when developing atom- scale manufacturing processes and device structures. Bulk GaN can be found as hexagonal wurtzite and cubic zincblende structures [22]. In ambient conditions the thermodynamically stable form of GaN is the wurtzite structure, which is why it is primarily used in applications compared to the less stable form zincblende [6], [23]. Wurtzite GaN has a hexagonal lattice, with lattice vectors a and c, and crystallographic parameter u describing the structure (Figure 1). The parameter u describes the relative distances of Ga and N layers in the [0001] direction, and in ideal wurtzite GaN, u would be a constant value of 0.375 (3/8). The value for u can be calculated as the Ga-N bond length divided by the c vector length. [24], [25], [26] Wurtzite GaN unit cell contains two gallium and two nitrogen atoms stacked in c-axis direction, forming one GaN bilayer (BL) (Figure 2a) [24], [27]. The alternating Ga and N layers are close packed in an ABABAB sequence, characteristic for wurtzite structure [21]. Figure 1. The crystallographic parameters for GaN unit cell. a) Hexagonal lattice with lattice vectors a and c represented as red arrows. b) Primitive wurtzite GaN cell, with 2 Ga atoms and 2 N atoms, forming one bilayer (BL). Lattice parameters a, c and u describe the full structure of the GaN cell. Ga- Ga distance is depicted by lattice vector a, while c describes the height of the bilayers in total. Ga atoms are represented as brown, and N as blue. Adapted from [4] (a) and [25] (b). Wurtzite GaN surface structures can be divided into three categories: polar, non-polar and semi-polar based on the crystallographic orientation of the surface plane. The crystal polarity is caused by the lack of inversion plane perpendicular to the c-axis. This asymmetry of the wurtzite GaN surfaces results in dangling bonds (DB) at the surface atoms. [24] DBs in solids are caused by unsatisfied valence due to unpaired electrons [28]. The DBs affect the electronic properties and reactivity of the surface, as they result in available surface states. 17 The polar surface orientations of GaN are GaN(0001) and GaN(0001̅) (Figure 2a) [24]. GaN(0001) is Ga-polar, meaning that the surface plane is terminated by top atom layer of Ga, while GaN(0001̅) is N-polar, and thus is terminated by nitrogen atoms. The GaN(0001) surface is currently the most used form of GaN in electronic applications [24], [29], which is why it is focused on in this thesis. Despite the similar terminology, the crystal polarity does not describe the polarity caused by the distribution of electric charges in the GaN lattice. Due to the electronegativity difference in gallium and nitrogen atoms, the bonds between Ga and N are slightly ionic, and the electrons tend to locate closer to the nitrogen. This causes a surface normal static dipole moment in polar GaN surfaces. In the bulk these dipoles are cancelled, but result in spontaneous polarization at the asymmetric cut surface. When performing atomistic simulations, the polarity of the material has to be taken into account, which will be further discussed in Chapter 2.4.1. Another form of polarization, piezoelectric polarization, also occurs in GaN, caused by distortion in the unit cell due to mechanical strain. However, piezoelectric effect typically only affects heterostructures and interfaces, and spontaneous polarization is the more noteworthy phenomenon in surface science studies. The electronic characteristics of the surface are affected by the polarity, and therefore the intrinsic surface polarization of GaN provides high conductivity for the material, which is beneficial for many of its electronic applications. [4], [21], [24], [27], [30] When studying the ALD process between GaCl3 and GaN(0001), it is important to know what are the possible locations on the surface where adsorption is most likely to occur. GaN(0001) has four potential high symmetry surface adsorption sites for the adsorbate, depicted in Figure 2b. They are T4 (N top) site, H3 (hollow) site, T1 (Ga top) site, and Br (bridge) site. In T4 site, the adsorbate is located on top of the N atom underneath the top Ga layer, while simultaneously bonded to three Ga atoms on the top layer. Adsorbate in H3 site is similarly bonded to three Ga atoms, but is situated in the hollow site, where the underlaying site is vacant. In T1 site, the adsorbate is situated directly on top of the surface Ga atom. Lastly in the bridge site, the adsorbate is bonded to two Ga surface atoms and is located in the bridge region between them. [24] 18 Figure 2. Surface structure for GaN. a) Side view of the ideally terminated GaN with three bilayers. Ga-polar face is in the [0001] direction, while N-polar face is in the opposite [0001̅] direction. b) Top- down view of the GaN(0001) adsorption sites; T1 is the Ga-top site, T4 is the N-top site, H3 is the hollow site, and Br is the bridge site between two Ga atoms. The red rhombus indicates the primitive 1x1 GaN(0001) surface unit cell. Adapted from [24]. 2.2 Atomic Layer Deposition Modern semiconductor devices rely on high-quality thin-films, which require precise and tuneable fabrication methods. As described in Chapter 1, Atomic Layer Deposition (ALD) is such method that offers great promise for WBG semiconductor manufacturing. ALD is a method of depositing nanometer scale thin films on a substrate surface, one atomic layer at a time. The basic principles of ALD process were independently discovered twice. Firstly, the so-called molecular layering method was the earliest form of ALD, developed in the 1960s in the Soviet Union by Professors Valentin Aleskovskii and Stanislav Koltsov [31]. Another basis for ALD method was proposed by Dr. Tuomo Suntola in the 1970s in Finland, by the name atomic layer epitaxy (ALE) [32]. Nowadays modern ALD is utilized to produce thin films for applications including microelectronics, photovoltaics, semiconductor devices and catalysis [14], [33]. The ALD thin film is formed from chemical reactants, small molecules also known as precursors, which are deposited to a substrate surface. In the ALD process (Figure 3), gaseous precursor molecules are sequentially introduced, or pulsed, into a reaction chamber in alternating fashion. The temperature and pressure conditions at the chamber are set to enable chemical reactions between the precursor and the substrate surface. Firstly, the precursor molecules react with the available surface groups on the substrate, leading to a one monolayer 19 of the adsorbed precursor. When all the available surface groups on the substrate are populated with the adsorbed precursor, they can no longer react with additional precursor still in the chamber. Because the surface reactions would naturally end at this step, ALD is fundamentally self-limiting, which enables structure control in atomic scale. In the next step of the ALD cycle, the excess precursor and possible reaction by-products are removed, or purged, from the reaction chamber with an inert purging or carrier gas. Following the purge step, a second precursor can be injected into the chamber, after which it can interact with the available surface groups of precursor 1. What follows is another purge, to once again remove excess precursor and by-products from the chamber. The pulsing and purging steps together form one full ALD cycle. This ALD cycle can be repeated until the desired film thickness has been achieved. The number of different precursors in the ALD process can be only one or several, depending on the material of the desired thin film. [14], [33], [34] The focus of my thesis was to study the adsorption of a single precursor, GaCl3, to gain understanding of that specific step of the full ALD cycle for GaN(0001) fabrication. Figure 3. Representation of the ALD process with two fictional precursor compounds. Precursor 1 is represented as blue-green, while precursor 2 as red-orange. The goal material is double layer thin film with alternating layers of two elemental species (blue and red). a) In the first stage of the ALD process, precursor 1 is pulsed to the reaction chamber, where it reacts with the substrate surface. b) Excess precursor and reaction by-products are pulsed out of the chamber in the second stage. c-d) These two steps are repeated with precursor 2. The full cycle is repeated until desired film thickness is achieved. Adapted from [14]. The precursor molecules can be divided into two classes: non-metals and metals. Hydride molecules. e.g. H2O, H2S and NH3, are typically used as non-metal precursors, while metal 20 halides, e.g. chlorides, have been used as metal precursors in ALD production of oxide, sulfide and nitride films [33]. Today, most commonly used ALD types are thermal ALD, where the thermal energy of the chamber activates the surface reactions of precursors, and plasma-enhanced ALD (PEALD) [14]. In PEALD the reactants are activated by the introduction of plasma to the reaction chamber. This creates highly reactive radicals and ions, facilitating surface growth [35]. The PEALD method is used especially for production of metal and nitride thin films, as the surface reactions may not be activated by thermal energy alone [33]. The advantages of ALD in comparison to other deposition methods, such as chemical and physical vapor deposition, are that ALD produces high-quality films due to its stoichiometric and conformality control caused by the self-limiting nature, and it operates in relatively low temperatures compared to other thin film deposition methods. However, disadvantages of ALD include slow process rate, because the chemical surface reactions have to have enough time to take place, and high energy and chemical waste rates, because precursor chemicals have to be introduced to the chamber in excess. Approximately 60% of the precursor initially pulsed to the chamber is wasted in the ALD process. [14] Another advantage of ALD is the tunability of the process, stemming from wide variety of existing and possible precursors, with many potential reaction pathways and complex surface reactions [13]. This aspect of ALD also poses challenges for development of the ALD procedures for growth of novel materials, because there can be numerous possible precursor combinations and process conditions to evaluate for suitability in regards to the production for given objective material. For these reasons, optimizing the ALD process would require a lot of resources and time. Computational methods like simulations are an important tool in ALD research, as their use can significantly reduce the materials cost. Computational modelling is also an effective way to systematically study large amount of precursor-surface combinations, and identify their reaction behavior and pathways, and whether the pairing is beneficial or not for the material fabrication. [13], [14] In previous studies, GaN has been produced with ALD from precursors GaCl3 and NH3 [36], [37] with GaCl3 acting as the Ga-source, and NH3 as the N-source for GaN. These studies utilized thermal ALD, with the reaction temperature between 500-700 ºC. In gas phase, GaCl3 in monomer form is a trigonal planar molecule [38]. GaCl3 has good thermal stability, which allows larger ALD temperature windows, and it is relatively volatile [36]. The studies 21 for this specific pairing of GaCl3 precursor for GaN thin film fabrication material are however very sparse, and the exact surface adsorption mechanism is not well documented. The goal of this thesis is to systematically analyze the adsorption configurations and behavior of GaCl3 vial computational methods, and fill in these gaps in the knowledge. 2.3 Principles of density functional theory Density functional theory (DFT) is a computational modelling method based on quantum mechanics that is used to determine the ground state electronic structure for various materials. These materials, such as solids and molecules can be modelled as many-body systems, consisting of multiple interacting particles: atoms and electrons. The defining aspect of DFT is that it utilizes the electron density of the model to calculate the total ground state energy. [39] The foundation of DFT is based on quantum mechanics, where the time-independent Schrödinger equation [40] (Equation 2.3.1) can be used to determine the total energy E of a system: ?̂?𝛹 = 𝐸𝛹, (2.3.1) where Ψ is the many-body wavefunction, describing the positions of all the electrons and nuclei in the system. For a system with N electrons and M nuclei, Ψ can be defined as 𝛹=𝛹(r1, …, rN; R1, …, RM), where r and R are the vector representations for spatial coordinates, x, y and z, for electrons and nuclei, respectively. In Equation 2.3.2, ?̂? is the Hamiltonian operator: ?̂? = 𝑇𝑛 + 𝑇𝑒 + 𝑉𝑛𝑛 + 𝑉𝑛𝑒 + 𝑉𝑒𝑒 , (2.3.2) where the terms Tn and Te represent the kinetic energy of the nuclei and electrons, and the three remaining terms Vnn, Vne, and Vee the potential energy of nucleus-nucleus interaction, nucleus-electron interaction, and electron-electron interaction. The interaction between the nucleus and the electron is attractive, while the interactions between similar particles are repulsive. The Hamiltonian can be simplified by applying the Born-Oppenheimer approximation [41]. Born-Oppenheimer approximation states that the wavefunction of nuclei and electrons can be separated because of the significant mass difference between atomic 22 nuclei (proton) and electron. At the point of view of the electrons, the much larger nuclei can be seen as stationary, and thus purely nuclei terms can be described with a constant term. The electronic energy can be described with the following electronic Hamiltonian (Equation 2.3.3): 𝐻?̂? = 𝑇𝑒 + 𝑉𝑛𝑒 + 𝑉𝑒𝑒 (2.3.3) and the electronic Schrödinger equation (Equation 2.1.4), 𝐻?̂?𝛹𝑒 = 𝐸𝑒𝛹𝑒, (2.3.4) where 𝛹𝑒 = 𝛹𝑒 (r1, …, rN) is the electronic wavefunction, which describes the electron coordinates of the system consisting of N-electrons, while Ee is the electronic ground state energy of this system. The many body wavefunction includes 3N coordinates, but it can be approximated as a product of the individual wavefunctions of the single electrons as 𝛹𝑒 = 𝛹1 (r1), …, 𝛹𝑁 (rN). This simplified approach is still not enough to determine the solution to many-body Schrödinger equation for a system with multiple interacting particles. One approach to solving such case is to determine the total energy E as a functional of the electron density n(r) (Equation 2.3.5), instead of directly using the wavefunction representation. Functional is defined as a function with a function as an argument: [39] 𝐸 = 𝐸[𝑛(𝒓)], (2.3.5) where r is a position vector representation of the three spatial coordinates x, y, z. The electron density approach is based on theorems by Hohenberg and Kohn [42]. The first theorem states that the ground-state electron density n0(r) of a system uniquely determines the Hamiltonian, and thus all ground-state properties of the system for a given external potential Vext. Vext is the potential arising from the nucleus, interacting with electrons in the absence of electromagnetic field [43]. The second Hohenberg and Kohn theorem states that the electron density minimizing the energy functional corresponds to the true ground-state electron density n0(r) of the system. In practice this means that the true ground state electron density can be determined by varying the electron density until the functional minima is found (variational principle). This however would require knowledge of the exact form of the energy functional. Hohenberg and Kohn theorems were further expanded by Kohn and Sham (KS) [44], where the many-body wavefunction is formulated into a set of independent non-interacting single 23 electron wavefunctions (Equation 2.3.6). As a result of this representation, the total energy functional can be separated into known and unknown terms (Equation 2.3.7). 𝑛(𝒓) = ∑ |𝛹𝑖(𝒓)| 2𝑁 𝑖=1 (2.3.6) 𝐸[𝑛(𝒓)] = 𝑇𝑠 + 𝑉𝑒𝑥𝑡 + 𝑉𝐻 + 𝐸𝑋𝐶 (2.3.7) With the KS-formulation, the known terms include the effects of the kinetic energies of the non-interacting electrons, Ts[n(r)], external Coulomb interactions between the electrons and the nuclei, Vext[n(r)], and internal classical Coulomb interactions between electrons, e.g. Hartree energy, VH[n(r)]. The unknown part is included in EXC[n(r)], which is the exchange- correlation (XC) functional including all many-body interactions, such as the exchange energy between interchangeable electrons (non-classical potential), and energy of correlated motion of the electrons (kinetic) [39], [45]. Solving the Kohn-Sham set of self-consistent equations (Equation 2.3.8) is done iteratively by first making an initial guess of the electron density n(r), which is used to calculate the effective potential Veff. Veff consists of the summed effects of Vext, VH, and VXC, which is the exchange-correlation potential derived from EXC. With the effective potential, the single- electron wavefunctions 𝛹𝑖 and thus their energy eigenvalues 𝜀𝑖 can be determined from Equation 2.3.8. Then the new electron density is consequentially determined from Equation 2.3.6. This iterative cycle between Equation 2.3.8 and Equation 2.3.6 is called the self- consistent cycle. This cycle is repeated until the obtained electron density is converged to the same as the initial electron density. [− ħ2 2𝑚 ∇𝑖 2 + 𝑉𝑒𝑓𝑓(𝒓)] 𝛹𝑖(𝒓) = 𝜀𝑖𝛹𝑖(𝒓) (2.3.8) This type of calculation, where the atom positions are not moved, and the result is the ground state total energy, is called a single-point calculation. In addition to monitoring the convergence of the self-consistent cycle, the convergence of total energy differences between iterations are typically also monitored. [46] Besides finding the ground-state energy of the system, DFT can be used to find the minimum energy structure of a system by geometry optimization, also known as structural optimization or relaxation. The interatomic forces of the atoms can be calculated from the gradient of the total energy. In relaxation, the atoms are moved to lower energy positions by minimizing these forces. The total energy and forces are then calculated in the new positions. This cycle is 24 repeated until the minimum energy structure is obtained when the force convergence threshold is reached. [45] The exchange-correlation functional is a source of uncertainty in DFT, as it is merely an approximation of the true system. There exists many different exchange-correlation functionals with varying levels of accuracy. When choosing a functional it is important to take into consideration the increased computational cost with the level of accuracy, as well as the specific materials properties simulated, as XC functionals are often case-specific. [46] The simplest case for the XC functional is based on the local density approximation (LDA), which assumes the XC energy to be that of a uniform electron gas with corresponding density. In this case n(r) is constant in all points in space. More accurate class of XC functionals are the generalized gradient approximations (GGA), which also include the gradient of the electron density while determining the XC energy. [46] One example of GGA functional is the Perdew Burke Ernzerhof (PBE) functional [47], which is well suited for solid-state systems. Most accurate class of XC functionals are the hybrid functionals that increase the accuracy by including some fraction of the Hartree-Fock exact exchange energy. [46] They help mitigating self-interaction error of electrons, which stems from incomplete cancellation of the electron’s interaction with itself [45]. Other noteworthy shortcoming of DFT is its poor ability to model van der Waals (vdW) interactions, and inaccurate determination of band gaps. Standard DFT functionals do not take into account the dynamic dispersion interactions between particles. To include these effects in the DFT calculations, either special vdW functionals, or other VdW correction schemes to the system’s energy has to be implemented [46]. Tkatchenko-Scheffler (TS) method [48] applies post-DFT vdW corrections to the system energy via Hirshfeld electron density partitioning [45]. DFT implemented with LDA or GGA functionals has a tendency to underestimate the band gaps of the materials due to the self-interaction error. Therefore, the band properties and band gap can be more accurately modelled using hybrid functionals [45], [46] . In DFT, the electronic wavefunction is represented with basis sets, which consists of basis functions. The two main types of basis sets are atomic-like orbital sets and planewave sets. Atomic like orbitals are constructed by radial functions centered around atoms, multiplied with some spherical harmonic. Planewave basis sets are formed from wavefunctions, and they are typically used with solid-state calculations. The choice of basis set is important for accuracy, and the sufficient set can be determined in regards of convergence with the total 25 energy differences. Other important aspect in DFT calculations is deciding which electrons are represented explicitly. In most cases, the valence electrons of the atom are of interest, as they mainly contribute to bonding and other chemical properties. However, the core electrons affect the nuclei-electron external potential of the valence electrons due to shielding effects, so they cannot be completely ignored in the calculations. Pseudopotential is an approximation that replaces the full external nuclei potential and core electrons with a fictional combined new potential, defined within some radius from the core. [46] One important aspect of DFT calculations for periodic systems, is the concept of k-point sampling. For symmetrically equivalent sites r and r + R in a periodic structure, Bloch’s theorem (Equation 2.3.9) describes how their electronic wavefunctions differ by a phase upon translation, characterized by some wavevector k: 𝛹(𝒓 + 𝑹) = 𝑒𝑖𝒌𝑹𝛹(𝒓) (2.3.9) When calculating the total DFT energy for periodic systems, the Bloch’s theorem can be applied to the KS-single electron wavefunctions to reduce the computational costs. The total electronic energy can be calculated by integrating in a grid over the Brillouin zone, which corresponds to the unit cell in the reciprocal lattice. The reciprocal space can be defined in terms of the k wavevectors, which is why it is also called the k-space. Instead of sampling the full Brillouin zone, sampling can only be done with a specific set of critical k-points in the k- space. Convergence in reference to the density of the k-point grid should be monitored against the total energy of the system, to determine the optimal k-grid in each direction. [39], [46], [49] The energy eigenvalues 𝜀𝑖 determined from KS-equations (Equation 2.3.8), correspond to the allowed energy states, or bands, of the molecular system. In a periodic system, such as a crystal lattice, the bands can be analyzed based on the specific path of critical k-points (high symmetry points) on the Brillouin zone. [46] There exists specific high symmetry points, denoted by Greek capital letters, for different lattice types [50]. The band structure of a material offers valuable information about its electronic properties, including band gap. The electronic structure of a material model can be further analyzed by determining the density of states (DOS). DOS quantifies the number of electronic states corresponding to a specific energy level or window. Atom-projected DOS (pDOS) presents the DOS projected for single atoms or groups of atoms of the model. Calculating the pDOS is especially useful for surface 26 adsorption studies, where the local electronic structure can offer more information than the full DOS. [46] When the electronic structure of a material is determined with DFT, the partial charges for atoms in a model can be calculated based on the electron density. In general, charge belonging to a certain atom can be assigned by defining some area surrounding the atom, and partitioning the electron density based on these assigned regions for each atom. There are multiple different methods for assigning partial charges, including Hirshfeld and Becke schemes, as well as Bader charges, and Mulliken method. Because these methods assign the charge differently, the absolute values for partial charges for a material can vary based on the method, and should not be relied explicitly. However, the general trends presented by the partial charges can be used to analyze the bonding and charge transfer between atoms. [46] In this work the chosen method for determining the partial charges was the Mulliken analysis method [51], which utilized density and overlap matrices to assign charges over the atom- centered basis functions. 2.4 Computational modelling of surface adsorption 2.4.1 Model preparation Modelling of surface adsorption requires accurate representation of the geometry and the bonding behavior between the adsorbate and the surface. The adsorbing molecule can be modelled as being in a gas-phase and isolated in a vacuum. The structure of the molecule, including bonds lengths and angles should be converged with reference to the DFT basis set for accurate representation of the geometry. [46] Surfaces on the other hand are modelled as slabs cut from the bulk structure in the desired crystallographic direction. [46] As for wurtzite GaN, the GaN(0001) surface can be obtained by cutting the bulk perpendicular to the z-axis, corresponding to the lattice vector c (Figure 2a). As discussed in Chapter 2.1, the termination of the bulk structure can create dangling bonds, which will cause the electronic behavior of the surface to change. This will have to be taken into consideration when constructing the simulation model to accurately represent properties of the material. For this reason, the bottom of the slab should be passivated with some elemental species to even out the charge disparity. Electron counting rule (ECR) 27 determines the correct termination of the DBs at the bottom of the slab, and it was applied to GaN to determine the correct amount of balancing charge. According to ECR, Ga has three valence electrons, and in the GaN bulk it forms four bonds with surrounding nitrogen atoms. Thus, Ga contributes 0.75 electrons for each Ga-N bond, which means in the pristine GaN(0001) surface there is excess electron density of 0.75 at the surface Ga DBs. In this work the excess charge was balanced by introducing one monolayer of pseudohydrogens (PH) to the bottom of the slab to saturate the nitrogen DBs. These PHs have a fictious nuclear charge and electron occupancy of 0.75, which satisfies the ECR. [19], [24] Periodic boundary conditions (PBC) are utilized expand the structure and simulate infinity. Due to PBC, the dimensions of the model in x, y and z direction have to be carefully considered, to avoid unphysical interactions between the periodic images of the adsorbate (Figure 4). [46] In xy-direction, the adsorbate interaction with periodic copies of itself can be avoided by expanding the surface supercell, so that the separation between the adsorbates stays around 10 Å. This provides sufficient distance between adsorbate images to mitigate the adsorbate-adsorbate interactions. To avoid the interaction between the surface slabs in z- direction, a sufficient vacuum layer can be added between the periodic slabs to separate them [45], [46]. Additionally, to accurately model the adsorption behavior, the thickness of the slab in z-direction has to be large enough to model the effect of the bulk to surface electronic structure. [46] For material models with surface normal dipole moment, such as GaN(0001) (Chapter 2.1), the addition of vacuum in z-direction can cause an inaccurate electric field in the vacuum gap due to polarization [24]. This can severely distort the calculations of surface properties. Solving this problem is commonly done by implementing dipole correction [52] in the simulations. This confines the electric field into a thin dipole layer placed in the middle of the vacuum gap, which prevents it from affecting the surface reactions. Also, as discussed in Chapter 2.3 DFT calculations cannot describe van der Waals interactions, which are important to represent in the model in case of physisorption. Therefore, some form of dispersion correction should also be applied to the model when studying surface adsorption. 28 Figure 4. The criteria of separation for periodic models. Left: The vertical separation zsep, should be enough to prevent adsorbate interaction with the bottom surface due to PBC. Right: ysep and xsep should be large enough to avoid interaction between periodic images of the adsorbates. Adapted from [45]. 2.4.2 Adsorption energy Since the ALD process involves the adsorption of the precursor molecules onto the surface, calculating the adsorption energies with DFT is a useful method for studying the adsorption reactions and mechanisms taking place between the surface and the adsorbing molecule [13]. This process involves first calculating the DFT total energies of each of the individual models (molecule, surface, and adsorbed molecule on surface), after which the adsorption energy can be determined for different configurations. By definition, adsorption energy is used to describe the strength of interaction between an adsorbate (molecule) and an adsorbent (surface). Depending on the type of interaction, adsorption can be classified into chemisorption or physisorption. In chemisorption, adsorbate can form a chemical bond with the surface, either by sharing electrons, or via electrostatic forces through opposite charge ions. Physisorption results in a weaker bonding between the adsorbate and the surface, based on the dispersive van der Waals forces arising from the polarity fluctuations of the systems. [45] Adsorption energy (Equation 2.4.1) can be calculated as 𝐸𝑎𝑑𝑠 = 𝐸𝑡𝑜𝑡 − (𝐸𝑚𝑜𝑙 + 𝐸𝑠𝑢𝑟𝑓) (2.4.1) where Etot is the total energy of the system comprising of the adsorbate and the surface, Emol is the total energy of the adsorbate molecule in a vacuum, and Esurf is the total energy of the 29 isolated surface [45]. Negative adsorption energy values indicate that the adsorbate prefers interaction with the surface, i.e. the adsorption process is energetically favorable. The strength of the adsorption is indicated by the magnitude of Eads. Typically, more negative values of Eads indicate chemisorption, while less negative values physisorption [53]. The stability of different structure configurations of a molecular system can be determined by constructing the Born-Oppenheimer Potential Energy Surface (PES), which maps the energy of the structure as a function of the atomic coordinates (Figure 5). The local minima of the PES correspond to the stable configurations, and the global minimum to the ground state configuration. [54] In the case of adsorption, the stable adsorption configurations are found in the local minima of Adsorption Energy Surface (AES), which represents the adsorption energy as a function of the molecular system coordinates. AES can be constructed by energy sampling the configurational space of this system with DFT calculations and locating the minima with phase-space exploration methods, such as minima hopping, metadynamics, or Monte Carlo method [53]. For complex geometries, these conventional methods can be very computationally expensive. In this work, the AES is mapped by utilizing the novel Bayesian Optimization Structure Search (BOSS) (Chapter 2.5) algorithm along DFT calculations, which offers improved computational efficiency. Figure 5. Example of a fictional PES. Global minimum corresponds to the lowest energy structure. 30 2.5 Bayesian Optimization Structure Search (BOSS) 2.5.1 Bayesian optimization (BO) Due to the computational costs associated with extensive atomistic energy calculations for multidimensional structure search, machine learning (ML) methods can be used to reduce the amount of energy sampling needed to map out the AES, while still maintaining accuracy. Active learning is a form of ML, where the ML algorithm iteratively chooses the datapoints to be added to the training set, with the goal of using as few computationally costly datapoint evaluations as possible [55]. Bayesian Optimization Structure Search (BOSS) is a machine learning method, that uses active learning in a form of Bayesian optimization (BO) to efficiently sample the configurational structure space, and find the structure configurations optimizing a specific target property. In this work, BOSS is used to determine the AES (with DFT calculations), while optimizing (minimizing) the adsorption energy Eads. BOSS is implemented in Python [56], and can be used in combination of wide range of simulation software for various optimization problems. BO is a machine learning method based on Bayesian statistics that is used to predict an unknown N- dimensional objective function, based on some existing data. In materials design, the objective function of interest can be any material property. In Bayesian statistics a prior or prior probability distribution describes the likelihood of values for some unknown quantity. Prior is based on intuition on the domain behavior. [57] With the prior, and some observed data, the posterior distribution can be determined according to the Bayes’ rule [58]. The conditional posterior distribution is updated as observed datapoints are added. 2.5.2 Gaussian process (GP) The prediction model in BO is based on a machine learning technique Gaussian process regression (GP) [57]. A stochastic process is defined as a collection of random variables with a joint probability distribution. When the distribution takes the form of a Gaussian distribution, the process is a Gaussian process. [59] AES can be represented as a Gaussian process, by assigning each structure configuration a random variable, representing the adsorption energy. GP is defined by its prior mean function and covariance function [59], [60]. The mean function describes the AES before any energy datapoints have been added to the model. It is possible to estimate the possible AES during this stage, but typically the mean 31 of GP is set to zero, so not to make any assumptions of its shape. The covariance function describes the covariance between two random variables, and it encodes knowledge about the similarity between datapoints in the objective function [59]. The choice for the covariance function, or kernel should be chosen according to the characteristics of the expected functions. The kernel includes hyperparameters, which can be optimized to improve the GP model performance, and better describe the similarity information between data. These hyperparameters encode the lengthscale, variance and periodicity into the covariance function. Lengthscale describes how rapidly the function values change in each dimension, and variance depicts the magnitude of the range of the function values. [45] Two common kernels are the radial basis function (RBF) and standard periodic (STDP) kernels which are non-periodic, and periodic, respectively. Periodic kernels can be used when the studied system includes periodic boundary conditions [61]. Bayesian optimization utilizes GP to build a surrogate model for the unknown objective function (Figure 6). Starting from some initial training data, the posterior distribution for the model is formed from the prior defined by the GP by applying Bayes’ rule to the evaluated datapoints of the objective function. The posterior describes the surrogate model, and its mean is the most probable value for the objective function, and variance its uncertainty. The variance across the function space is used to evaluate the confidence of the model, which is high in areas close to the sampled points, and low in unexplored areas. [61] Figure 6. Construction of the surrogate model with Bayesian optimization. a) Prior distribution with mean zero. b) Posterior distribution updated from the prior with some training data according to the Bayes’ rule. c) GP model, where the mean (blue) and variance (grey) of the posterior distribution describe the current prediction and uncertainty of the model. Adapted from [61]. 32 The next evaluated point in the model is determined by minimizing the acquisition function A(x). The acquisition function is calculated based on the current mean (prediction of the function) and variance (uncertainty) of the model. There are acquisition functions with different purposes, e.g., minimizing uncertainty or finding the global minimum. [45] [61] Acquisition function used in this work is the exploratory lower confidence bound (eLCB) function [62], which balances between exploitation and exploration. Exploitation means sampling new data close to the expected global minimum point of the objective function, while exploration prioritizes sampling areas of high posterior variance. [45] After the next to be evaluated point of the objective function is determined by the acquisition function, the new point is added to the dataset. The GP model is then fitted to this updated set of training data (Figures 7 & 8). This cycle of acquiring new datapoints is continued until the surrogate model converges to the objective function, or some other stopping criterion, such as maximum number of iterations, is reached. When converged, the uncertainty of the model practically disappears and the model has completely mapped the true function. [45], [61] Figure 7. BO working principle: Based on some known data, a surrogate model for the objective function is modelled with GP according to the Bayes’ rule. The location for new data acquisition is determined with the acquisition function A(x). New datapoints are added iteratively to the model, until the surrogate model converged to the objective function. Adapted from [45]. 33 Figure 8. The GP surrogate model and the acquisition function A(x). a) GP model: x represents an arbitrary simulation variable, while y is the objective target property. The black curve is the true objective function f(x), while blue is the GP model μ(x). The model is fitted to the datapoints (red circles) sampled from the true model. Uncertainty ν(x) of the model is shown as the grey area around the prediction. b) A(x) is used to calculate the next evaluation point for surrogate model. The red line indicates the current global minimum prediction of the model, and green dashed line the location of the next evaluated point, located in the A(x) minimum. 34 3 Methodology 3.1 Construction of DFT simulation models In this work I utilized the FHI-aims simulation package [63] to perform DFT calculations. Due to the numeric atom-centered basis approach, FHI-aims is suitable for large periodic systems, as well as for surface studies with vacuum in the model, because simulation of vacuum is computationally inexpensive with atom-centered orbitals. FHI-aims is an all- electron code, meaning that it considers all electrons explicitly in the calculations via numeric atom-centered basis functions. The basis functions are listed in different tiers, providing additional accuracy if needed. The basis sets are further categorized in to light, intermediate, tight, and really tight. All DFT calculations were performed with the Perdew-Burke- Ernzerhof (PBE) exchange correlation functional. For accurate portrayal of the non-covalent interactions, I used the Tkatchenko-Scheffler van der Waals correction scheme [48] with FHI- aims. Relativistic effects due to gallium were accounted for with the “atomic ZORA” approach in FHI-aims [64]. For the GaCl3 molecule the total energy convergence criterion for DFT was set to 10-5 eV in the self-consistent cycle, and the geometry was set to relax until convergence of maximum of 10-3 eV/Å force component per atom was obtained. In the case of the bulk GaN model, both the atoms and the simulation cell were set to relax in all dimensions until convergence of maximum of 10-2 eV/Å force component per atom was obtained. The total energy convergence criterion for the molecule was set to 10-6 eV in the SCF cycle. For slab calculations, dipole correction was implemented in FHI-aims to counteract the effect of the dipole moment in the [0001] direction. Before implementing the BOSS method, I performed specific convergence tests for all the simulation models to obtain optimal basis sets, k-grid density, and other settings for the DFT calculations. 3.1.1 Adsorbate molecule (GaCl3) The propositioned precursor molecule for GaN fabrication with the ALD process studied in this work was the trigonal planar molecule GaCl3 (Figure 9). The initial geometry of GaCl3 was constructed manually with Avogadro software [65] and the geometry was pre-optimized within Avogadro with Universal Force Field (UFF) [66] method. UFF is a molecular mechanics force field, applicable to the full periodic table. It uses element and hybridization information to describe the total potential energy of the atoms in the molecular system, 35 comprised of non-bonded and bonded interactions. Force field methods do not consider electrons explicitly, which makes them less accurate than DFT, but useful for quick geometry optimization. After the pre-optimization, I relaxed the molecule with DFT to obtain the accurate geometry. Figure 9. The GaCl3 molecule viewed from the top (z-direction). The molecule is planar in the xy- plane, while the theoretical angles between the Cl-Ga-Cl bonds are 120º. The Cl atoms are annotated as Cl1, Cl2, and Cl3. Ga atom is represented as brown and Cl as green. 3.1.2 Semiconductor bulk (GaN) and surface (GaN(0001)) I obtained the initial geometry for the bulk wurtzite GaN (Figure 1) from materialsproject.com [67], which corresponded to the pristine experimental geometry of the material. The k-grid density was carefully tested for convergence before band structure calculations, and subsequent slab simulations. For band structure calculations, the path Γ, M, K, Γ, A, L, H, A was set, which describes the band path in a hexagonal wurtzite structure [50]. The desired (0001) surface slab for GaN (Figure 2a) was created with atomic simulation environment (ASE) python library [67] from the converged relaxed bulk geometry. 50 Å of vacuum was placed between the periodic slabs in z-direction. To satisfy the ECR, the nitrogen DBs in the bottom surface of GaN(0001) slab with 6 bilayers were passivated with 0.75 partial charge pseudohydrogens. The position of the pseudohydrogens was optimized by fixing the atomic positions of the rest of the slab and performing structural relaxation only on the pseudohydrogens. The optimization resulted in a N-H bond length of 1.057 Å. The pseudohydrogens were kept fixed at their optimized bong length for the subsequent slab simulations. 36 3.2 BOSS model 3.2.1 Working principle of BOSS in practice In BOSS method, structures of the surface and adsorbate can be represented as chemical building blocks (BB) to simplify the configurational search space. BBs represent rigid parts of the structure [53], such as the adsorbing molecule, or the surface slab. In general, a N-atom structure has 3*N degrees of freedom (DOF), corresponding to the three spatial coordinates for each atom. With the BB approach, the DOFs for the adsorbing GaCl3 molecule can be reduced to six: three translations along the x, y and z axis, and three counterclockwise rotations α, β and γ around the translation axes (Figure 10). [45]. The axes are defined through the molecular center of gravity, located at the Ga atom center of GaCl3. Figure 10 also shows the orientation of the molecule as parallel to the surface plane, where all rotations were defined to be 0 º. Figure 10. BOSS search dimensions for the molecule are three translational degrees of freedom along the cartesian axes x, y and z, and corresponding rotations α, β and γ counterclockwise around the axes. Ga atoms are represented as brown, N as blue, and Cl as green. The relative atom sizes are not in scale. The workflow of BOSS is represented in Figure 11. BOSS is combined with the atomistic simulation code (DFT) via a user function. The role of BOSS is to perform the active learning search for building a surrogate model of the AES, while DFT is used to calculate the adsorption energies for different configurations, which are then used as training data for BOSS. In this work, the full six degrees of freedom are used to define the configuration of the GaCl3 adsorbate molecule on the GaN(0001) surface. Each unique combination of the 6 DOFs x, y, z, α, β and γ corresponds to a different GaCl3 configuration of the AES. The purpose of the user function script is to transfer and rotate the molecule BB to a given configuration, after 37 which DFT is used to calculate its adsorption energy. The obtained energy can be then parsed from the DFT output into BOSS as a new data point. [45] Figure 11. BOSS working principle: The adsorption energy for a given molecule-surface configuration is calculated with DFT, corresponding to a single datapoint for BOSS search space, after which that datapoint is added to the BOSS training data set. BOSS iteratively builds a surrogate model for the AES as new DFT data is acquired until the surrogate model of AES converges to the true AES. The local minima configurations from the converged model can be extracted from the minima of the AES. Local minima structures are then relaxed with DFT without the building block approximation to acquire the true structures. The final configurations can be then analysed to gain information about the orientation and corresponding properties of interest. Adapted from [53]. New energy datapoints should be added to the model until the surrogate model of AES has converged. The convergence of the surrogate model can be monitored by observing the change in location of the predicted global minimum, as well as the individual minimum predictions for x, y, z, α, β and γ locations. Additionally, the kernel hyperparameters, lengthscale and variance should be monitored with regards to convergence. It is also important to investigate the model convergence with respect to the adsorption energies of the identified minima. Stable adsorption configurations can be extracted from the local minima of the AES constructed by the adsorption energy sampling when the BOSS model is sufficiently converged. Because BOSS uses the BB block approach to decrease computational costs, the local minima configurations should be subsequentially fully optimized with DFT to acquire the true configurations. The final configurations can be analysed in a number of ways, 38 depending on the research objectives. Interesting mechanical properties of the local minima configurations include adsorption sites, adsorbate orientation, degree of possible dissociation (number of fragments), and the distances and bond angles between these fragments. The adsorption energies of the minima and electronic structure analysis can additionally provide information about the stability, charge transfer, and bonding of the configurations. 3.2.2 BOSS parameters After I had obtained converged DFT simulation models for the molecule and surface, the different adsorption configurations of GaCl3 on GaN(0001) surface were sampled with BOSS, while using DFT via FHI-aims to calculate the adsorption energies. The full simulation model comprised of the GaCl3 molecule and GaN(0001) slab model. The DFT calculations were performed as single-point energy evaluations, where the static molecule and surface slab acted as the BOSS building blocks. DFT calculations were performed with light tier 1 basis set, and 2x2x1 k-grid, to reduce the computational times. The goal of my work was to comprehensively study the molecule configuration with full six DOFs: x, y, z, α, β and γ. The bounds for the configurational sampling were set as [0, 1], [0, 1], [3.0 Å, 4.5 Å], [0 º, 360 º], [0 º, 360 º], and [0 º, 120 º] for x, y, z, α, β and γ. For x and y, the bounds [0,1] defined the area of the primitive surface cell of GaN(0001), transformed into unitless fractional coordinates. The z bounds were defined as the height of the GaCl3 molecule measured from the center of gravity of the molecule (Ga-atom). Importantly, due to the BB approximation, the minimum height had to be considered carefully, so not to make the molecule collide with the surface. Full 360 º rotations were allowed in regards to α and β, while the γ rotations could be reduced to 120 º due to the trigonal symmetry of GaCl3. For BOSS I used the stdp kernel the with periodic variables x, y, α, β, and γ, while the rbf kernel for non-periodic z. 3.2.3 BOSS models Before performing a full BOSS run with 6 DOFs, I conducted preliminary studies with 1, 2 and 3 degrees of freedom to study individually the adsorption height z (1D), xy-diffusion (2D), and rotational orientation (3D) of the GaCl3 molecule on GaN surface. These tests were 39 performed to gain understanding of the qualitative adsorption behavior of the molecule, including a suitable sampling range for the height of the molecule. The datapoints from these preliminary tests were also recycled as the initial data for full 6D structure search. For these preliminary tests, BOSS initial datapoints were set as 5, sampled with the Sobol sequence [68]. The full structure search was performed with 6 DOFs with all three translational and rotational directions (Figure 10). New datapoints were sampled in batches of 250 points until the 6D model had converged. Due to the BB approximation and the shape of the rigid GaCl3, some α and β rotations could cause the Cl atoms from GaCl3 to locate very close to the surface. These unfavourable configurations could lead to repulsion between the adsorbate and the surface, which in turn could cause large Eads values. Hight energy configurations would significantly increase the search space for the BOSS model, but because the purpose of this study was to identify the local minima configurations, the exact information about the high energy region of the AES was not of interest. Possible effects of repulsion forces causing large positive values of Eads were counteracted by transforming Eads values larger than 1 eV into their log10 values, only after which the datapoint was added to the BOSS model. This approach allowed for accurate modelling of the minima region, without the added computational cost of unnecessarily sampling the maxima in detail. 3.3 DFT calculations for electronic structure analysis I extracted the local minima structures from the converged 6D BOSS model for further analysis. The building block approximation was removed, and the structures were relaxed with more accurate tier 2 light basis set settings to obtain the true local minima geometries. During relaxation, the bottom two bilayers and the pseudohydrogens were kept fixed in the slab model, while the top of the slab and the molecule were allowed to relax freely until convergence of maximum of 10-2 eV/Å force component per atom was obtained. Finally, a single-point DFT calculation was performed on the relaxed local minima, for calculating the atom-projected DOS (pDOS), and to perform Mulliken charge analysis, both of which were implemented in the FHI-aims code. The change in the electronic states of the minima caused by adsorption can be inspected with pDOS. The broadening and shift of electronic states depicted with pDOS can provide information about possible chemical bonding taking place in the configurations. The atom partial charges determined with the Mulliken method can be 40 used to explain the charge transfer between the adsorbate and surface atoms, which in turn can provide information about the type and strength of bonding, and thus adsorption. The Mulliken analysis by FHI-aims defines the per-atom partial charge q as follows: 𝑞 = 𝑍 − 𝑁, (3.2.2) where Z is the integer number of electrons in a neutral isolated atom, and N is the actual total number of electrons calculated to belonging to the atom based on the Mulliken scheme. Based on this definition, q is negative for atoms which have gained electron density, and positive when the atom has given up electrons. In subsequent analysis of the Mulliken charges, Δq is defined as the sum of these atom partial charges of a structure, where negative value indicates the whole structure has gained electron density, and lost electron density if Δq is positive. The total charge difference of a structure before and after adsorption is defined as ΔQ = ΔqB- ΔqA, where superscripts B and A refer to Δq before and after adsorption. 41 4 Results 4.1 Simulation model validation and convergence tests 4.1.1 Precursor adsorbate molecule (GaCl3) To ensure optimal simulation settings for the GaCl3 molecule, I performed geometry optimization tests in reference to the basis set, while monitoring the total energy and Ga-Cl bond length of the molecule. Comparison of suitable basis was performed between light and tight settings (Figure 12a), while also evaluating the convergence of tiers, especially in regards to the Cl basis set, which included a large g orbital. (Figure 12b). Based on these convergence tests in regards to the total energy, light tier 2 basis sets were deemed suitable for GaCl3 simulations. Figure 12. Total energy convergence for GaCl3 molecule presented as a function of different basis set tiers. a) Comparison of light and tight settings. b) Convergence of the basis set with light settings. The Ga-Cl bond lengths of the molecule after geometry optimization with each basis set configuration are presented in Table 1. The bond length of 2.124 Å was sufficiently converged with light settings and tier 2 basis set, while taking into account the computational time. The experimental Ga-Cl bond length for GaCl3 is 2.108 Å [69], meaning that the DFT calculations overestimate the bond length slightly. 42 Table 1. Bond lengths and angles for GaCl3 after optimization with different basis settings. Basis set configuration Ga-Cl (Å) optimization time (s) Experimental reference 2.108 - Light tier 1 without Cl g orbital 2.128 8.26 Light tier 1 2.126 9.23 Light tier 2 2.124 21.16 Light tier 3 2.124 19.23 Light tier 4 2.124 28.95 Tight tier 1 2.126 26.17 Tight tier 2 2.123 43.25 Tight tier 3 2.123 62.24 Tight tier 4 2.123 78.30 Additionally, the formation energy (Equation 3.2.2) for the molecule with respect to different tiers of light basis set were determined. The formation energy was calculated as: 𝐸𝑓𝑜𝑟 = 𝐸𝑚𝑜𝑙 − ∑ 𝑛𝑖𝐸𝑖 = 𝐸𝑚𝑜𝑙 − (𝐸𝐺𝑎 + 3𝐸𝐶𝑙) (3.2.2) where the total energies of Ga atom (EGa) and Cl atom (ECl), were subtracted from the energy of the GaCl3 molecule (Emol). The converged value of formation energy in regards to the tiers was -12.50 eV for the whole molecule, and -3.10 eV/atom, which was achieved with light tier 2 basis set (Figure 13). This value of Efor indicates GaCl3 is relatively stable in the molecular form, which is backed up by previous literature [36]. Dividing the Efor value with the number of bonds in the molecule produces the Ga-Cl bond dissociation energy of -4.16 eV. This value is slightly higher than the bond dissociation energy for O-H bond in methanol, which is around -3.99 eV, and slightly lower than the bond dissociation energy of -5.15 eV for H2O O-H bonds, which are considered very strong single bonds [70]. Compared to these reference energies, Ga-Cl bonds are quite strong in the isolated GaCl3 molecule. 43 Figure 13. Formation energy convergence for GaCl3 molecule presented as a function of different basis set tiers. 4.1.2 Bulk semiconductor (GaN) Before building the surface model, the optimal simulation settings were determined for the bulk GaN model. Convergence of the basis set and k-point grid density were validated with respect to the DFT single point total energies. For the k-grid, convergence criterion was set as 1 meV, which was first obtained with 9x9x9 k-grid . It converged within 0.2 meV for the energy for the subsequent 12x12x12 grid (Figure 14). Figure 14. Total energy convergence for GaN bulk model in regards to k-point grid density. Density of states (DOS) and band structure for the GaN bulk model were also calculated with DFT to inspect the convergence of the band gap values, important for semiconductors. The DOS and band structure calculated with light tier 1 settings are presented in Figure 15. The band gap can be seen in the calculated DOS, and the band structure also shows the gap between valence band maximum and conduction band minimum. The calculated band gap for 44 GaN was 1.95 eV with light tier 1 and 1.93 eV with tier 2 settings. The values were in agreement with previous computational PBE results of 1.80 eV [19] and 1.91 eV [71], but underestimating the experimental band gap of 3.40 eV. Based on these tests, tier 1 basis set was chosen to be sufficient in representing the band gap behavior of the GaN bulk. Figure 15. Band structure and DOS for GaN bulk with light settings and tier 1 basis set. After I had obtained the converged settings for the basis set and k-grid, the cohesive energy (Equation 3.2.3) for the bulk model was calculated to inspect the effect of Van der Waals correction in the GaN bulk structure. Light tier 1 basis and 9x9x9 k-grid were used for these calculations. Cohesive energy was calculated as the difference between the total energy of the bulk GaN per cation-anion pair (Etot/2), and the energy of free elements Ga and N (EGa, EN) with single point energy calculations [72]. The Ecoh calculated without VdW correction was 10.28 eV, while the use of VdW correction yielded Ecoh of 10.83 eV. These values sufficiently agreed with the experimental Ecoh value of 9.06 eV [72]. 𝐸𝑐𝑜ℎ = − 𝐸𝑡𝑜𝑡 2 + 𝐸𝐺𝑎 + 𝐸𝑁 (3.2.3) Subsequently, the bulk geometry was relaxed with and without the Van der Waals correction. The hexagonal lattice constants a, and c, and the parameter u were measured to obtain the final converged structure for the bulk GaN (Table 2). Experimental values for a, c and u are 3.19 Å, 5.19 Å, and 0.377, while similar GaN DFT work with the PBE functional has produced 3.22 Å, 5.24 Å, and 0.377 for a, c, and u [19]. Based on these results, the values for a and c, calculated with VdW correction were closer to their experimental counterparts. 45 Table 2. GaN lattice parameters after geometry optimization with and without VdW correction. Model a (Å) c (Å) u Experimental 3.19 5.19 0.377 PBE reference 3.22 5.24 0.377 PBE (without VdW correction) 3.22 5.24 0.377 PBE (with VdW correction) 3.20 5.21 0.376 4.1.3 Semiconductor surface (GaN(0001)) Suitable dimensions for the GaN(0001) slab model were inspected with respect to the xy size of the surface supercell, and the number of bilayers in z direction. The xy dimensions of the model were deemed reasonable, when the separation of the GaCl3 molecules would be circa 10 Å from periodic images of each other. The separation was calculated as the molecular diameter subtracted from the lattice vector. For the calculated Ga-Cl bond length of 2.124 Å, GaCl3 diameter was around 4.25 Å. The computational cost of increasing the number of atoms in the model was also taken into account when choosing the final surface supercell dimensions (Table 3). Based on this analysis, the 4 x 4 supercell in xy-plane provided sufficient separation for the GaCl3 molecules. Table 3. Adsorbate-adsorbate separation in periodic cell with increasing supercell size Supercell Supercell xy size (Å) GaCl3 separation Number of atoms in the surface layer in xy-plane 1 x1 3.2 x 3.2 -1.1 (overlapping) 2 3 x 3 9.6 x 9.6 5.4 18 4 x 4 12.8 x 12.8 8.6 32 5 x 5 16.0 x 16.0 11.8 50 6 x 6 19.2 x 19.2 15.0 72 Sufficient thickness for the slab was evaluated against the changes in the top surface layer bond lengths of the GaN model after geometry optimization (Figure 16). To mimic bulk structure in the slab, the bottom two BLs and the pseudohydrogens were kept fixed while the analysis of suitable thickness was performed with 4, 5 and 6 BLs in total. The results of the optimization tests are presented in Table 4, which shows that the model with four GaN bilayers with two free and fixed two bottom BLs along with pseudohydrogens was converged in reference to the changes in the surface structure within 0.01 Å accuracy. The final slab model used for subsequent BOSS simulations of this work is presented in Figure 16. 46 Figure 16. The final model for GaN(0001) slab with four BLs and 4x4 xy supercell. Ga atoms are represented as brown, and N as blue. Table 4. Changes in the GaN surface layer bonds lengths in regards to the slab thickness. The bonds are presented in Figure 16. 4.2 BOSS preliminary studies For the 1D run, only the z-coordinate of GaCl3 was varied during the sampling. Two cases were studied, one with the molecule flat on the surface (α and β set to 0º), and other, where the molecule was rotated 90 º around the β-axis to an upright position, where one Cl atom was pointing straight down (Figure 17a-b). The two cases, flat and upright, were studied to initially determine the optimal sampling height range for the molecule model in reference to the adsorption behavior between the molecule and the surface. The GaCl3 molecule was placed on the GaN(0001) slab, with its center of gravity (Ga atom) over the Ga-top site, which corresponded to the highest point on the surface. The sampling bounds for z were set so that the distance from the molecule to the Ga-top site would be at minimum of 1.5 Å and maximum of 10.0 Å. This range was chosen as an initial guess, to avoid high energy peaks due to repulsion forces, and find out a reasonable upper heigh limit after which there would be no interaction between the molecule and the surface. The search was initialized with 5 points, and additional 15 new datapoints were acquired to the BOSS model. GaN model Bond length 1 (Å) Bond length 2 (Å) Unrelaxed slab 1.96 1.96 4 bilayers (2 relaxing) 1.97 1.99 5 bilayers (2 relaxing) 1.97 2.00 6 bilayers (2 relaxing) 1.97 2.00 47 Figure 17. The geometry models and convergence for 1D BOSS test. a-b) Two molecule configurations sampled for the 1D test; “flat” (a) and “upright” (b). c-d) Predicted surrogate model for flat (c) and upright (d) configurations. Blue line is the surrogate model, red circles are the sampled datapoints, red vertical line is the global minimum prediction, and green dashed line is the next acquisition point. e-f) The global minimum predictions for Eads (black) and acquisition locations for z for flat (e) and upright (f) models. 48 Total of 20 energy evaluations were sufficient to converge the location of the global minimum prediction for z, and map out the true height function. The corresponding plots for the converged 1D surrogate models are presented in Figure 17c-d. In the regions where z is lowest, there arises repulsion forces between the molecule and the surface, which produces very unfavorable adsorption configurations with large positive adsorption energies. The optimal adsorption height would be situated where the global minimum of the 1D model is located, corresponding to the minimum Eads value. For the flat configuration of GaCl3, the global minimum value for z was 2.5 Å, corresponding to Eads value of -1.74 eV. In the global minimum configuration of the upright molecule, the distance to the surface from the GaCl3 Ga-atom was 4.9 Å, and 2.8 Å from the downwards pointing Cl atom, with a minimum Eads value of -0.16 eV. The values and locations of the global minimum prediction are presented in Figure 17e-f. In both flat and upright configurations, the global minimum is identified within the first energy evaluations, after which the model explores the 1D space to refine the model. The rotational symmetry and adsorption height range for the molecule were inspected with a 3D BOSS search. The optimal height for a 3D run to study the molecule orientation was chosen as 4.9 Å corresponding to the global minimum of the adsorption height in the upright orientation. This allowed full rotational freedom to the molecule, without the risk for high energy configurations due to repulsion forces. For the 3D tests, the molecule was initialized in the flat configuration, where all rotations were initially defined to be 0º. Similarly to the 1D run, the GaCl3 molecule was placed on top of Ga-top site. For the 3D run, only the α, β and γ rotations of the molecule around the [100], [010], and [001] directions were sampled. Initial sampling points were set as 5, and 100 BOSS iterations were performed to converge the 3D model in regards to the global minimum predictions for α, β, γ, and Eads. Two-dimensional cross-sections of the rotational model AES, showing high and low Eads regions of the configurational sample space were inspected for possible periodicity in the rotations, which could be used to reduce the sampling bounds (Figure 18a). The αβ cross-section presented some symmetry in the α and β rotations at around 90 º and 270 º for α, and at 30 º, 150 º and 270 º for β. These rotations corresponded to GaCl3 local minima configurations where none of the Cl atoms were oriented straight down to the surface, indicating that might be an unfavorable adsorption position. The global minimum configuration with regards to α and β corresponded to Eads of -0.21 eV, which was in line with the 1D results in the same z height. Based on the magnitude of the Eads, the z value of 4.9 Å produced relatively low adsorption energies, indicating further need for refinement of the height sampling bounds. 49 Lastly the translation of GaCl3 along the xy-surface was preliminary tested with a 2D BOSS run (Figure 18b). The molecule was placed flat on the GaN surface, and it was allowed to move within the primitive 1x1 surface cell region (Figure 2b), while keeping the orientation and height fixed. Initial sampling points were set as 5, and 50 BOSS iterations were performed. The global minimum configuration with regards to x and y locations corresponded to Eads value of -0.17 eV. The most important observation from the preliminary tests was that the adsorption energy range between minima and maxima was very narrow with both 2D and 3D models, which indicated that the fixed value of 4.9 Å for z was too high to produce meaningful adsorption energies, and the minimum z bound for subsequent 6D BOSS should be set lower than that. Additionally, the rotational bounds for α and β were set as [0 º, 360 º], to not make any false assumptions of the periodicity based on the non-optimal adsorption height of the 3D model. Figure 18. Cross-sections of the preliminary BOSS searches. a) 2D cross-section of the BOSS predicted 3D AES for α and β. Purple regions indicate minima and yellow maxima. The slice indicates some periodicity in global minima predictions around 90 º and 270 º for α, and at 30 º, 150 º and 270 º for β. b) 2D cross-section of the BOSS predicted 2D AES for x and y. The narrow range for the Eads surrogate model (μ(x)) values indicates that the molecule is located too high above the slab model to produce significant interactions with the surface. 4.3 6D BOSS simulation The goal of my work was to study the GaCl3 adsorption configuration with full six DOFs, with a 6D BOSS simulation. The 6D model was initialized with 369 datapoints, recycled from preliminary BOSS studies. I sampled new datapoints in batches of 250, while recurringly inspecting the convergence of the model. Based in the preliminary tests, the lower bound for 50 adsorption height was lowered to 3.0 Å. In low z region, due to the BB approximation and the shape of the rigid GaCl3, some unfavourable configurations in regards to rotations could lead to very large Eads values. This was counteracted by transforming Eads values larger than 1.00 eV into their log10 values, only after which the datapoint was added to the BOSS model. The convergence of the 6D model was validated against the global minimum prediction for adsorption energy, sampling locations for x, y, z, α, β, and γ, convergence in model hyperparameters, and the number of found local minima. Based on all these criteria, which I will next describe individually, the model was converged with a total of 2119 datapoints. Figure 19 depicts the global minimum prediction for adsorption energy, and corresponding Eads acquisitions for each sampled configuration obtained from the DFT calculations before applying log10 on values higher than 1.00 eV. The global minimum prediction does not change noticeably after 1250 datapoints have been added into the model. As can be seen from the acquisitions, very high adsorption energies at the beginning and end indicated that the BOSS sampled some unfavorable adsorption configurations of the molecule, which can be explained by the model using the exploration approach in choosing the next sampled point in high uncertainty areas, according to the eLCB acquisition function. Figure 19. Convergence of the global minimum prediction for Eads, and the corresponding acquired datapoints. Left: The global minimum prediction for the adsorption energy. Red vertical lines indicate the start of new batch of acquisitions. Right: Acquired Eads values calculated for sampled configurations, before transforming the Eads > 1.00 values into log10 values. There is noticeable difference in the energy ranges for minimum prediction and initially acquired datapoints. The sampling locations for x, y, z, α, β, and γ are presented in Figure 20. The search space for x, y, and γ was sampled thoroughly. Locations for α and β sampling showed a slightly denser areas in 0 º, 180 º and 360 º for α, and in 0 º and 180 º for β, which indicated that those regions 51 were especially focused on within the model. The sampling for z values was concentrated on the upper and lower bounds of the assigned z range, but also the region between is sufficiently explored. Based on the sampling of all six DOFs, the number of iterations for this model was enough to fully explore the whole space of these variables, and additional sampling would not have provided much information gain. Figure 20. Sampling locations of x, y, z, α, β, and γ as a function of datapoints in the model. The global minimum predictions corresponding to x, y, z, α, β, and γ are presented in Figure 21 as a function of datapoints in the model. Initially the minimum prediction changes a lot, which can be seen from the oscillations across the sampling ranges. The minimum of z is found with quite a few datapoints, corresponding to the lowest possible height within the set bounds. This indicates the GaCl3 molecule prefers possibly even lower adsorption heights than 3.0 Å. The predictions for x, y, α, β, and γ are more defined as new datapoints are added to the model, and at around 1300 datapoints they change very littler with further iterations. However, new oscillations for global minimum predictions are again present after additional sampling, which may be caused by the sampling shifting focus to more unknown regions of the space after the global minimum predictions are already quite certain. 52 Figure 22 presents the convergence of the variance and lengthscale hyperparameters as a function of datapoints in the model. Both are initially high, but steadily decrease and eventually only show small changes as enough datapoints are added to the model. Figure 21. Convergence of the global minimum predictions for x, y, z, α, β, and γ as a function of datapoints in the model. Figure 22. Convergence of the BOSS hyperparameters with reference to the datapoints in the model. Left: Variance. Right: lengthscale. The locations for the global minimum values of x, y, z, α, β, and γ were further inspected with two-dimensional cross-sections of the 6D AES. They were converged, when there were no prominent visual changes in the global minimum locations after including additional points to the model. The cross-sections of αβ, βγ, xz, and αγ corresponding to the converged model with 53 2119 datapoints are presented in Figure 23. The αβ cross-section displays periodicity in local minima regions around 0 º, 180 º and 360 º for α, and similarly for β. Due to the BB approximation these individual rotation angles correspond to the molecule being in a somewhat flat position on top of the surface, which minimizes the adsorption distance for the center Ga atom. The periodicity of α and β rotations is further confirmed by cross-sections of the αγ and βγ spaces, where the minima regions are represented as purple vertical areas. These graphs also indicate symmetry for the γ variable, corresponding to the rotation around the z- axis, in every 60 º, between the sampling range from 0 º -120 º. This could be caused by the three-fold symmetry of Cl-atoms around the center Ga, which reduces the possible unique configurations of the fixed molecule. Based on the αγ and βγ relationships, the global minimum is mostly defined by the rotations of α and β, and γ has very little effect. Lastly, the cross-section of x and z shows clear preference for smaller adsorption heights, already indicated by the global minimum prediction of z in Figure 21. Additionally, there is only one global minimum area for variable x. Figure 23. 2D cross-sections of the 6D AES. a) 2D cross-section of the BOSS predicted 6D AES for α and β. Purple regions indicate minima and yellow maxima. Both α and β show periodicity in 0 º, 180 º and 360 º. b) 2D cross -section of β and γ, where the minima for β is located around 0 º, 180 º and 360 º, and 0 º, 60 º and 120 º for γ. c) 2D cross-section of x and z, where the z minima are located close to the lower bound of the sampling range. d) 2D cross-section of α and γ, where minima are located similarly than in βγ cross-section. 54 The xy cross-section of the 6D AES (Figure 24) indicates one global minimum adsorption region in purple, and one maximum region in yellow. This indicates there is one specific adsorption site in reference to x and y that is preferred, and one site that is less favourable for the rigid GaCl3. In Figure 24 the positions of Ga and N atom of the primitive surface cell are represented as brown and blue stars. The minimum region coincides with the Ga site, while the maximum with the N site. The locations of Ga and N are slightly offset to the minimum and maximum, which indicates GaCl3 does not adsorb directly on top of the surface atoms. The Eads range for the surrogate model μ(x) minimum and maximum is around 1.00 eV, which indicates meaningful differences between minima and maxima regions of the AES. Notably even the maximum region has negative Eads values, which means the N-top site is still favourable for adsorption. Figure 24. 2D cross-section of the BOSS predicted 6D AES for x and y, with one global minimum region and one maximum region. For reference the 1x1 primitive surface cell is shown on the right. BOSS can be used the extract the local minima from the AES. After each batch of 250 datapoints in the model were added, I compared these identified local minima structures in reference to their adsorption energies. The model is converged when the adsorption energy of the found minima don’t change when new configurations are evaluated [45]. The number of identified minima and their adsorption energies are presented in Figure 25, where each line represents a 6D BOSS model with specific number of training datapoints. The minima can be seen located in energy plateaus. The locations of the plateaus with regards to their adsorption energies in the low energy region should converge with enough datapoints in the model, which in this case required 2119 datapoints. 55 Based on all aspects used to verify the convergence of the 6D model, there were 2119 total datapoints needed to converge the global minimum predictions for Eads, x, y, z, α, β, and γ, and the number of local minima. The global minimum structure had an adsorption energy of -1.33 eV, and it consisted of a GaCl3 adsorbed of slightly offset Ga-top site within 3.0 Å of the surface. Figure 25. Identified local minima and their adsorption energies. Left: Number of found local minima for models with increasing training data. Rose colour curve M2119 corresponds to the converged 6D model. Right: A zoomed in view from the local minima curves in the low energy region, displaying the first 150 found minima. 4.4 Found local minima structures 4.4.1 Local minima before relaxation The local minima of the AES were extracted from the converged model. In total there were 746 minima found in the model with 2119 datapoints. The analysis of the minima was restricted to a thermal ALD reaction temperature of 500 ºC previously used in experimental ALD studies with GaCl3 [36], [37], which corresponded to energy range of 0.066 eV above the global minimum energy -1.33 eV. This energy window included 146 local minima, including the lowest energy global minimum, and they are presented in Figure 26, located in 20 energy plateaus. The lowest energy structure from each plateau was chosen as a representative structure for corresponding adsorption energy value. These 20 unique configurations are described in Table 5, which lists the Eads values and adsorption sites for the GaCl3 molecules, as well as the α, β, and γ rotations describing the tilt of the adsorbate. All 20 structures have varying levels of tilt, and no directly flat or upright configurations were observed. These minima also 56 all had an adsorption height of 3.0 Å measured from the center of mass of GaCl3 (Ga-atom) to the surface, indicating that the optimal height would be lower. The Eads values are relatively similar between the 20 minima, which is reflected on by their similar structure due to BB approximation, and adsorption height. Figure 26. All found local minima within 500 ºC range (0.066 eV) from the global minimum. One structure from each plateau was chosen to represent the structure type. Table 5. 20 identified unique local minima structures and their adsorption energies, adsorption sites, and rotations. The structures are unrelaxed and include the BB approximation. Plateau number (representative structure) Eads (eV) Adsorption site (center of gravity) α (º) β (º) γ (º) 1 -1.33 Ga top (offset) 0 201 120 2 -1.33 Bridge (offset) 154 373 25 3 -1.32 Ga top 357 349 60 4 -1.31 Ga top 180 174 120 5 -1.30 Bridge (offset) 0 201 0 6 -1.30 Bridge (offset) -24 165 84 7 -1.29 Ga/N top (offset) 22 173 97 8 -1.29 Ga top (offset) -2 198 68 9 -1.29 Ga top (offset) 358 198 69 10 -1.29 Ga top (offset) 202 12 46 11 -1.29 Ga top 184 8 58 12 -1.29 Ga top 190 8 54 13 -1.29 Ga top (offset) 358 -18 56 14 -1.28 Ga top 188 5 125 57 Plateau number (representative structure) Eads (eV) Adsorption site (center of gravity) α (º) β (º) γ (º) 15 -1.27 Ga top 190 175 63 16 -1.27 Ga top (offset) 188 189 58 17 -1.27 Bridge (offset) 155 169 82 18 -1.27 Ga top (offset) 353 375 54 19 -1.26 Ga top (offset) 187 192 58 20 -1.26 Ga top (offset) 351 15 53 4.4.2 Local minima after relaxation To obtain the true structures without the BB approximation, the 20 molecule-slab configurations were relaxed with DFT using light settings and tier 2 basis set to ensure proper representation of the Cl-orbitals in GaCl3. After relaxation, the 20 different minima were categorized into 5 unique identifiable local minima (LM) configuration types, where the lowest energy one of each category is presented in Figures 27-31, indicated as LM1-LM5, with LM1 having the lowest (most favorable) Eads with the surface, and LM5 the highest Eads. LM1 was therefore the global minimum structure. The most important characteristics of each LM are presented in Table 6. In general, the adsorption energies significantly decreased upon relaxation, indicating the BB approximation was restricting the natural movement of the bonds in GaCl3 . The adsorbate also dissociated into fragments in LM1-LM4, indicating that is the preferred state for the GaCl3 molecule when interacting with GaN(0001) surface. The adsorption heights also decreased after relaxation to circa 2.0 Å, compared to the unrelaxed global minimum prediction of 3.0 Å. I will next describe the individual geometric characteristics of each LM in more detail. The adsorption sites and distances are defined from the main atom of the fragments, corresponding to Ga-atom center in fragments with Ga, or Cl center in other cases. The main fragment containing gallium is indicated as F1. In general, in all five LM configurations, some surface layer atoms experienced displacements from their ideal structure due to the forces caused by the adsorbing fragments, which can be seen in Figures 27-31. The z-displacement of atoms belonging to the adsorption site of each fragment are presented in Tables 7-11. The atom(s) belonging to each adsorption site are defined as Ga (Ga-top), N (N-top), and two adjacent Ga atoms (bridge). In the unrelaxed structure, the z-position for the surface layer Ga-atoms was defined as 0.0, meaning the whole surface plane was also defined to be positioned at z=0. The 58 z displacement for the adsorption site atoms was calculated by subtracting the original unrelaxed z-position in the lattice from their corresponding value in the relaxed structure. Negative displacement indicates that the atom in the surface had shifted lower in z-direction compared to the original unrelaxed position in the lattice. The surface atoms at the adsorption site shifted upwards in the case of Ga-fragment adsorption on the bridge site, but downwards in all other cases. Table 6. Unique relaxed LM and their Eads values and degree of dissociation. Structure Eads (eV) Number of fragments LM1 (global minimum) -7.24 3 LM2 -6.84 2 LM3 -6.82 2 LM4 -6.45 3 LM5 -6.18 1 The global minimum structure LM1 dissociated into three fragments, visible in Figure 27. Differentiating factor for LM1 from other minima was the adsorption site for F1, which was the N-top site. LM1 also included two dissociated Cl fragments F2 and F3, which were situated equal distance apart from F1, on bridge and Ga-top sites. F1 had a Ga-Cl bond length of 2.17 Å, meaning the bond had increased slightly from the GaCl3 bond lengths of 2.12 Å, after the two Cl atoms had dissociated. The GaCl fragment was oriented directly upright on top of the N-stop site, meaning the Ga-Cl bond was not tilted. Figure 27. LM1 configuration from top and side. F1, F2 and F3 correspond to GaCl, Cl and Cl. 59 Table 7. Characteristics of LM1 adsorbed fragments. Fragment Chemical species Adsorption site Distance to surface plane (Å) Dissociation distance from F1 (Å) Mean z displacement of adsorption site surface atom(s) (Å) F1 GaCl N-top 2.0 - -0.19 F2 Cl Bridge 1.8 5.0 -0.15 F3 Cl Ga-top 2.2 5.0 -0.02 LM2 was the second lowest energy structure, and it included two fragments, GaCl2 and Cl atom, both of which had adsorbed onto bridge sites (Figure 28, Table 8). The Cl-Ga-Cl angle in GaCl2 was 103 º, with Ga-Cl bond lengths of 2.17 and 2.65 Å. Former of these bonds corresponded to the Cl directly above adsorbing Ga, while the latter Cl was situated on adjacent Ga-top site. The adsorption of the bonded Cl on the surface thus seemed to increase the existing Ga-Cl bond. Figure 28. LM2 configuration from top and side. F1 and F2 correspond to GaCl2 and Cl. Table 8. Characteristics of LM2 adsorbed fragments. Fragment Chemical species Adsorption site Distance to surface plane (Å) Dissociation distance from F1 (Å) Mean z displacement of adsorption site surface atom(s) (Å) F1 GaCl2 Bridge 2.1 - 0.13 F2 Cl Bridge 1.8 5.3 -0.16 The structure of LM3 (Figure 29, Table 9) was very similar to LM2, because it also included two fragments, GaCl2 and Cl, with only the adsorption site of Cl being Ga-top instead of bridge. This similarity in structure is highlighted by their adsorption energies as well, with 60 only 0.02 eV difference between them. Similarly to LM2, the Cl-Ga-Cl bond angle was 104 º, and the individual Ga-Cl bond lengths were 2.16 and 2.67 Å. Figure 29. LM3 configuration from top and side. F1 and F2 correspond to GaCl2 and Cl. Table 9. Characteristics of LM3 adsorbed fragments. Fragment Chemical species Adsorption site Distance to surface plane (Å) Dissociation distance from F1 (Å) Mean z displacement of adsorption site surface atom(s) (Å) F1 GaCl2 Bridge 2.1 - 0.16 F2 Cl Ga-top 2.1 4.9 -0.03 LM4 structure (Figure 30, Table 10) included three fragments, GaCl and two Cl atoms. The two dissociated Cl atoms were situated closest to F1 compared to the other LM structures, with 3.9 Å dissociation distance for both. In F1 the Ga-Cl bond length was 2.15 Å, which was in line with the similar bonds in the other LM structures. Figure 30. LM4 configuration from top and side. F1, F2 and F3 correspond to GaCl, Cl and Cl. 61 Table 10. Characteristics of LM4 adsorbed fragments. Fragment Chemical species Adsorption site Distance to surface plane (Å) Dissociation distance from F1 (Å) Mean z displacement of adsorption site surface atom(s) (Å) F1 GaCl Bridge 2.1 - 0.13 F2 Cl Ga-top 2.1 3.9 -0.05 F3 Cl Ga-top 2.2 3.9 -0.02 LM5 (Figure 31, Table 11) was the least favourable adsorption configuration of the 5 minima, although it still has a quite negative value of Eads, -6.18 eV. It was the only one of the five unique local minima that did not dissociate after relaxation., and the adsorbate remained as GaCl3. It was also the only LM where the Ga-containing fragment adsorbed on the Ga-top site. The GaCl3 fragment was oriented in a way that one Cl atom was pointing directly upwards above the adsorbed Ga, while the two remaining bonded chlorines were adsorbed on adjacent Ga-top sites. The Cl-Ga-Cl bond angles were 101 º and 103 º between the top-facing Cl and two adsorbed Cl atoms. The bond angle within the adsorbed Cl atoms was smaller, 86 º. Figure 31. LM5 configuration from top and side. F1 corresponds to GaCl3. Table 11. Characteristics of LM5 adsorbed fragments. Fragment Chemical species Adsorption site Distance to surface plane (Å) Dissociation distance from F1 (Å) Mean z displacement of adsorption site surface atom(s) (Å) F1 GaCl3 Ga-top 2.4 - -0.04 Based on the analysis of all LM, adsorbed fragments are located in either Ga-top, N-top or bridge sites, but noticeably not on the hollow site. Based on the adsorption energies, N-top could be the most preferrable adsorption site for Ga-containing fragments, which indicates 62 that the adsorbing Ga-fragment will likely prefer to situate as close to the surface as possible, indicating the interaction between the Ga-atom and the surface is favorable. This attraction of Ga to the surface is therefore so strong, that it either causes the Ga-Cl bonds to break and GaCl3 to dissociate (LM1-LM4), or the bonds will bend to accumulate preferred adsorption configuration for Ga in case the molecule does not dissociate (LM5). Also, notably in all LM structures, no dissociated Cl atoms adsorbed onto nitrogen, but rather either on Ga-top or bridge sites. To further investigate the bonding and chemical meaning behind these adsorption configurations I performed pDOS and Mulliken analysis with DFT to study their electronic structure properties. 4.5 Electronic structure properties of local minima 4.5.1 Projected density of states Projected density of states (pDOS) method was used to study the electronic properties of the LM structures after the GaCl3 adsorption on GaN(0001) surface. The pDOS was analyzed for the top bilayer of the surface slab where the surface states would be localized, instead of the whole slab, which mainly would provide additional information about the bulk states. The pDOS plots of the isolated GaCl3 molecule and the GaN(0001) top bilayer (1 BL), were compared with the pDOS contributions of the adsorbed molecule, and 1 BL after adsorption in each local minima LM1-LM5. Figure 32 presents the pDOS for these isolated components. In the isolated GaN 1 BL, there exist bands in the band gap region, arising from the surface dangling bonds of Ga atoms. This electronic behavior is aligned with similar previous results [73], [74]. The surface states between -1.0 and 1.0 eV indicate that the 1 BL is slightly metallic, and the surface Ga atoms are free to bond with adsorbates. 63 Figure 32. pDOS for the isolated GaN 1 BL, total GaN slab, and the GaCl3 molecule. The bands in the band gap region are caused by dangling bonds in Ga atoms. The pDOS for GaCl3 in vacuum (Figure 33) displays peaks in regions corresponding to -4.0, -3.5, 1.9, and 3.5 eV. The peak in -3.5 eV corresponds to the highest occupied molecular orbital (HOMO), and the peak in 1.9 eV the lowest unoccupied molecular orbital (LUMO). The individual Ga and Cl atom pDOS contributions to GaCl3 indicate that the states corresponding to the HOMO region arise mostly from Cl atoms, while the states in LUMO are caused by contribution of Ga orbitals. Figure 33. Isolated GaCl3 and contributions of the individual atoms to the pDOS. Left: Contribution of the Ga atom. Right. Contribution of a Cl atom. Only one pDOS for the Cl atoms is presented here, as the contributions are identical due to the symmetric nature of the molecule. The changes in the electronic structure of GaN 1 BL after GaCl3 adsorption are presented in Figure 34 (left column), which presents the pDOS for the isolated bilayer as refence and each LM 1 BL contribution. In all LM structures, the available surface states of the GaN 1 BL are partly saturated by the adsorbate, which is indicated by the decrease in the number of states in 64 the region of -1.0-1.0 eV. This suggests that some of the dangling bonds in the surface are engaged in covalent bonding with the adsorbate. The 1 BL still remains slightly metallic after adsorption. The effect of adsorption on GaCl3 electronic structure is presented in Figure 34 (right column), which shows the pDOS for all LM adsorbates, as well as the isolated GaCl3 as reference. In LM1-LM4 the molecule is partly dissociated and the pDOS shows combined effects of the dissociated fragments, while in LM5 the adsorbate remains as GaCl3 after relaxation. Individual atom contributions of GaCl3 to pDOS are presented in the Appendix. LM1 is the global minimum structure, with a GaCl adsorbed on N-top site and two dissociated Cl atoms adsorbed on Ga-top and bridge sites. There is visible shift and broadening of the HOMO states, caused by the dissociation and subsequent adsorption of the two Cl atoms. There also is a new peak around 0.0 eV. LM2 and LM3 structures each have two fragments, GaCl2 molecule on bridge site, and a dissociated Cl atom on either bridge (LM2) or Ga-top (LM3) site. Because the structures are similar except for the Cl adsorptions site, the pDOS plots also show similar characteristics. Both of these configurations have a largest peak around the HOMO, but LM2 has less states in that energy region than LM3. Additionally, LM2 and LM3 both have two similar smaller peaks at around -1.0 and 0.0 eV. LM4 fragments include a GaCl adsorbed on the bridge site, and two Cl atoms adsorbed on Ga-top sites. There is a concentration of states In the HOMO region, around -3.5 eV, and two small peaks in the HOMO-LUMO gap. In LM5, the non-dissociated GaCl3 fragment has its Ga-atom adsorbed on the Ga-top site. Because the adsorbate has not dissociated, corresponding pDOS shows least changes in the positions of the peaks and the HOMO- LUMO gap compared to the isolated GaCl3 molecule. Most importantly, for all LM configurations the pDOS results indicate some type of chemisorption rather than physisorption. Physisorption would not affect the electronic states noticeably, because it involves weak van der Waals interactions, and won’t result in chemical bond formation via shared electrons. The observed saturation of 1 BL surface states indicates hybridization and covalent bonding between the adsorbate and the surface. There is also noticeable shift and broadening of the HOMO and LUMO peaks of the LM adsorbates, meaning their electronic structure is affected by the adsorption. 65 Figure 34. pDOS for the five LM. Left: pDOS for all LM GaN 1 BL contributions. Right: pDOS for all LM adsorbate contributions. The plots also include the pDOS for the isolated reference structures. 66 4.5.2 Mulliken charges Mulliken analysis of partial charges was conducted to analyze the charge distribution of the local minima structures. In an isolated state the GaCl3 molecule is neutral, with Ga atom having a partial charge of 0.81 electrons (e), while the three Cl atoms each having equal charge of -0.27 e. This means the electrons in Ga-Cl bonds are more attracted to Cl, which is also supported by their electronegativity values of 1.81 for Ga and 3.16 for Cl. Isolated GaN surface has a partial charge of effectively zero, indicating the slab is close to neutral due to the pseudohydrogens balancing the ECR. The sum of partial charges for isolated GaCl3 and GaN slab, as well as the corresponding values extracted from the local minima partial charges, are shown in Table 12. The atom- specific partial charges for GaCl3 and LM are also visualized in Figure 35. In the local minima structures, the adsorbing molecule gains negative charge, while the GaN slab loses charge an equal amount. The charge transfer is strongest in the global minimum, which indicates LM1 configuration is the most suitable for GaCl3 to chemically bond with the surface. The magnitude of charge transfer also correlates to the strength of the chemical bonds formed in each LM, also explaining the relative ordering of the Eads values. From Figure 35, it can be seen that the adsorption results in both Ga and Cl atoms in GaCl3 to gain electrons, indicating Ga-Cl bonds in GaCl3 break in favor of the Ga-fragment to bond with the surface to gain electrons. Similarly, dissociated Cl atoms also gain electrons by bonding with the surface atoms. The partial charges of the Cl atoms across all LM are relatively similar, but the change in charge for Ga-atom is noticeably different for the LM structures. In LM1, Ga-atom has gained the most electron density, which correlates to it bonding the strongest with the surface. The Ga-atom in the non-dissociated LM5 has gained the least electron density, while LM2, LM3 and LM4 each present similar amount of charge transfer for adsorbed Ga-atom. Table 12. The sum of atom-specific partial charges of the isolated GaN slab and GaCl3, and the corresponding value for each LM structure. The sign of Δq indicates whether the whole structure has gained (-) or lost (+) electron density. Structure Δqslab (e) Δqmol (e) Isolated model 0.0 0.0 LM1 (global minimum) 1.08 -1.08 LM2 0.86 -0.86 LM3 0.82 -0.82 LM4 0.83 -0.83 LM5 0.55 -0.55 67 Figure 35. The atom-specific partial charges for the isolated GaCl3 and the LM adsorbates. The grey bars indicate the sum of partial charges for the whole adsorbate. To further quantify the charge transfer, the change in summed partial charges ΔQ due to adsorption for each fragment is presented in Table 13. The results suggest that the adsorption site of the Ga-fragment (F1) in the LM configuration affects the strength of bonding between the surface and adsorbate the most, because F1 adsorbed on N-top site has gained the most charge, followed by bridge, and lastly Ga-top site. These charges agree with the adsorption energy values, also indicating strongest bonding on the N-top, and weakest on the Ga-top site. However, the amount of charge transfer for the adsorption site atoms is not directly corresponding to the charge gained by the adsorbed fragments. This indicates the charge transfer between the molecule and the surface is not entirely localized between the fragment and the underlaying adsorption site atoms. The adsorbed F1 fragments gain at minimum over half an electron of charge, so some of the surface atoms must donate the electrons. Further analysis on the individual partial charges for all the surface supercell atoms would be needed to exactly pinpoint the atoms donating charge to the fragments, and magnitude of this charge transfer. For all LM structures, dissociated Cl atoms gain charge due to bonding with their adsorption site atoms. The dissociation and subsequent adsorption of Cl atoms on either Ga-top or bridge sites results in the Cl atoms having slightly more negative charge than when engaged 68 in bonding in GaCl3. The gained charge is on average -0.11 e, and it is not dependent on the adsorption site. For the adsorption site atom(s), bonding with dissociated Cl decreases the electron density, indicated by positive ΔQsite values. The charge on adsorption sites changes on average by 0.23 e. The adsorption site atoms lose more charge than what is gained by the Cl atoms, so roughly half of the electron density (0.10 e) of the site is distributed elsewhere in the structure. Table 13. The change in total sum of partial charge for adsorbed fragments and corresponding adsorption site atom(s) after adsorption. Local minima Fragment Adsorption site ΔQfrag (e) ΔQsite (e) LM1 GaCl N-top -0.86 -0.01 Cl Bridge -0.11 0.27 Cl Ga-top -0.10 0.24 LM2 GaCl2 Bridge -0.73 0.00 Cl Bridge -0.13 0.27 LM3 GaCl2 Bridge -0.71 0.03 Cl Ga-top -0.11 0.21 LM4 GaCl Bridge -0.60 0.04 Cl Ga-top -0.11 0.23 Cl Ga-top -0.12 0.23 LM5 GaCl3 Ga-top -0.55 0.08 69 5 Conclusions In this thesis I conducted a computational study to evaluate the behaviour and suitability of GaCl3 molecule as an ALD precursor for GaN(0001) fabrication. GaN and other WBG materials have attracted interest in semiconductor technology due to their improved properties compared to silicon. However, currently there is a knowledge gap on the detailed adsorption mechanism of GaCl3 on GaN(0001), which is why I utilized computational methods to quantitatively study the adsorption of GaCl3 in atomic scale. I did this with combining DFT simulations with ML-based BOSS method, which provided me a way to study this complex adsorption system with lesser number of computations, while still maintaining accuracy. My research topic provided an optimum target application for BOSS, a multi-dimensional problem, with a computationally expensive model system. Before the 6D structure search I constructed and validated DFT simulation models for the GaCl3 molecule and GaN(0001) surface, which was defined as my research objective 1. This extensive validation of the simulation models was done to ensure the adsorption energy data used for BOSS would be representative of the chemical and physical behaviour of the model system, and thus also the optimal adsorption configurations found with BOSS would be reasonable. This part of my thesis can also act as a guide of important steps when performing similar computational work on surface adsorption. My second research objective was to map out the 6D AES of the adsorbing GaCl3 molecule, which was successfully done with 2119 DFT adsorption energy evaluations. The BOSS model convergence was thoroughly verified, which ensured the AES and corresponding local minima was well defined. The third and main objective of this thesis was to extract the local minima structures found with the BOSS method, and analyse the optimal adsorption configurations of the adsorbate. I found 20 unique local minima structures in the propositioned ALD temperature range. After removing the BB approximation and relaxing the minima structures with DFT, these 20 original minima could be categorized into five unique structure types. All but the least favourable configuration of the five experienced dissociation of the GaCl3 molecule, into Ga-containing fragments and Cl atoms. Notably in the case of my studied adsorption system, the BB approximation was significantly restricting the preferred adsorption configurations of the GaCl3 molecule, which included the aforementioned dissociation into fragments. Despite this, the subsequent DFT relaxation still managed to produce unique and meaningful final local minima configurations form the BB 70 approximated models, suggesting BOSS is suitable to study systems which might exhibit dissociation. The five unique local minima structures LM1-LM5 had adsorption energies ranging from -7.24 eV for the global minimum to -6.18 eV for the least favourable adsorption configuration of the five. The magnitude of these negative Eads values indicated quite strong interaction between the molecule and the surface. The global minimum configuration LM1 dissociated into GaCl and two Cl fragments. LM4 also dissociated similarly to LM1. LM2 and LM3 included two fragments, GaCl2 and Cl atom, while LM5 consisted of non-dissociated GaCl3. Based on the Eads values, from all the GaN(0001) surface adsorption sites, N-top seemed to be the most preferred for Ga-fragment adsorption, followed by bridge and Ga-top, while hollow site was absent on the observed adsorption configurations. The bond dissociation energy for Ga-Cl bonds in GaCl3 was calculated to be -4.16 eV, distinctly smaller than the Eads values of -6.18 to -7.24 eV for the LM, indicating the bonds formed between surface and the molecule during adsorption were stronger than in the isolated molecule. To further study the adsorption properties of the minima, I conducted an electronic structure analysis by pDOS and Mulliken partial charges. pDOS analysis implied the adsorption mechanism between the surface and the molecule is chemisorption, due to the change in their electronic states upon adsorption, suggesting chemical bonding. Analysis of the partial charges further confirmed the formation of bonds, as the adsorbed fragments had gained significant amount of electron density after adsorption, with the global minimum LM1 F1 fragment gaining almost a full electron, and the least favourable LM5 F1 fragment approximately half an electron. The adsorption site atoms however did not lose charge a corresponding amount, which suggests the electrons gained by the fragments are from multiple surface atoms, and not caused by direct charge transfer between the fragment and the site. Therefore, further research is needed to verify which surface atoms are most involved in the charge transfer and bonding with the fragments. To sum up, the main conclusions that can be drawn based on this work are that the GaCl3 molecule partly dissociates when introduced to the GaN(0001) surface, implying that there is preference for the Ga-atom in GaCl3 molecule to break existing Ga-Cl bonds in favour of stronger bonding with the surface atoms. For ALD precursor this is a useful property, because the adsorption of the Ga-atoms is the main goal, and Cl atoms should be easily removable from the system. Additionally for the purposes of ALD, it would be beneficial if adsorbing 71 Ga atoms and Cl atoms would not prefer to adsorb on the same sites on the surface, as this would complicate the stoichiometric control over the deposition process. The global minimum adsorption site for Ga-fragments was the N-top site, while dissociated Cl was observed either on Ga-top or bridge sites, which means in theory the preferred adsorption sites are different for Ga-fragments and Cl. Most significant impact of my thesis work has been the expansion of the previously unknown chemical understanding of the fundamental surface reactions enabling GaN(0001) deposition with GaCl3 through ALD. This understanding can aid in the evaluation of the potential for GaCl3 as an ALD precursor, which could be further tested in experimental conditions. For future work on this topic, as mentioned previously, more detailed understanding of the exact adsorption and ALD process could be achieved with additional research on the charge transfer between the individual surface atoms and the adsorbate. Other logical next step for research includes modelling additional aspects of the ALD cycle, such as the interaction of multiple adsorbing GaCl3 molecules on the GaN(0001) surface. This approach would provide information about the adsorption process more representative of the true ALD conditions, where the GaCl3 molecules are pulsed onto the surface in excess. These studies would be useful, because there is a scarcity of information about the mechanics of adsorption of multiple GaCl3 molecules, and GaCl3-GaCl3 interactions and their possible effect on the molecule dissociation. However, this approach would also add a new level of complexity to the model system, which might require the need to reconsider the best computational methodology for this task. BOSS could be potentially utilized by considering the adsorbed GaCl3 on Ga(0001) LMs the as the base structures, where an additional GaCl3 molecule would be introduced. Additionally, further studies could be done on the effect of the purging gas and nitrogen- source precursor molecules to the adsorption of GaCl3. In the case of GaCl3 usage for ALD, the existing research is quite sparse and over 15 years old. As mentioned in Chapter 1, computational methods are a viable tool to systematically study a large number of possible molecules and their suitability for ALD. Previously unexplored precursor combinations could be tested based on novel developments and knowledge about reactants for GaN(0001) manufacturing with ALD. The use of computational high-throughput sampling would enable to efficiently inspect multiple potential precursor molecules for GaN(0001) deposition, and thus find the combination which would present most promising properties for the material. 72 References [1] IEA International Energy Agency, Electricity 2025 report, (2025), https://www.iea.org/reports/electricity-2025 (accessed 9.3.2025). [2] L. A. Yeboah, P. A. Oppong, A. Abdul Malik, P. S. Acheampong, J. A. Morgan. R. A. Addo, Sustainable Manufacturing and Applications of Wide-Bandgap Semiconductors -A Review, Intell. Sustain. Manuf., 2, (2025), 10010, doi: 10.70322/ism.2025.10010. [3] M. Buffolo, D. Favero, A. Marcuzzi, C. De Santi, G. Meneghesso, E. Zanoni, Review and Outlook on GaN and SiC Power Devices: Industrial State-of-the-Art, Applications, and Perspectives, IEEE Trans. Electron Devices, 71, (2024), 1344–1355, doi: 10.1109/TED.2023.3346369. [4] A. Udabe, I. Baraia-Etxaburu, D. G. Diez, Gallium Nitride Power Devices: A State of the Art Review, IEEE Access, 11, (2023), 48628–48650, doi: 10.1109/ACCESS.2023.3277200. [5] T. J. Flack, B. N. Pushpakaran, S. B. Bayne, GaN Technology for Power Electronic Applications: A Review, J. Electron. Mater., 45, (2016), 2673–2682, doi: 10.1007/s11664-016- 4435-3. [6] J. Lutz, H. Schlangenotto, U. Scheuermann, R. De Doncker, Semiconductor Power Devices: Physics, Characteristics, Reliability, 2nd Edition, Springer Cham, eBook, 2018, p. 24, 26, 35, doi: 10.1007/978-3-319-70917-8. [7] P. Yu, M. Cardona, Fundamentals of Semiconductors: Physics and Materials Properties, 4th Edition, Springer Berlin, Heidelberg, eBook, 2010, p. 1, doi: 10.1007/978-3-642-00710-1. [8] J. Song, The history and trends of semiconductor materials’ development, J. Phys. Conf. Ser., 2608, (2023), 012019, doi: 10.1088/1742-6596/2608/1/012019. [9] C. P. C. V. Bernardo, R. A. M. Lameirinhas, J. P. de Melo Cunha, J. P. N. Torres, A revision of the semiconductor theory from history to applications, Discov. Appl. Sci., 6, (2024), 316, doi: 10.1007/s42452-024-06001-1. [10] G. Iannaccone, C. Sbrana, I. Morelli, S. Strangio, Power Electronics Based on Wide-Bandgap Semiconductors: Opportunities and Challenges, IEEE Access, 9, (2021), 139446–139456, doi: 10.1109/ACCESS.2021.3118897. [11] F. Roccaforte, P. Fiorenza, G. Greco, R. Lo Nigro, F. Giannazzo, F. Iucolano, M. Saggio, Emerging trends in wide band gap semiconductors (SiC and GaN) technology for power devices, Microelectron. Eng., 187–188, (2018), 66–77, doi: 10.1016/j.mee.2017.11.021. [12] D. J. Schepis, The role of thin film in nanothecnology, in Handbook of Thin Film Deposition, D. Schepis, K. Seshan, Eds., 5th Edition, Elsevier, eBook, 2024, p. 4, doi: 10.1016/C2022-0-01474- 4. [13] T. J. L. Mustard, H. S. Kwak, A. Goldberg, J. Gavartin, T. Morisato, D. Yoshidome, M. D. Halls, Quantum Mechanical Simulation for the Analysis, Optimization and Accelerated Development of Precursors and Processes for Atomic Layer Deposition (ALD), J. Korean Ceram. Soc., 53, (2016), 317–324, doi: 10.4191/kcers.2016.53.3.317. [14] P. O. Oviroh, R. Akbarzadeh, D. Pan, R. A. M. Coetzee, T.-C. Jen, New development of atomic layer deposition: processes, methods and applications, Sci. Technol. Adv. Mater., 21, (2019) 465–496, doi: 10.1080/14686996.2019.1599694. [15] S. Shankar, H. Simka, M. Haverty, Density functional theory and beyond -opportunities for quantum methods in materials modeling semiconductor technology, J. Phys. Condens. Matter, 20, (2008), 064232, doi: 10.1088/0953-8984/20/6/064232. [16] J. Ling, M. Hutchinson, E. Antono, S. Paradiso, B. Meredig, High-Dimensional Materials and Process Optimization Using Data-Driven Experimental Design with Well-Calibrated Uncertainty Estimates, Integrating Mater. Manuf. Innov., 6, (2017), 207–217, doi: 10.1007/s40192-017-0098-z. [17] T.-S. Vu, M.-Q. Ha, D.-N. Nguyen, V.-C. Nguyen, Y. Abe, T. Tran, H. Tran, H. Kino, T. Miyake, K. Tsuda, H.-C. Dam, Towards understanding structure–property relations in materials with interpretable deep learning, Npj Comput. Mater., 9, (2023) 1–12, doi: 10.1038/s41524-023- 01163-9. 73 [18] R. Arróyave, D. Khatamsaz, B. Vela, et al., A perspective on Bayesian methods applied to materials discovery and design, MRS Commun., 12, (2022), 1037–1049, doi: 10.1557/s43579- 022-00288-0. [19] F. Al-Quaiti, A. A. Demkov, Theoretical study of GaN (0001) surface reconstructions and La and Ga adatoms under N- and Ga-rich conditions, Phys. Rev. Mater., 5, (2021), 044602, doi: 10.1103/PhysRevMaterials.5.044602. [20] M. S. Kang, C.-H. Lee, J. B. Park, H. Yoo, G.-C. Yi, Gallium nitride nanostructures for light- emitting diode applications, Nano Energy, 1, (2012), 391–400, doi: 10.1016/j.nanoen.2012.03.005. [21] L. Liu, J. H. Edgar, Substrates for gallium nitride epitaxy, Mater. Sci. Eng. R Rep., 37, (2002), 61–127, doi: 10.1016/S0927-796X(02)00008-6. [22] R. M. Feenstra, J. E. Northrup, J. Neugebauer, Review of Structure of Bare and Adsorbate- Covered GaN(0001) Surfaces, MRS Internet J. Nitride Semicond. Res., 7, (2002), doi: 10.1557/S1092578300000296. [23] B. Shen, N. Tang, X. Q. Wang, Z. Z. Chen, F. J. Xu, X. L. Yang, T. J. Yu, J. J. Wu, Z. X. Qin, W. Y. Wang, Y. X. Feng, W. K. Ge, III-Nitride Materials and Characterization in Handbook of GaN Semiconductor Materials and Devices, W. Bi, H. Kuo, P. Ku, B. Shen, Eds., 1st Edition, CRC Press, eBook, 2017 p. 4, doi: 10.1201/9781315152011. [24] V. M. Bermudez, The fundamental surface science of wurtzite gallium nitride, Surf. Sci. Rep., 72, (2017), 147–315, doi: 10.1016/j.surfrep.2017.05.001. [25] A. Zaharo, A. Purqon, The Calculation Study of Electronic Properties of Doped RE (Eu, Er and Tm)-GaN using Density Functional Theory, J. Phys. Conf. Ser., 877, (2017), 012051, doi: 10.1088/1742-6596/877/1/012051. [26] Y. Qian, F. Shang, Q. Wan, Y. Yan, A molecular dynamics study on indentation response of single crystalline wurtzite GaN, J. Appl. Phys., 124, (2018), 115102, doi: 10.1063/1.5041738. [27] Razia, M. Chugh, M. Ranganathan, Surface energy and surface stress of polar GaN(0001), Appl. Surf. Sci., 566, (2021), 150627, doi: 10.1016/j.apsusc.2021.150627. [28] J. Zhou, G. Zheng, X. Liu, G. Dong, J. Qiu, Defect engineering in lanthanide doped luminescent materials, Coord. Chem. Rev., 448, (2021), 214178, doi: 10.1016/j.ccr.2021.214178. [29] M. Grodzicki, Properties of Thin Film-Covered GaN(0001) Surfaces, Mater. Proc., 2, (2020) 30, doi: 10.3390/CIWC2020-06833. [30] D. Ueda, Properties and advantages of gallium nitride, in Power GaN Devices: Materials, Applications and Reliability, M. Meneghini, G. Meneghesso, E. Zanoni, Eds., 1st Edition, Springer Cham, eBook, 2017, p. 6-7, doi: 10.1007/978-3-319-43199-4. [31] R. L. Puurunen, A Short History of Atomic Layer Deposition: Tuomo Suntola’s Atomic Layer Epitaxy, Chem. Vap. Depos., 20, (2014), 332–344, doi: 10.1002/cvde.201402012. [32] T. Suntola, Atomic layer epitaxy, Thin Solid Films, 216, (1992), 84–89, doi: 10.1016/0040- 6090(92)90874-B. [33] M. Leskelä, M. Ritala, Atomic layer deposition (ALD): from precursors to thin film structures, Thin Solid Films, 409, (2002), 138–146, doi: 10.1016/S0040-6090(02)00117-7. [34] J. A. Oke, T.-C. Jen, Atomic layer deposition and other thin film deposition techniques: from principles to film properties, J. Mater. Res. Technol., 21, (2022), 2481–2514, doi: 10.1016/j.jmrt.2022.10.064. [35] H. Kim, Characteristics and applications of plasma enhanced-atomic layer deposition, Thin Solid Films, 519, (2011), 6639–6644, doi: 10.1016/j.tsf.2011.01.404. [36] O. H. Kim, D. Kim, T. Anderson, Atomic layer deposition of GaN using GaCl3 and NH3, J. Vac. Sci. Technol. Vac. Surf. Films, 27, (2009), 923–928, doi: 10.1116/1.3106619. [37] H. Tsuchiya, M. Akamatsu, M. I. Masahiko Ishida, F. H. Fumio Hasegawa, Layer-by-Layer Growth of GaN on GaAs Substrates by Alternate Supply of GaCl3 and NH3, Jpn. J. Appl. Phys., 35, (1996) L748, doi: 10.1143/JJAP.35.L748. [38] V. I. Tsirelnikov, B. V. Lokshin, P. Melnikov, V. A. Nascimento, On the Existence of the Trimer of Gallium Trichloride in the Gaseous Phase, J Inorg. General Chem., 638, (2012), 2335-2339, doi: 10.1002/zaac.201200282. 74 [39] D. S. Sholl, J. A. Steckel, Density Functional Theory: A Practical Introduction, 1st Edition, John Wiley & Sons, Incorporated, ProQuest Ebook Central, 2009, p. 15-19, 50-53, doi:10.1002/9780470447710. [40] E. Schrödinger, An Undulatory Theory of the Mechanics of Atoms and Molecules, Phys. Rev., 28, (1926), 1049–1070, doi: 10.1103/PhysRev.28.1049. [41] M. Born, R. Oppenheimer, Zur Quantentheorie der Molekeln, Ann. Phys., 389, (1927), 457–484, doi: 10.1002/andp.19273892002. [42] P. Hohenberg, W. Kohn, Inhomogeneous Electron Gas, Phys. Rev., 136, (1964), B864–B871, doi: 10.1103/PhysRev.136.B864. [43] T. Tsuneda, Density Functional Theory in Quantum Chemistry, 1st Edition, Springer Tokyo, eBook, 2014, p. 81, doi: 10.1007/978-4-431-54825-6. [44] W. Kohn, L. J. Sham, Self-Consistent Equations Including Exchange and Correlation Effects, Phys. Rev., 140, (1965), A1133–A1138, doi: 10.1103/PhysRev.140.A1133. [45] J. Järvi, Structure search of molecular adsorbates with Bayesian inference and density- functional theory, Doctoral Thesis, Aalto University, Finland, 2023, https://aaltodoc.aalto.fi/handle/123456789/120176 [46] V. Brázdová, D. R. Bowler, Atomistic Computer Simulations: A practical guide, Wiley‐VCH Verlag GmbH & Co. KGaA, eBook, 2013, p. 44-46, 113-114, 128-133, 139-141, 180, 182-183, 214, 269-275, doi:10.1002/9783527671816. [47] J. P. Perdew, Generalized Gradient Approximation Made Simple, Phys. Rev. Lett., 77, (1996), 3865–3868, doi: 10.1103/PhysRevLett.77.3865. [48] A. Tkatchenko, Accurate Molecular Van Der Waals Interactions from Ground-State Electron Density and Free-Atom Reference Data, Phys. Rev. Lett., 102, (2009), 073005, doi: 10.1103/PhysRevLett.102.073005. [49] P. Kratzer, J. Neugebauer, The Basics of Electronic Structure Theory for Periodic Systems, Front. Chem.,7, (2019), 106, doi: 10.3389/fchem.2019.00106. [50] W. Setyawan, S. Curtarolo, High-throughput electronic band structure calculations: Challenges and tools, Comput. Mater. Sci., 49, (2010), 299–312, doi: 10.1016/j.commatsci.2010.05.010. [51] R. S. Mulliken, Electronic Population Analysis on LCAO–MO Molecular Wave Functions. I, J. Chem. Phys., 23, (1955), 1833–1840, doi: 10.1063/1.1740588. [52] L. Bengtsson, Dipole correction for surface supercell calculations, Phys. Rev. B - Condens. Matter Mater. Phys., 59, (1999), 12301–12304, doi: 10.1103/PhysRevB.59.12301. [53] J. Järvi, P. Rinke, M. Todorović, Detecting stable adsorbates of (1S)-camphor on Cu(111) with Bayesian optimization, Beilstein J. Nanotechnol., 11, (2020), 1577–1589, doi: 10.3762/bjnano.11.140. [54] S. Goedecker, W. Hellmann, T. Lenosky, Global Minimum Determination of the Born- Oppenheimer Surface within Density Functional Theory, Phys. Rev. Lett., 95, (2005), 055501, doi: 10.1103/PhysRevLett.95.055501. [55] B. Settles, Active Learning Literature Survey: Computer Sciences Technical Report 1648, University of Wisconsin–Madison, 2010, http://digital.library.wisc.edu/1793/60660, accessed (15.3.2025) [56] BOSS Code Repository, https://gitlab.com/cest-group/boss (accessed 21.3.2025) [57] P. I. Frazier, J. Wang, Bayesian optimization for materials design, Springer Series in Materials Science, 2015, doi: 10.48550/arXiv.1506.01349. [58] S. M. Lynch, Basics of Bayesian Statistics, in Introduction to Applied Bayesian Statistics and Estimation for Social Scientists, 1st Edition, Springer New York, eBook, 2007, p. 47, doi: 10.1007/978-0-387-71265-9. [59] A. Denzel, J. Kästner, Gaussian Process Regression for Geometry Optimization, J. Chem. Phys., 148, (2018), 094114, doi: 10.48550/arXiv.2009.05803. [60] C. E. Rasmussen, C. K. I. Williams, Gaussian processes for machine learning, MIT Press, 2006, eBook, p. 13, doi: 10.7551/mitpress/3206.001.0001. [61] H. Paulamäki, Bayesian optimization structure search, Master’s thesis, University of Helsinki 2019, http://hdl.handle.net/10138/308237 75 [62] M. U. Gutmann, J. Corander, Bayesian Optimization for Likelihood-Free Inference of Simulator- Based Statistical Models, J. Mach. Learn. Reasearch, 17, (2016), 1–47, doi: 10.48550/arXiv.1501.03291. [63] V. Blum, R. Gehrke, F. Hanke, P. Havu, V. Havu, X. Ren, K. Reuter, M. Scheffler, Ab initio molecular simulations with numeric atom-centered orbitals, Comput. Phys. Commun., 180, no. (2009), 2175–2196, doi: 10.1016/j.cpc.2009.06.022. [64] Fritz Haber Institute ab initio molecular simulations: FHI-aims -A Users’ Guide manual (2025) https://fhi-aims.org/uploads/documents/FHI-aims.250320_1.pdf (accessed 22.2.2025) [65] M. D. Hanwell, D. E. Curtis, D. C. Lonie, T. Vandermeersch, E. Zurek, G. R. Hutchison, Avogadro: an advanced semantic chemical editor, visualization, and analysis platform, J. Cheminformatics, 4, (2012), 17, doi: 10.1186/1758-2946-4-17. [66] A. K. Rappe, C. J. Casewit, K. S. Colwell, W. A. I. Goddard, W. M. Skiff, UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations, J. Am. Chem. Soc., 114, (1992), 10024–10035, doi: 10.1021/ja00051a040. [67] The Materials Project web page, GaN mp-804 structure, https://next- gen.materialsproject.org/materials/mp-804?formula=GaN (accessed 24.3.2025) [68] I. M. Sobol, On the distribution of points in a cube and the approximate evaluation of integrals, USSR Comput. Math. Math. Phys., 7, (1967), 86–112, doi: 10.1016/0041-5553(67)90144-9. [69] A. Haaland, A. Hammel, K.-G. Martinsen, J. Tremmel, H. V. Volden, Molecular structures of monomeric gallium trichloride, indium trichloride and lead tetrachloride by gas electron diffraction, J. Chem. Soc. Dalton Trans., issue 14, (1992), 2209–2214, doi: 10.1039/DT9920002209. [70] S. J. Blanksby, G. B. Ellison, Bond Dissociation Energies of Organic Molecules, Acc. Chem. Res., 36, (2003), 255–263, doi: 10.1021/ar020230d. [71] AFLOW database: Automatic – FLOW for Materials Discovery, GaN (ICSD #41481), https://aflow.org/material/?id=aflow:46542a27230c53ac (accessed 23.2.2025) [72] C. Stampfl, Density-functional calculations for III-V nitrides using the local-density approximation and the generalized gradient approximation, Phys. Rev. B, 59, (1999), 5521– 5535, doi: 10.1103/PhysRevB.59.5521. [73] R. González-Hernández, Vanadium adsorption and incorporation at the GaN(0001) surface: A first-principles study, Phys. Rev. B, 81, (2010), 195407, doi: 10.1103/PhysRevB.81.195407. [74] M. Himmerlich, GaN(0001) surface states: Experimental and theoretical fingerprints to identify surface reconstructions, Phys. Rev. B, 88, (2013) 125304, doi: 10.1103/PhysRevB.88.125304. [75] A. Ogawa, H. Fujimoto, Lewis Acidity of Gallium Halides, Inorg. Chem., 41, (2002), 4888– 4894, doi: 10.1021/ic020268m. 76 Appendix The individual atom contributions for pDOS in LM1-LM5 are presented here. Figure 36. pDOS contributions from individual atoms in the LM structures. The isolated GaCl3 pDOS is shown for reference. a) LM1 b) LM2 c) LM3 d) LM4 e) LM5.