In previous weeks, both in class and in lab, we have been learning how to use theory and computation to investigate molecular structure. In particular, calculation of the molecular orbital structure for a given molecule enables knowledge about the molecule in terms of energetics, electron density, electrostatic potential, type of stationary point, and types of molecular motions and corresponding vibrational frequecies, as well as a variety of other properties that we have yet to discover.

As we have seen, calculation of the molecular orbitals requires key input to be provided before the computation is valid. This includes a) geometric information, b) specification of the type of computation that is to be performed, and c) a starting ansatz that indicates the mathematics, approximations and degree of accuracy that the computation will run at. In the last weeks we have looked in detail at (a) and (b), and in week 6, we looked at details of the Hartree-Fock procedure, which is an element of (c). This week we look more into detail about another aspect of (c), that of the basis set approximation for our molecular description.

There are two general categories of basis sets that are used to describe a molecule in a program such as GAMESS. The first is the minimal basis sets, which is a basis set that describes only the most basic aspects of the orbitals � one basis function for each atomic orbital angular moment component. The second class is the extended basis sets, which are basis sets that describe the orbitals in greater detail, and often include multiple functions for each angular momentum component in your electronic configuration description

In the most general sense, a basis set is a table of numbers that mathematically estimates where the electrons can be found. The American physicist J. C. Slater developed algorithms by linear least-squares fitting to calculated data. These became known as **Slater type Orbitals** (**STOs**). The general expression for such a basis function is given as:

**STO function=N*exp(-alpha*r)**

__N__: normalization constant

__alpha__: orbital exponent

__r__=radius in angstroms

So that,

Plot such a function for the hydrogen atom 1s orbital. Slater functions are exact wave functions for hydrogen-like atoms in the non-relativistic case. Therefore, they serve as a good basis to expand the one-particle basis.

Although the STO turned out to be a great approximation for the molecular orbital description, calculations involving such functions become computationally problematic. Because of this, another scientist, **S.F. Boys**, developed a method of using a linear combination of Gaussian Type Orbitals to express a single STO. In this way, the Gaussian Type Orbital, GTO or primitive Gaussian function (GF) was defined as follows:

What is the main difference in the functional form between STO and GTO ?

The GTO squares the "r" so that the product of the Gassian "primitives" (original Gaussian equations) is another Gaussian. In this way, the use of GTOs is computationally much more tractable. However, as you might expect, the use of **GTOs is less accurate than the use of STOs**. To compensate for this loss, we find that the **more Gaussians we combine in a linear combination, the more accurate the result**.

A procedure that has to come into wide use is to fit a Slater-type orbital (STO) to a linear combination of N=1,2,3,� primitive Gaussian functions (GTO). This is the **STO-NG** procedure.

For electronic wavefunction calculations one would prefer to use the Slater functions. They more correctly describe the qualitative features of the molecular orbitals than do Gaussian functions, and fewer STO than GTO would be needed in the basis function expansion of , for comparable results. It is possible to show, for example, that at large distances molecular orbitals decay as , which is of the Slater rather than the Gaussian form. In particular, the exact solution for the 1s orbital of the hydrogen atom is the Slater function .

The reason why one considers Gaussian functions at all is that, in an SCF calculation, one must compute of the order of two-electron integrals , where K is the number of basis functions. These integrals are of the form

Where is a basis function centered on nucleus A. The general integral involves four different centers, RA, RB , RC and RD. Evaluation of these four-center integrals is very difficult and time-consuming with Slater basis functions. These integrals are relatively easy to evaluate with Gaussian basis functions, however. The reason is that the product of two 1s Gaussian basis functions, each on different centers, is, apart from a constant, a 1s Gaussian function on a third center, with exponent the sum of the two exponents.

As a result, the four center integral becomes readily a two center integral, that can be readily evaluated.

The compromise between these two approaches is to use as basis functions fixed linear combinations of the primitive Gaussian functions.

These linear combinations, called contractions, lead to *contracted Gaussian functions* (**CGF**).

At this point, you should understand and perform **Exercise I**.

Nowadays the one particle basis sets are expressed directly in Gaussians without any reference to the Slater functions.

Look, for example at the STO-3G basis set. This basis set is the representation of the Slater Type Orbital (STO) as a linear combination of 3 Gaussian primitive functions. In general the use of STO-NG designates "N" Gaussian primitives that are used to simulate the STO basis function. The bigger the �N� the more accurate the result, but also the computational intensity. All basis sets of the form STO-NG are considered �minimal� basis sets.

In the case of Hydrogen in molecules, you may be surprised that in calculation a value of is used instead the value of 1.0. This is because the hydrogen 1s orbital in average molecules is known to be "smaller" or "denser" than in the atom.

Beyond the minimal basis set descriptions are what are called extended basis sets. In these sets, there are many strategies, typically resulting in multiple sets of functions being used for part or all of the angular momentum components of a molecule to describe it. In addition, one might considered higher order angular momentum descriptions for particular molecular cases, and also more diffuse functions, depending on the nature of the molecule.

Each atomic orbital is expressed as the sum of two Slater-type orbitals. The SCF procedure will weight either the coefficient of the dense or diffuse component according to whether the molecular environment requires the effective orbital to be expanded or �contracted�. In addition, an extra degree of anisotropy is allowed relative to an STO-3G basis, since for example, p orbitals in different directions can have effectively different sizes.

The triple and quadruple-zeta basis sets work the same way, except use three and four Slater equations instead of two. Better accuracy implies of course more computational time required.

Often it takes too much effort to calculate a double-zeta for every orbital. Instead, many scientists simplify matters by calculating a double-zeta only for the valence orbital. Since the inner-shell electrons are not as vital to the calculation, they are described with a single Slater orbital, contraction of N primitive Gaussian. This method is called a **split-valence basis**.
The valence shell is described with two such sets of functions, each with a different value of �N� from the core, and from each other. Examples include such basis sets as

__ 3-21G__: The inner shells described using a linear combination of 3 Gaussians, while the valence shell is described using two sets of basis functions, one expanded in a set of 2 Gaussians, and the other in a set of 1 Gaussians.

There are many many basis sets to choose from, with many combinations of functionality in optimized proportions. Many people have devoted a great deal of time to optimize basis sets for atoms using a variety of different methods and functions. A compilation of such basis sets for many atoms can be found at

http://www.emsl.pnl.gov/forms/basisform.html

If one asks for the 3-21G basis set for Carbon, one gets the following output, that can be directly included in GAMESS (see exercises).

The first column of numers correspond to the exponents for the primitive Gaussians, the other columns to the d coefficients for each orbital. Usually the exponents for s and p orbitals of the same shell are the same, whereas the d coefficients are different.

In this case, for each carbon atom, we have:

- one atomic orbital for the 1s (formed by contraction of 3 gaussians)
- two atomic orbitals for the 2s (contraction of 2 and 1 gaussians)
- two atomic orbitals for the 2px, 2py and 2pz

The total is 9 atomic orbitals for each atom, and the SCF will optimize the coefficients of these 9 atomic orbitals.

The next step in improving a basis set could be to go to triple zeta, quadruple zeta, etc. If one goes in this direction rather than adding functions of higher angular quantum number, the basis set would not be well balanced. In the limit of a large number of only s and p functions, one finds, for example, that the equilibrium geometry of ammonia actually becomes planar!

The next step beyond double zeta usually involves adding *polarization functions*, i.e., adding *d-type* functions to the first row atoms Li-F and *p-type* functions to H. To see why these are called polarization functions, consider the hydrogen atom.

The exact wavefunction for an isolated hydrogen atom is just the *1s orbital*. If the hydrogen atom is placed in an uniform electric field, however, the electron cloud is attracted to the direction of the electric field, and the charge distribution about the nucleus becomes asymmetric. It is polarized. The lowest order solution to this problem is a mixture of the original 1s orbital and a *p-type* function, i.e., the solution can be considered to be a hybridized orbital.

A hydrogen atom in a molecule experiences a similar, but nonuniform, electric field arising from its nonspherical environment. By adding polarization functions to a basis set for H we directly accommodate this effect. In a similar way, d-type functions, which are not occupied in first row atoms, play the role of polarization functions for the atoms Li to F. One denotes this improvement with a star (*) or a (d) when only the heavy atoms are corrected with *d-type* functions, or with two stars (**) (or with a (d,p)) when the hydrogen (or helium) atom is corrected as well with *p-type* functions.

In chemistry, one is mainly concerned with the valence electrons which interact with other molecules. Sometimes, the effect of the electrons when they are far from the nucleus are important, and this cannot be taken into account by basis functions with large Gaussian exponents. To compensate this deficiencies, computational scientists use **diffuse** functions. These basis sets are represented by the �+� signs. One �+� means that we are accounting for the �p� orbitals, while �++� signals that we are looking at �p� and �s� orbitals.