Gear Design Relevant Cleanness Metrics

January 20, 2017

A method is presented to characterize premium quality clean steels using statistics of extreme values (SEV) and quantitative stereology. The data can be used to perform gear design relevant engineering analysis of the potential for a gear failure due to bending fatigue in the root or flank or rolling/sliding contact fatigue of the gear tooth face.

The “Metallurgical Specifications for Steel Gearing” information sheet (AGMA 923-B05) [1] details non-metallic inclusion limits to meet AGMA grade expectations for a range of gear metallurgical processing methods (case carburized, through hardened, etc.). These limits allow gear designers and producers to select material suppliers that will meet the minimum expectations for material fatigue performance, but do not provide the data needed by designers to meet ever-increasing demands for high power density gearing applications faced today. Modern electric arc furnace (EAF) and vacuum refining (VR) steelmaking technologies have enabled steelmakers to improve steel oxide inclusion cleanness to levels that rival vacuum arc re-melted (VAR) steels at a fraction of the cost. The ability to fully characterize the geometric and chemical characteristics of micro and macro oxide inclusion populations using automated image analysis SEM allows the steelmaker to understand how practices employed in the melting and teaming (pouring and solidification) of the steel affect oxide cleanness. The geometric characteristics of the inclusion population, combined with a gear design engineer’s understanding of the contact and bending stresses and duty cycles, allows for the prediction of inclusion-related fatigue risks as described in this paper.

TimkenSteel offers Ultrapremium™ air-melted and vacuum-refined steels. Ultrapremium air-melted technology is TimkenSteel’s premium clean steelmaking practice. All steels produced with TimkenSteel’s Ultrapremium practice come with a Steel Certificate of Test with design-relevant oxide inclusion content values. The presence of hard oxide inclusions can result in fatigue failures. The oxide inclusion content limits for Ultrapremium steels are on par with typical values for oxide inclusions in vacuum arc re-melted (VAR) steels, but at a much lower cost. To produce and certify steel to Ultrapremium air-melted quality, TimkenSteel employs advanced vacuum refining and teaming practices and measures the steel cleanness with SEM image analysis and statistical evaluation.

TimkenSteel can produce any of its grades to this new steel micro-cleanness standard. At this time, the Ultrapremium practice and certification limits with steels are produced using bottom-poured ingots. The Ultrapremium practice for its strand cast process path is under development.

Quantification of Inclusion Populations

Current/Common Oxide Cleanness Rating Methods

Historically, inclusions are measured against ASTM E45 [2] or similar micro cleanness specifications using light optical microscopy (LOM) and six samples to represent a heat. (See Table 1.) The AGMA 923 information sheet on steels for gears requires that oxide inclusion ratings meet certain limits in order to meet grade 2 or 3 gear steel quality requirements. In the rating method called out in ASTM E45 Method A, an operator uses a light optical microscope to scan the polished surface of a specimen at 100 times magnification looking for the worst field for each of four inclusion types, with thin and heavy categories for each type. The field size used for rating is 0.5 mm2, and the total scanned area per sample is at least 160 mm2. The total inspected area for this method is 6 x 160 mm2 = 960 mm2, and the actual area used for rating for each inclusion type and thickness combination is only 6 x 0.5 mm2 = 3 mm2. With four inclusion types, and thin and heavy categories, the total sample rating is based on 8 x 3 mm2 = 24 mm2.

Once the worst 0.5 mm2 field is identified for each stringer inclusion type (A, B, C) and thickness (thin, heavy), the total length of stringers in the field is summed, and a numerical rating between 0 and 5 is selected from a table based on accumulated length in the single worst field. This process is repeated for each stringer inclusion type (A, B, C). For D type globular oxides, the worst 0.5 mm2 field is defined by the number count of globular oxides in the field and is again rated between 0 and 5 based on a table.

This method of steel cleanness rating has been used for decades in order to assure the level of steel cleanness is achieved per a specified limit. However, this method does not provide inclusion metrics that are relevant to gear design, and it cannot provide the statistically robust data needed to predict gear performance. When gears are in critical applications and/or when power densification or light weighting is driving better designs, a more robust set of inclusion metrics is needed. Steels rated with this method and meeting the AGMA grade 3 requirements can have very different inclusion populations when examined more closely.

SEM-Based Automated Image Analysis

SEM (scanning electron microscopy) is particularly well-suited for performing automated inclusion analysis when compared to light optical microscopy, because of the SEM’s ability to assess Z-contrast, or atomic number contrast using back-scattered electron detection. Z-contrast facilitates the automated identification of oxide particles in a steel matrix as a result of the significant difference between the atomic number of iron, with an atomic number of 26, and oxygen, with an atomic number of 8. Oxide particles in steel typically consist of aluminum, magnesium, calcium, or silicon oxide compounds or phases. In each case, the Z-contrast of these particles against the steel matrix makes them readily detectable. Figure 2 shows an example of a Z-contrast SEM image of inclusions in steel.

The SEM is also capable of inclusion particle chemical analysis through energy-dispersive X-ray spectroscopy, or EDS. When high-energy electrons from the electron beam strike the sample, some of the inner-shell electrons contained in the elements of the sample may be excited to a higher-energy shell, leaving an electron hole in the inner shell. An outer-shell, high-energy electron then fills the hole, and the difference in energy is released as a characteristic X-ray. Because each element has a unique atomic structure, each element has a unique set of peaks on its X-ray emission spectrum. Figure 1 shows an example of EDS analysis of a macro oxide stringer inclusion. The ability to characterize the chemistry of inclusion populations is critical to developing a strong understanding of how steelmaking practices affect the generation of inclusions. This capability then facilitates the systematic study and optimization of steelmaking practices to minimize oxide inclusion population density.

Characterization of the inclusion population for a heat of steel requires tens of hours of SEM run time, but only tens of minutes of sample preparation and operator time. Forty-eight samples from six locations in the heat are collected from front, middle, and back portions of the heat and top and bottom positions of the ingot. These sampling locations are in accordance with ASTM-E45 [2] sampling requirements. These 48 metallurgical samples are prepared on the longitudinal plane and polished. The operator loads a carousel of 12 samples into the SEM and starts the automated analysis process. No further operator interaction is needed until the analysis is completed and the system is ready for the next carousel of samples. The SEM scans each 200 mm2 sample and stops on any particle larger than 3 mm in square root area. The total inspected area for this method is 48 x 200 mm2 = 9,600 mm2, and since all inclusions are accounted for, the area used for rating is also 9,600 mm2.

When an inclusion particle is encountered, the chemistry is measured and recorded, and each particle’s geometry and position is recorded. The chemical content of the inclusions is of particular use to the steelmaker in defining practices to improve steelmaking practices, while the inclusion geometry and distribution information is of particular use to the gear designer. If the particle is in proximity to other particle(s) such that they meet the standard criterion for stringers, then the group is categorized and assessed as a stringer inclusion. Individual or isolated globular oxide particles (see Figure 2, lower-left) are recorded as micro inclusions, and their geometry is reported by square root area. Stringers of continuous or intermittent oxide particles (see Figure 2, upper-right) are recorded as macro inclusions, and geometry is recorded as individual stringer lengths and widths. A wide range of other geometric measures can be selected as needed.

Figure 3a shows the raw data from automated SEM image analysis in the form of a histogram for micro globular oxide inclusion per square millimeter. Figure 3b shows the raw data in histogram form for macro stringer oxide inclusions lengths per square millimeter. Each of these histograms compares five different steel producers. In each case, the steel types are nominally equivalent carburizing steel chemistries. The first is TimkenSteel Ultrapremium air-melted steel. The next two are from domestic special bar quality (SBQ) steel mills. Each of the first three steels were produced by electric arc furnace and vacuum refining, and each meet AGMA grade 3 and ASTM A534 [3] bearing steel quality requirements for oxide inclusion cleanness. The remaining two are air-melted and vacuum arc re-melted (Air-VAR) and vacuum induction melted and vacuum arc re-melted (VIM-VAR) steels.

Comparing Figure 3a to Figure 3b, one notes that the population of stringers tends to run about one order of magnitude less than the micro inclusion concentrations. While stringers do tend to be larger and therefore more injurious when present in a critically loaded location, it is much more probable that an injurious micro inclusion will be located in a critical location compared to a stringer. In the following section, “Estimating Inclusion Related-Fatigue Risks,” an analysis is presented that establishes critical inclusion sizes of approximately 10 mm square root area (and larger) where contact stresses exceed 1500 MPa and at approximately 20 mm square root area (and larger) where contact and/or bending stresses exceed 1000 MPa. These histograms are particularly useful in comparing the inclusion population between the five steel sources and in considering the concentration of inclusions greater than 10 or 20 mm and stringers longer than 100 and 200 mm for each steel source. As such, these data alone have great utility in identifying steel sources that can meet the cleanness requirements demanded by highly loaded, power-dense transmission systems. Figure 4 compares the sum of micro oxide inclusions greater than 10 and 20 mm square root area, and Figure 5 compares the sum of stringer oxide inclusions greater than 100 and 200 mm in length. Micro inclusions at the surface of a gear can be directly considered from these data, while doing so for stringers would require that the gear be machined from bar stock such that all stressed surfaces are along the original longitudinal plane to be directly considered.

In order to provide a more direct linkage between gear design and steel cleanness effects on gear fatigue performance, some further analytical processes can be performed on these automated SEM image analysis data. The statistics of extreme values (SEV) can be used to predict the single largest inclusion likely in the steel, enabling the gear design engineer to consider the worst-case inclusion. Quantitative stereography can be employed to convert the measured area concentration of inclusions to mean-free path between inclusions and volumetric concentration, enabling the gear design engineer to make direct comparisons of stressed volumes and volumetric inclusion concentrations. These analytical techniques and their resulting outputs are described in the following two sections.

The statistics of extreme values technique [4] applied to inclusion populations has been described in detail by Murakami [5], [6] and are summarized here. With SEV analysis, one can use the population of inclusions measured on a limited, but statistically robust set of samples to predict a worst case or maximum likely inclusion size.

A set of j samples, j=(1, 2..… n) (n=32 in this example), each of the same inspected area (A0 = 200 mm2 in this example), is evaluated for inclusion content, and the largest (extreme value) inclusion is recorded for each sample. The data set is then arranged in rank order from smallest extreme value to the largest. Next, a measure of the accumulated inspected area for each of the rank ordered samples, described as the reduced variate, Y, is calculated at each j value. The reduced variate is a log-log measure of the accumulated inspected area over the set of extreme value samples. The reduced variate is calculated as follows:

Eq 1.jpg Equation 1



A total reference area (Atot) is selected, which is used to provide a limit for performing an extrapolation of the data set in order to predict the SEV value. Murakami proposes that a 30,000 mm2 be used, which equates to a reduced variate value of 5.007. The Ylim value for the extrapolation is calculated based on the return period, T, and the relevant areas as follows:

Eq 2.jpg Equation 2



Eq 3 Equation 3



As illustrated in Figure 6, a linear regression using the maximum likely linear fit is made of the rank-ordered extreme value data, and the value at which this regression intersects the Ylim value is the SEV maximum likely inclusion value. Table 2 shows the maximum likely globular oxide inclusions for each of the steels reviewed previously. It is important to point out that the SEV value is useful in considering what the largest likely inclusion in the steel is, but it does not provide information about the number density of inclusions that exceed a critical value. As a result, the SEV value is best used in conjunction with other metrics described in the following section.

Quantitative Stereology

Quantitative stereology [7], [8] can be used to predict the mean-free path between inclusions or the volumetric concentration of inclusions based on the measured unit area data. Mean-free path can quickly be used to consider the likelihood of an inclusion being present in a component and is valid for both globular oxides and stringers. The volumetric inclusion concentration can be considered against the stressed volume of a gear, and the probability of encountering a critically sized inclusion in a gear or a population of gears can be estimated. Similarly, volumetric inclusion populations can be used to populate loading models, in combination with Monte Carlo simulation, to predict relative fatigue life between different populations.

The mean-free path, λ, between inclusions can be calculated using the oxide inclusion count per unit area as follows [7]:

Eq 4 Equation 4



Figure 7 shows the calculated mean-free paths for the globular oxide inclusions greater than 10 mm in square root area and stringers longer than 100 mm in length. Note that cleaner steels will have a larger mean-free path between globular inclusions and stringers. These data can quickly be considered with respect to a gear size or a test coupon size to get a sense of the probability of inclusion-related fatigue failures.

The Saltykov [9] method is a popular and frequently used method to convert area density of spheres to volume density. In this method, the three-dimensional distribution of spheres is approximated by first dividing the two-dimensional frequency per unit area, na, data in to K discrete size sets of integer values between 7 and 15. The discrete size range value is then the largest square root area particle divided by K, or ∆=√amax/K. The data presented in Figure 3 meet these criteria. A series of multipliers is generated based on the number of size ranges selected. The table of values used to generate the series of multipliers is included in Saltykov’s original work and is available in handbooks [7] and textbooks [8], [10]. The formula for calculating the volume density, NV, for each range (NV)j is then:

Eq 5 Equation 5



Figure 8 shows a comparison of the number of inclusions greater than 10 mm and greater than 20 mm in square root area per cubic centimeter. If it is determined that inclusions greater than 10 mm represent a risk for failure at stress levels exceeding 1500 MPa, then the gear design engineer can assess the design and determine what volume of gear is exposed to principal stresses of 1500 MPa and higher. For example, an automotive ring gear might see stresses at and near the surface in the contact region in excess of 1500 MPa. If that volume stressed in excess of 1500 MPa is determined to be 0.01 cm3, referring to Figure 8, then there will be approximately two inclusions per gear in this stressed volume from the Ultrapremium and one to two for vacuum re-melted steels, compared to 83 or 22 per gear for the SBQ steels. Similarly, if the same automotive ring gear is determined to have 0.1 cm3 of material stressed in excess of 1000 MPa, then the risk of a 20 mm inclusion in this stressed volume per gear is about two for Ultrapremium, zero for the vacuum re-melted steels, and seven to eight for the SBQ steels.

As noted previously, macro stringer inclusions are typically an order of magnitude less frequent than globular oxides and therefore much less likely to be in a critical area compared to one or more critically sized globular oxides. The Saltykov method assumes that the features being addressed are all spherical. This is a good assumption with micro globular oxide inclusions but clearly does not work for macro stringer oxide inclusions. As a result, the current work cannot provide a figure for oxide stringers equivalent to Figure 8. In most instances, if a component’s life is limited by oxide inclusions, the globular oxide population shown in Figure 8 will be the controlling population.

Estimating Inclusion-Related Fatigue Risks

Linear elastic fracture mechanics is a field developed in the solid mechanics and materials science communities [12], [13]. This well-established science applies the physics of solid mechanics stress and strain and the physics of energetic fracture processes to calculate the driving force for propagation of cracks in materials and predict crack behavior. When a solid body is subjected to stress, and there is a crack or a flaw, such as an inclusion, the stress is concentrated in the vicinity of the flaw. The degree of concentration is expressed as the stress intensity value, K, with units of MPa√m. In the laboratory, one can apply known cyclic stress intensity to a crack and evaluate the threshold stress intensity, Kth, or the minimum stress intensity required to begin to drive crack growth. One can also assess the rate of crack growth rate when Kth is exceeded and the plane strain fracture toughness, or K1C, where a crack becomes unstable and the component or test coupon fails.

Figure 9 illustrates the progression of a gear tooth bending fatigue failure as a result of a globular oxide micro inclusion. The leftmost photo shows a gear tooth that has fractured off of the gear. Further evaluation at higher magnifications shown in the progression of pictures moving forward to the right shows a fatigue thumbnail, and in the middle of the thumbnail is a globular oxide micro inclusion that measures approximately 20 mm in square root area. Figure 10 illustrates the test procedure for performing a LEFM crack growth rate test [14]. A sample is machined with a notch and subject to an initial fatigue cycle to pre-crack the sample. The stress intensity at the crack tip is then varied, and the rate of crack growth is measured. As the stress intensity is reduced, the crack stops, and the threshold stress intensity for the material is established. At intermediate stress intensity levels, the crack growth is log-log linear, and at higher stress intensities, the critical stress intensity is reached, and rapid fracture occurs. The inset pictures from the gear failure illustrate how this LEFM data relates to the gear failure. Early in the running of this gear, the bending stresses, combined with the inclusion, were large enough to result in a stress intensity that exceeded the threshold, Kth, hence allowing a fatigue crack to initiate at the inclusion. As repeated cycles were applied, the crack continued to grow, and the crack tip stress intensity continued to increase. When the stress intensity reached the critical value, K1C, the tooth fractured off completely.

Murakami and co-workers [15] used stress analysis to evaluate a broad range of defect shapes and sizes, under a range of loading conditions. From this work, they determined that the stress intensity was related to the square root area of the defect in the plane normal to the principal stress, and they developed a stress intensity equation as a function of the square root area of an inclusion as follows:

Eq 6 Equation 6



Typical threshold stress intensity values for high-strength hardened steels such as case-carburized steels is on the order of 4 to 8 MPa√m. Figure 11 plots Equation 7 when solved for stress and using a threshold stress intensity of 6 MPa√m, hence providing a fatigue limit stress level as a function of the square root area of an inclusion. Under high contact stress conditions of approximately 1500 MPa, a 10 mm inclusion is predicted to be of a critical size, while for a root bending stress of approximately 1000 MPa, a 20 mm inclusion is predicted to be of a critical size. Conversations with multiple gear designers and failure analysts indicate that these predicted values are consistent with real-world experiences with heavily loaded gears.

Improving Gear Performance with Affordable Clean Steels

The current “Metallurgical Specifications for Steel Gearing” information sheet (AGMA 923-B05) provides requirements for AGMA grade 1, 2, or 3 steel gearing. When gears are in critical applications and/or when power densification or light weighting is driving better designs, a more robust set of inclusion metrics is needed to provide relevant guidance to designers, regarding the risk of inclusion-initiated fatigue failures. This paper has illustrated how statistically robust inclusion population data, gathered with SEM automated image analysis, can generate the statistically rich data needed to robustly consider design concerns. Stereographic methods are employed to generate volumetric inclusion population density, and linear elastic fracture mechanics are employed to illustrate critical flaw size for initiation and growth of a fatigue crack in application.

These techniques are also well-suited for incorporation into gear computational analysis tools in order to make predictions. TimkenSteel has developed a virtual gear life model that allows the simulation of stresses in gears in a baseline state, without inclusions, and then again with inclusions. Inclusions can be placed parametrically by the user to study various scenarios. Inclusions can also be placed via a Monte Carlo algorithm into the gears based on the measured inclusion population described in this paper. Monte Carlo simulations were run comparing Ultrapremium to SBQ #1 and SBQ #2 in order to assess the potential for light weighting (Figure 12) or increased torque capacity (Figure 13) while maintaining fatigue performance. The fatigue performance axis in these figures is based on the average maximum stress observed in each of the Monte Carlo sets. When there is a greater population of inclusions, more of the gears in the Monte Carlo set exhibited inclusions in load zones. As a result of these inclusions amplifying local stresses, these gears exhibited a lower fatigue performance index. The results indicate that a 12 to 30 percent weight reduction can be achieved with Ultrapremium steel or a 10 to 35 percent increase in torque capacity with the same gear using Ultrapremium steel.


  1. American Gear Manufacturers Association, 2005, “Metallurgical Specifications for Steel Gearing” AGMA Information Sheet 923-B05.
  2. ASTM International, 2013, “Standard Test Methods for Determining the Inclusion Content of Steel” ASTM E45-13.
  3. ASTM International, 2014, “Standard Specification for Carburizing Steels for Anti-Friction Bearings” ASTM 534-14.
  4. E. J. Gumbel, 1957, Statistics of Extremes, Columbia University Press, New York, NY.
  5. Murakami, Y., 1994, “Inclusion Rating by Statistics of Extreme Values and Its Application to Fatigue Strength Prediction and Quality Control of Materials,” J. Res. Natl. Inst. Stand. Technol. 99, pg. 345.
  6. Murakami, Y., Toriyama, T., and Coudert, E. M., 1994, “Instructions for a New Method of Inclusion Rating and Correlations with the Fatigue Limit,” Journal of Testing and Evaluation, JTEVA, Vol. 22, No. 4. pp. 318–326.
  7. A.M. Gokhale, 2004, Quantitative Characterization and Representation of Global Microstructural Geometry, Metallography and Microstructures, Vol 9, ASM Handbook, ASM International, pp. 428–447, Materials Park, OH.
  8. E.E. Underwood, 1970, Quantitative Stereology, Addison Wesley, London.
  9. S.A. Saltykov, 1958, Stereometric Metallography, 2nd ed., Metallurgizdat, Moscow.
  10. Russ, J. C., and Dehoff, R. T., 2000, Practical Stereology, 2nd ed., Springer Science + Business Media, New York.
  11. ASTM International STP 504, 1971, Stereology and Quantitative Metallography, ASTM Intl., Philadelphia, PA.
  12. Murikami, T., 2002, Metal Fatigue: Effects of Small Defects and Nonmetallic Inclusions, 1st ed., Elsivier, London.
  13. Anderson, T. L., 1995, Fracture Mechanics, Fundamentals and Applications, 2nd ed., CRC Press, Boca Raton, FL.
  14. ASTM International, 2011, “Standard Test Method for Measurement of Fatigue Crack Growth Rates” ASTM E647-11.
  15. Murakami, Y., Toriyama, T., 1993, “The √area Parameter Model for Quantitative Evaluation of Effects of Nonmetallic Inclusions on Fatigue Strength,” Proc. Fatigue 93, J. P. Bailon and J. I. Dickson, eds., Vol. I (1993) pp. 303–309.
  16. The Welding Institute, 2016, “Compact Tension and the J Intergral Test,” Cambridge, United Kingdom, From
Printed with permission of the copyright holder, the American Gear Manufacturers Association, 1001 N. Fairfax Street, Suite 500, Alexandria, Virginia 22314. Statements presented in this paper are those of the authors and may not represent the position or opinion of the American Gear Manufacturers Association (AGMA). This paper was presented October 2016 at the AGMA Fall Technical Meeting in Pittsburgh, Pennsylvania. 16FTM08.


About The Authors

E. Buddy Damm

steel solutions scientist at TimkenSteel Corporation, is responsible for developing new or improved products for TimkenSteel’s customers and developing new or improved processes for TimkenSteel’s manufacturing operations. He is passionate about understanding and solving customer needs in order to build TimkenSteel solutions. In his 20-year tenure, he has served as a research and development engineer, failure analyst, and engineering manager. He has expertise in Integrated Computational Materials Engineering (ICME), thermodynamics, and kinetics of microstructure evolution, thermo-mechanical processing, fatigue and fracture mechanics, and failure analysis. He has served on the board of the Iron and Steel Society and is active in metallurgy and product-related professional societies such as AGMA, the Forging Industry Association (FIA), and The Minerals, Metals and Materials Society (TMS). Damm holds a bachelor’s degree in metallurgical engineering from Michigan Technological University and a master's degree and doctorate in material science and engineering from Colorado School of Mines. He can be reached at

Dr. Peter C. Glaws

works for TimkenSteel Corp.