Physics in Medicine &
Biology      

PAPER • OPEN ACCESS

A quantitative assessment of Geant4 for predicting
the yield and distribution of positron-emitting
fragments in ion beam therapy
To cite this article: Andrew Chacon et al 2024 Phys. Med. Biol. 69 125015

 
View the article online for updates and enhancements.

You may also like
Development of a more accurate Geant4
quantum molecular dynamics model for
hadron therapy
Yoshi-hide Sato, Dousatsu Sakata, David
Bolst et al.

-

Carbon fragmentation measurements and
validation of the Geant4 nuclear reaction
models for hadrontherapy
M De Napoli, C Agodi, G Battistoni et al.

-

Fusion mechanism in fullerene-fullerene
collisions: The deciding role of giant
oblate-prolate motion
J. Handt and R. Schmidt

-

This content was downloaded from IP address 137.157.8.253 on 20/12/2024 at 02:04

https://doi.org/10.1088/1361-6560/ad4f48
/article/10.1088/1361-6560/ac9a9a
/article/10.1088/1361-6560/ac9a9a
/article/10.1088/1361-6560/ac9a9a
/article/10.1088/0031-9155/57/22/7651
/article/10.1088/0031-9155/57/22/7651
/article/10.1088/0031-9155/57/22/7651
/article/10.1088/0031-9155/57/22/7651
/article/10.1088/0031-9155/57/22/7651
/article/10.1088/0031-9155/57/22/7651
/article/10.1209/0295-5075/109/63001
/article/10.1209/0295-5075/109/63001
/article/10.1209/0295-5075/109/63001
https://pagead2.googlesyndication.com/pcs/click?xai=AKAOjsvSt7v3hry9B-9ZgrtaR1cmIDc9fNmlfYlYceoSd1nDYWK_8eu7y-FvGKPJY_k3tVYDSqwEiqQG7koPolz21e9m9oCkY2nKC2Ifi5P1OioQ1dMFgcylIjUta4uqtYOrX9k8lJZw7m6SrWHSNeL7GeRQ-D9T3DEZOhdu209A9Ov665EcuN-FcFLIYlMyn_hkNvp_9IqhT3f1il5Jwxh5HSrPLKsIwOlxQ_1hyodJ800JMRYFUrgcGK3xAtc36oC_Bh1PtaiHnYlyAQ17iZYpgF78fHCzV4k3Aj2Jvhr2t4T85dR0CF6RkY7SqxoymYZ1mQuuFrrlSmYCpiz3V7J7wxkrAjEhmN13BOrGC2C6psAR&sig=Cg0ArKJSzKQKME-Njbkb&fbs_aeid=%5Bgw_fbsaeid%5D&adurl=https://www2.sunnuclear.com/l/302621/2024-09-18/zwnvc


Phys. Med. Biol. 69 (2024) 125015 https://doi.org/10.1088/1361-6560/ad4f48

Physics in Medicine & Biology

OPEN ACCESS

RECEIVED

5 February 2024

REVISED

10 May 2024

ACCEPTED FOR PUBLICATION

22 May 2024

PUBLISHED

11 June 2024

Original Content from
this work may be used
under the terms of the
Creative Commons
Attribution 4.0 licence.

Any further distribution
of this work must
maintain attribution to
the author(s) and the title
of the work, journal
citation and DOI.

PAPER

A quantitative assessment of Geant4 for predicting the yield and
distribution of positron-emitting fragments in ion beam therapy
Andrew Chacon1, Harley Rutherford1,2, Akram Hamato3, Munetaka Nitta3, Fumihiko Nishikido3,
Yuma Iwao3, Hideaki Tashima3, Eiji Yoshida3, Go Akamatsu3, Sodai Takyu3, Han Gyu Kang3,
Daniel R Franklin4, Katia Parodi5, Taiga Yamaya3, Anatoly Rosenfeld2,6, Susanna Guatelli2,6
and Mitra Safavi-Naeini1,2,7,∗
1 Australian Nuclear Science and Technology Organisation (ANSTO), Lucas Heights, NSW, Australia
2 Centre for Medical Radiation Physics, University of Wollongong, Wollongong, NSW 2522, Australia
3 National Institutes for Quantum Science and Technology, Chiba, Japan
4 School of Electrical and Data Engineering, University of Technology Sydney, Ultimo, Australia
5 Department of Medical Physics, Faculty of Physics, Garching b, Ludwig-Maximilians-Universität München, Munich, Germany
6 Illawarra Health and Medical Research Institute, University of Wollongong, Wollongong, NSW 2522, Australia
7 Brain and Mind Centre, University of Sydney, Sydney, NSW, Australia
∗ Author to whom any correspondence should be addressed.

E-mail: mitras@ansto.gov.au

Keywords: hadronic models, fragmentation models, ion beam therapy, carbon ion beam therapy,
positron emission tomography (PET), Geant4 Monte Carlo simulation toolbox, quality assurance

Supplementary material for this article is available online

Abstract
Objective. To compare the accuracy with which different hadronic inelastic physics models across
ten Geant4 Monte Carlo simulation toolkit versions can predict positron-emitting fragments
produced along the beam path during carbon and oxygen ion therapy. Approach. Phantoms of
polyethylene, gelatin, or poly(methyl methacrylate) were irradiated with monoenergetic carbon
and oxygen ion beams. Post-irradiation, 4D PET images were acquired and parent 11C, 10C and 15O
radionuclides contributions in each voxel were determined from the extracted time activity curves.
Next, the experimental configurations were simulated in Geant4 Monte Carlo versions 10.0 to 11.1,
with three different fragmentation models—binary ion cascade (BIC), quantum molecular
dynamics (QMD) and the Liege intranuclear cascade (INCL++) - 30 model-version
combinations. Total positron annihilation and parent isotope production yields predicted by each
simulation were compared between simulations and experiments using normalised mean squared
error and Pearson cross-correlation coefficient. Finally, we compared the depth of the maximum
positron annihilation yield and the distal point at which the positron yield decreases to 50% of
peak between each model and the experimental results.Main results. Performance varied
considerably across versions and models, with no one version/model combination providing the
best prediction of all positron-emitting fragments in all evaluated target materials and irradiation
conditions. BIC in Geant4 10.2 provided the best overall agreement with experimental results in
the largest number of test cases. QMD consistently provided the best estimates of both the depth of
peak positron yield (10.4 and 10.6) and the distal 50%-of-peak point (10.2), while BIC also
performed well and INCL generally performed the worst across most Geant4 versions. Significance.
The best predictions of the spatial distribution of positron annihilations and positron-emitting
fragment production along the beam path during carbon and oxygen ion therapy was obtained
using Geant4 10.2.p03 with BIC or QMD. These version/model combinations are recommended
for future heavy ion therapy research.

© 2024 The Author(s). Published on behalf of Institute of Physics and Engineering inMedicine by IOP Publishing Ltd

https://doi.org/10.1088/1361-6560/ad4f48
https://crossmark.crossref.org/dialog/?doi=10.1088/1361-6560/ad4f48&domain=pdf&date_stamp=2024-6-11
https://creativecommons.org/licenses/by/4.0/
https://creativecommons.org/licenses/by/4.0/
https://orcid.org/0000-0002-3274-4261
https://orcid.org/0000-0003-3155-0083
https://orcid.org/0000-0003-4269-2618
https://orcid.org/0000-0001-9686-8901
https://orcid.org/0000-0002-9563-5943
https://orcid.org/0000-0001-5116-6308
https://orcid.org/0000-0002-9289-7956
https://orcid.org/0000-0002-6975-9563
mailto:mitras@ansto.gov.au
https://doi.org/10.1088/1361-6560/ad4f48


Phys. Med. Biol. 69 (2024) 125015 A Chacon et al

1. Introduction

One of the chief advantages of particle therapy as a treatment for cancer is the high dose gradient between
the treatment area and surrounding regions (Durante et al 2017). This precision necessitates the use of
sophisticated treatment planning and quality assurance methods to ensure proper delivery of the prescribed
dose to the target only. These methods, in turn, are heavily reliant on Monte Carlo simulation methods,
which are used for modelling the interaction of high-energy charged particles with the patient.

Good models for nuclear fragmentation processes are especially critical for faithfully simulating imaging
applications in particle therapy, such as positron emission tomography (PET)-based dose estimation
methods for quality assurance, since the production and distribution of positron-emitting radionuclide
fragments directly affects the quality of the resulting image (Parodi and Polf 2018, Hofmann et al 2019a,
2019b, Rutherford et al 2020). One of the leading fully open source Monte Carlo toolkits for modelling the
interaction of radiation and matter, Geant4, currently offers a choice of three hadronic inelastic
fragmentation models that are appropriate for particle therapy—binary ion cascade (BIC), quantum
molecular dynamics (QMD), and Liège intranuclear cascade (INCL++) (Agostinelli et al 2003, Mancusi
et al 2014, G Collaboration 2018)8. In a previous study, we evaluated these models by comparing the spatial
distributions of positron-emitting radionuclides predicted following irradiation of PMMA, gelatin and
polyethylene targets by monoenergetic carbon and oxygen ion beams (simulated using Geant4 10.2.p03) to
equivalent results estimated from experimentally-obtained PET data (Chacon et al 2019). The BIC model
was found to provide the best estimates overall; however, none of the models provided a perfect fit in all
evaluated cases, and some significant discrepancies were observed.

Since the publication of our previous study, there have been several updates to Geant4; specifically, six
minor releases (versions 10.x) and one major release (version 11, which has since been updated to version
11.1). Each of these releases includes modifications to the physics models implemented in Geant4, which can
affect the simulation of positron-emitting fragment production in particle therapy.

In this work, we have extended our earlier study, and present a quantitative evaluation of Geant4’s ability
to predict positron-emitting fragment production across a total of ten different stable versions (10.0.p04,
10.1.p03, 10.2.p03, 10.3.p03, 10.4.p03, 10.5.p01, 10.6.p03, 10.7.p02, 11.0 and 11.1) which have followed the
previous major release (10.0) for each of the three different fragmentation models (Chacon et al 2019). In
addition to the normalised mean squared error (NMSE) metric used in the previous study, three additional
metrics—the Pearson cross-correlation coefficient (CCC), the depth of the positron annihilation peak, and
the depth at which the positron annihilation intensity has decreased to 50% of the peak—are also used to
compare the shape of the predicted positron-emitting fragment distributions with the experimentally
measured distributions.

2. Materials andmethods

This section describes the methods used for obtaining and quantitatively comparing the experimental and
simulated positron annihilation profiles. The general approach is similar to that used in our previous study
(see Chacon et al 2019); however, it has been extended to include a much wider range of Geant4 versions,
and additional comparison metrics are introduced.

The experimental methods used to estimate the total positron annihilation profile and activity of the
dominant positron-emitting fragment isotopes (11C, 10C and 15O) are briefly summarised in section 2.1.
Equivalent simulation configurations were constructed for each Geant4 version under test, and the total
positron annihilation profile and activity of 11C, 10C and 15O were predicted for each beam ion/energy, target
material, hadronic inelastic fragmentation model and Geant4 version; the design and parameters of these
simulations are described in detail in section 2.2.

Results in each of the three target materials and 5 ion/energy combinations were then compared to those
predicted in equivalent simulations performed in Geant4 using each of hadronic fragmentation models
(BIC, QMD and INCL++) across the ten evaluated Geant4 versions for a total of 150 unique
target/ion/energy/version/model test conditions. The total positron yields and yields of the individual
positron-emitting fragment species from each model and Geant4 version were then compared with the
experimental annihilation profiles using the following metrics in each of the entrance, build-up, and Bragg
peak and tail regions:

8 INCL++ is considered the most appropriate option for neutron spallation simulations, but is included here for completeness (Boudard
et al 2002).

2


Phys. Med. Biol. 69 (2024) 125015 A Chacon et al

Table 1. Beam parameters for each ion species and energy. The energy spread is 0.2% of nominal energy in each case; 95% confidence
intervals are given for beam flux.

Ion Energy (MeV/u) σx (mm) σy (mm) Beam flux (pps)

12C 148.5 2.77 2.67 1.8×109 ± 3.8×107
12C 290.5 3.08 4.70 1.8×109 ± 6.4×107
12C 350 2.50 2.98 1.8×109 ± 4.6×107
16O 148 2.79 2.89 1.1×109 ± 2.8×107
16O 290 2.60 4.90 1.1×109 ± 7.0×107

• NMSE; and
• Pearson CCC.

Additionally, the depth of the positron annihilation peak and the depth of the distal point at which the
magnitude of the positron annihilation profile decreases to 50% of the peak value are evaluated. All metrics
are described in detail in section 2.3.

2.1. Experimental configuration
The experimental data obtained in our 2019 paper were used as the ground truth for this simulation study; a
detailed description of the experimental procedures is presented in that paper (Chacon et al 2019). In
summary, phantoms constructed from either pure PMMA, polyethylene or gelatin (encased in a thin-walled
PMMA container), each with dimensions of 100mm×100mm×300mm, were irradiated with
monoenergetic carbon or oxygen ion beams of various energies—three for carbon ions and two for oxygen
(see table 1). Positron annihilation profiles (with respect to depth in the target) were estimated across the full
width at tenth maximum (FWTM) of the beam using the whole-body DOI-PET scanner prototype
developed at QST (Akamatsu et al 2019). These profiles were decomposed into the individual population of
each of the dominant parent positron-emitting fragments (11C, 10C and 15O) at t= 0 (end of irradiation
period) by fitting the observed time-decay curves in each voxel to a multiexponential decay model.

2.2. Simulation parameters
The same beam parameters, phantom compositions and geometries used in the experimental measurements
were modelled in each version of Geant4. Apart from minor modifications to the simulation source code
required due to version-to-version changes in certain Geant4 application programming interfaces (APIs), the
code was identical across versions. Simulations were performed using each of the 10 most recent stable
releases of Geant4: 10.0.p04, 10.1.p03, 10.2.p03, 10.3.p03, 10.4.p03, 10.5.p01, 10.6.p03, 10.7.p02, 11.0 and
11.1. For brevity, the patch number will be dropped when referring to the version of Geant4.

For each version of Geant4, three alternative hadronic ion fragmentation models were evaluated—BIC,
QMD and Liège Intranuclear Cascade (INCL) models9 (Mancusi et al 2014, G Collaboration 2018). All
simulations modelled electromagnetic interactions using the standard option 3 list
(G4EmStandardPhysics_option3). The remaining physics processes, including hadronic physics models,
are listed in table 10.

The location of each positron annihilation, as well as the identity of the parent isotope which decayed to
emit each positron (principally 11C, 10C and 15O), were scored with a resolution of 1.5 mm3 to match the
voxel dimensions of the experimental OpenPET image reconstruction output. The pristine positron
annihilation profiles were convolved with a 2.3 mm FWHMGaussian filter to simulate the measured point
spread function of the PET system (Akamatsu et al 2019).

A total of 20 runs, each with 108 primary particles were simulated for each version/model combination.
In our previous work, we established that this is sufficient to limit the run-to-run ratio of standard deviation
to mean across the build-up and Bragg peak region of the profiles to less than 5% (Chacon et al 2019). Each
of the simulated profiles is randomly paired with one of the experimental profiles (for the same target, ion
species and beam energy) and then the performance metrics are calculated, with the statistical distribution of
each metric used to generate the confidence intervals shown in the results presented in the supplementary
materials.

2.3. Evaluationmethods andmetrics
The irradiated target was divided into three separate regions for analysis since different physics processes
dominate in each: the entrance region, the build-up and Bragg peak region, and the tail region. This

9 The INCL model was developed specifically for spallation reactions but is included in this study as it can also model fragmentation.

3


Phys. Med. Biol. 69 (2024) 125015 A Chacon et al

segmentation is defined in the same way as in our previous paper (Chacon et al 2019); in summary, the
central build-up and Bragg peak region is defined as follows:

• The proximal edge in the z dimension (along the path of the beam) is defined as the first point at which the
dose deposited along the central axis exceeds the entrance plateau dose by more than 5% of the difference
between peak dose and the entrance plateau dose; and

• The distal edge in z is defined as the last point at which the deposited dose is greater than 5% of the absolute
peak dose value.

The entrance region is then defined as the region proximal to the build-up and Bragg peak region, while the
tail region is defined as the region distal to the build-up and Bragg peak region.

The yields of the positron-emitting nuclei are defined by (1):

Yield (Isotope) =
N (Isotope)

N (Primary)
(1)

where N (Isotope) is the yield of the isotope under study in that region and N (Primary) is the total number
of primary particles. Yields were calculated for each voxel along the beam path.

Three different metrics were chosen to quantify the accuracy of each model in Geant4: the NMSE, the
Pearson CCC, and the range (depth along the path of the beam) of both the positron annihilation peak and
the point beyond the peak at which positron annihilation intensity decreases by 50%.

NMSE measures the average squared difference between the experimental measurements and
simulation-predicted positron yields in each region. NMSE is most useful in regions of relatively high yield
(especially in the entrance and build-up and Bragg peak regions); the relatively low statistics available in the
tail region limit the value of the NMSE there.

NMSE is defined as:

NMSE=

Nreg∑
i=1

|Si − Ei|2

Nreg∑
i=1

|Ei|2
(2)

where Si and Ei are the simulation and experimental yields in the ith voxel of the Nreg voxels in region reg
(with a lower value indicating a better match).

For the NMSE metric, we identify the best-performing model (with the lowest mean NMSE) and
consider any other model whose mean NMSE is within two standard deviations of the best-performing
version/model as being statistically equal. For a Gaussian random distribution, this would correspond to a
95% confidence interval (although, as can be seen in the box plots of the NMSE results included in the
supplementary materials, the NMSE distributions often deviate from the Gaussian model).

The Pearson CCC compares the degree of linear dependence of one profile to another—that is, the
degree to which changes in the profiles occur at the same location and in the same direction. Thus, the
Pearson CCC quantifies the differences in shape between the simulation-predicted positron-emitting
fragment distributions and the experimental measurements, without regard to differences in the magnitude
of the profiles. The Pearson CCC is defined as:

CCC=

∑Nreg

i=1

(
Snorm,i − Snorm

)(
Ei − Enorm

)√(∑Nreg

i=1

(
Snorm,i − Snorm

)2)(∑Nreg

i=1

(
Enorm,i − Enorm

)2) (3)

where Snorm,i and Enorm,i are the normalised simulation and experimental yields in the ith voxel of the Nreg

voxels in region reg. Normalisation is performed by dividing each Si and Ei by the maximum value in its
respective region. Snorm and Enorm are the mean values in each region.

When comparing the models, the closer that the CCC between the simulation output and the
experimental estimate of positron-emitting fragment distribution is to+1, the more accurate the prediction.
A Pearson CCC greater than+0.8 is generally considered to be ‘very strong’ (Swinscow 2021). In this work,
we aim to identify the very best version/model combinations; therefore, a Pearson CCC threshold of 0.95 is
chosen to identify those combinations which have produced exceptionally good predictions of the shape of
the yield profiles. It is important to note that this threshold is quite arbitrary, and the most appropriate

4


Phys. Med. Biol. 69 (2024) 125015 A Chacon et al

threshold depends on the application; readers are referred to the supplementary data for the complete set of
results.

For each version of Geant4, phantom, beam type and energy, the NMSE and CCC were calculated for
both total annihilation photon yield profiles and also for the profiles of the three main positron-emitting
fragment species (10C, 11C and 15O). The calculation was repeated for each of the Nreg regions (entrance,
build-up and Bragg peak, and tail regions). The NMSE and the CCC were then compared across all evaluated
Geant4 versions for each region, phantom material and beam type.

A total of 5 energy/ion combinations are evaluated (carbon ions at three energies and oxygen ions at two
energies). For oxygen ions, three target materials (gelatine, PMMA and polyethylene) are evaluated for total
positron annihilation yield and 11C/10C/15O yield. For carbon ions, the same three target materials are
evaluated for total positron annihilation yield and 11C/10C yield and two for 15O yield (polyethylene is
omitted since it is not possible to produce 15O fragments with a 12C ion beam and a PE target which only
contains carbon and hydrogen). Thus, a total of 15 cases are evaluated for total positron annihilation, 11C
yield and 10C yield, while 12 are cases evaluated for 15O yield.

For range calculations, the difference between the depths at which the positron annihilation yield reached
its maximum value in the experiment and simulation was calculated (see (4)). Additionally, the point distal
to this maximum at which positron annihilation yield decreases to 50% of the maximum value was also
compared between experiment and simulation. For each version and model, the mean differences between
the experimental and simulation-based values, as well as the standard deviations and maximum differences
were calculated across all test cases (ion species, energies and target materials),

δvoxel = Rsimulation −Rexperiment (4)

where Rx is the range (depth) of the voxel with the maximum value (or, for distal 50% of peak, the first distal
voxel to fall below 50% of the maximum value) in either the simulation or experiment.

3. Results and discussion

The number of cases in which each version/model combination performed the best or equal-best in terms of
each of the evaluated metrics are counted across all simulations in the entrance, build-up and Bragg peak and
tail regions, and summarised in this section. Detailed results for each experiment are included in the
supplementary materials.

3.1. Entrance region
In the entrance region, positron-emitting fragments are created by target fragmentation rather than
projectile fragmentation. The projectile ions lose energy via Coulomb interactions, slowing down at an
approximately constant rate as they traverse this region, with only gradual changes to projectile/target cross
sections. As a result, the positron-emitting fragment distributions are expected to exhibit an approximately
flat depthwise profile in this region.

NMSE and Pearson CCC results between simulation and experimental total positron annihilation
profiles in the entrance region are summarised in tables 2 and 3, respectively, with corresponding figures
shown in supplementary material section 1.

For the entrance region, the BIC model implemented in Geant4 versions 10, 10.1, 10.3 and 10.4 provided
the (equal) lowest NMSE of the yields of total positron annihilation in 5 out of 15 cases. The BIC model in
Geant4 10, 10.1 and 10.2 also provided the (equal) lowest NMSE for 11C fragment production (11/15 cases),
whereas for 10C the best version/model combination was 10.5/INCL (8/15 cases) and for 15O it was 10.6/BIC
(9/12 cases).

Geant4 versions 10.5-11 with BIC and 10.3/10.4 with INCL each achieved a Pearson CCC greater than
0.95 (3/15 cases) for total positron yield; QMD did not reach the threshold for any test case in any version of
Geant4.

Results for individual radionuclides were also mixed, with 10/BIC, 10.1/BIC, 10.4/BIC and 10.5/INCL
achieving the threshold in 4/15 cases for 11C, 10-10.4/BIC and all versions with QMD reaching the threshold
in 2/15 cases for 10C, and all versions with BIC and 10/INCL, 10.1/INCL, 10.2/INCL, and 10.5/INCL
reaching the threshold for 15O.

3.2. Build-up and Bragg peak region
In the build-up and Bragg peak region, positron-emitting fragments are produced via a combination of
target fragmentation and projectile fragmentation. There is a rapid change in positron-emitting fragment
yield with respect to depth, especially since different positron-emitting fragments stop at different distances
from their point of production.

5


Phys. Med. Biol. 69 (2024) 125015 A Chacon et al

Table 2. Number of test cases for which each Geant4 version/model combination achieved the lowest or equal-lowest NMSE in the
entrance region. Bold text denotes the version/model achieving the highest (or equal-highest) number of best results for each
combination of ion/energy/target.

Version

Total 11C 10C 15O

BIC QMD INCL BIC QMD INCL BIC QMD INCL BIC QMD INCL

10 6 0 0 11 3 2 0 0 2 6 2 0
10.1 6 0 0 11 3 2 0 0 3 6 2 0
10.2 5 0 0 11 3 2 0 0 3 5 2 0
10.3 6 0 0 6 0 0 5 3 6 4 2 0
10.4 6 0 0 1 0 0 2 3 6 9 2 0
10.5 2 0 0 0 0 0 2 5 8 3 1 0
10.6 4 0 0 1 0 0 0 0 5 5 4 0
10.7 3 0 0 1 0 0 0 0 5 5 2 0
11 4 0 0 1 0 0 0 0 5 5 2 0
11.1 4 0 0 0 0 0 0 0 5 5 2 0

Table 3. Number of test cases for which each Geant4 version/model combination achieved a CCC greater than 0.95 in the entrance
region. Bold text denotes the version/model achieving the highest number of best results for each combination of ion/energy/target.

Version

Total 11C 10C 15O

BIC QMD INCL BIC QMD INCL BIC QMD INCL BIC QMD INCL

10 2 0 1 4 0 0 2 2 0 3 0 3
10.1 1 0 1 4 0 0 2 2 0 3 0 3
10.2 2 0 1 3 0 0 2 2 0 3 0 3
10.3 2 0 3 3 1 3 2 2 0 3 0 2
10.4 2 0 3 4 1 3 1 2 0 3 0 2
10.5 3 0 1 3 2 4 1 2 0 3 0 3
10.6 3 0 1 2 1 2 1 2 0 3 0 2
10.7 3 0 1 3 1 3 1 2 0 3 0 2
11 3 0 1 2 1 3 1 2 0 3 0 2
11.1 1 0 1 2 1 2 1 2 0 3 0 2

Table 4. Number of test cases for which each Geant4 version/model combination achieved the lowest or equal-lowest NMSE in the
build-up and Bragg peak region. Bold text denotes the version/model achieving the highest (or equal-highest) number of best results
for each combination of ion/energy/target.

Version

Total 11C 10C 15O

BIC QMD INCL BIC QMD INCL BIC QMD INCL BIC QMD INCL

10 4 0 0 11 6 3 0 0 2 3 2 0
10.1 5 0 0 11 6 3 0 0 2 3 2 0
10.2 11 1 0 14 6 3 0 0 1 5 1 0
10.3 1 0 0 5 0 0 4 1 2 3 0 0
10.4 1 0 0 1 0 0 3 2 2 4 0 0
10.5 0 0 0 0 0 0 3 4 7 1 2 0
10.6 1 0 0 0 0 0 3 0 9 2 1 2
10.7 1 0 0 0 0 0 3 2 7 0 0 1
11 2 0 0 0 0 0 3 2 9 2 0 2
11.1 1 0 0 0 0 0 3 2 9 2 0 2

NMSE and Pearson CCC results between simulation and experimental total positron annihilation
profiles in the build-up and Bragg peak region are summarised in tables 4 and 5, respectively, with
corresponding figures shown in supplementary material section 2.

In the build-up and Bragg peak region, according to the NMSE metric, total positron yield is most
accurately predicted by the BIC model in Geant4 version 10.2, being (equal) best in 11/15 cases. This is much
higher than the next-best combinations (10.1/BIC with 5/11 cases followed by 10/BIC with 4/11). Similar
results are observed for 11C yield, with 10.2/BIC achieving (equal) best performance in 14/15 cases, and
10/BIC and 10.1/BIC each achieving (equal) best results in 11/15 cases; QMD also performs reasonably well
in this case with 10, 10.1 and 10.2 achieving wins in 6/15 cases. For 10C, 10.6/INCL, 11/INCL and 11.1/INCL
are the best performers (each winning in 9/15 cases). Finally, for 15O, 10.2/BIC is the best-performing model
with 5/12 wins, followed by 10.4/BIC with 4 wins.

6


Phys. Med. Biol. 69 (2024) 125015 A Chacon et al

Table 5. Number of test cases for which each Geant4 version/model combination achieved a CCC greater than 0.95 in the build-up and
Bragg peak region. Bold text denotes the version/model achieving the highest number of best results for each combination of
ion/energy/target.

Version

Total 11C 10C 15O

BIC QMD INCL BIC QMD INCL BIC QMD INCL BIC QMD INCL

10 9 4 3 3 3 2 4 3 6 3 2 3
10.1 9 4 4 3 3 3 4 3 6 3 2 3
10.2 10 8 4 6 6 3 5 4 5 3 3 3
10.3 9 6 6 6 5 3 2 2 2 4 3 3
10.4 6 8 7 6 5 3 1 1 3 4 3 3
10.5 8 8 6 6 5 4 2 2 3 3 3 4
10.6 9 9 6 6 7 6 2 1 4 4 3 4
10.7 6 6 4 3 5 2 2 1 4 3 2 3
11 8 8 6 4 5 4 3 2 4 3 3 3
11.1 6 5 6 2 3 4 3 2 4 3 3 3

Table 6. Differences between the depths of the maximum positron annihilation yield in experimental and simulation results. Each voxel
has a width of 1.5 mm; the maximum error is always in multiples of 1.5 mm increments.

Version

BIC QMD INCL

µ (mm) σ (mm) max (mm) µ (mm) σ (mm) max (mm) µ (mm) σ (mm) max (mm)

10 1 1.85 6 −0.20 1.69 3 1.10 3.82 10.50
10.1 1 1.85 6 −0.20 1.69 3 1 3.91 10.50
10.2 0.60 1.37 3 −0.60 1.58 −3 0.60 3.39 9
10.3 1.30 1.78 6 −0.30 1.41 −3 2.30 4.12 9
10.4 1.60 1.55 6 −0.10 1.44 −3 4 4.45 10.50
10.5 0.69 0.99 3 0.60 2.03 6 2.20 4.24 9
10.6 0.60 1.37 3 −0.10 1.55 3 0.10 2.50 7.50
10.7 1.20 1.72 4.50 0.60 1.95 4.50 0.70 3.40 10.50
11 1.10 1.65 4.50 0.60 1.95 4.50 0.60 3.09 9
11.1 0.30 1.62 3 −0.30 1.62 3 0 2.78 7.50

Table 7. Differences between the distal depths at which the positron annihilation yield has decreased to 50% of the peak value in
experimental and simulation results. Each voxel has a width of 1.5mm; the maximum error is always in multiples of 1.5mm increments.

Version

BIC QMD INCL

µ (mm) σ (mm) max (mm) µ (mm) σ (mm) max (mm) µ (mm) σ (mm) max (mm)

10 0.70 1.49 3 0.30 1.41 3 −0.20 1.49 −3
10.1 0.70 1.49 3 0.30 1.41 3 −0.20 1.49 −3
10.2 0.30 1.01 1.50 0 1.13 1.50 −0.50 1.22 −3
10.3 0.50 1.09 3 0.20 1.11 1.50 −0.20 1.37 −3
10.4 0.60 1.11 3 0.30 1.01 1.50 0.10 1.20 −3
10.5 0.35 0.90 1.50 0.20 1.11 1.50 −0.50 1.22 −3
10.6 0.40 0.89 1.50 0.20 1.11 1.50 −0.40 1.33 −3
10.7 1 1.46 3 0.70 1.59 3 −0.10 1.65 3
11 1 1.46 3 0.70 1.59 3 0 1.60 3
11.1 0 1.60 −3 −0.10 1.44 −3 −0.60 1.68 −3

Using the Pearson CCC metric, the best-performing version/model combinations for overall positron
yield are 10.2/BIC (10/15 cases), followed by 10/BIC, 10.1/BIC, 10.3/BIC, 10.6/BIC and 10.6/QMD (9/15
cases). Generally, BIC performed very well, with all Geant4 versions achieving (equal) best performance in at
least 6 cases. 11C yield was best predicted by 10.6/QMD (7/15 cases) however many version/model
combinations did well here also, with 10.2/BIC, 10.2/QMD, 10.3/BIC, 10.4/BIC, 10.5/BIC, 10.6/BIC and
10.6/INCL all achieving 6/15 wins. 10C yield was best predicted by 10/INCL and 10.1 INCL (6/15 cases),
closely followed by 10.2/BIC and 10.2/INCL which won in 5/15 cases. The best-performing version/model
combinations for 15O yield were 10.3/BIC, 10.4/BIC, 10.5/INCL, 10.6/BIC and 10.6/INCL with 4/15 wins
each, and all other version/model combinations achieving 2 or 3 wins.

Table 6 lists difference between the experimental and simulation positron peak, while table 7 lists the
difference between the 50% fall off point for the experimental and simulated positron peak.

7


Phys. Med. Biol. 69 (2024) 125015 A Chacon et al

Table 8. Number of test cases for which each Geant4 version/model combination achieved the lowest or equal-lowest NMSE in the tail
region. Bold text denotes the version/model achieving the highest (or equal-highest) number of best results for each combination of
ion/energy/target.

Version

Total 11C 10C 15O

BIC QMD INCL BIC QMD INCL BIC QMD INCL BIC QMD INCL

10 3 2 2 5 9 4 2 2 1 4 4 4
10.1 4 2 2 6 9 4 1 1 2 4 4 4
10.2 12 10 2 11 12 4 1 1 3 7 7 4
10.3 1 1 1 1 1 1 2 4 0 3 4 4
10.4 1 1 1 1 1 1 2 3 0 3 4 3
10.5 1 1 1 1 1 1 3 4 2 2 2 2
10.6 1 2 1 1 1 1 5 3 6 4 10 4
10.7 1 1 1 1 1 1 3 3 5 3 4 3
11 1 1 1 1 1 1 4 3 6 4 4 3
11.1 1 1 1 1 1 1 4 3 5 4 4 3

Table 9. Number of test cases for which each Geant4 version/model combination achieved a CCC greater than 0.95 in the tail region.
Bold text denotes the version/model achieving the highest number of best results for each combination of ion/energy/target.

Version

Total 11C 10C 15O

BIC QMD INCL BIC QMD INCL BIC QMD INCL BIC QMD INCL

10 12 11 11 10 11 11 4 3 4 8 7 9
10.1 12 11 11 11 11 11 4 4 4 8 7 9
10.2 12 11 11 10 11 11 4 4 4 8 7 7
10.3 11 11 10 11 11 11 2 2 1 8 7 8
10.4 10 10 10 9 11 10 2 2 0 7 7 8
10.5 11 11 11 11 11 11 2 2 3 7 7 8
10.6 11 11 12 11 11 11 2 3 3 7 7 9
10.7 12 11 12 11 11 12 3 3 3 7 7 9
11 12 11 12 11 11 13 3 3 3 8 8 9
11.1 12 11 12 11 11 13 3 3 3 8 8 9

The smallest differences between experimental and simulation-based depth of maximum positron
annihilation were obtained with Geant4 10.4/QMD (µ=−0.1mm; max=−3mm) and 10.6/QMD
(µ=−0.1mm; max=+3mm). While a smaller mean value was obtained with 11.1/INCL, the maximum
value and standard deviation were much larger (+7.5mm and 2.78mm) compared to 10.4/QMD and
10.6/QMD. Differences in the depth of the distal 50%-of-peak point were much smaller; the best estimates
were obtained with 10.2/QMD (µ= 0 mm; max=+1.5mm), 11.1/BIC (µ= 0mm; max=−3mm) and
11/INCL (µ= 0mm; max=+3mm).

3.3. Tail region
In the tail region, positron-emitting radionuclides are primarily produced through fragmentation of the
target material caused by light fragments created upstream from the primary beam. As such, the production
of positron-emitting fragments in the tail region is highly dependent on fragmentation and scattering cross
sections upstream. Therefore, the yield of positron annihilation is not expected to rapidly change across this
region compared to the build-up and Bragg peak region.

NMSE and Pearson CCC results between simulation and experimental total positron annihilation
profiles in the tail region are summarised in tables 8 and 9, respectively, with corresponding figures shown in
supplementary material section 3.

Using the NMSE metric, 10.2/BIC was the best-performing version/model combination for overall
positron yield (12/15 cases), with 10.2/QMD being the second-best (10/15). Results were similar for 11C
yield, with the best version/model combinations being 10.2/QMD (12/15 cases) and 10.2/BIC (11/15). For
10C, the most wins were obtained by 10.6/INCL and 11/INCL (6/15 cases) followed by 10.6/BIC, 10.7/INCL
and 11.1/INCL (5/15 cases). Finally, for 15O, the best results were obtained with 10.6/QMD (10/12 cases)
followed by 10.2/BIC and 10.2/QMD (7/12 cases).

The Pearson CCC results in the tail region were all very similar across Geant4 versions, with only a few
wins separating the best and worst-performing version/model combinations in most instances. All
version/models exceeded the threshold of 0.95 for a clear majority of cases for total positron yield as well as
11C and 15O production. For total positron annihilation yield, 10/BIC, 10.1/BIC, 10.2/BIC, 10.6/INCL,

8


Phys. Med. Biol. 69 (2024) 125015 A Chacon et al

10.7/BIC, 10.7/INCL, 11/BIC, 11/INCL, 11.1/BIC and 11.1/INCL all exceeded the target threshold for 12/15
cases. Even the worst-performing version/model combinations still exceeded the threshold in 10/15 cases.
For 11C yield, 11/INCL and 11.1/INCL reached the threshold in 13/15 cases (with the worst-performing
combination scoring 9/15 wins). Fewer wins were seen with 10C; the best results were obtained with 10/BIC,
10/INCL, 10.1/BIC, 10.1/QMD, 10.1/INCL, 10.2/BIC, 10.2/QMD and 10.2/INCL (4/15 cases). Finally, 15O
yield was best predicted by 10/INCL, 10.1/INCL, 10.6/INCL, 10.7/INCL, 11/INCL and 11.1/INCL (9/12
cases)-again, in this case, even the worst-performing version/model combinations exceeded the threshold in
7/12 cases.

3.4. Overall recommendation
The accuracy of Geant4’s hadronic inelastic physics models (BIC, QMD and INCL) in predicting both total
positron annihilation yield and individual positron-emitting fragment production is not consistent between
different versions of Geant4; furthermore, later releases do not necessarily provide a more accurate
prediction of experimental observations than preceding versions. In some cases, NMSE and Pearson CCC
yielded conflicting results, due to the different features of the respective profiles which are emphasised by
each metric (NMSE quantifying the overall average squared differences between the profiles while Pearson
CCC quantifying the degree of linear dependence, independent of relative or absolute magnitude).

In the entrance region, BIC was clearly the best-performing model, with the best choice of Geant4
version depending on the particular metric and fragmentation product of interest. NMSE results generally
favoured 10-10.4/BIC (especially 10.2/BIC), except for 10C yield, which was better predicted by 10.3+/INCL.
Pearson CCC performance did not strongly favour any particular version/model combination, with at most
1/3 of test cases achieving the target CCC threshold of 0.95 for any version/model.

In the build-up and Bragg peak region and tail region, the results are more conclusive. The NMSE metric
conclusively shows that version 10.2/BIC is the best choice for total positron yield as well as 11C and 15O
yield, while 10.5-11.1/INCL performed the best for 10C. Pearson CCC results are more mixed, but again,
10.2/BIC gives the best results for total positron annihilation yield, with most versions of Geant4 with BIC
performing well. 10.6/QMD performed the best for 11C, 10/INCL and 10.1/INCL performed the best for 10C,
and there was no clear winner for 15O.

Using the depth-of-maximum-yield metric, the smallest mean differences were obtained with 10.4/QMD
and 10.6/QMD. These versions/models also achieved the equal-smallest maximum difference (−3mm and
+3mm, respectively). Across all versions of Geant4, QMD demonstrated the best overall accuracy (lowest
average mean difference in peak depth) and highest precision (lowest average standard deviation). INCL was
the worst-performing model across all versions, with much larger maximum differences, and a consistent
underestimation of depth of maximum yield across Geant4 versions, with the exception of version 11.1
(which, despite a mean difference of 0, exhibited a large standard deviation and maximum value). Standard
deviations obtained using INCL were generally around double those of QMD and BIC. BIC also showed a
consistent underestimation in depth of maximum yield, although the maximum differences were much
smaller than for INCL. For context, the difference between the depth of the positron annihilation peak and
the Bragg peak with monoenergetic ion beams is of the order of –5.6± 0.8mm for 12C and –6.6± 0.8mm
for 16O (Augusto et al 2018, Mohammadi et al 2019, Chacon et al 2020).

Results were generally better for the distal depth at 50% of peak metric. In this case, 10.2/QMD, 11.1/BIC
and 11/INCL all achieved a mean of zero, with 10.2/QMD also having the equal-lowest maximum value of
1.5mm (a depth difference of one voxel). QMD’s maximal values were slightly smaller overall compared to
BIC, and INCL’s were the largest at±3mm for all versions. INCL tended to consistently overestimate the
depth of this point, with both mean and maximum differences being negative in most cases. BIC and QMD
both tended towards underestimating the 50%-of-peak depth, with the exception of version 11.1 (negative
maxima for both, and means of 0 and−0.1mm, respectively). Standard deviations were quite small for all
versions and models (with the maximum standard deviation being 1.68mm, for 11.1/INCL).

Finally, in the tail region, Geant4 10.2 with BIC and QMD again provided the best prediction of total
positron and 11C yield in terms of NMSE, while 10.6/INCL performed the best for 10C and 15O. All
version/model combinations performed well for total positron annihilation, 11C and 15O yield according to
the Pearson CCC metric, while no version/model performed especially well for 10C.

Across all regions, ion species, beam energies, and target materials evaluated, the combination of Geant4
version 10.2 and BIC is best able to reproduce experimental results as evaluated using the NMSE and Pearson
CCC metrics—especially in the build-up and Bragg peak region and tail region. Since the build-up and
Bragg peak region is the location where (1) the majority of the dose resulting from carbon or oxygen ion
beam irradiation in heavy ion therapy is deposited, and (2) where the strongest positron annihilation signal
is observed, the results in this region are the most relevant to PET image-based QA simulation work. Version
10.2 also provided the best estimate of the depth of the distal depth at which positron yield decreased to 50%

9


Phys. Med. Biol. 69 (2024) 125015 A Chacon et al

of peak, although this was obtained with QMD rather than BIC; the most accurate estimate of the depth of
the peak itself was also achieved with QMD, but with Geant4 versions 10.4 and 10.6. As QMD exhibited the
best accuracy and precision across Geant4 versions, it is the recommended model if the depth of the yield
peak is critical.

The BIC model implemented in Geant4 version 10.5 suffered from a run-time stability error which
resulted in it being unable to simulate all test scenarios; therefore, we recommend that this version/model
combination should be avoided for future studies.

In the evaluation of individual positron-emitting fragment yield profiles, predictions of 10C distribution
were generally the least accurate in terms of both the NMSE and Pearson CCC. Interestingly, the INCL model
often performed the best for prediction of 10C fragment yield, although it rarely performed the best for total
positron annihilation and 11C or 15O. Therefore, INCL should be considered for studies focusing on 10C
fragmentation, with the caveat that range estimation will be less accurate with this model.

Not all models met or exceeded the set threshold of 0.95 for the Pearson CCC metric. This means that in
these cases, the shape of the predicted positron distribution differs significantly from the experimental
measurements. This is of particular concern if these models are to be used for dose estimation using a
deconvolution approach (Hofmann et al 2019a, 2019b) or for the training of machine learning models for
feature extraction (Rutherford et al 2022).

One may reasonably ask why the performance of the fragmentation models in Geant4 has not continued
to steadily improve with each release, and in fact has regressed at times. Positron-emitting isotope
production channels represent only a fraction of all possible reaction outcomes, so it may be the case that by
improving results for one subset of reaction processes, the positron-emitting nuclide production cross
sections became worse. Another possible reason is the implementation of different numbers of de-excitation
channels in the Fermi break-up model in different versions of Geant4. Unfortunately, to date, no detailed
investigation has been conducted into Geant4 to pin down the specific cause, and it is unknown at this stage
if there are other contributing factors as well. In order to more strictly monitor the impact of the evolution of
Geant4 in the results of a simulation application of interest, the Geant4 developers are developing an
automated benchmarking system for medical applications in Geant4 (the G4-Med project) which should
help to document the reasons behind different results when using different Geant4 releases with higher
granularity (Arce et al 2020).

In the next release of Geant4, 11.2, a new QMDmodel, ‘Light Ion QMD’, will be introduced10 with a
specific focus on hadron therapy (Sato et al 2022). In future work, we will be collaborating with the
developers of this model to compare its performance against the other models included in Geant4 11.2 with a
focus on in vivo PET applications.

Finally, it is worth noting that current evaluations of fragmentation cross sections exhibit uncertainties
exceeding 10%, which must be tightened in order to accurately model positron fragmentation, particularly
in the case of complex fragmentation reactions such as the production of 10C (Bolst et al 2017, Toppi et al
2022). These uncertainties are especially due to the effective cross-sections that are double-differential in
angle and energy. Since these cross-sections provide a strong constraint on nucleus-nucleus reaction models,
access to improved experimental measurements of these cross-sections is vital to constraining these models
and improving their accuracy. This also impacts other Monte Carlo simulation platforms (such as FLUKA,
MCNP and PHITS) which also rely on accurate cross section data (although notably PHITS uses a new
version of this model, JQMD2, which tries to correct the main flaw of the QMDmodel, the drop in effective
cross-sections at low angles (Ogawa et al 2015)).

4. Conclusion

In this study, the accuracy with which Geant4 is able to predict the distribution of total positron annihilation
yield and the distributions of individual positron-emitting fragmentation products (11C, 10C and 15O)
during carbon or oxygen ion therapy was compared to experimental data. Three different hadronic inelastic
physics models—BIC, QMD and Liege Intranuclear Cascade model (INCL) were used with ten different
versions of Geant4-10.0.p04, 10.1.p03, 10.2.p03, 10.3.p03, 10.4.p03, 10.5.p01, 10.6.p03, 10.7.p02, 11.0 and
11.1, in three different homogeneous phantoms. The simulated and experimental data were compared using
two different metrics—NMSE and the Pearson CCC. Additionally, the differences between the simulated and
experimental depth of maximum positron annihilation yield, as well as the distal point at which positron
yield declines to 50% of the peak were evaluated. It was found that the accuracy of the hadronic inelastic

10 Note: this model had not been included in Geant4 prior to the submission of this manuscript.

10


Phys. Med. Biol. 69 (2024) 125015 A Chacon et al

physics models strongly depends on the version of Geant4 in which it was implemented, and newer versions
of Geant4 were not always more accurate at predicting positron-emitting fragmentation compared to older
versions. Furthermore, it was found that not all version/model combinations were able to satisfactorily
predict the shape of positron annihilation or positron-emitting fragment distributions, even though they
could provide a good estimation of the total positron annihilation yield and range. For future simulation
studies of therapeutic irradiation using carbon or oxygen ion beams, it is recommended that Geant4 version
10.2 with the BIC model be used as it is currently the version/model combination best able to replicate the
experimentally-observed total positron yield and the fragmentation product distributions, while the depth of
the maximum positron yield and distal 50%-of-peak point were best predicted using the QMDmodel from
Geant4 10.4, 10.6 (peak) and 10.2 (distal 50%-of-peak).

Data availability statement

All data that support the findings of this study are included within the article (and any supplementary
information files).

Acknowledgment

The authors would like to acknowledge the following organisations for providing access to their
high-performance computing resources: the Multi-modal Australian Sciences Imaging and Visualisation
Environment (MASSIVE) ‘M3’ cluster and Australia’s Nuclear Science and Technology Organisation
(ANSTO) ‘Tesla’ cluster. This research has been conducted with the support of the Australian government
research training program scholarship. The authors acknowledge the scientific and technical assistance of the
National Imaging Facility, a National Collaborative Research Infrastructure Strategy (NCRIS) capability at
the Australian Nuclear Science and Technology Organisation, ANSTO.

Appendix

Table 10 lists the physics models which were used in the simulations.

Table 10. Hadronic physics processes and models used in all simulations.

Interaction Energy range Geant4 model

Radioactive decay All energies G4RadioactiveDecayPhysics
Particle decay All energies G4Decay
Hadron elastic 0–100 TeV G4HadronElasticPhysicsHP
Ion inelastic <100 MeV Binary Light Ion Cascade

100 MeV–10 GeV BIC or QMD or INCL++
Neutron capture 0–20 MeV NeutronHPCapture

>19.9 MeV nRadCapture
Neutron inelastic 0–20 MeV NeutronHPInelastic

>19.9 MeV Binary Cascade
Proton inelastic 990 eV–10 TeV Binary Cascade

ORCID iDs

Akram Hamato https://orcid.org/0000-0002-3274-4261
Hideaki Tashima https://orcid.org/0000-0003-3155-0083
Eiji Yoshida https://orcid.org/0000-0003-4269-2618
Go Akamatsu https://orcid.org/0000-0001-9686-8901
Daniel R Franklin https://orcid.org/0000-0002-9563-5943
Anatoly Rosenfeld https://orcid.org/0000-0001-5116-6308
Susanna Guatelli https://orcid.org/0000-0002-9289-7956
Mitra Safavi-Naeini https://orcid.org/0000-0002-6975-9563

11

https://orcid.org/0000-0002-3274-4261
https://orcid.org/0000-0002-3274-4261
https://orcid.org/0000-0003-3155-0083
https://orcid.org/0000-0003-3155-0083
https://orcid.org/0000-0003-4269-2618
https://orcid.org/0000-0003-4269-2618
https://orcid.org/0000-0001-9686-8901
https://orcid.org/0000-0001-9686-8901
https://orcid.org/0000-0002-9563-5943
https://orcid.org/0000-0002-9563-5943
https://orcid.org/0000-0001-5116-6308
https://orcid.org/0000-0001-5116-6308
https://orcid.org/0000-0002-9289-7956
https://orcid.org/0000-0002-9289-7956
https://orcid.org/0000-0002-6975-9563
https://orcid.org/0000-0002-6975-9563


Phys. Med. Biol. 69 (2024) 125015 A Chacon et al

References

Agostinelli S et al 2003 Geant4—a simulation toolkit, nuclear instruments and methods in physics research section a: accelerators,
spectrometers Detectors Assoc. Equip. 506 250–303

Akamatsu G et al 2019 Performance evaluation of a whole-body prototype PET scanner with four-layer DOI detectors Phys. Med. Biol.
64 095014

Arce P et al 2020 Report on G4Med, a Geant4 benchmarking system for medical physics applications developed by the Geant4 Medical
Simulation Benchmarking GroupMed. Phys. 48 19–56

Augusto R S, Mohammadi A, Tashima H, Yoshida E, Yamaya T, Ferrari A and Parodi K 2018 Experimental validation of the fluka Monte
Carlo code for dose and β+ -emitter predictions of radioactive ion beams Phys. Med. Biol. 63 215014

Bolst D et al 2017 Validation of geant4 fragmentation for heavy ion therapy Nucl. Instrum. Methods Phys. Res. 869 68–75
Boudard A, Cugnon J, Leray S and Volant C 2002 Intranuclear cascade model for a comprehensive description of spallation reaction

data Phys. Rev. C 66 044615
Chacon A et al 2019 Comparative study of alternative geant4 hadronic ion inelastic physics models for prediction of positron-emitting

radionuclide production in carbon and oxygen ion therapy Phys. Med. Biol. 64 155014
Chacon A et al 2020 Experimental investigation of the characteristics of radioactive beams for heavy ion therapyMed. Phys. 47 3123–32
Durante M, Orecchia R and Loeffler J S 2017 Charged-particle therapy in cancer: clinical uses and future perspectives Nat. Rev. Clin.

Oncol. 14 483–495
G Collaboration 2018 Physics reference manual for geant4 Technical Report CERN
Hofmann T et al 2019a Dose reconstruction from PET images in carbon ion therapy: a deconvolution approach Phys. Med. Biol.

64 025011
Hofmann T, Fochi A, Parodi K and Pinto M 2019b Prediction of positron emitter distributions for range monitoring in carbon ion

therapy: an analytical approach Phys. Med. Biol. 64 105022
Mancusi D, Boudard A, Cugnon J, David J-C, Kaitaniemi P and Leray S 2014 Extension of the Liège intranuclear-cascade model to

reactions induced by light nuclei Phys. Rev. C 90 054602
Mohammadi A, Tashima H, Iwao Y, Takyu S, Akamatsu G, Nishikido F, Yoshida E, Kitagawa A, Parodi K and Yamaya T 2019 Range

verification of radioactive ion beams of 11C and 15O using in-beam PET imaging Phys. Med. Biol. 64 145014
Ogawa T, Sato T, Hashimoto S, Satoh D, Tsuda S and Niita K 2015 Energy-dependent fragmentation cross sections of relativistic 12C

Phys. Rev. C 92 024614
Parodi K and Polf J C 2018 In vivo range verification in particle therapyMed. Phys. 45 e1036–50
Rutherford H et al 2022 An inception network for positron emission tomography based dose estimation in carbon ion therapy Phys.

Med. Biol. 67 194001
Sato Y-h, Sakata D, Bolst D, Simpson E C, Guatelli S and Haga A 2022 Development of a more accurate geant4 quantum molecular

dynamics model for hadron therapy Phys. Med. Biol. 67 225001
Swinscow T 2021 Statistics at Square One (Wiley)
Toppi M et al 2022 Elemental fragmentation cross sections for a 16O beam of 400 MeV/u kinetic energy interacting with a graphite target

using the FOOT∆E-TOF detectors Front. Phys. 10 979229
Rutherford H et al 2020 Dose quantification in carbon ion therapy using in-beam positron emission tomography Phys. Med. Biol.

65 235052

12

https://doi.org/10.1016/s0168-9002(03)01368-8
https://doi.org/10.1016/s0168-9002(03)01368-8
https://doi.org/10.1088/1361-6560/ab18b2
https://doi.org/10.1088/1361-6560/ab18b2
https://doi.org/10.1002/mp.14226
https://doi.org/10.1002/mp.14226
https://doi.org/10.1088/1361-6560/aae431
https://doi.org/10.1088/1361-6560/aae431
https://doi.org/10.1016/j.nima.2017.06.046
https://doi.org/10.1016/j.nima.2017.06.046
https://doi.org/10.1103/physrevc.66.044615
https://doi.org/10.1103/physrevc.66.044615
https://doi.org/10.1088/1361-6560/ab2752
https://doi.org/10.1088/1361-6560/ab2752
https://doi.org/10.1002/mp.14177
https://doi.org/10.1002/mp.14177
https://doi.org/10.1038/nrclinonc.2017.30
https://doi.org/10.1038/nrclinonc.2017.30
https://doi.org/10.1088/1361-6560/aaf676
https://doi.org/10.1088/1361-6560/aaf676
https://doi.org/10.1088/1361-6560/ab17f9
https://doi.org/10.1088/1361-6560/ab17f9
https://doi.org/10.1103/PhysRevC.90.054602
https://doi.org/10.1103/PhysRevC.90.054602
https://doi.org/10.1088/1361-6560/ab25ce
https://doi.org/10.1088/1361-6560/ab25ce
https://doi.org/10.1103/physrevc.92.024614
https://doi.org/10.1103/physrevc.92.024614
https://doi.org/10.1002/mp.12960
https://doi.org/10.1002/mp.12960
https://doi.org/10.1088/1361-6560/ac88b2
https://doi.org/10.1088/1361-6560/ac88b2
https://doi.org/10.1088/1361-6560/ac9a9a
https://doi.org/10.1088/1361-6560/ac9a9a
https://doi.org/10.3389/fphy.2022.979229
https://doi.org/10.3389/fphy.2022.979229
https://doi.org/10.1088/1361-6560/abaa23
https://doi.org/10.1088/1361-6560/abaa23

	A quantitative assessment of Geant4 for predicting the yield and distribution of positron-emitting fragments in ion beam therapy
	1. Introduction
	2. Materials and methods
	2.1. Experimental configuration
	2.2. Simulation parameters
	2.3. Evaluation methods and metrics

	3. Results and discussion
	3.1. Entrance region
	3.2. Build-up and Bragg peak region
	3.3. Tail region
	3.4. Overall recommendation

	4. Conclusion
	Appendix
	References