Teak genetic diversity in Ghana shows a narrow base for further breeding and a need for improved international collaboration for provenance exchange

We evaluated the genetic diversity of teak (Tectona grandis L.) provenances at a newly established provenance trial with 52 provenances collected from Africa, South America and Asia in Tain II Forest Reserve in Central Ghana. This provenance trial was established to widen the genetic basis for teak establishment in West Africa. Using Genotyping by Sequencing (GBS) we analysed the genetic diversity of these provenances. Results of the study revealed that, although acquired from a wide geographical range, most teak provenances in the trial belong to only two distinct groups that are closely related. The implication of this finding is that, for breeding, a wider range of provenances is needed from the original teak distribution areas, and more specifically from Southern India. We conclude that urgent protection of older existing sources of genetic variation in teak, as well as an improvement of international collaboration under the Nagoya protocol with countries with native teak populations, is necessary.


Introduction
Teak is a high-quality timber species of great importance in plantation establishment throughout the tropics. Teak (Tectona grandis L.) is naturally distributed in Myanmar, India, Laos and Thailand (White, 1991) but can now be found in about 36 countries in Tropical Asia, South America and Africa (Koskela et al, 2014). A record area of about 5.7 million ha of teak has been planted (Bhat and Hwan, 2004;Nair, 2007), underlining the economic importance of teak for tropical forestry. Teak has reached shape of the trees under many circumstances in various countries (Chaix et al, 2011;Goh and Monteuuis, 2012;Ugalde-Arias, 2013). Although such clones contribute to the present success and financial attractiveness of teak planting, tree planters should heed their genetic diversity. To help mitigate against the effects of climate change, the appearance of new diseases, and to allow for improvement of other qualities such as growth speed and heartwood formation, genetic diversity is key and should receive more attention (Graudal and Moestrup, 2017). It is important to initiate and support selection and testing of superior individuals in local breeding programmes because many traits such as bole straightness, proportion of heartwood and fine branching, which are important for commercial production of teak, have a genetic background (Kjaer et al, 1996Fofana et al, 2008), but the phenotypic manifestation of traits is not the same in each locality.
In Ghana, teak is the prime plantation species with well over 150,000 hectares planted since 2002 (FSD- FC Ghana, 2017). At present, for the development of teak plantations in Ghana, a very limited number of seed sources are available (Wanders, 2014;FSD-FC Ghana, 2017). Most of these seed sources are 'unproven', which means that the stands have been identified as seed stands, but progeny trials are not available to support the selection of these stands for this purpose. The material from these stands is now systematically evaluated at the Tain II Forest Reserve provenance trial at Form Ghana Ltd. The stands that are currently used as seed sources were mostly planted in the 1980s and 1990s, many of them with material from Kihuhwi in Tanzania (FSD- FC Ghana, 2017). Other sources of planting material are what remains of an international teak provenance trialing effort containing 13 provenances from India, Laos, Indonesia and Ghana planted between 1972 and1975 by the Danish Development Cooperation DANIDA and the Forest Research Institute of Ghana FORIG (Keiding et al, 1986). A clonal seed orchard has been developed based on material from this trial by FORIG at Jimira in Ghana. From these sources, plantation developers presently obtain seeds for plantation establishment. Some also resort to importation of seed from other countries.
Recognising the need for a wider genetic pool to source seeds, Form Ghana Ltd started a new provenance trial in 2015 in which, over several years, 52 accessions of teak coming from Ghana, Tanzania, Côte d'Ivoire, Malaysia, Brazil, Costa Rica, Honduras, and indirectly from India and Indonesia, were planted (Wanders, 2020).
While these provenances represent a global distribution of teak, there is great uncertainty on their genetic kinship. In this study, we investigated the genetic diversity of teak provenances presently grown for testing in Ghana. We evaluated how closely related the provenances are and whether the aim of a wider genetic base for the teak industry in West Africa can be achieved under the present conditions and with the material currently accessible.

Trial location
The trial is located in block A42 in the Tain II Forest Reserve (Figure 1). The coordinates of the location are 7 • 37'53.78"N and 2 • 38'26.31"W. The layout of the trial is a block design with blocks of 49 trees per provenance. For most provenances there is at least one replicate, but some have several replicates. Planting started in 2015 and new material has been added annually, while also adding already present material to make comparisons within and between the years possible. In 2020 the trial covered 12 hectares.

Genetic sampling and sample library preparation
Leaf samples (one sample per provenance) were collected at Form Ghana's Tain II provenance trial and at Form Ghana's nursery in May 2019. Leaf samples were immediately dried with silica gel and stored for further processing. A total of 41 trees of 37 different accessions were sampled.
Genomic DNA extraction was done using a Nucleospin 96 Plant II Kit from Bioké, following the manufacturer's instructions. Genetic variation was measured using Genotyping by Sequencing (GBS) (Elshire et al, 2011). First, 88 to 278 ng of genomic DNA (gDNA) of each of 41 samples was digested by two restriction enzymes (AseI and NsiI) after which, two indexed adapters were ligated to the DNA fragments. The main change in the adapter design was the incorporation of three random Unique Molecule Identifier (UMI) nucleotides per adapter for the identification of PCR duplicates within each amplified GBS library. After ligation, individual samples were cleaned by two subsequent Nucleo-Mag (Macherey-Nagel, Germany) cleanup steps of 1x and 0.8x beads. A small volume test PCR (15 cycles) was performed using KAPA HiFi Hot-Start readyMix (Roche Diagnostics, Switzerland). The resulting product was diluted 10,000 x prior to qPCR quantification (KAPA Library Quantification Kit for HTS, Roche Diagnostics, Switzerland). The result of the qPCR was subsequently used to equimolarly pool the original cleaned digestion/ligation products. This pooled product was concentrated using a column-based Nucleo-Mag PCR clean-up (Macherey-Nagel, Germany) and nick repaired using DNA polymerase I (50 µL reaction). The nick repaired product was amplified in five reactions of 10 µL each and cleaned by two subsequent NucleoMag (Macherey-Nagel, Germany) clean-up steps using 1x and 0.8x beads, respectively. The average library size was 1,177 bp. The final GBS library was quantified by qPCR, pooled with other libraries and spiked with 10% PhiX prior to sequencing. This increases the DNA complexity of the library in order to improve the Hiseq colour matrix estimation for which the first 11 sequencing cycles are used overlapping with our index region. Sequencing was performed by Novogene (Hong Kong) on an Illumina Hiseq X-Ten sequencer, producing 2x150bp Paired-End (PE) sequencing reads. In total 0.4 sequence lane was devoted to the 41 teak GBS libraries, providing a total of 232,995,422 raw reads.

Data analysis
Demultiplexing, de novo reference construction, mapping and SNP calling of the DNA sequences were conducted using Stacks version 2.4 (Catchen et al, 2013). PCR duplicates were removed using clone filter based on the UMI nucleotides, followed by demultiplexing using process radtags. To identify SNPs from the reads we used the "denovo map.pl" script using -m 3 -M 5 -n 5 based on exploratory runs using a range of values (m 2-6, -M 3-7 and n=M) to maximise the quality of SNPs for this dataset (Paris et al, 2017). After mapping, data was filtered using VCF tools (Danecek et al, 2011). The applied filter first removed all loci which were not present in more than 50% of individuals, had a genotype quality below 30 or had a mean depth lower than six. After this, individuals with more than 80% missing data were removed. All SNPs, which were not present in all individuals and had an individual sample depth less than 10, were removed. Four duplicate datasets were removed from the analysis, resulting in a total of 37 samples. We used STRUCTURE (Pritchard et al, 2000) on 1000 randomly selected SNPs to assess patterns of genetic structure in the samples, with a number of assumed populations (K) of 1-7, with 10 replicates per K. We used 1,000,000 burn-in and 500,000 reps. Afterwards, the output data was analysed using structure selector (Li and Liu, 2018). Clustering was done using the Adgenet package (Jombart and Ahmed, 2011) in R version 3.5.3 (R Core Team, 2019). Genetic distance was calculated using the R-package Adegenet, using dist(method="Euclidian"). Principal component analysis was done with the function dudi.pca from the Rpackage Ade4. All scripts used in this analysis are available at https://github.com/MaartenPostuma/Teak-anal ysis. Demultiplexed reads are available under BioProject PRJNA756980 at NCBI (https://www.ncbi.nlm.nih.gov /bioproject/PRJNA756980).

Results
A total of 23,182 SNPs were obtained after filtering, which ensured the accuracy and reliability of subsequent genetic diversity and population structure analyses. The optimal value of K was determined by Evanno's delta K method (Evanno et al, 2005). Two clearly defined main clusters (K=2; Figure 2) and a maximum of 4 clusters (K=4; Figure 2) were revealed. The most likely number of clusters was K=2. These two main clusters are at a great genetic distance from each other and are clustered primarily according to geographical region: one cluster consisted of teak stands originating from Asia and the Pacific and the other cluster originating from South America and Africa. In addition, K=3 and K=4 showed some variation within the two main clusters. When K=3, accessions originating from Côte d'Ivoire and Malaysia for example could be distinguished and at K=4 accessions from the Solomon Islands separated from the other accessions from the Asia-Pacific cluster. The list of provenances, ID and their collected and expected origin (as inferred from the genetic analysis) is shown in Table 1. 'Origin of collection' in this table means how Form Ghana obtained the material and the 'expected origin' in the table refers to the origin to which the material can be traced in the literature. Further analysis of the two main clusters clearly showed more genetic similarities within the South America-Africa cluster indicating less genetic variability as compared to the Asia-Pacific cluster, which had less genetic similarities indicating more variation in genotype especially in the Asian cluster. This was also illustrated by a principal component analysis based on genetic distance ( Figure 3) and the number of polymorphic sites (SNPs) within the two clusters. In the Asia-Pacific cluster, 94% of SNPs were polymorhpic compared to 74% of polymorphic SNPs in the South America-Africa cluster, even though the latter had more individuals. In addition, only 1364 private alleles were found in the South America-Africa cluster as compared to 6201 private alleles in the Asia-Pacific cluster. Mean Euclidean-based genetic distance was calculated as 87.25 ± 19.35 within the South America-Africa cluster (green + pink), 117.19 ± 43.9 within the Asia-Pacific cluster (yellow + red), and 161.35 ± 32.35 between these two main clusters (Figure 2). These data showed substiantally higher levels of genetic variation in the Asia-Pacific cluster and suggest low levels of genetic variation in the South America-Africa cluster.

Discussion
The results in Figure 2 show two main clusters of genetic variation for the 37 teak provenances sampled in the Tain II Forest Reserve. The first cluster mainly consists of teak stands from Africa and South America and the second of provenances from Asia and the Pacific. Data on the number of polymorphic sites in the two clusters and genetic distance within and between the clusters indicate less genetic variation between provenances in the Africa-South America cluster and a high genetic variation between provenances within the Asia-Pacific cluster and especially in the Asia cluster.
Grouping the material shows that material from Indonesia and Africa is closely related which confirms the conclusions of Verhaegen et al (2010) that teak from Ghana and Indonesia could be originating from Laos while teak from other African places can be traced back to North India (Fofana et al, 2008). Together, they form a group that is different from the Thai and South India provenance groups. In this study we can now add the South American provenances to the latter group. The attribution of the Indonesian provenances to Laos was also found by Hansen et al (2017), who unfortunately did not sample from Ghana. The grouping of material from Malaysia and India in one group can be explained by the collection of Indian provenances in provenance trials in Côte d'Ivoire for the establishment of the Malaysian plots (Goh and Monteuuis, 2009). The link between the material from the Solomon Islands and India should not be surprising as the Solomon Islands have no indigenous teak population and their population was built up from foreign material which mostly came from India (Raomae, 2012).
Most teak provenances within the Africa-South America cluster showed less genetic variation in this study which confirms that African teak provenances most likely originate from a limited range in North India and none of the African provenances are from South India (Verhaegen et al, 2010).
Some uncertainty on the exact origin of provenances in our study remains. Attribution to a certain origin as indicated by the structure analysis was based on the genetic relatedness of samples from single trees representing each provenance. Based on this, provenances that were genetically more related were then assigned to the same cluster. However, some provenances originated from mixed clonal seed orchards (Jimira, Kiroka, Sangoué and La Téné) as presented in Table 1. The seed obtained from such seed orchards is potentially more diverse and sampling may have covered only part of the locally present diversity. As a consequence, more sampling in the same population of seedlings from such orchards could potentially also identify genetic material from the other cluster.
Despite the uncertainty of the origin of some provenances, the results show that although imports were made from very different areas in the tropics, the achieved gain in genetic diversity is very limited and reflects that, over time, teak provenances from a limited number of sources have spread over a wide area (Fofana et al, 2008). This also means that at present, new imports of teak seeds into e.g. Ghana, mostly do not constitute a new genetic accession added to the gene pool. Before going through the process of obtaining permits and importing seeds from a presumed new accession, it is important to compare its genetic makeup with the existing provenances. It is also important to further investigate the current collection of provenances Figure 2. Clustering of teak stands of different provenances. Shown is a dendrogram based on genetic distance (right part) and the different clusters as identified by the structure analysis (green, cluster 1; yellow, cluster 2; pink, cluster 3 and red, cluster 4). The dendrogram was generated by hierarchical clustering (UPGMA) based on genetic dissimilarity. Vertical lines in the dendrogram give the amount of genetic dissimilarity and represent genetic lineages. Each row represents an individual tree per provenance, with the length of the different colour segments representing the proportion of a cluster in an individual's genetic makeup. K = 2-4 indicates the number of genetic clusters that were revealed in this structure analysis from 2 to 4. The most likely number of clusters was K=2. so that the search for additional genetic material for teak provenance pools in West Africa can be conducted with more focus.
Our findings emphasise the need for acquiring teak provenances from areas of its original distribution that are high in genetic diversity and are not in the present provenance trial, one such area being South-West India (Hansen et al, 2017) and the semi-moist east coast of India (Hansen et al, 2015). The analysis of Vaishnav and Ansari (2018) indicates that genetic resources in India may be a source for screening resilient superior provenances for improvement strategies for sustainable production of quality timber on a large scale. Various examples exist for the benefit of matching specific provenances to specific local conditions. Indigenous teak populations from Annamalai Hills in the Indian states of Kerala and Tamil Nadu contain well performing provenances for Tanzania (Madoffe and Maghembe, 1988;Pedersen et al, 2007), while a Nilambur provenance from India and a Savannahket provenance from Laos have been assessed as very suitable for Ghana (Adu-Bredu et al, 2019).
Currently it is difficult to obtain accessions from some of the countries containing the high diversity areas, as they have banned the export of seeds and sometimes also of clones of their genetic heritage. It is, for instance, impossible to import seeds from India (Government Of India , 2002). This makes it all the more urgent to get a full view of the genetic make-up of trees planted in old (pre-Nagoya protocol) provenance trials such as the series of international provenance trials planted in the 1970s (Keiding et al, 1986). More and more of these trials are lost to felling, e.g. recently Longuza provenance trial in Tanzania (Wanders, personal observation) and to disaster as is the case of St. Croix in Puerto Rico, which was part of the series of international provenance trials set up by DANIDA and was destroyed by hurricanes (Morgan, personal communication, 2016). The original series of international provenance trials by DANIDA contained 75 provenances which were under test on over 50 locations, with 41 original teak provenances originating from the natural range of teak (Keiding et al, 1986). These trials potentially remain a very important source of genetic variety for any breeding programme (Koskela et al, 2014;Adu-Bredu et al, 2019) and their conservation should be a high priority. As the climate is changing and forestry is to adapt to the climate, becoming either wetter or drier, the need to access a wider range of genetic material may become more and more important in tropical forestry (Koskela et al, 2014).
At present, the Nagoya Protocol on access to genetic resources and benefit sharing (ABS) (CBD, 2011) could govern the sharing of benefits resulting from exchanges of genetic material in a more structured and mutually beneficial manner. It is not yet clear if the signing of the Nagoya protocol will make it possible to again obtain seeds from countries having interesting genetic resources, but prohibiting export of seeds and other propagation materials. Documents that need to be elaborated per seedlot, such as the Prior Informed Consent and benefit sharing agreement, create barriers that need urgent addressing at supranational level. Koskela et al (2014) provide an insight into the amount of paperwork necessary in order to plant a provenance trial, which is another argument to carefully conserve pre-Nagoya planted trials and exchange genetic material from these. The amount of work going into the drafting and signing of ABS and mutually agreed terms (MAT) may make it worthwhile to engage in only for commercially high returning crops.
In international forestry, non-profit initiatives to exchange seeds exist. One of these, CAMCORE (https:/ /camcore.cnr.ncsu.edu/) has done excellent work on the collection and distribution of seeds for broad testing of Pinus and Eucalyptus species. CAMCORE has organised expeditions for the collection of seeds of species interesting for forestry and tree breeding and distributed these seeds to be planted in trials at member organisations and companies. CAMCORE has recently also started work on teak (Hodge et al, 2019). The cost of the membership fee for this organisation, however, is not accessible to all organisations involved in plantation development. More exchange would certainly improve the possibilities of increasing the gene pool for teak breeding.

Conclusion
As our work has shown, it can be difficult to have access to diverse genetic materials. With the uncertainties about the long-term fitness of currently available genetic material under climate change and possible disease vulnerability, having access to genetic diversity is becoming increasingly important. Because not all genetic diversity has a direct commercial interest, the creation and maintenance of a national gene bank or national collection (NCCPG, 2007;FAO, 2014) should be a national priority. Conservation of teak genetics in Thailand has been described by Kaosa-Ard et al (1998) and Graudal et al (1999). In India genebanks have also been created such as the National Teak Germplasm Bank in Chandrapur (Maharashtra) whose genetic diversity has been analysed (Mahesh et al, 2016). Lack of formal protection of tree genetic resources can cause genetic material to be lost unnoticed. Through cooperation between the countries that took part in past international provenance trials on teak, each participant country could, through exchanges, build up a collection of most, if not all, accessions of teak originally distributed. This should be done in addition to addressing of international barriers for exchange of genetic material of teak from its original range mentioned earlier. A national teak genebank collection for Ghana (and other African teak producing countries) would then become an excellent centre for the distribution and conservation of genetic material. The facility managing such a collection should become a member of CAMCORE or a similar organisation to further facilitate exchange.
contribution to the analysis of the findings. This study also received co-funding from Wageningen University.

Author contributions
Tieme Wanders, Philippine Vergeer, James Ofori and Elmar Veenendaal conceived and planned the experiments; James Ofori, Alexander Amoako and Tieme Wanders collected data; Philippine Vergeer, Niels Wagemaker, Maarten Postuma, James Ofori, Tieme Wanders and Elmars Veenendaal contributed to analysis and interpretation of the results; Tieme Wanders took the lead in writing the manuscript. All authors provided critical feedback and helped shape the research, analysis and manuscript.