# Structural annotation

 - EGN-EP 2.1.1
 - Transcript Datasets: transcriptome assemblies used for XRQv2 annotation
 - gmap -n2 for unphased Het genomes
 
# Functional annotation
 - blast2go v1.5.1
 - e2p2 v4
 - iprscan 5.67-99.0
 - iTAK 1.7a
 - kofamscan 1.3.0
 - PlantTFCat 2014
 - UniProt:"Helianthus annuus" 20240507
 - NR2024:"Embryophyta"

## Statistics

#ANNOTATION	mRNA	ncRNA	tRNA	rRNA	
Hann83HR4RM-20240117-HIFIASM	52434	8163	1682	6269	
HannANN1064-20240117-r1	139944	23562	3436	4412	
HannANN1438-20240117-r1	140378	18479	3415	2650	
HannANN1510-20240117-r1	140128	24318	3614	5702	
HannANN1184-20240117-r1	140693	23350	3430	9304
HannANN1430-20240117-r1	146691	22128	3671	7135
				
#ANNOTATION COMPLETION (BUSCO EMBRYOPHYTA)	Complete	Single Copy	Duplicated	Fragmented	Missing
Hann83HR4RM-20240117-HIFIASM	96.7%	86.5%	10.2%	1.7%	1.6%
HannANN1064-20240117-r1	97.8%	7.5%	90.3%	1.1%	1.1%
HannANN1438-20240117-r1	97.9%	4.1%	93.8%	1.2%	0.9%
HannANN1510-20240117-r1	97.8%	5.0%	92.8%	1.1%	1.1%
HannANN1184-20240117-r1	97.7%	6.6%	91.1%	1.2%	1.1%
HannANN1430-20240117-r1 98.4%	6.3%	92.1%	0.8%	0.8%
				
#FUNCTIONAL ANNOTATION	Proteins with at least one Blast2GO GO term	Proteins with EC Number			
Hann83HR4RM-20240117-HIFIASM	38404	19445			
HannANN1064-20240117-r1	89577	51978			
HannANN1438-20240117-r1	90211	53218			
HannANN1510-20240117-r1	89996	52885			
HannANN1184-20240117-r1	81545	52668
HannANN1430-20240117-r1	85191	55392


#
# HiFi + Hi-C assemblies
#

- EGN-EP 2.2.1
 - Annotation haplotype by haplotype
 - Helixer training
 - Transcript Datasets: transcriptome assemblies used for XRQv2 annotation
 - gmap -n0 (one copy / haplotype)
 
# Functional annotation
 - blast2go v1.5.1
 - e2p2 v4
 - iprscan 5.71-102.0
 - iTAK 1.7a
 - kofamscan 1.3.0
 - PlantTFCat 2014
 - UniProt:"Asteraceae" 20241127
 - NR2024:"Viridiplantae"

# busco 5.6.1 vs. embryophyta_odb10						
						
Genome	Haplotype	Complete	Single Copy	Duplicated	Fragmented	Missing
HannANN1430	H1	96.9%	86.0%	10.9%	1.6%	1.5%
HannANN1430	H2	96.8%	85.8%	11.0%	1.8%	1.4%
HannANN1184	H1	96.7%	85.6%	11.1%	2.0%	1.3%
HannANN1184	H2	96.8%	86.2%	10.6%	1.5%	1.7%
						
						
# egn-ep 2.2.1 --training_use_helixer						
						
Genome	mRNA	ncRNA	tRNA	rRNA		
HannANN1430	152971	18108	3483	11924		
HannANN1184	153814	19098	3525	14348		
