Constrained neighbour joining


Hi all,

In the bootscanning procedure in the Rega genotyping tool, PAUP* is used to perform neighbour joining with constraints e.g.

begin paup; log file=paup.log replace=yes; export format=nexus file=paup.nex replace=yes; set rootmethod=midpoint; set criterion=distance; dset distance=HKY NegBrLen=SetAbsVal; constraints CONSTRE=((2,3,4),((5,6,7),(11,12,13)),(8,9,10),(14,15),(16,17,18),(19,20,21),(22,23,24),(25,26)); NJ constraints=CONSTRE enforce=yes; savetree format=nexus brlens=yes file=paup.tre replace=yes; bootstrap nreps=100 search=nj; end; quit;

Does anyone know whats going on under the hood here?

Best Simon


Sorry, I don’t know for sure. You should ask Tulio De Oliviera or one of his REGA coworkers who designed the tool to be certain. But I can make a pretty good guess. I am assuming you are talking about the HIV-1 genotyping and not HCV or another virus. In the HIV-1 subtype reference set the sequences of subtypes B and D are more closely related to each other, and the rest are essentially a star phylogeny. The constraint tree there puts B and D together on a branch ((5,6,7),(11,12,13)).

The tool is looking to place each query sequence into one of the clades of the HIV-1 M group, and to check whether it is withing that clade or outside of it. So it makes sense to constrain the tree to have those clades from the beginning.

The newer version of the REGA tool has a more extensive list of subtypes and subsubtypes of HIV-1 M group:

Subtype A1 A1_UA_01_01UADN139_DQ823357 A1_RU_00_RU00051_EF545108 A1_UG_UG031_AB098330 Subtype A2 A2_CD9797CDKTB48 A2_CY9494CY01741 Subtype B B_US_1986_5019_86_AY835780 B_US_90_US2_AY173953 B_BR_03_BREPM1028_EF637053 Subtype C C_ZA_98_TV001_AY162223 C_IN_x_VB39_EF694033 C_IL_99_99ET1_AY255823 Subtype D D_UG_99_99UGG35093_AF484495 D_UG_98_98UG57143_AF484514 D_ZA_84_R2_AY773338 Subtype F1 F1_ES_x_X1670_DQ979024 F1_BR_89_BZ126_AY173957 Subtype F2 F2_CM_95_MP255_AJ249236 F2_CM_97_CM53657_AF377956 Subtype G G_NG_01_01NGPL0669_DQ168576 G_ES_05_ES_EU786670 G_GH_2003_GHNJ175_AB231893 Subtype H H_CF_90_056_AF005496 H_BE_93_VI997_AF190128 H_BE_93_VI991_AF190127 Subtype J J_SE_93_SE7887_AF082394 J_SE_1994_SE7022_AF082395_1 J_CD_97_J_97DC_KTB147_EF614151 Subtype K K_CD_97_EQTB11C_AJ249235 K_CM_96_MP535_AJ249239