Peptide retention time prediction for immobilized artificial membrane phosphatidylcholine stationary phase : method development and preliminary observations

Development of the first peptide retention prediction model for immobilized artificial membrane phosphatidylcholine (IAM.PC) stationary phase is reported. 2D liquid chromatography coupled to tandem mass spectrometry (2D LC-MS/MS) analysis of a whole cell lysate of S. cerevisiae yielded a retention dataset of ~29,500 tryptic peptides; sufficient for confident assignment of retention coefficients which determine the contribution of individual amino acids in peptide retention. Retention data from the first dimension was used for the modelling: an IAM.PC.DD2 column, with pH 7.4 ammonium bicarbonate, and a water/acetonitrile gradient. Peptide separation using the IAM.PC.DD2 phase was compared to a standard C18 phase (Luna C18(2)). There was a significant reduction in peptide retention (~14 % acetonitrile on average), indicating that the phosphatidylcholine stationary phase is significantly more hydrophilic. In comparison to the C18 phase, a substantial increase was found in the relative retention contribution for the positively charged Arg and Lys, and the aromatic Tyr, Trp and His residues. A decrease in retention contribution was observed for the negatively charged Asp and Glu. This indicates an involvement of electrostatic interactions with the glycerophosphate functional groups, and possibly, delocalization effects from hydrogen bonds between the phosphate group and the aromatic side chains in the separation mechanism.


Introduction
Modern applications of chromatography have spread far beyond its original role as a method for preparative separations.Years of development have established chromatography as a leading analytical technique covering virtually all fields of analytical chemistry.The contribution by high-performance liquid chromatography (HPLC) into studying physicochemical fundamentals of interactions in heterogeneous biological systems is also well appreciated.Establishing hydrophobicity scales of amino acids [1] to support original hydrophobicity measurement data obtained by X-ray crystallography [2] and studying interactions of various substances in the systems mimicking real biological environments [3] represent some of the most interesting fundamental biochemical applications of chromatography.The latter, now known under the broad term of biomimetic chromatography, has found applications in developing fast assays to determine lipophilicity, protein binding, and phospholipid binding -all critically important parameters in drug design.Measuring the retention properties of molecules on a C18 reversed-phase support provides information on the hydrophobicity of these molecules.Similar measurements on phospholipid-modified phases such as immobilized artificial membrane phosphatidylcholine (IAM.PC) provide more biologically relevant information on phospholipid binding, i.e. cell membrane permeability [4].
Reversed-phase HPLC has been pin-pointed as a powerful technique for elucidating the hydrophobicity contribution of individual amino acids [5] and the interacting domains between the peptide and hydrophobic surfaces [6].Houghten [6][7][8] and Hodges' [9][10][11] were the first to address the influence of amphipathic helicity in the stabilization of helical peptides upon contact with a hydrophobic surface -a key mechanism in antimicrobial peptides' action [11].Studies such as this represent a perfect example of bridging the gap between HPLC as a method of physicochemical study and drug development.All of these efforts have originated from the first attempts to model/predict peptide behaviour in reversed-phase liquid chromatography (RPLC) [12,13].The goal of the early modelling studies was to simplify method development for peptide analytical HPLC with UV detection.The arrival of high-throughput proteomic technology led to the expansion of these applications into protein/peptide identification [14], development of quantitative LC-MS methods [15], and the guided design of multi-dimensional peptide separation systems [16].Proteomics has provided a significant increase in the size of peptide retention data, paving the way for the development of the first sequence-specific peptide retention prediction models [17][18][19].
The majority of retention modelling studies in the proteomics era targeted RPLC separations with formic acid as the ion-pairing modifier.Our Sequence-Specific Retention Calculator (SSRCalc) model has been a benchmark tool in this field since 2004 (available on-line at http://hs2.proteome.ca/SSRCalc/SSRCalcQ.html), and followed the same trend but with the addition of models for trifluoroacetic acid [19] and high pH reversed-phase [20].Other peptide separation modes have largely excluded because of the poor compatibility of the eluents with ESI-MS.However, the last few years have witnessed an expansion of prediction studies into other separation mechanisms.In 2017 our laboratory applied SSRCalc methodology to Hydrophilic Interaction Liquid Chromatography (HILIC) [21], Strong-Cation Exchange (SCX) [22], Strong-Anion Exchange (SAX) (manuscript in preparation), and Capillary Zone Electrophoresis (CZE) [23].All of the listed prediction models that we have generated are the most accurate models reported for the respective modes of peptide separation.Application of 2D LC-MS/MS of the complex tryptic digests was the key innovation allowing collection of large retention datasets for peptide separations not possible with on-line ESI-MS detection [20,21].Standard RP (formic acid) LC-MS/MS was used in the second dimension as a "standard detection device", while the first-dimension separation information was used as a modelling dataset: e.g.HILIC-RPLC-MS for HILIC models, SCX-RPLC-MS for SCX, etc.
We are not aware of any peptide retention modelling studies on chromatographic supports for biomimetic applications.This would provide a significant advantage by expanding predictive approaches to peptide-based drug design.Having extensive experience in peptide retention modelling in various separation systems, we concluded that the use of 2D IAM.PC-RPLC to study peptide separation in biomimetic applications would be the first step in this direction.The goal of this study was to establish a large-scale retention data collection protocol for peptide separation on the IAM.PC.DD2 stationary phase and gain a first insight into peptide retention mechanism in this system.This was done by developing a peptide retention prediction model and assigning contributions of individual amino acids into peptide retention.

Experimental
Experimental procedures were identical to the previously reported modelling studies of HILIC and SCX [21,22], except for the chromatographic parameters (columns and eluents) in the first-dimension separation.Overall, the procedure (Figure 1) included a tryptic digestion of the whole cell S. cerevisiae extract, first dimension separation in a reversed-phase mode using a Luna C18(2) or and IAM.PC.DD2 column, fraction collection, and LC-MS/MS analysis of the individual fractions followed by peptide identification using the X!Tandem search algorithm.

First dimension separation
An Agilent 1100 series HPLC system fitted with UV detector (214 nm), a 100 µL injection loop and operating at a 150 µL/min flow rate was used for separations of standard peptide mixtures and the complex digest.Identical gradients of 1% acetonitrile per minute were used for both columns.Eluent A consisted of 20 mM ammonium bicarbonate in water.A 200 mM ammonium bicarbonate stock solution was diluted 10 times and the pH was adjusted with formic acid to 7.4.Eluent B consisted of 20 mM ammonium bicarbonate, pH 7.4, in 70/30 acetonitrile/water.The gradient program included the following steps: a linear increase from 0 to 71.4 % B in 50 min, 10 min wash with 90 % B and 30 min equilibration with 100 % A. One-minute fractions were collected within the expected interval of peptide elution.
Fractions were lyophilized and re-suspended in 30 l of 0.2% formic acid in water and spiked with ~200 fM of standard peptides P1-P6 for retention time alignment purposes.One-third of each collected fraction (10 µL) was injected in the second dimension.

Second dimension LC-MS/MS
Second dimension LC-MS/MS was done using a standard data-dependent acquisition protocol using a 2D LC Ultra system (Eksigent, Dublin, CA) and a TripleTOF5600 mass spectrometer (Sciex, Concord, ON) as described [19].LC settings featured a 100 µm x 200mm analytical column packed with 3 µm Luna C18(2) (Phenomenex, Torrance, CA) and a 300 µm x 5 mm PepMap 100 trap-column (Thermo Fisher).A 500 nL/min flow rate was used with ~0.4 % acetonitrile per minute gradient.Both buffers A (water) and B (acetonitrile) contained 0.1 % formic acid.The gradient program consisted of the following steps: a linear increase from 0.4 to 31 % buffer B (acetonitrile) in 77 minutes, 5 minutes at 80 % B and then 8 minutes at 0.4 % B for column equilibration (90 min total analysis time).

Data Analysis and retention time assignment
X!Tandem's search algorithm was used with the following parameters: 20 ppm and 50 ppm mass tolerance for parent and daughter ions, respectively; constant modification of Cys with iodoacetamide.All identified tryptic non-modified peptides (log (e) < -1) were additionally filtered using retention time prediction in the second dimension.Retention times in the first dimension were assigned as equal to the fraction number in which the peptide was found.When the peptide signal was distributed between two or more fractions, the intensity weighted average fraction number was used.

Results and Discussion
Selection of peptide reversed-phase separation conditions on C18 and IAM.PC.DD2 phases at pH 7.4.
The majority of peptide retention time modelling studies have been performed using acidic eluent conditions (usually formic acid) -a standard setting in proteomic LC-MS.However, biomimetic separations usually use physiological pH to maximize the similarity between biological systems and the artificial biphasic separation environment.We decided to employ ammonium bicarbonate -based buffer at pH 7.4 and performed separations on both the IAM.PC.DD2 phase and a standard C18 phase for comparison.
Figure 2 (A, B) shows the separation of a standard mixture of 6 peptides on these two columns.The P1-P6 peptide mixture was designed to cover the entire hydrophobicity range for tryptic peptides: i.e. the elution window of the reversed-phase separations.At acidic eluent conditions (0.1% trifluoroacetic acid) they elute between 4 (P1) and 29.6 (P6) % acetonitrile (min) [24] -very similar to the ~26 % acetonitrile (min) retention window on C18 phase at pH 7.4 (Figure 2A).Peptide retention on the IAM.PC is significantly lower (Figure 2B).Peptides P1-P3 are not retained, while the retention time decrease for P4-P6 was ~16.7 min (% acetonitrile) on average compared to C18.A significant decrease in separation efficiency for IAM.PC.DD2 was also obvious and likely due to the introduction of mixed-mode interactions on the phosphatidylcholine phase and larger particle size.
Most tryptic peptides are expected to elute in the range between 5 and 45 min from the Luna C18(2) column under the chromatographic conditions used.Based on preliminary experiments with standard peptides, the elution window for IAM.PC was expected to be smaller.Noting this we performed separations of complex S. cerevisiae digests (~150 µg, Figure 2(C,D)) and collected fractions up to 45 min for each run.As expected, both chromatograms showed no well-resolved chromatographic peaks due to the extremely high complexity of the mixture.At the same time, the overall retention profiles of tryptic peptides in these two systems were quite different.The majority of peptides were retained on the C18 column and eluted as a typical bell-shaped profile within a 5-40 min window (Figure 2C).The separation on the IAM.PC.DD2 phase exhibited a significant "break-through" -a very high peak at the beginning of the chromatogram containing peptides, which were not retained under the starting gradient conditionssimilar to P1-P3 in Figure 1B.

Identification outputs for both 2D LC-MS/MS runs
Identification outputs for both 2D LC-MS/MS runs are shown in Table 1, indicating significantly higher redundancy in identification for the IAM.PC.DD2 run.Due to the lower separation efficiency in the first IAM.PC.DD2 dimension, individual peptides were distributed through a larger number of fractions.This led to an acquisition of a larger number of MS/MS spectra, more identified spectra, but a lower number of unique peptide IDs. Figure 3A shows the correlation between retention time (fraction number) on the two columns.As expected, a significant portion of the peptides which show a moderate retention on Luna C18, is not retained on IAM.PC.DD2 and thus elute in the early fractions.* -these numbers include ~10 % peptides with post-translational modifications (default settings of PTMs for X!Tandem was used (methionine oxidation, deamidation and N-terminal cyclization of Cys and Gln), which were excluded from the retention modelling.

Optimization of peptide retention prediction models
The optimization of the peptide retention prediction models has been performed using the standard SSRCalc workflow [21,22]: 1) retention coefficients for individual amino acids were optimized to produce the best fit for experimental vs. predicted retention values plot using an additive model with peptide length correction.2) position-dependent retention coefficients have been introduced for four terminal positions from each terminus.3) sequence-dependent corrections related to peptide helicity and presence of hydrophobic clusters were applied in an attempt to improve correlations.It should be noted, that modelling peptide helicity in reversed-phase separations still represents a major problem and has not been fully implemented in the SSRCalc model.In this work, we have used our helicity model developed for acidic C18 conditions and applied it directly to C18 at pH7.4.It's application to IAM.PC.DD2 data (not shown here) did not a provide significant improvement.Therefore, we applied its simplified version (counting i -i+3; i -i+4 interactions of hydrophobic residues) to the phospholipid phase data.Resulting correlation plots for Luna C18(2) and IAM.PC.DD2 are shown in Figure 3B and 3C, respectively.The final model accuracy for the C18 packing material was found to be similar to other SSRCalc models for C18 separations (R 2 -value 0.96).It should be noted, that only peptide, with a retention of 3 min and higher were used for the IAM.PC.DD2 model development.Peptides with lower retention were considered unretained under the chromatographic conditions used and therefore their retention values could not be accurately assigned.Therefore, out of 37,327 non-modified tryptic peptides identified only 28,558 were used for modelling as shown in Figure 3C.The accuracy of the IAM.PC.DD2 algorithm is lower than that for the C18 column due to a narrower range of peptide elution (~30 % acetonitrile vs. ~40 %) and the possible involvement of novel sequence-specific features of the retention on the phosphatidylcholine stationary phase, yet to be discovered.

Retention coefficients (R C )
Retention coefficients (R C ) represent a measure of the participation of individual amino acids in the peptide retention on different chromatographic columns.SSRCalc models encode R C for individual residues in a position-dependent manner with four to five N-and C-terminal R C 's, and internal R C 's.Since tryptic peptides are fairly large, the latter represents the bulk of the residues and provide the most valuable information on the contribution of the residues.Figure 4A shows the comparison of internal retention coefficients for C18 and IAM.PC.DD2 separations at pH 7.4 and establishes the difference in retention contributions of amino acids.Table 2 additionally compares these values to RP separations at acidic and basic conditions.When analyzing Figure 4A, both hydrophobic character and charge state of the residues should be taken into account.The amino acids usually considered to be hydrophobic are shown in red and hydrophilic in green.At pH 7.4 Arg and Lys are protonated, His is neutral, while Asp and Glu carry a negative charge.Positively charged Lys, Arg and the aromatic Tyr, His, and Trp are among the residues, which showed an increase in interaction on IAM.PC.DD2.At the same time negatively charged Glu and Asp exhibit reduced interaction.This suggests that additional electrostatic interactions with glycerophosphate groups play a major role in the separation through the attraction of Lys and Arg and repulsion of Asp and Glu.The increased interaction of aromatic residues on IAM.PC.DD2 is most likely caused by the delocalization of the negative charge on the phosphate groups through the formation of a hydrogen bond.The contribution of the different amino acids to peptide retention is often found to be position dependent [17,21,22].These effects occur due to various mechanisms such as ion-pairing at positively charged N-terminus [17] or the peptide orientation effect [22], characteristic in cation-exchange peptide separations.Optimizing position dependent R C 's is a mandatory procedure for all SSRCalc models and was applied in this study.Figure 4(B,C) shows nine position dependent R c values (four on each side plus internal) for selected residues: hydrophobic, negatively, and positively charged.IAM.PC.DD2 does not exhibit significant position-dependent changes except for a small decrease of hydrophobic interactions from the N-to C-terminus (Figure 4C).Respective plots for the C18 stationary phase show a substantial increase in the retention contribution for Arg and Lys and increased retention of the hydrophobic residues for the internal positions (Figure 4B).
Comparing the retention contribution on the Luna C18(2) and IAM.PC.DD2 columns require understanding the differences in chemistry between the two stationary phases.IAM.PC.DD2 is more hydrophilic because of its shorter aliphatic chain (C14 vs. C18), hydrophilic linkers and the presence of a zwitterionic head group.The positively charged choline group is located on the outside of the functional layer, while the negatively charged glycerophosphate is positioned below it; separated by two additional methylene groups [4].A peptide has to penetrate this zwitterionic bilayer to be partitioned into the hydrophobic environment.Hydrophobic residues are found to have greater R C 's in Figure 4A, suggesting that the majority of the separation is driven by hydrophobic interaction on both stationary phases.The differences in observed retention contributions between C18 and IAM.PC.DD2 have to come from the interactions of the peptides with the zwitterionic head group.The relative changes in retention for aromatic and charged residues suggest that the zwitterionic head group contributes to the separation mechanism.We observe that the negatively charged amino acids Asp and Glu have a decreased, and the positively charged Arg and Lys have an increased retention on the IAM.PC.DD2 compared to C18 stationary phase.This suggests the involvement of electrostatic interactions (repulsion/attraction) with negatively charged glycerophosphate groups.Considering the behaviour of aromatic amino acids (Trp, Tyr, His, and Phe), all of which have larger R C values, except for Phe, aromatic rings contain conjugated pi-systems, which possess electro-negative character.We have concluded that the reason these residues increase in retention is due to their interaction with the phosphate group.Tyr, Trp, and His all contain nitrogen or oxygen bonded to a hydrogen in their side chains connected to their aromatic rings, while Phe does not.The partial positive charge of the hydrogen allows for a hydrogen bond to form between the side chain and the phosphate.This, in turn, delocalizes the negative charge of the phosphate across the entire aromatic ring.This is known to provide a stabilizing effect and thus would increase retention.The inability to form a hydrogen bond with Phe is why Phe behaves like the other hydrophobic residues and does not exhibit an increased contribution.The delocalization effect is further supported by the relatively large increase in retention of Arg in comparison to Lys.Although Lys and Arg have nearly identical R C 's on the C18 phase, in the IAM.PC.DD2 phase, the retention of Arg is much greater than Lys.This could be explained by the delocalized positive charge across the two amine groups interacting more strongly with the negatively charged phosphate than the single positively charged amine group of Lys.Overall, the major changes in retention between the columns can be explained by the interaction of the zwitterionic head groups in addition to the aliphatic chains that are similar to those on the C18 stationary phase.

Conclusions
A peptide retention prediction model for reversed-phase separation on immobilized artificial membrane phosphatidylcholine (IAM.PC) stationary phase has been developed.Chromatographic conditions for highthroughput measurements of peptide retention on an IAM.PC.DD2 phase in reversed-phase separation mode at pH 7.4 have been established and compared to a standard C18 phase.IAM.PC.DD2 was found to be more hydrophilic and to exhibit lower peptide retention (by ~14% acetonitrile on average, calculated for all S.cerevisiae peptides retained in both systems) compared to octadecyl-silica (C18).2D LC-MS/MS analysis of a complex S. cerevisiae digest with IAM.PC or C18 columns in the first dimension allowed the measurement of the retention properties of tens of thousands of peptides -sufficient for the confident assignment of retention coefficients and the development of a sequence-specific prediction algorithm.Peptide retention on the IAM.PC.DD2 phase is driven by hydrophobic interactions.However, we found a substantial increase in the relative retention contribution (compared to C18) for positively charged (Arg and Lys) and aromatic (Tyr, Trp and His) residues, and a decrease for negatively charged (Asp and Glu) residues compared to C18.This indicated the involvement of other types of interactions (electrostatic and electron delocalization), which results in a mixed-mode retention mechanism.Due to the lower overall hydrophobicity of the IAM.PC.DD2 stationary phase, the effect of amphipathic helicity on retention is less profound.At the same time, IAM.PC.DD2 version of the SSRCalc algorithm showed a lower accuracy compared to C18 version, suggesting that additional sequence-specific features (yet to be discovered) play a role in peptide separation on the phosphatidylcholine stationary phase.

Figure 1 .
Figure 1.Overview of the experimental procedure for the large-scale peptide retention data collection using 2D LC-MS/MS.

Table 1 .
Identification output of 2D (RP-RP) LC-MS/MS and 2D (IAM.PC-RP)-LC MS/MS for the analysis of whole cell yeast tryptic digest.

Figure 3 .
Figure 3. Representation of the separation space of tryptic peptides on IAM.PC.DD2 and C18, and their respective SSRCalc model accuracy.A -correlation between the retention times (fraction number) for the two chromatographic systems (26,594 peptides identified in both runs); B and C -the accuracy of the custom versions of the SSRCalc model for the Luna C18(2) (40,105 peptides) and the IAM.PC.DD2 (28,558), respectively.

Figure 4 .
Figure 4. Retention contributions of individual residues in peptide retention on C18 and IAM.PC.DD2 phases at pH 7.4.A -comparison of retention coefficients for the two columns; B and C -position-dependent retention coefficients for selected amino acids for Luna C18(2) and IAM.PC.DD2 columns, respectively.

Table 2 .
Retention coefficients for SSRCalc peptide retention prediction models in different RP separation