Multivariate analysis of hydrophobic descriptors

Multivariate approaches like principal component analysis (PCA) are powerful tools to investigate hydrophobic descriptors and to discriminate between intrinsic hydrophobicity and polar contributions as hydrogen bonds and other electronic effects. PCA of log P values measured for 37 solutes in eight solventwater systems and of hydrophobic octanol-water substituent constants  for 25 metaand parasubstituents from seven phenyl series were performed (re-analysis of previous work). In both cases, the descriptors are reproduced within experimental errors by two principal components, an intrinsic hydrophobic component and a second component accounting for differences between the systems due to electronic interactions. Underlying effects were identified by multiple linear regression analysis. Log P values depend on the water solubility of the solvents and hydrogen bonding capabilities of both the solute and the solvents. Results indicate different impacts of hydrogen bonds in nonpolar and polar solventwater systems on log P and their dependence on isotropic and hydrated surface areas. In case of the values, the second component (loadings and scores) correlates with electronic substituent constants. More detailed analysis of the data as -values of disubstituted benzenes XPhY has led to extended symmetric bilinear Hammett-type models relating interaction increments to cross products X Y, Y X and X Y which are mainly due to mutual effects on hydrogen-bonds with octanol.


Introduction
Per IUPAC definition [1], hydrophobicity "is the association of nonpolar groups or molecules in an aqueous environment which arises from the tendency of water to exclude nonpolar molecules". In the strict sense, this operational definition specifies "hydrophobic effects" and "hydrophobic bonding" as intramolecular or intermolecular interactions in an aqueous phase due to attracting forces (van der Waals forces based on orientation, induction and dispersion) and structural reorganization of water adjacent to nonpolar groups. Ordered water molecules in hydration shells with enthalpically stronger H-bonds are transferred into bulk water with a lower degree of order, leading to an increase of both enthalpy and entropy (for review, see [2]). With a predominating entropy term (which is not necessarily the case if restriction of flexible solutes is taken into acccount) hydrophobic interactions are stronger than the attracting forces themselves. In this context, hydrophobic effects are closely related to the interacting nonpolar surface and volume fractions of groups or molecules.
Hydrophobicity is operationally discriminated from lipophilicity which "represents the affinity of a molecule or a moiety for a lipophilic environment", commonly "measured by its distribution behavior in a biphasic system" (IUPAC [1]). With this definition, experimental aspects come into play: Hydrophobic or lipophilic effects are quantified by measurement of partition coefficients P in solvent-water systems or retention indices in RP-HPLC or TLC (for reviews of methods, see [3] and articles therein). These quantities do not simply reflect "hydrophobicity" as defined above, but depend on the interactions of the complete solute molecules with both phases and on the phase transfer. I.e., not only the nonpolar surface fraction of the solute plays a role, but also polar effects like dipole-dipole interactions as well as formation and breaking of hydrogen bonds (see [4] and references therein). Different conformations and intramolecular interactions in both phases may also be of impact. Hydrophobic (or lipophilic) descriptors are therefore rather complex. Nevertheless, in a first approximation they may be regarded as consisting of a nonpolar and a polar component, making factorization an attractive tool to better understand their nature.
The introduction of hydrophobic effects and parameters into systematic quantitative structure-activity relationships (QSAR) analysis was an essential part of the pioneering work of Corwin Hansch, Albert Leo and their coworkers. On one hand, they collected or measured a huge number of partition coefficients in different solvent-water systems [5,6], documented as linear free-energy related quantities log P, derived the hydrophobic substituent constant  [7] in analogy to Hammett's electronic  parameter, and developed a constructional fragment method of calculating partition coefficients [8]. On the other hand, the exploration of numerous QSAR on different levels of integration led to a substantial advancement of the theoretical background, namely of the role of hydrophobic effects in ADMET (especially in transport, distribution, membrane passage) and in protein-ligand interactions ("hydrophobic bonding").
Beginning with the work of Collander [9], the correlation of log P values from different solvent-water systems and the decomposition of log P into more fundamental molecular descriptors like solubility, surface fractions, polarizability and hydrogen-bond strengths have contributed to quantitative structureproperty analysis (QSPR) of hydrophobic effects (for review, see [4,10]). However, already in 1964 octanolwater partition coefficients were implemented as standard in QSAR analysis because of the similarity of noctanol and lipophilic biophases [7]. QSPR with focus on octanol-water log P were integral parts of the foundation of log P calculation software as Rekker's fragment additivity method [11,12], CLOGP (for review, see [10]), ACDLabs [13], several atomistic approaches and the recent surface-integral model using local properties from semiempirical MO-calculations and their integrals over the molecular surface [14]. Based on large log P databases -in 1995, Hansch, Leo and Hoekman documented ca. 17,000 values [6]these methods have become more and more predictive, but also rather intransparent for an ordinary user with respect to the underlying QSPR, i.e., to the "factors", the specific inter-and intramolecular interactions and forces which affect the log P under consideration. Thus, in addition to available papers and manuals, detailed QSPR of hydrophobic descriptors may be helpful to interprete and validate calculated quantities.
At this point, multivariate analysis may come into play. The linear decomposition of correlated hydrophobic descriptors into uncorrelated "inner variables" (factors or principal components, PCs) by factor or principal component analysis (PCA) yields the underlying, "inner" data structure (dimensionality, common and specific components). Correlation of the PCs with physicochemical parameters identifies basic effects accounting for the multivariate QSPR. With this feedback, hydrophobic descriptors may be modeled as linear functions of nonpolar and polar components by comparative multiple regression analysis. In the following sections, previous multivariate approaches jointly investigating log P values from diverse solventwater systems and substituent constants  derived from different aromatic scaffolds will be reviewed, accompanied by recalculations based on more recent data if available.

Experimental
The present study is based on principal component analysis and multiple linear regression analysis of hydrophobic descriptors. In brief, PCA factorizes correlated data from m variables (systems) and n objects (compounds) into uncorrelated PCs according to the model: where a ik are system-specific PC loadings and t kj compound-specific PC scores. The number of significant PCs p yields the dimensionality of the data, i.e., their recombination by the model except for an error e ij . Scaling of the variables (original data, normalized with zero means, or standardized with zero means and standard deviations of one) determines whether the cross product, the covariance or the correlation matrix is diagonalized. The PCs 1 to p are calculated via successive extraction of the maximal (residual) "correlation", i.e., via the eigenvalues and eigenvectors of the matrix under consideration. Present recalculations were performed with in-house programs FAPCA and REGRE.

Multivariate analysis of partition coefficients log P from different solvent-water systems
The correlation of log P-values from different solvent-water systems was first reported by Collander [9]. The Collander-equation log P solvent 1 = a 0 + a 1 log P solvent 2 (1) is restricted to homologous series or purely nonpolar solutes and models the different contribution of nonpolar solute-solvent interactions in the two solvents. The more hydrophobic solvent 1, the higher the slope a 1 . The intercepts a 0 are positively correlated with the water solubility of solvent 1, i.e., hydrophilic solutes (log P < 0) result in higher log P if solvent 1 is more polar than solvent 2. PCAs of log P-values from such restricted series extract just one significant PC. The loadings increase with the hydrophobicity of the solvent, and the scores are strongly correlated with the nonpolar surface of the solutes.
With variable polar solute moieties, the situation becomes more complex. Then the phase transfer comes along with different contributions of electrostatic interactions and, in particular, of broken solutewater and newly formed solute-solvent hydrogen bonds. PCAs of such extended series commonly lead to two significant PCs, a "hydrophobic" and a "polar" one. All analyses known from the literature comply with this rule, namely PCAs of log P-values  from 18 solutes in six solvent-water systems (n-octanol, diethylether, chloroform, benzene, toluene, cyclohexane) [15];  from 28 solutes [16], 50 solutes [17], and 69 solutes [18], respectively, in six solvent-water systems (n-octanol, diethylether, chloroform, carbon tetrachloride, benzene, n-hexane) Whereas Dove et al. [15] applied the standard PCA method, diagonalization of the correlation matrix, the results of the group of Bill Dunn [16,17,18] were based on diagonalizing the matrix of cross products implied in the SIMCA software pocket.
In the following section, a newly calculated PCA of log P-values from 37 solutes will be presented to exemplify common principles and results. The series is a subset of the solutes analyzed by Koehler et al. [18], the number of solvents was extended to eight (by toluene and cyclohexane). Data, either taken from ref. [18] or from the tables of Hansch and Leo [5,6], are presented in Table 1 Table 2 shows PCA results with respect to the solvents as well as the means and variances of the solvent columns which had to be additionally considered since merely the correlation structure of the data was analyzed by our PCA. The order of the means and variances reflects the general rules derived above from the Collander equation. The means are highly correlated with the water solubility of the solvents expressed in terms of log  [19]: The variances are inversely related to the hydrogen bonding component of the solvent solubility in water,  h [20] (data from ref. [21]): Variance = -0.065 (± 0.042)  h + 1.13 (± 0.10) r 2 = 0.70, s = 0.08,  = 0.009 Thus, log P scales from water soluble solvents capabable of forming hydrogen bonds show higher means and lower variances compared to log P scales from nonpolar solvents.
Two PCs account for 96.9 % of the data variance (first PC, 83.8 %, second PC, 13.1 %). Thus, PCA of log Pvalues again results in a two-component model as in the case of the previous analyses [15,16,17,18]. The loadings a ik ( Table 2) represent correlation coefficients between log P from solvent i and scores t k (Table 1). Obviously all nonpolar solvents, in particular benzene and toluene, are sufficiently described by the first PC which, however, extracts only ca. 55 % of the variance of the hydrogen bonding solvents n-octanol and diethylether. Therefore, the first PC represents "pure" hydrophobic effects due to the transfer of solutes from water into inert solvents. For polar solvents, a second PC accounting for hydrogen bonds and electrostatic interactions between solutes and solvents is necessary. This relationship may be modeled by correlation of a 2 with the water solubility of the solvents: Large positive loadings of highly water-soluble solvents are in contrast to negative loadings of carbon tetrachloride, n-hexane and cyclohexane.
Inspection of the scores t k and their correlation with suitable solute descriptors will enable more detailed insights into the QSPR. Figure 1 presents a plot of t 1 vs. t 2 accounting for "purely hydrophobic" effects and polar corrections, respectively, as described above. A homologous series as aliphatic alcohols is characterized by a flat line nearly parallel with the abszissa, i.e., by variation of mainly the hydrophobic component. Amines under consideration are clustered, their contribution to PC2 is significantly negative. doi: 10.5599/admet.2.1. 35 8 Aniline (14) resides close to the other amines, whereas phenol (15) has a positive t 2 value, indicating strong interactions with hydrogen-bond acceptor solvents like n-octanol and diethylether.  Positional effects in disubstituted benzenes are evident. On one hand, m-nitroanilines and mnitrophenols are slightly more hydrophobic than their p-substituted isomers (compare 29, 30 and 32, 33, respectively). On the other hand, o-substituted benzenes are located down to the right with respect to their m-and p-isomers as obvious from Figure 1 for hydroxybenzoic acids (24 vs. 25), chlorphenols (16 vs. 17 and 18), nitrophenols (28 vs. 29 and 30) as well as nitroanilines (31 vs. 32 and 33). The rightward shift is due to intramolecular hydrogen bonds and/or proximity effects, both increasing the nonpolar surface and reducing hydrogen bonds as well as electrostatic interactions with water. However, these effects are much more significant in nonpolar solvents.
Diethylether and n-octanol are strong hydrogen bond acceptors (n-octanol additionally a weak donor), preventing the formation of internal hydrogen bonds in solutes similarly as water (for review, see [22]). The downward shift of o-substituted isomers is a consequence of this phenomenon which is most pronounced in the case of o-methoxybenzoic acid (27). Also o-hydroxyanisole (26) shows a small negative "ortho-factor" in polar solvents rather due to twist than to a hydrogen bond effect [23].
Suitable descriptors for the identification of scores from PCA were provided by Dunn et al. [16,17,18] who defined and calculated the isotropic surface area, ISA [24], of solutes as the surface of the molecule accessible to nonspecific interactions with the solvent. The surface area of the solutes involved in specific hydrogen bonds with water, HSA, was excluded from the ISA. For calculation of ISA and HSA, Dunn et al. constructed hydrated solutes, "supermolecules", from empirical hydration rules based on crystallographic data, quantum-chemical approaches, solution modeling and experimental data from solute-gas phase equilibria (see [25,26] and references therein). doi: 10.5599/admet.2.1. 35 9 The scores of PC1, t 1 , are highly correlated with the isotropic surface area (data from [18] Thus, the first PC accounts for the well known dependency of log P values on the nonpolar surface fraction of solutes based on entropic effects of water exclusion and nonspecific solute-solvent interactions. In the PCA approach of Koehler et al. [18], the matrix of cross products was diagonalized, leading to orthogonal but correlated scores which also depend on the means and variances of log P in six analyzed water-solvent systems. The correlation of t 1 from ref. [18] with ISA was weaker than that shown in eq. 5 (r 2 = 0.81).
The scores of PC2, t 2 , are significantly related to the hydrated surface area (r 2 = 0.46) [24], but even if four outliers (compounds 11,12,27,36) are excluded from the analysis, the correlation remains rather weak: t 2 = 2.67 (± 0.86) HSA -1.56 (± 0.60) r 2 = 0.56, s = 0.63,  < 0.001 I.e., the hydrated surface area plays a role in increasing log P in polar solvents (positive PC2 loadings) and decreasing log P in nonpolar solvents (negative PC2 loadings), but HSA is not sufficient to quantitatively describe this effect. Compared to eq. 6, Koehler et al. [18] obtained a better correlation of the scores from the second PC with the hydrated fraction of the solvent accessible surface area, f(HSA) (69 solutes, r 2 = 0.74) [24]. In our PCA approach, HSA is superior to f(HSA). However, the scores t 2 from ref. [18] account only for nonpolar solvent-water systems as evident from correlation coefficients between log P and t 2 (in analogy to loadings a 2 from our PCA): octanol, 0.23, diethylether, 0.12, chloroform, 0.84, carbon tetrachloride, 0.88, benzene, 0.78, hexane, 0.89. Thus, the correlation of t 2 with f(HSA) in the paper of Koehler et al. [18] mainly reflects a negative impact of the hydrated surface area fraction on the hydrophobicity of solutes in nonpolar solvents.
The discriminative effect of f(HSA) on log P-values from different solvent-water systems may be explored in more detail by regression analysis of log P as function of ISA and f(HSA) (see Table 3). All equations for nonpolar solvents provide a sufficient decomposition of log P into surface area terms. In contrast, equations for n-octanol and diethylether explain only ca. 60 % of the data variance. Whereas regression coefficients of ISA and intercepts do not significantly differ, effects of f(HSA) are distinctive with respect to the solvent class: polar solvents are characterized by a positive, nonpolar solvents by a negative or no impact of the hydrated surface area fraction of solutes on log P. Solutes with a high hydration potential are poorly transferred just into carbon tetrachloride, n-hexane and cyclohexane. Koehler [18] correlated log P-values calculated from loadings and scores with ISA and f(HSA). Because of the nonsignificance of f(HSA) in equations for n-octanol and diethylether, they sugggested that the solutes partition into the solvent as the hydrated "supermolecule" due to the water solubility of these solvents which compete with water for the hydrogen bonding sites of the solutes, leading to displacement of water from the "supermolecule". However, f(HSA) is significant if measured n-octanol and diethylether log Pvalues are correlated (Table 3), albeit both equations do not sufficiently model hydrophobicity. Therefore, additional descriptors must be taken into account for polar solvents.
The free energy of solute partition from water into n-octanol and diethylether depends on the difference of hydrogen bond interactions in both phases. Suitable descriptors considering these effects on log P have been derived by Taft, Kamlet and Abraham et al. [27,28,29,30,31]. Based on the solvatochromic model [32,33,34], log P-values are factorized into four parameters: the molecular volume V for nonpolar interactions, the solute's dipolarity/polarizability * for orientation and induction forces, as well as  and  for the hydrogen bond donor acidity and acceptor basicity, respectively. For example, solvatochromic analysis of octanol-water partition coefficients for 103 solutes resulted in the following equation [35]: log P = 5.15 (± 0.16) V/100 -1.29 (± 0.16) * -3.60 (± 0.18)  + 0.45 (± 0.12) r 2 = 0.98 s = 0.16 (7) From the series of Koehler et al. [18], solvatochromic descriptors were available for 45 compounds [27,30]. Also in case of this subset, the correlations of n-octanol and diethylether log P-values with ISA and HSA (r 2 : 0.77 and 0.63, respectively) or with ISA and f(HSA) (r 2 : 0.74 and 0.60, respectively) are not sufficient. Combining these parameters with the solvatochromic descriptors leads to the following equations for noctanol: Taken together, these equations represent the different impacts of hydrogen bonds in nonpolar and polar solvent-water systems. The hydrated surface area reflects hydrogen bond donor acidity and acceptor basicity in equal parts and is a suitable descriptor of the detrimental effect of solute-water hydrogen bonds on log P-values in nonpolar solvents as chloroform, n-hexane and cyclohexane. In contrast, n-octanol and diethylether are strong hydrogen bond acceptors themselves. Donor solutes are favored, i.e., log P increases if a large hydrated surface area is mainly due to a high -term. The net effect of  on log P is negative (compare eqs. 8-11 with 12-13). Accordingly, hydrogen bond acceptor solutes are less hydrophobic in these solvents, in particular in diethylether since n-octanol is also a hydrogen bond donor, but weak compared to water. In conclusion, the decomposition of log P-values must always consider differences of solute-water and solute-solvent interactions. doi: 10.5599/admet.2.1. 35 11

Multivariate analysis of hydrophobic substituent constants  from disubstituted benzenes
The hydrophobic substituent constant  was first introduced by Hansch et al. [36] as difference of the octanol-water log P of substituted and unsubstituted phenoxyacetic acid. Following this concept, numerous -values were derived for ortho-, meta-and para-substituents X in various aromatic systems PhY as benzenes, nitrobenzenes, anilines, phenols, phenylacetic acids and phenoxyacetic acids [7]. Whereas metapara positional effects on  X and  X -differences of inert substituents were only marginal,  X -values of hydrogen-bonding, electron-attracting or -releasing substituents depend significantly on the nature of the "functional group" Y. Thus, log P values of disubstituted benzenes XPhY are not simply the sum of log P (benzene),  X and  Y from the benzene system, but include "interaction increments" due to electronic effects. In a first approximation, their nature was identified by correlations of the type [7]: I.e., the difference of the  X -values from a series PhY and benzene depends on the electronic properties of X, described by Hammett's  constant, and on the specific impact of Y on X, reflected by k.
To further investigate these effects, principal component analyses of  XY -values from different metaand para-substituted aromatic series PhY [5,7,37] were performed by Franke et al. [38,39]. Both approaches with separate [38] and simultaneous [39] consideration, respectively, of 27 meta-and parasubstituents (PhY: benzenes, nitrobenzenes, anilines, phenols, benzoic acids, phenylacetic acids, phenoxyacetic acids, piperidinoacetanilides) resulted in two significant PCs. The first PC accounted for the "average" hydrophobicity of the substituents, and the second PC was due to electronic interactions between X and Y and correlated with  X . However, these PCAs suffered from too many unknown  XY values (36 of 216, 17%) which had to be estimated by regression analysis in order to obtain a full data matrix.
Consideration of more recent experimental log P-values [6] and withdrawal of -values of piperidinoacetanilides and the CH 2 OH group enables a substantial reduction of calculated data (10 of 175, 6%). With this update and some substitutions by more reliable values [6,40], the simultaneous PCA of meta-and paradisubstituted benzenes was recalculated. The data matrix is shown in Table 4. Table 5 presents PCA results for the systems PhY. Two PCs account for 98.7 % of the data variance (first PC, 92.6 %, second PC, 6.1 %). The loadings a Yk as correlation coefficients between  XY from series Y and scores t k (Table 4) show that -values from benzenes, phenoxyacetic, phenylacetic and benzoic acids are sufficiently reproduced by the first PC, indicating only weak effects of X on Y and vice versa in these systems. Thus, PC 1 represents hydrophobicity of substituents X largely unaffected by interaction with Y. The hydrogen bonding acceptor system nitrobenzene and the donor-acceptor systems aniline and phenol bear considerable, opposed loadings in the second PC which is significantly correlated with  p -values of the "functional groups" Y: a 2 = 0.49 (± 0.26)  Yp + 0.02 (± 0.12) r 2 = 0.83, s = 0.11,  = 0.005 (15) The correlation with  Ym is only weak (r 2 = 0.63). These results must be interpreted in context with the scores t 1 and t 2 ( Table 4). Figure 2 presents a plot of t 1 vs. t 2 accounting for "unaffected" hydrophobicity of substituents X and X-Y interactions, respectively. There is obviously no positional effect, corresponding mand p-substituents overlap apart from small differences in the case of OH, OMe, COMe, CN and Br. Thus, the joint analysis of meta-and para-disubstituted benzenes is justified. The arrangement of the substituents along the abscissa (PC 1) corresponds to the common hydrophobicity scale (polar, hydrogenbonding substituents < H < Me, F < Cl < Br < I).
doi: 10.5599/admet.2.1.35 Eq. 17 is better than the correlation with position-dependent  Xm -and  Xp -values (r 2 = 0.68). Taken together, eqs. 15 and 17 reflect an electronic X-Y interaction increment described by the product  Xp  Yp . The hydrophobicity of a substituent X increases if its electronic effects on the phenyl nucleus are counterbalanced by Y (electron-attracting X combined with electron-releasing Y and vice versa).
These findings obtained from the multivariate PCA approach may be explored in more detail by individual consideration of the series. Instead of correlating  X with  X as in eq. 14 [7], the  XY -values were directly related to  XH and  X by multiple regression analysis (see Table 6). Calculated  XY -values were omitted. Since correlations with  Xp and position-dependent  Xm -and  Xp -values led to approximately equivalent equations, the latter, "correct" descriptors were used. All regression equations except that for the nitrobenzene series result in an intercept of approximately zero and explain more than 95 % of the data variance.
As expected from the PCA (eqs. 15,17), the regression coefficients for  X depend on electronic properties of Y. However, the coefficients for  XH and in particular their deviation from unity seem to follow  In both cases, the correlation with  m -and  p -values, respectively, is significantly better than with Hammett-constants of the other position (r 2 = 0.71, 0.53). Eqs. 18 and 19 indicate that  XY -values include two X-Y interaction increments depending on the products  XH  Ym and  Xp  Yp . However, the drawback of this model is that it has been derived from separate analyses of the seven systems. A common model for all systems must be based on equivalent, "symmetric" consideration of substituents X and Y in meta-and para-disubstituted benzenes. Fujita [40] published such a model relying on bidirectional Hammett-type relationships: In this equation,  Y is the difference of the susceptibility contants  Y (octanol) - y (water) of hydrogen bonding association between the respective solvent and the fixed substituent Y to the effect of variable substituents X, and  X is the equivalent difference for the impact of Y on X. Thus, the transmission of electronic effects of substituents from X to Y is assumed to be independent of transmission from Y to X.
To make the Hammett-type relationships from eqs. 18, 19 and Table 6 bidirectional implies the introduction of an additional X-Y interaction increment  YH  X . The following common models for meta-and para-disubstituted benzenes, respectively, were derived from the data in Table 4 i.e., the contribution of the interaction to a property as, e.g., solvation energy is a function of the electronic effects of the fragments and  constants depending on the skeleton between them. Comparing these models with such bidirectional approaches [40,41] indicates that the "susceptibility constants"  X and  Y depend on  XH and  YH , respectively. Polar substituents (negative -values)increase  XY in combination with a second, electron-attracting substituent. By this, polarity is reduced and interactions with the strong hydrogen bond acceptor octanol become slightly more favorable. In case of hydrophobic substituents (positive -values), the interaction of a second, electron-releasing group with octanol is favored by the same reason. The  Xp  Yp cross term may be due to electron-releasing hydrogen bond donor substituents as OH and NH 2 . Combined with electron-attracting groups, their hydrogen bonds with octanol are facilitated. This effect is even underestimated by the cross term since the greatest differences between measured and calculated  XY -values of +0.3 to +0.4 occur just in case of phenols and anilines with strongly electron-withdrawing substituents as NO 2 , CF 3 and CN. Combination of two hydrogen bond acceptors with positive  Xp -values reduces  XY most. In contrast to previous suggestions [38], electronic effects on the phenyl nucleus play a minor role since otherwise meta-disubstituted benzenes should be correlated with a  Xm  Ym cross term. Taken together, the Hammett-type relationships represented by eqs. 21 and 22 indirectly account for mutual interactions of substituents favorable or detrimental for hydrogen bonds with octanol.
Models of this type may be used for the calculation of log P values of meta-and para-disubstituted benzenes: log P (XPhY) = log P (benzene) +  YH +  XY  where  XY is the difference between log P (XPhY) and log P (PhY) and comprises all X-Y interactions. Thus: log P (XPhY) = 2.13 +  XH +  YH -c X  XH  Ym -c Y  YH  Xm -c XY  Xp  Yp (25) where c X (ca. 0.25), c Y (ca. 0.4) and c XY (ca. 0.5) quantify the influence of the three Hammett-type increments. This is an extension of the method applied in CLOGP [10] where electronic interactions in meta-and para-disubstituted benzenes are considered by factors F XY =  Y  X (here,  X is no Hammett constant, but derived from log P values). For ortho-disubstituted benzenes, additional factors come into play representing an ortho-effect (F ortho ), intramolecular hydrogen bonding (F HB ) and alkyl-aryl interaction (F A ), so that in this case F XY =  Y  X + F ortho + F HB + F A (26)

Conclusions
Multivariate, simultaneous analysis of hydrophobic descriptors by PCA may provide valuable information about the data structure (dimensionality of two, two common components). Correlated parameters from different solvent-water systems and phenyl series have been transformed into uncorrelated "inner" variables discriminating between the systems and leading to suggestions about underlying interactions. Via identification of such interactions per multiple linear regression analysis, the different impact of hydrogen bonds in nonpolar and polar solvent-water systems on log P values and their dependence on isotropic and hydrated surface areas has become obvious. The analysis of -values of metaand para-disubstituted benzenes has led to extended symmetric bilinear Hammett-type models relating interaction increments to three cross products  X  Y ,  Y  X and  X  Y . The resulting models from both approaches provide detailed insight into the nature of hydrophobic descriptors and fall into line with numerous other theoretical investigations on the background of hydrophobicity and lipophilicity. doi: 10.5599/admet.2.1. 35