An Empirical Investigation of Color Temperature and Gender Effects on Web Aesthetics

Constantinos K. Coursaris, Sarah J. Sweirenga, and Ethan Watrall

Journal of Usability Studies, Volume 3, Issue 3, May 2008, pp. 103-117

Instrument Scales and Validity

The questionnaire used for data collection contains scales that measure the various constructs shown in the research model and are provided in Table 1. All scales were adapted from a prior study (Lavie & Tractinsky, 2004), which had established their reliability and validity, thereby satisfying content validity. These scales were used to measure the users' perceived attractiveness of Web sites through assessments of classical aesthetics and expressive aesthetics. These 7-point Likert scales (anchored "Strongly Disagree/Agree") measured responses to the shared question "My perception of this Web site is that it is…" for each of the following items: clean, clear, symmetric, aesthetic, pleasant for classical aesthetics, original, creative, fascinating, sophisticated, and uses special effects for expressive aesthetics.

When the questionnaire was conducted, items within the same construct group were randomized to prevent systemic response bias. Upon further testing it was shown that non-response, temporal, and common method biases were not present in our data set. The factor loadings for the total set of items used in this study are summarized in Table 1. Shimp and Sharma (1987), Carmines and Zeller (1979), and Hulland (1999) suggest that an item is significant if its factor loading is greater than 0.7 to ensure construct validity. Adherence to this criterion required the modification of only one scale (classical aesthetics) through the removal of two items: ClasAes1 (or clean) and ClasAes2 (or symmetrical). After the removal of the non-valid items, each item was re-validated by testing its item-to-total correlation measure, where all items had higher measures than the 0.35 threshold suggested by Saxe and Weitz (1982).

Table 1. Construct items and their factor loadings
Item Question: Thinking about my impression with the Web site, it is … Loading Item-total correlations
ClasAes1* Clean 0.661 0.593
ClasAes2 Clear 0.746 0.607
ClasAes3 Aesthetic 0.863 0.701
ClasAes4 Pleasant 0.895 0.547
ClasAes5* Symmetrical 0.605 0.442
ExprAes1 Original 0.848 0.763
ExprAes2 Sophisticated 0.851 0.728
ExprAes3 Fascinating 0.895 0.825
ExprAes4 Creative 0.883 0.816
ExprAes5 Uses special effects 0.777 0.688

Note: * denotes items removed from the subsequent analysis; ClasAes-classical aesthetics; ExprAes-expressive aesthetics

Results of tests for convergent validity (Bagozzi, 1981), discriminant validity (Bagozzi, 1981; Fornell & Larcker, 1981), construct means, and Cronbach's alpha can be found in Table 2. All constructs had adequate reliability (Carmines & Zeller, 1979) and internal consistency well above the 0.7 threshold (Nunnally, 1978). Cronbach ?-values were satisfactory for our constructs (0.771-0.906) and constructs' AVE exceeded the 0.5 benchmark for convergent validity (Fornell & Larcker,1981).

Table 2. Construct statistics
  ClasAes ExprAes
Arithmetic means (all items) 5.457 3.294
Arithmetic means (used items) 5.342 3.294
Cronbach's α reliability 0.771 0.906
Internal consistency 0.875 0.930
Convergent validity (AVE) 0.701 0.726
Discriminant validity (sqr[AVE]) 0.837 0.852

The square root of the variance shared between a construct and its items was greater than the correlations between the construct and any other construct in the model (see Table 3) suggesting discriminant validity (Fornell & Larker, 1981). Discriminant validity was confirmed by verifying that all items load highly on their corresponding factors and load less on other factors (see Table 4). Although the correlation between the two aesthetics constructs was quite high (i.e., 0.622), a phenomenon also observed in the work by Lavie and Tractinsky (2004), it is not exceedingly high according to Kline's (1998) suggestion that correlations between factors should not be greater than 0.85, thus further supporting the discriminant validity of the two aesthetic factors.

Table 3. Correlation matrix and discriminant validity assessment
ITEM ClasAes ExprAes
ClasAes 0.9851  
ExprAes 0.622 0.8321

1Fornell and Larcker (1981) measure of discriminant validity, which is the square root of the average variance extracted compared to the construct correlations. Bold values are supposed to be greater than those in corresponding rows and columns.

Table 4. Matrix of loadings and cross-loadings
ITEM ClasAes ExprAes
ClasAes2 0.746 0.455
ClasAes3 0.863 0.547
ClasAes4 0.895 0.554
ExprAes1 0.474 0.847
ExprAes2 0.638 0.849
ExprAes3 0.571 0.896
ExprAes4 0.524 0.885
ExprAes5 0.391 0.779

