An Investigation of Abstract Construal on Impression Formation: A Multi-Lab Replication of McCarthy and Skowronski (2011)

including the descriptor “warm” or “cold” within a list of otherwise identical trait words affected perceivers’ impressions of the described individual. Specifically, those who were exposed to the list of traits “intelligent, skillful, industrious, warm, determined, practical, and cautious” formed relatively positive overall impressions about the described individual; whereas, those exposed to the list of traits “intelligent, skillful, industrious, cold, determined, practical, and cautious” formed relatively negative overall impressions of the described individual. Thus, in an otherwise identical list of traits, perceivers’ impressions of the described individual are reliably affected by the inclusion of the trait words “warm” or “cold”—an effect that has subsequently been conceptually replicated (e.g., Anderson & Barrios, 1961; Singh, Onglacto, Sriram & Tay, 1997) and directly replicated (e.g., Nauts, Langer, Huijsmans, Vonk & Wigboldus, 2014). The effect of including the descriptors “warm” or “cold” on perceivers’ impressions is commonly attributed to participants aggregating the information from the individual traits into one coherent global impression (although different processes of how these traits are aggregated have been proposed [c.f., Anderson, 1971; Hamilton & Zanna, 1974; Kaplan, 1975; Wyer, 1974]). Accordingly, it is plausible the impact of including “warm” versus “cold” may be more influential when perceivers interpret the list of traits as one global and interconnected piece of information, rather than as several discrete pieces of information. Following the logic in the previous paragraph, McCarthy and Skowronski (2011) demonstrated that manipulating the distance between the perceivers and described individuals—a manipulation believed to promote perceivers’ tendencies to process information about the described individuals globally versus piecemeal (e.g., Fujita, Henderson, Eng, Trope & Liberman, 2006; Fujita, Trope, Liberman & Levin-Sagi, 2006; Trope & Liberman, 2010)—affected perceivers’ impressions of described individuals. Specifically, in two studies, McCarthy and Skowronski presented McCarthy, R., et al. (2018). An Investigation of Abstract Construal on Impression Formation: A Multi-Lab Replication of McCarthy and Skowronski (2011). International Review of Social Psychology, 31(1): 15, 1–6, DOI: https://doi.org/10.5334/irsp.133

Perceivers often view individuals described as "warm" to be generally positive and individuals described as "cold" to be generally negative. Consistent with the tenets of Construal Level Theory, McCarthy and Skowronski (2011) demonstrated this difference was larger among perceivers who were instructed the information was psychologically distant rather than psychologically near; however, those results have never been subjected to replication attempts. To test the replicability of those results, we closely replicated the methods of McCarthy and Skowronski (2011) Study 1b at eight separate data collection sites and pooled the results into a random-effects meta-analysis. Within the replication attempts, the overall effect was not significantly different from zero (d = 0.10, 95% CI [-0.01, 0.22]) and an equivalence test confirmed this effect was smaller than our smallest effect size of interest. However, when the original study was incorporated into the meta-analysis, the overall effect was significantly different from zero in the theoretically-consistent direction (d = 0.13, 95% CI [0.02, 0.24]). The weight of the overall evidence suggests the traits "warm" and "cold" are more influential among participants who were presented with information that was psychologically distant; however, this effect is small. Future research should try to identify more potent moderators, which would make the effect more affordable to detect.
Keywords: Construal Level Theory; impression formation; replication; central traits participants with an individual who was described with a list of traits that included either "warm" or "cold." Further, in each study, participants were told the described individual was either psychologically distant or psychologically near. For example, in McCarthy and Skowronski Study 1b, half of the participants were told the described individual was a current student at the participants' university (a temporally near condition) and half of the participants were told the described individual attended the participants' university ten years ago (a temporally distant condition). All participants then rated how sociable, generous, likable, and agreeable they perceived the individuals to be. The results showed that individuals described as distant were rated as more extremely positive or more extremely negative than individuals described as near, which is consistent with the notion that the inclusion of "warm" or "cold" is more influential on impression formation within situations that promote global information processing.
Despite the initial positive results, the effects reported in McCarthy and Skowronski (2011) have not been replicated. Thus, the current studies attempted to replicate the Warm/Cold × Abstract Construal/Concrete Construal interaction observed in McCarthy and Skowronski, Study 1b. Specifically, we sought to replicate the findings that including the descriptor "warm" will result in generally positive impressions, that including the descriptor "cold" will result in a generally negative impression, and that the inclusion of "warm" or "cold" will be more influential on perceivers' impressions as the individual is described as temporally distant.

Motivation for Replication Attempts
The current replication attempts were motivated by three reasons. First, although finding conditions that moderate the impressions that individuals form from trait information is important, to date, this particular effect of psychological distance has not been replicated. Confidence in the effect would greatly increase if the effect is found to be replicable. Second, the results of the original studies are not as conclusive as they were believed to be at the time of the original publication (e.g., Świątkowski & Dompnier, 2017). For example, although the two p-values of the key Near/Far × Warm/Cold interactions from McCarthy and Skowronski (2011; i.e., F [1, 92] = 6.29, p = .014 and F [1, 98] = 4.12, p = .045) were both significant using a Type I error rate of 5%, they are each relatively high within the possible range of p-values that are less than .05. Third, there are aspects of the original data collection procedure that could have affected the results. Although the first author (RJM) can confirm there were no unreported studies, conditions, or outcome variables in the original studies, the first author also can confirm the sample sizes were based on "rules of thumb" (e.g., collect ~2 0 participants per cell) and convenience, rather than being justified by theory. Further, the first author cannot recollect whether the decision to stop collecting data in the original studies was determined by the results obtained (i.e., optional stopping), which, if there was optional stopping, would potentially inflate the observed effect size.
Collectively then, the effects that were reported in McCarthy and Skowronski can be considered tentative, but far from conclusive. We sought to address the tentativeness of the original findings by closely replicating the original methods and using high-powered tests to potentially detect the effect.

Methods
Prior to data collection, the methods used in the current research were approved by the human subjects review board at the authors' home institutions. We report all data exclusions (if any), all manipulations, and all relevant measures in our research (Simmons, Nelson & Simonsohn, 2012). All data needed to replicate our analyses for the current studies can be acquired either by contacting the first author or by visiting this project's page on the Open Science Framework (https://osf.io/b9mnr/).

Procedure
The current data were part of a larger, collaborative data collection effort by the authors. Although we only present the information that is relevant to the effect of temporal distance on impression formation, the entire survey and data can be accessed on this project's Open Science Framework page (https://osf.io/b9mnr/; note: these data overlap with Edlund et al. [in press]). The overall study was modeled on Klein et al. (2014). The authors agreed on a common data collection procedure, each author recruited participants at their institute, another sample was recruited online through Amazon.com's Mechanical Turk, and these samples were aggregated into common analyses. Ethics approval and data collection was coordinated by individual researchers at their site.
About halfway through the study, participants viewed a list of traits that described an individual and reported their impressions of the described individual. The program randomly assigned participants to one of two temporal framing conditions: a temporally near frame where the description was ostensibly about an individual who was currently a student at the participants' university or a temporally far frame where the description was ostensibly about an individual who was a student at the participants' university ten years ago. Participants then viewed one of two lists of seven traits that ostensibly described the individual: intelligent, skillful, industrious, [cold/warm], determined, practical, and cautious. Participants randomly viewed either "cold" or "warm" in the middle position of the list. Participants then rated the described individual on four 5-point scales: unsociable-sociable, ungenerous-generous, unlikable-likable, and disagreeable-agreeable. Thus, participants were randomly assigned to one of four conditions: near-warm, far-warm, near-cold, or far-cold. Finally, after completing the remainder of the study, participants reported demographics and were debriefed and compensated.
The materials and general procedure for this effect were the same as McCarthy and Skowronski (2011), with two exceptions. First, the original study was a paper-andpencil survey; whereas, the current replication attempt was an online survey. Second, the methods of the original study consisted entirely of the materials to test this effect; whereas, the methods of the current study contained materials to test several effects. Neither of these changes was believed to be critical for testing the hypothesis of interest.

Analytic Approach
We first analyzed the data from the replication attempts only (i.e., we did not consider the effect from the original study). This analytic approach is the same as Klein et al. (2014). Effect sizes for the hypothesized 2 (Descriptor: Warm vs. Cold) × 2 (Distance: Near vs. Far) interaction were computed by converting the F-ratio for the interaction into a d effect size using the compute.es package in R (Del Re, 2015). The effect sizes from each data collection site were then analyzed in a random-effects meta-analysis using the metafor package in R (Viechtbauer, 2017).
As a first analysis, we tested whether the observed meta-analytic effect size was significantly different from zero within the replication attempts. The effect size was computed such that a positive effect would indicate a significant effect consistent with the direction of the effect observed in McCarthy and Skowronski (2011). We then conducted an equivalence test to determine whether the observed effect was smaller than our smallest effect size of interest. Our smallest effect size of interest was deemed to be d = 0.20, which is an effect that is conventionally considered "small" (e.g., Cohen, 1988). Thus, an effect of d = |0.20| was used as the upper-and lower-bounds of the range of equivalence. Two one-sided tests would then be conducted to separately test whether the observed effect was significantly greater than the lower bound (i.e., d > -0.20) and less than the upper bound (i.e., d < +0.20). An effect both greater than the lower bound and less than the upper bound (i.e., -0.20 < d < + 0.20) would be considered evidence that the observed effect was smaller than our smallest effect size of interest.
As a second analysis, we conducted a random-effects meta-analysis of the current replication attempts and the corresponding effect from McCarthy and Skowronski (2011), Study 1b. We followed the same analytic strategy that was described in the previous paragraph. This second meta-analysis allowed us to synthesize all of the available evidence for the effect of interest and to compare the magnitude of the original effect and the effects from the replication attempts.

Participants
The data were collected at nine different data collection sites. Of the 1,246 potential participants, 1,075 provided usable data. One data collection site was only able to obtain data from 6 total participants; because this meant there was only 1 participant in some cells of the design at this site, it was omitted from the analyses. This left a final sample of 1,069 participants. The final sample was mostly female (53.3%; male, 43.2%; missing, 3.5%), White (59.6%; Black/African-American, 11.4%; Asian/Asian-American, 8.0%; Hispanic, 11.2%; Other, 5.2%; Missing; 4.5%), and had a mean age of 21.66 years (SD = 10.96).

Meta-Analysis of Current Replication Attempts Only
The data from the replication attempts were first analyzed in an ANOVA with a 2 (Descriptor: Warm vs. Cold) × 2 (Distance: Near vs. Far) between-participants design separately at each data collection site. Table 1 contains the descriptive statistics and the Descriptor × Distance interaction for each individual site. Although participants consistently rated the individuals described as "warm" more positively than the individuals described as "cold," the hypothesized interaction was not statistically significant within any individual sample.
An effect size was then computed for each interaction by converting the F-ratio into a standardized mean difference effect size and entered into a random-effects metaanalysis. The overall magnitude of the effect size estimate was 1/10 of a standard deviation (d = 0.10, 95% CI [-0.01, 0.22]). Although the absolute value of the effect was in the hypothesized direction in six of the eight samples, the overall effect was not significantly different from zero (z = 1.76, p = .08). We thus proceeded with the equivalence test.
Within the replication attempts, the overall effect size was significantly greater than the lower bound of d = -0.20 (z = 5.11, p < .001). And the overall effect size was significantly less than the upper bound of d = +0.20 (z = -1.70, p = .04) using one-sided hypothesis tests. Thus, our meta-analytic effect size estimate from the replication attempts was deemed to be smaller than our smallest effect size of interest.

Meta-Analysis Including McCarthy & Skowronski (2011), Study 1b
The effect size from McCarthy and Skowronski (2011), Study 1b was d = 0.41 (95% CI [0.02, 0.80]). A second meta-analysis was conducted that included this original effect in a common meta-analysis with the effects from the current replication attempts. When including the effect size from the original study, the overall effect was d = 0.13 (95% CI [0.02, 0.24]) and was significantly greater than zero (z = 2.25, p = .02) (see Figure 1).
Within all of the samples, the overall effect size was significantly greater than the lower bound of d = -0.20 (z = 5.81, p < .001). And the overall effect size was not significantly less than the upper bound of d = +0.20 (z = -1.30, p = .10) using one-sided hypothesis tests. Thus, although the meta-analytic effect size estimate was modest in magnitude, it was not significantly smaller than our smallest effect size of interest.
We also examined whether the magnitude of the effects within the current replication attempts were different than the original effect. There was not a significant amount of variability beyond what would be expected by chance present in the meta-analysis (Q (8) = 8.41, p = .39, I 2 = 0.00%, τ = 0.0004). Further, the absolute magnitude of the original effect (i.e., d = 0.41) was only the second largest effect size in the meta-analysis (i.e., one sample produced an effect size of d = 0.63 in the hypothesized direction). Collectively, the effect sizes from the replication attempts do not appear to be different than the effect size of the original study.

Discussion
The results of the current replication attempts demonstrated that individuals described as "warm" were judged more positively than individuals described as "cold," which is consistent with several previous findings (e.g., Anderson & Barrios, 1961;Asch, 1946;Nauts et al., 2014). However, the current results are more mixed with the idea this effect was moderated by whether the information was framed as being temporally distant or temporally near. On the one hand, based on an overall effect that was not significantly different than zero, the current replication studies failed to detect the effect observed in McCarthy and Skowronski (2011). On the other hand, the overall effect size estimate was significantly greater than zero when the effect from the original study was considered. The seeming discrepancy between these two conclusions is because the lower bound of the 95% confidence interval is nearly zero in both meta-analyses and barely becomes positive when the effect from the original study is included. Notably, the absolute magnitude of the effect size estimate is nearly the same in both meta-analyses. Thus, we believe a reasonable assessment is that the population effect is likely to be positive and small. In addition to demonstrating a moderator of how participants aggregate trait information into an impression, the current results are consistent with the tenets of Construal Level Theory (e.g., Shapira, Liberman, Trope & Rim, 2012;Trope & Liberman, 2010). Construal Level Theory posits that increasing psychological distance-hypotheticality or spatial, temporal, or social distance-will increase the extent to which perceivers will process information globally. Thus, to the extent the temporal distance manipulations actually affected perceivers' tendencies to process information globally, and to the extent the impression formation task is an appropriate outcome to capture changes in perceivers' tendencies to process information globally, the current results are potentially of broad interest for social-cognitive researchers. However, despite the current results being in the hypothesized direction, the effect was small, which suggests the specific manipulation was not very potent. Although it is encouraging to find evidence that is consistent with Construal Level Theory, it would behoove researchers to find reliable and more potent manipulations that can produce effects that are more affordable for researchers to study.
It also is notable that the effects of the current replication attempts were slightly smaller in magnitude than McCarthy and Skowronski (2011). We offer two speculations on why this may be. First, it is possible the replication attempts were smaller than the original study merely because of sampling error. This possibility seems likely given that the heterogeneity of the effect sizes were not significantly different than what would be expected by chance. This would mean that the original studies overestimated the population effect size and that the replication attempts are not inconsistent with the effects in the original studies but were merely closer to the smaller population effect size. Second, it is possible that changes in the methods resulted in a smaller effect size estimate in the replication attempts. We do not believe this is likely because we used the same stimuli as the original studies. The most obvious modification of the methods is that the current studies were collected online, but there is no obvious reason why this would greatly affect the magnitude of the effect.
Given the overall evidence, we believe that temporal distance has an impact on participants' impressions that are based on lists of descriptors that include "warm" or "cold." However, this effect is likely to be small, which means that future research in this area would need to use much more statistically powerful tests (e.g., larger samples, more potent manipulations, within-participants designs) than the previous research on this effect.